# **`sagesim` Implementation**

**SAGESim** (Scalable Agent-Based GPU-Enabled Simulator) is the first scalable, pure-Python, general-purpose agent-based modeling framework that supports both distributed computing and GPU acceleration. It is designed to run efficiently on modern high-performance computing (HPC) systems.


We begin by exploring how `sagesim` combines distributed computing with GPU acceleration. This leads naturally to the *Agent Data Tensor* (ADT)—a CuPy-based data structure that stores agent properties in GPU-compatible vectors. The ADT is central to how `sagesim` manages and transfers agent data across processes and GPUs. A solid understanding of the ADT is essential for effectively using `sagesim` and for scaling agent-based simulations in high-performance environments.

After introducing the system architecture and data model, we will demonstrate how to use **`sagesim`** to build and run large-scale agent-based simulations tailored to your specific problem domain. The main workflow consists of:

1. **Defining a Custom Model Class**  
   Subclass the `Model` class provided by the `sagesim` library to define your problem-specific model (e.g., `UserModel`).

2. **Instantiating the Model**  
   Create an instance of your custom model, providing the necessary network structure and problem-specific parameters.

3. **Running the Simulation and Analyzing Results**  
   Execute the simulation and collect results for analysis and visualization.

This notebook series will guide you through each of these steps in detail.


## Distributed Computing and GPU Acceleration


For distributed execution, `sagesim` uses the Python-based Message Passing Interface library, **`mpi4py`**, to manage inter-process communication. In an MPI application, each process is assigned a unique *rank* and runs the same simulation code concurrently. However, each rank has its own local memory, meaning data (e.g., variables or objects) created in one process is inaccessible to others. As a result, communication and coordination between ranks must be explicitly handled through message passing.

Each rank is designed to offload its portion of the agent simulation to a GPU. Ideally, every process independently simulates a subset of the overall agent population on a dedicated GPU. Details on running `sagesim` on an HPC system, including GPU assignment and job scheduling, are discussed in a later section.

### ADTs
To support GPU computation, `sagesim` uses **CuPy**, which is compatible with both NVIDIA CUDA and AMD ROCm. All agent data is stored in CuPy arrays, organized into what we call *agent data tensors* (ADTs). An ADT is a structured nested list, where each sublist corresponds to a particular agent attribute. The outer list has a fixed length equal to the number of agents assigned to the rank. The structure is as follows:

- **0th element**: A list of agent IDs (e.g., `[0, 1, 2, 3]`)
- **1st element**: A list of neighbors for each agent, represented as sublists  
  (e.g., `[[1, 4, 5, nan, nan], [2, 3, 4, 5, nan], ...]`)
- **2nd element**: User-defined property 1 (e.g., a list of lists of integers)
- **3rd element**: User-defined property 2 (e.g., a list of lists)
- *...additional properties as defined by the user*

This structure compactly represents all data associated with the agents assigned to a given process.

> **Note:** CuPy requires that all nested sublists be of the same length. While the outer list naturally meets this requirement (with one entry per agent), inner sublists—such as those representing neighbors—must also be uniform in length. To satisfy this, lists with fewer elements must be padded with `nan` to match the length of the longest list. For example, if the maximum number of neighbors any agent has is 5, then all neighbor lists must be padded to length 5 using `nan`, as shown above.

### Communication Between Ranks
These agents are said to be local to the ranks on which they are assigned to. In order to stitch these individual ranks into a global whole, `sagesim` uses 






## Customized Model Class
Builiding a custom model class that subclasses the base `Model` class provided by `sagesim`, is the core part of using `sagesim`, as this enables access to the built-in `simulate()` method to execute your simulations. There are three(or two) main components you’ll need to implement in building your custom model class:

#### 1. **Define and Register Breeds**

Each agent in your model must belong to a specific *breed*. To enable this:

- Define a breed class by subclassing the `Breed` class from `sagesim`. Each breed class is responsible for:
   - Registering agent properties using `breed.register_property()` for each property.
   - Defining the **step functions**, which specifies how agents behave at each simulation step, and registering it using `breed.register_step_func()`. Note that each breed can have multiple step functions, each assigned a different execution priority.
- Register the breed inside the model’s `__init__()` method using `model.register_breed()`.

#### 2. **Define and Register the Reduce Function**

`sagesim` supports parallel execution by distributing agents across multiple compute nodes and/or GPUs. After executing the simulation steps, different copies of agent data might exists, that is multiple version of adts. Therefore, a function is called to handles the logical combination (or "reduction") of those copies into a single consistent version. To enable this:

- Define the problem specific *reduce function*
- Register the *reduce function* in the `__init__()` method using `register_reduce_function()`.

This function is critical for ensuring logical correctness in distributed simulations.


#### 3. **Create and Connect Agents**

Define separate class methods that

- Specify which breed each agent belongs to and assign initial values for the agent's attributes using `create_agent_of_breed()`.
- Connect two agents, using `connect_agents()`.


#### Others: 
- If you have any global properties, they should be registered in the model class's `__init__()` method using `register_global_property()`.


#### CuPy Implementation: What It Means to You

`sagesim` uses a **CuPy** implementation to support both NVIDIA CUDA and AMD ROCm GPUs. However, there are important constraints when using **`cupyx.jit.rawkernel`**. Kernel code must be written using low-level Python functions, as many advanced Python features and abstractions are not supported. 

As a result, when implementing your own `step functions` and `reduce functions`, you must adhere to these limitations. Key restrictions include (but are not limited to):

- NaN checks must be done via inequality to self (e.g., `x != x`). This is an unfortunate limitation of `cupyx`.
- Dictionaries and custom Python objects are not supported.
- `*args` and `**kwargs` are unsupported.
- Nested function definitions are not allowed.
- Use **CuPy** data types and array routines instead of NumPy: [https://docs.cupy.dev/en/stable/reference/routines.html](https://docs.cupy.dev/en/stable/reference/routines.html)
- `for` loops must use the `range` iterator only — no `for-each` style loops.
- `return` statements do not behave reliably.
- `break` and `continue` statements are unsupported.
- Variables cannot be reassigned within `if` or `for` blocks. Declare and assign them at the top level or within new subscopes.
- Negative indexing (e.g., `array[-1]`) may not work as expected; it can access memory outside the logical bounds of the array. Use `len(array) - 1` instead.


