# Welcome to the NEST Model Integration Tutorial

This tutorial will guide you through the process of **adding new models** to the **Neural Encoding Simulation Toolkit (NEST)** — a toolbox consisting of trained encoding models of the brain that you can use to generate in silico neural responses to stimuli of your choice.

The code used to create NEST, along with its utility functions, is available on [GitHub][github].

Whether you're contributing a model for **fMRI**, **EEG**, or a **new modality**, this tutorial walks you through each required step, with hands-on examples and best practices.

If you use the code and/or data from this tutorial, please cite:

> *Gifford AT, Bersch D, Roig G, Cichy RM. 2025. The Neural Encoding Simulation Toolkit. In preparation. https://github.com/gifale95/NEST*



[nest]: https://www.alegifford.com/projects/nest/
[data_manual]: https://docs.google.com/document/d/1DeQwjq96pTkPEnqv7V6q9g_NTHCjc6aYr6y3wPlwgDE/edit?usp=drive_link
[github]: https://github.com/gifale95/NEST


---

#### What You Will Learn

We will cover:

1. **Overview**  
2. **Quick Start: Model Implementation Template**  
3. **Detailed Implementation Guide**  
   - 3.1: Determining Your Model's Location  
   - 3.2: Creating the YAML Configuration File  
   - 3.3: Implementing the Model Class  
     - 3.3.1: Model Registration  
     - 3.3.2: Class Initialization and Parameter Validation  
     - 3.3.3: Loading the Model  
     - 3.3.4: Generating Responses  
     - 3.3.5: Accessing Metadata  
     - 3.3.6: Auxiliary Methods  
     - 3.3.7: Complete Implementation Example  
4. **Adding a New Modality**  
5. **Contributing to NEST**



By the end of this tutorial, you will be able to plug your model into NEST and use it to generate in silico neural responses to arbitrary stimuli.

---

## 1. Overview

Adding a new model to NEST requires creating two main files:

1. **A Python model file** — implements the model’s logic and interface  
2. **A YAML configuration file** — describes the model’s parameters and behavior

These two files work together to register your model with NEST and make it accessible through the unified interface.

---

## 2. Quick Start: Model Implementation Template

Here's a barebones version of a Python model file that shows only the required structure.  
We will expand on this template throughout the tutorial to build a fully functional model.

You can use this as a starting point and fill in the details specific to your model.


```python
from nest.interfaces.base_model import BaseModelInterface
from nest.core.model_registry import register_model

# Load model info from YAML
def load_model_info():
    path = os.path.join(os.path.dirname(__file__), "..", "model_cards", "your_model_id.yaml")
    with open(path, "r") as f:
        return yaml.safe_load(f)

model_info = load_model_info()

register_model(
    model_id=model_info["model_id"],
    module_path="nest.models.your_modality.your_model_file",
    class_name="YourModelClass",
    modality=model_info["modality"],
    dataset=model_info["dataset"],
    yaml_path=os.path.join(os.path.dirname(__file__), "..", "model_cards", "your_model_id.yaml")
)

class YourModelClass(BaseModelInterface):
    def __init__(self, subject, device="auto", nest_dir=None, **kwargs):
        # Set up model parameters and device
        pass

    def load_model(self):
        # Load model weights and prepare for inference
        pass

    def generate_response(self, stimulus, **kwargs):
        # Generate simulated neural responses
        pass

    def get_metadata(self):
        # Return model metadata
        pass

    @classmethod
    def get_model_id(cls):
        # Return the model ID string
        pass

    def cleanup(self):
        # Free up any resources (e.g., GPU memory)
        pass
```

This is the required structure. Each method must be implemented for your model to work with NEST.


# 3. How to Implement a New Model into NEST

This step-by-step guide walks you through the process of implementing a complete model, starting from the barebones template shown above.

We will cover the following topics:

1. **Determining where your model belongs in the NEST structure**
2. **Creating the YAML configuration file**
3. **Implementing the model class with all required components:**
   - Model registration  
   - Initialization and parameter validation  
   - Model loading  
   - Response generation  
   - Metadata access  
   - Resource cleanup  


## 3.1: Where Does Your Model Belong?

First, determine which **modality** your model belongs to:

- If it's an **fMRI model**, add it to:  
  `nest/models/fmri/`

- If it's an **EEG model**, add it to:  
  `nest/models/eeg/`

- If it's a **new modality**, create a new directory:  
  `nest/models/your_new_modality/`

  > **Note**: If you are adding a new modality, there are a few additional considerations. These are discussed in **Step 4** below.

Your corresponding **YAML configuration file** should go here:  
`nest/models/model_cards/your_model_id.yaml`


---

## 3.2: Creating the YAML Configuration File

In the NEST toolkit, you'll find a template YAML file at: `nest/models/model_cards/template.yaml`


This template serves as a guide for creating your model's configuration.

Your YAML file is **crucial** because it:

- Provides metadata about your model  
- Defines input/output specifications  
- Documents valid parameters and constraints  
- Is used for parameter validation in your model class  
- Generates model cards for end users  

> Be as detailed as possible — this helps others understand how to work with your model.  
> **You can reference existing model YAML files as examples.**

Here's what you need to include:




```yaml
# Template YAML file for NEST model specification
# Replace placeholder values with actual model information

# Basic metadata
model_id: modality_dataset_model_type  # e.g., fmri_nsd_fwrf
modality: modality  # e.g., fmri, eeg, meg, ...
dataset: dataset_name
features: feature_extraction_method
repeats: single/multi  # whether model generates single or multiple repetitions
subject_specific: true/false  # whether model is subject-specific

# General description of the model
description: |
  Provide a concise but informative description of the model, including:
   - What kind of neural responses it generates
   - What dataset it was trained on
   - The basic approach/architecture
   - Any notable characteristics or limitations
   Keep this to 3-5 sentences for readability.

# Input stimulus information
input:
  type: "numpy.ndarray"  # or other appropriate type
  shape: [shape_description]  # e.g., [batch_size, 3, height, width]
  description: "Brief description of input format"
  constraints:
    - "List any constraints on input values"
    - "e.g., value ranges, size requirements, etc."

# Output information
output:
  type: "numpy.ndarray"  # or other appropriate type
  shape: [shape_description]  # e.g., [batch_size, n_voxels]
  description: "Brief description of output format"
  dimensions:
    - name: "dimension_name"
      description: "What this dimension represents"
    - name: "dimension_name"
      description: "What this dimension represents"
    # Add more dimensions as needed

# Model parameters and their usage
parameters:
  # First parameter (typically subject)
  param_name:
    type: param_type  # e.g., int, str, float
    required: true/false
    valid_values: list_of_valid_values  # or range, or omit if not applicable
    default: default_value  # include if there's a default value
    example: example_value
    description: "Description of what this parameter represents"
    function: "Which function uses this parameter: get_encoding_model, load_model, .."
  
  # Add more parameters as needed
  param_name:
    type: param_type
    required: true/false
    valid_values: list_of_valid_values  # or range, or omit if not applicable
    default: default_value  # include if there's a default value
    example: example_value
    description: "Description of what this parameter represents"
    function: "Which function uses this parameter"

# Performance metrics (if needed) and references
performance:
  metrics:
    - name: "metric_name"
      value: "metric_value"
      description: "What this metric represents"
    
    # Add more metrics as needed
    - name: "metric_name"
      value: "metric_value"
      description: "What this metric represents"
  
  plots: "URL_to_performance_plots"  # URL or path to visualizations

# Add References here
references:
    - "Citation for your model or dataset"
```

## 3.3: Implementing the Model Class

Now we'll build the complete model implementation step by step.  
The required functions must be named **exactly as shown** to work with the `BaseModelInterface`.

You are free to add additional helper functions as needed — but the core methods must be implemented.

---


### 3.3.1: Model Registration

First, set up the model registration code that makes your model discoverable by the NEST toolkit.

This code:

1. Loads your model's configuration from the YAML file  
2. Registers your model with the NEST registry, making it discoverable  
3. Specifies the module path, class name, and modality

```python
import os
import yaml
from nest.core.model_registry import register_model


# Load model info from YAML
def load_model_info():
    yaml_path = os.path.join(os.path.dirname(__file__), "..", "model_cards", "your_model_id.yaml")
    with open(os.path.abspath(yaml_path), "r") as f:
        return yaml.safe_load(f)

# Load model_info once at the top
model_info = load_model_info()

# Register this model with the registry using model_info
register_model(
    model_id=model_info["model_id"],
    module_path="nest.models.your_modality.your_model_file",  # Replace with actual path
    class_name="YourModelClass",
    modality=model_info.get("modality", "your_modality"),
    dataset=model_info.get("dataset", "your_dataset"),
    yaml_path=os.path.join(os.path.dirname(__file__), "..", "model_cards", "your_model_id.yaml")
)
```

---

### 3.3.2: Class Initialization and Parameter Validation

Next, define your model class by inheriting from `BaseModelInterface` and implement the initialization logic.

The initialization method:

1. Stores user-provided parameters (e.g., subject ID, device, NEST directory)  
2. Validates parameters against the specifications in the YAML file  
3. Sets up the compute device (CPU or GPU)  
4. Can process additional model-specific parameters through `**kwargs`


```python
class YourModelClass(BaseModelInterface):
    """
    Your model description here. Explain what this model does, what
    neural responses it generates, and any other important details.
    """
    
    MODEL_ID = model_info["model_id"]
    # Extract any validation info from model_info
    VALID_SUBJECTS = model_info["parameters"]["subject"]["valid_values"]
    
    def __init__(self, subject: int, device: str = "auto", nest_dir: Optional[str] = None, **kwargs):
        """
        Initialize your model with the required parameters.
        
        Parameters
        ----------
        subject : int
            Subject ID for subject-specific models.
        device : str
            Device to run the model on ('cpu', 'cuda', or 'auto').
        nest_dir : str, optional
            Path to the NEST directory.
        **kwargs
            Additional model-specific parameters.
        """
        self.subject = subject
        self.nest_dir = nest_dir
        self.model = None
        self._validate_parameters()
        
        # Select device
        if device == "auto":
            device = "cuda" if torch.cuda.is_available() else "cpu"
        self.device = device
        
        # Store any additional parameters
        # self.your_param = kwargs.get('your_param', default_value)

    def _validate_parameters(self):
        """
        Validate the input parameters against the model specs.
        """
        if self.subject not in self.VALID_SUBJECTS:
            raise InvalidParameterError(
                f"Subject must be one of {self.VALID_SUBJECTS}, got {self.subject}"
            )
        
        # Add any other parameter validation here
```

---
### 3.3.3: Loading the Model

Next, implement the `load_model()` method, which handles loading model weights and preparing the model for inference.

This method:

1. Constructs the file path to your model weights using a consistent directory structure  
2. Loads the model architecture and weights (implementation will vary based on your model type)  
3. Moves the model to the appropriate device (CPU or GPU)  
4. Sets the model to evaluation mode  
5. Stores the loaded model in a class variable (e.g., `self.model`) for use by other methods

> Remember to customize the model loading code for your specific architecture and framework (e.g., PyTorch, TensorFlow, etc.).




```python

    def load_model(self) -> None:
        """
        Load model weights and prepare for inference.
        """
        try:
            # Build paths to model weights
            weights_path = os.path.join(
                self.nest_dir,
                'your_path') # Adjust filename format as needed
            
            # Load your model here
            # Example with PyTorch:
            # self.model = YourModelArchitecture()
            # self.model.load_state_dict(torch.load(weights_path, map_location=torch.device(self.device)))
            # self.model.to(self.device)
            # self.model.eval()
            
            print(f"Model loaded on {self.device} for subject {self.subject}")
        
        except Exception as e:
            raise ModelLoadError(f"Failed to load model: {str(e)}")
```


---
### 3.3.4: Generating Responses

The `generate_response()` method is the core functionality that produces in silico neural responses from input stimuli:

This method:

1. Validates the input stimulus to ensure it meets requirements  
2. Preprocesses the stimulus if needed (e.g., normalization, resizing)  
3. Runs the model inference, typically in batches to manage memory usage  
4. Collects and formats the response data  
5. Returns the in silico neural responses as a NumPy array  

Customize this method based on your model's specific requirements and output format.




```python
    def generate_response(
        self,
        stimulus: np.ndarray,
        **kwargs) -> np.ndarray:
        """
        Generate in silico neural responses for given stimuli.
        
        Parameters
        ----------
        stimulus : np.ndarray
            Input stimulus array. Typically has shape (batch_size, channels, height, width)
            for image stimuli, but requirements vary by model.
        **kwargs
            Additional model-specific parameters for response generation.
        
        Returns
        -------
        np.ndarray
            Simulated neural responses. Shape depends on your model's output.
        """
        # Validate stimulus
        if not isinstance(stimulus, np.ndarray) or len(stimulus.shape) != 4:
            raise StimulusError(
                "Stimulus must be a 4D numpy array (batch, channels, height, width)"
            )
        
        # Preprocess stimulus if needed
        # preprocessed_stimulus = preprocess(stimulus)
        
        # Generate responses
        # with torch.no_grad():
        #     batch_size = 100  # Adjust as needed
        #     responses = []
        #     
        #     for i in range(0, len(stimulus), batch_size):
        #         batch = torch.from_numpy(stimulus[i:i+batch_size]).to(self.device)
        #         output = self.model(batch)
        #         responses.append(output.cpu().numpy())
        #     
        #     all_responses = np.concatenate(responses, axis=0)
        
        # For now, return dummy data with expected shape
        # Replace this with your actual model inference
        dummy_response = np.zeros((stimulus.shape[0], 100))  # Example shape
        
        return dummy_response
```

---
### 3.3.5: Accessing Metadata

The `get_metadata()` method provides information about the model and its outputs.

This method:

1. Attempts to load metadata from a predefined location  
2. Returns the metadata as a dictionary  
3. Provides basic information if no metadata file is found  

The metadata may include information about voxel indices, channel information, region details, or other model-specific information.




```python
    def get_metadata(self) -> Dict[str, Any]:
        """
        Return metadata about the model and its outputs.
        
        Returns
        -------
        Dict[str, Any]
            Dictionary containing model metadata.
        """
        # Load metadata file if available
        metadata_path = os.path.join(
            self.nest_dir,
            'your_path')
        
        try:
            metadata = np.load(metadata_path, allow_pickle=True).item()
            return metadata
        except Exception as e:
            # If no metadata file exists, return basic info
            return {
                "model_id": self.MODEL_ID,
                "subject": self.subject,
                # Add any other relevant metadata
            }


---
### 3.3.6: Auxiliary Methods

Finally, implement these required auxiliary methods.

The `get_model_id()` method:

1. Returns the unique identifier for your model  
2. Is used by the NEST registry to identify the model  

The `cleanup()` method:

1. Releases resources (especially GPU memory)  
2. Is important for preventing memory leaks when working with multiple models  
3. Should clean up any large objects or references




```python
    @classmethod
    def get_model_id(cls) -> str:
        """
        Return the model's unique identifier.
        
        Returns
        -------
        str
            Model ID string from the YAML config.
        """
        return cls.MODEL_ID
    
    def cleanup(self) -> None:
        """
        Release resources (e.g., GPU memory) when finished.
        """
        if hasattr(self, 'model') and self.model is not None:
            # Free GPU memory if using CUDA
            if hasattr(self.model, 'to'):
                self.model.to('cpu')
            
            # Clear references
            self.model = None
            
            # Force CUDA cache clear if available
            if torch.cuda.is_available():
                torch.cuda.empty_cache()
  ```

---
### 3.3.7. This is how the full model function could look like:

In [None]:
import os
import numpy as np
import torch
import yaml
from typing import Dict, Any, Optional

from nest.interfaces.base_model import BaseModelInterface
from nest.core.model_registry import register_model
from nest.core.exceptions import ModelLoadError, InvalidParameterError, StimulusError

# Load model info from YAML
def load_model_info():
    yaml_path = os.path.join(os.path.dirname(__file__), "..", "model_cards", "your_model_id.yaml")
    with open(os.path.abspath(yaml_path), "r") as f:
        return yaml.safe_load(f)

# Load model_info once at the top
model_info = load_model_info()

# Register this model with the registry using model_info
register_model(
    model_id=model_info["model_id"],
    module_path="nest.models.your_modality.your_model_file",  # Replace with actual path
    class_name="YourModelClass",
    modality=model_info.get("modality", "your_modality"),
    dataset=model_info.get("dataset", "your_dataset"),
    yaml_path=os.path.join(os.path.dirname(__file__), "..", "model_cards", "your_model_id.yaml")
)


class YourModelClass(BaseModelInterface):
    """
    Your model description here. Explain what this model does, what
    neural responses it generates, and any other important details.
    """

    MODEL_ID = model_info["model_id"]
    # Extract any validation info from model_info
    VALID_SUBJECTS = model_info["parameters"]["subject"]["valid_values"]

    def __init__(self, subject: int, device: str = "auto", nest_dir: Optional[str] = None, **kwargs):
        """
        Initialize your model with the required parameters.

        Parameters
        ----------
        subject : int
            Subject ID for subject-specific models.
        device : str
            Device to run the model on ('cpu', 'cuda', or 'auto').
        nest_dir : str, optional
            Path to the NEST directory.
        **kwargs
            Additional model-specific parameters.
        """
        self.subject = subject
        self.nest_dir = nest_dir
        self.model = None
        self._validate_parameters()

        # Select device
        if device == "auto":
            device = "cuda" if torch.cuda.is_available() else "cpu"
        self.device = device

        # Store any additional parameters
        # self.your_param = kwargs.get('your_param', default_value)

    def _validate_parameters(self):
        """
        Validate the input parameters against the model specs.
        """
        if self.subject not in self.VALID_SUBJECTS:
            raise InvalidParameterError(
                f"Subject must be one of {self.VALID_SUBJECTS}, got {self.subject}"
            )

        # Add any other parameter validation here

    def load_model(self) -> None:
        """
        Load model weights and prepare for inference.
        """
        try:
            # Build paths to model weights
            weights_path = os.path.join(
                self.nest_dir,
                'your_path') # Adjust filename format as needed

            # Load your model here
            # Example with PyTorch:
            # self.model = YourModelArchitecture()
            # self.model.load_state_dict(torch.load(weights_path, map_location=torch.device(self.device)))
            # self.model.to(self.device)
            # self.model.eval()

            print(f"Model loaded on {self.device} for subject {self.subject}")

        except Exception as e:
            raise ModelLoadError(f"Failed to load model: {str(e)}")

    def generate_response(
        self,
        stimulus: np.ndarray,
        **kwargs) -> np.ndarray:
        """
        Generate in silico neural responses for given stimuli.

        Parameters
        ----------
        stimulus : np.ndarray
            Input stimulus array. Typically has shape (batch_size, channels, height, width)
            for image stimuli, but requirements vary by model.
        **kwargs
            Additional model-specific parameters for response generation.

        Returns
        -------
        np.ndarray
            Simulated neural responses. Shape depends on your model's output.
        """
        # Validate stimulus
        if not isinstance(stimulus, np.ndarray) or len(stimulus.shape) != 4:
            raise StimulusError(
                "Stimulus must be a 4D numpy array (batch, channels, height, width)"
            )

        # Preprocess stimulus if needed
        # preprocessed_stimulus = preprocess(stimulus)

        # Generate responses
        # with torch.no_grad():
        #     batch_size = 100  # Adjust as needed
        #     responses = []
        #
        #     for i in range(0, len(stimulus), batch_size):
        #         batch = torch.from_numpy(stimulus[i:i+batch_size]).to(self.device)
        #         output = self.model(batch)
        #         responses.append(output.cpu().numpy())
        #
        #     all_responses = np.concatenate(responses, axis=0)

        # For now, return dummy data with expected shape
        # Replace this with your actual model inference
        dummy_response = np.zeros((stimulus.shape[0], 100))  # Example shape

        return dummy_response

    def get_metadata(self) -> Dict[str, Any]:
        """
        Return metadata about the model and its outputs.

        Returns
        -------
        Dict[str, Any]
            Dictionary containing model metadata.
        """
        # Load metadata file if available
        metadata_path = os.path.join(
                self.nest_dir,
                'your_path') # Adjust filename format as needed

        try:
            metadata = np.load(metadata_path, allow_pickle=True).item()
            return metadata
        except Exception as e:
            # If no metadata file exists, return basic info
            return {
                "model_id": self.MODEL_ID,
                "subject": self.subject,
                # Add any other relevant metadata
            }

    @classmethod
    def get_model_id(cls) -> str:
        """
        Return the model's unique identifier.

        Returns
        -------
        str
            Model ID string from the YAML config.
        """
        return cls.MODEL_ID

    def cleanup(self) -> None:
        """
        Release resources (e.g., GPU memory) when finished.
        """
        if hasattr(self, 'model') and self.model is not None:
            # Free GPU memory if using CUDA
            if hasattr(self.model, 'to'):
                self.model.to('cpu')

            # Clear references
            self.model = None

            # Force CUDA cache clear if available
            if torch.cuda.is_available():
                torch.cuda.empty_cache()

---
---

# 4. How to Add a New Modality

To extend NEST with a new recording modality (e.g., MEG), follow the steps below:

### 1. Create a Folder
Create a new directory under `nest/models/`:
```bash
nest/models/your_modality/
```

### 2. Add Your Model Files
Inside the new folder, include:
- `your_model.py` &mdash; your model implementation.
- `__init__.py` &mdash; register your model by adding:
  ```python
  import nest.models.your_modality.your_model
  ```

### 3. Add a Model Card
Create a YAML configuration file for your model and place it in:
```bash
nest/models/model_cards/your_model_id.yaml
```

### 4. Specify the Modality
In both `your_model.py` and the YAML config file, define the modality name. For example:
```yaml
modality: "your_modality"
```

### 5. Register the Modality
Finally, update `nest/models/__init__.py` to ensure your modality is loaded:
```python
import nest.models.your_modality
```


## Final Directory Structure

```
nest/
├── models/
│   ├── __init__.py
│   ├── fmri/
│   ├── eeg/
│   ├── your_modality/
│   │   ├── __init__.py
│   │   └── your_model.py
│   └── model_cards/
│       └── your_model_id.yaml
```


---

# 5. Contributing to NEST

We warmly welcome all contributions to the NEST toolbox and are happy for every addition that helps grow the community.

If you would like to submit a model, you can use this toolkit to guide your implementation.

## Code Quality
- Include clear **docstrings** for all public methods.
- Add **type hints** to improve code readability.
- Implement **robust error handling** with informative messages.
- Follow existing **NEST naming conventions**.
- Be thorough with your **YAML configuration** and include as much relevant information as possible.
- If available, feel free to add **performance details**

## Testing
- Test your model with various **input shapes** and **data types**.
- Verify that **error handling** works as expected.
- Check **resource usage** during and after model execution.
- Ensure all required **metadata** is correctly provided.

## How to Contribute

If you would like to contribute your model back to NEST:

1. **Fork** the NEST repository.
2. **Create a branch** from the `development` branch.
3. **Add your model** following this tutorial.
4. **Submit a pull request** with:
   - A clear description of your model.
   - Example code showing how to run your model.
   - Any relevant **citations** or **references**.

We look forward to your contributions and are excited to see the creative ways the community expands NEST!

