In [1]:
from IPython.display import Markdown
from fnn import microns

# Download MICrONS digital twin

In [None]:
target_dir = '/groups/saalfeld/saalfeldlab/vijay/fnn/data/microns_digital_twin/params/'

In [3]:
microns.download_data(directory=target_dir)

[2025-08-27T14:30:43Z] [INFO] fnn.microns: Downloading model parameters and metadata to `/workspace/fnn/data/microns_digital_twin/params/`
100%|██████████| 222M/222M [00:14<00:00, 15.4MB/s] 
100%|██████████| 3.69k/3.69k [00:00<00:00, 2.99MB/s]


# Display README

In [4]:
with open(target_dir + "README.md", "r") as f:
    content = f.read()

Markdown(content)

## Overview

This directory contains supporting files for the data release for:

```
Foundation model of neural activity predicts response to new stimulus types  
```
[https://doi.org/10.1038/s41586-025-08829-y](https://doi.org/10.1038/s41586-025-08829-y)

<br>

For technical details of the model, see section of Methods (in particular, Neural network architecture, Perspective module, Modulation module, Core Module, Readout module).  

The Core module section of the Methods lists two different types of recurrent cores (Conv-LSTM and CvT-LSTM).  

The recurrent core used here is **CvT-LSTM**.   

For instructions on how to download the model weights from bossDB and to use the model, see the [FNN](https://github.com/cajal/fnn) repo.

## Files 

The directory contains two files:

```
    foundational_model_weights_and_metadata_v1.zip
        contains the results of the export function from the modeling pipeline (`https://github.com/cajal/foundation/blob/nature_v1/foundation/exports/microns.py`), which generates a compressed ZIP archive containing a collection of files related to neural network models and scan data. 

    readme_v{i}.md
        This file. If downloaded inside the FNN repo, it may be renamed to README.md
```

## File Organization

Inside `foundational_model_weights_and_metadata_v1.zip` exist the following files:
- params_core.pt
- params_{session}_{scan_idx}.pt
- units.csv
- scans.csv

## File Descriptions

### Parameter Files (*.pt)

`params_core.pt`

    A PyTorch state dictionary file containing the foundation core model parameters. 
    
    This file includes only parameters whose keys start with "core." from the original model state dictionary. 
    
    The core parameters are learned by training on the foundation cohort scans. 
    
    Once learned, the core parameters are frozen and transferred to all other scans.

<br>

`params_{session}_{scan_idx}.pt`

    Scan-specific parameter files containing non-core model parameters. 
    
    The params in this file include the readout feature weights that are tuned specifically for every scan of the mouse.

    A scan is identified by `session` and `scan_idx`.
    
    The naming convention follows the pattern:
        {session}           :      The session identifier
        {scan_idx}          :      The scan index number

<br>

**IMPORTANT**

A single model is specified by a combination of `params_core.pt` and `params_{session}_{scan_idx}.pt`. 

For example, the model for session: 4, scan_idx: 7 would be synthesized by combining `params_core.pt` with `params_4_7.pt`.

### Data Files (CSV)

These data files contain metadata for each model.

`units.csv`
    
    A comma-separated values file containing unit metadata for all scans. 
    
    The file includes the following columns:

        session (int64)      :      Session identifier, unique.
        scan_idx (int64)     :      Scan index number, unique per session but reused across sessions.
        unit_id (int64)      :      Identifier for each unit, unique per scan but reused across scans, non-consecutive.
        readout_id (int64)   :      Identifier for each unit, unique per scan but reused across scans, consecutive.

<br>

`scans.csv`
    
    A summary file containing metadata for all scan recordings.
    
    The file contains the following columns:

        session (int64)      :      Session identifier, unique.
        scan_idx (int64)     :      Scan index number, unique per session but reused across sessions.
        units (int64)        :      Number of readout_ids in the scan
        data_id (str)        :      Hash identifying scan and preprocessing methods in model training pipeline
