You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Each time a simulation runs, a copy of the fully-collated model is stored. The motivation is to ensure that the simulation is reproducible later. These model snapshots can contain a large amount of redundancy because every part and variable is recorded explicitly. For large-scale models, this will become a problem.
Possible solutions, in descending order of reproducibility, are:
Snapshot the individual source models rather than the collated model. To recreate the collated model, treat the snapshot as if it were a stand-alone repo.
Snapshot source models to depth n, configured by user. Deeper levels will be drawn from current repos.
Assume models from base repo (or other chosen repos) don't change, but snapshot everything else.
Only snapshot the top-level model (depth 1).
Don't snapshot. Always use current repos.
How important is reproducibility? How does that weigh against the cost to snapshot the model? At a minimum, we need to retrieve the explicit parameters for a given run, which eliminates option 5.
Another issue is memory use for collation. Currently, MPart is used to construct the collated model. However, MPart is designed for active editing, so it carries more information than strictly necessary. We may need a lighter-weight class that simply builds the structure without any ability to backtrack.
For huge models (perhaps the whole brain?), we may need to do collation and subsequent compile stages on an HPC system rather than a workstation. (The "size" of model is more about the number of distinct populations than about the number of instances, so if a whole-brain model only had 1000 neuron types, it would still fit easily on a workstation.) MPart may also need to change so it can handle lazy expansion of subtrees during editing.
The text was updated successfully, but these errors were encountered:
Implemented options 1, 4 and 5. The user can choose between them with a setting.
Probably won't implement options 2 or 3 unless a clear use-case arises.
Each time a simulation runs, a copy of the fully-collated model is stored. The motivation is to ensure that the simulation is reproducible later. These model snapshots can contain a large amount of redundancy because every part and variable is recorded explicitly. For large-scale models, this will become a problem.
Possible solutions, in descending order of reproducibility, are:
How important is reproducibility? How does that weigh against the cost to snapshot the model? At a minimum, we need to retrieve the explicit parameters for a given run, which eliminates option 5.
Another issue is memory use for collation. Currently, MPart is used to construct the collated model. However, MPart is designed for active editing, so it carries more information than strictly necessary. We may need a lighter-weight class that simply builds the structure without any ability to backtrack.
For huge models (perhaps the whole brain?), we may need to do collation and subsequent compile stages on an HPC system rather than a workstation. (The "size" of model is more about the number of distinct populations than about the number of instances, so if a whole-brain model only had 1000 neuron types, it would still fit easily on a workstation.) MPart may also need to change so it can handle lazy expansion of subtrees during editing.
The text was updated successfully, but these errors were encountered: