Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce size of model snapshots #116

Closed
frothga opened this issue May 27, 2021 · 1 comment
Closed

Reduce size of model snapshots #116

frothga opened this issue May 27, 2021 · 1 comment

Comments

@frothga
Copy link
Collaborator

frothga commented May 27, 2021

Each time a simulation runs, a copy of the fully-collated model is stored. The motivation is to ensure that the simulation is reproducible later. These model snapshots can contain a large amount of redundancy because every part and variable is recorded explicitly. For large-scale models, this will become a problem.

Possible solutions, in descending order of reproducibility, are:

  1. Snapshot the individual source models rather than the collated model. To recreate the collated model, treat the snapshot as if it were a stand-alone repo.
  2. Snapshot source models to depth n, configured by user. Deeper levels will be drawn from current repos.
  3. Assume models from base repo (or other chosen repos) don't change, but snapshot everything else.
  4. Only snapshot the top-level model (depth 1).
  5. Don't snapshot. Always use current repos.

How important is reproducibility? How does that weigh against the cost to snapshot the model? At a minimum, we need to retrieve the explicit parameters for a given run, which eliminates option 5.

Another issue is memory use for collation. Currently, MPart is used to construct the collated model. However, MPart is designed for active editing, so it carries more information than strictly necessary. We may need a lighter-weight class that simply builds the structure without any ability to backtrack.

For huge models (perhaps the whole brain?), we may need to do collation and subsequent compile stages on an HPC system rather than a workstation. (The "size" of model is more about the number of distinct populations than about the number of instances, so if a whole-brain model only had 1000 neuron types, it would still fit easily on a workstation.) MPart may also need to change so it can handle lazy expansion of subtrees during editing.

@frothga frothga added this to the Release 1.2 milestone Nov 20, 2021
@frothga
Copy link
Collaborator Author

frothga commented Nov 20, 2021

Implemented options 1, 4 and 5. The user can choose between them with a setting.
Probably won't implement options 2 or 3 unless a clear use-case arises.

@frothga frothga closed this as completed Nov 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant