Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make serialization of slendr models to disk optional (for faster msprime coalescent runs) #112

Merged
merged 10 commits into from
Sep 6, 2022

Conversation

bodkan
Copy link
Owner

@bodkan bodkan commented Sep 5, 2022

This is is an attempt to implement #97. Skipping serialization to disk makes it possible to call our msprime back-end Python code directly, without having to go through the standard Bunch of Files on Disk (TM) format of slendr models (normally needed to call the SLiM back-end script on the command line). This will, in turn, make coalescent simulations even faster, opening up the possibility to develop model fitting procedures in other projects (that functionality won't be part of slendr).

@bodkan
Copy link
Owner Author

bodkan commented Sep 5, 2022

Things are looking good so far. The only thing necessary was to make the back-end msprime simulation code into its own function. Then, compile_model was given a serialize = TRUE | FALSE argument (default TRUE), which allows skipping the writing of model configuration files to disk. The msprime() function can detect this and simply use R data frames (converted to pandas DataFrames) containing the model configuration data stored in memory (not on disk) and plug them into the msprime Python coalescent script.

More unit tests are needed to make sure I didn't miss some combination of missing gene-flow events, now resize events, etc. -- this would make some of those data.frames NULL/None, which needs to be take care of accordingly.

@codecov-commenter
Copy link

Codecov Report

Merging #112 (0ad6407) into main (b0bc2a0) will increase coverage by 0.21%.
The diff coverage is 85.93%.

@@            Coverage Diff             @@
##             main     #112      +/-   ##
==========================================
+ Coverage   82.09%   82.31%   +0.21%     
==========================================
  Files           6        6              
  Lines        2905     2935      +30     
==========================================
+ Hits         2385     2416      +31     
+ Misses        520      519       -1     
Impacted Files Coverage Δ
R/compilation.R 89.31% <85.71%> (+0.77%) ⬆️
R/tree-sequences.R 87.22% <100.00%> (ø)

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@bodkan bodkan merged commit 34898db into main Sep 6, 2022
@bodkan bodkan deleted the optional-serialization branch September 6, 2022 18:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants