Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG] fix inconsistent random state assignment between parallel trials #171

Merged
merged 6 commits into from Sep 15, 2020

Conversation

rythorpe
Copy link
Contributor

fixes #136

Running the code here creates this plot with MPI_Backend:
mn_mpi_10trials

and this plot for JoblibBackend:

mn_joblib_10trials

Note that while individual trials are not exactly the same between the two versions (most likely due to an inconsistency in the trajectories of random states), both show 10 distinct trials that emerged from distinct random states. If anyone can readily spot where the discrepancy is occurring it be much appreciated. Otherwise we might want to address that in another PR since the tests are still passing.

This PR also addresses some of the inconsistent messaging that gets output during a simulation in the terminal.

@rythorpe rythorpe changed the title Parallel random states fix inconsistent random state assignment between parallel trials Sep 11, 2020

neuron_net = NetworkBuilder(net)
neuron_net.net.trial_idx = trial_idx
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this equivalent to https://github.com/jonescompneurolab/hnn-core/pull/171/files#diff-ca5f146e027d9dc6ef29ff4db9c21ca2R68 ? I have a suspicion it is not. Can you check?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, I messed this up. Fixed in b340fe0.

@@ -353,6 +352,11 @@ def _create_cells_and_feeds(self):
External inputs are not targets.
"""
params = self.net.params
# Re-create external feed param dictionaries
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we add an "XXX" in the comment somewhere? To indicate that this is a hack (right?) and we should get rid of at some point

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I'm not sure in this scenario what delineates a hack from a long-term bug fix. We plan to refactor the params and feed modules wrt how feed objects and their respective parameters are passed to Network and NetworkBuilder. Along the way we will end up simplifying the code by e.g. removing the need for calling create_pext() here. That said, I'm happy to update the comment if you think it'd be clearer.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I agree it might go away but just leaving an XXX tells future contributors that it's a hack that needs to fixed.

@jasmainak
Copy link
Collaborator

I have a hunch that digging into this might solve your issue. Then ideally, we can add a tiny test to check that MPI vs Joblibs gives same result for 2 trials (with a 3 x 3 network to make it run fast). If it doesn't work, I am fine merging the PR as is if it's good by @blakecaldwell

do make sure to update whats_new.rst. I see that you found a fun way to end your week ;-)

@codecov-commenter
Copy link

codecov-commenter commented Sep 14, 2020

Codecov Report

Merging #171 into master will increase coverage by 0.88%.
The diff coverage is 35.71%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #171      +/-   ##
==========================================
+ Coverage   67.61%   68.50%   +0.88%     
==========================================
  Files          19       19              
  Lines        1930     1959      +29     
==========================================
+ Hits         1305     1342      +37     
+ Misses        625      617       -8     
Impacted Files Coverage Δ
hnn_core/mpi_child.py 0.00% <0.00%> (ø)
hnn_core/parallel_backends.py 16.41% <0.00%> (ø)
hnn_core/network_builder.py 91.74% <100.00%> (+3.16%) ⬆️
hnn_core/cell.py 69.23% <0.00%> (-5.66%) ⬇️
hnn_core/pyramidal.py 98.32% <0.00%> (-0.10%) ⬇️
hnn_core/basket.py 100.00% <0.00%> (ø)
hnn_core/tests/test_cell.py 100.00% <0.00%> (ø)
hnn_core/tests/test_network.py 100.00% <0.00%> (ø)
hnn_core/params.py 61.37% <0.00%> (+6.20%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 04ab8bb...b340fe0. Read the comment docs.

@rythorpe
Copy link
Contributor Author

Update with b340fe0: the script that generated the plots shown at the beginning of this PR now produces the same results for the joblib and mpi backends when running the default param set:

mn_mpi_10trials_02
mn_joblib_10trials_02

@jasmainak
Copy link
Collaborator

Amazing! do you think we could add a tiny test?

  • perhaps one test that checks that the two trials are different
  • and another that checks that the results are consistent between joblib and MPI?

you can use a small network, e.g., params.update({"N_pyr_x": 3, "N_pyr_y": 3}) and maybe a shorter tstop to make it run fast?

also don't forget to update whats_new.rst!

@rythorpe
Copy link
Contributor Author

I've updated whats_new.rst and added your suggested tests @jasmainak. I also ended up rearranging some of the test functions in test_compare_hnn.py in b8f5d1d in order to limit redundant code. Take a look and let me know if the changes are clear and acceptable.

@rythorpe rythorpe changed the title fix inconsistent random state assignment between parallel trials [MRG] fix inconsistent random state assignment between parallel trials Sep 15, 2020
@jasmainak jasmainak merged commit 073ef6c into jonescompneurolab:master Sep 15, 2020
@jasmainak
Copy link
Collaborator

Fantastic, thanks for making the changes. PR looks good to me, merged!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

discrepancy between simulation trial variability in HNN vs hnn-core
3 participants