[MRG] fix inconsistent random state assignment between parallel trials #171

rythorpe · 2020-09-11T23:30:13Z

fixes #136

Running the code here creates this plot with MPI_Backend:

and this plot for JoblibBackend:

Note that while individual trials are not exactly the same between the two versions (most likely due to an inconsistency in the trajectories of random states), both show 10 distinct trials that emerged from distinct random states. If anyone can readily spot where the discrepancy is occurring it be much appreciated. Otherwise we might want to address that in another PR since the tests are still passing.

This PR also addresses some of the inconsistent messaging that gets output during a simulation in the terminal.

…nNetwork

jasmainak · 2020-09-12T04:41:59Z

hnn_core/parallel_backends.py


        neuron_net = NetworkBuilder(net)
+        neuron_net.net.trial_idx = trial_idx


is this equivalent to https://github.com/jonescompneurolab/hnn-core/pull/171/files#diff-ca5f146e027d9dc6ef29ff4db9c21ca2R68 ? I have a suspicion it is not. Can you check?

You're right, I messed this up. Fixed in b340fe0.

jasmainak · 2020-09-12T04:44:33Z

hnn_core/network_builder.py

@@ -353,6 +352,11 @@ def _create_cells_and_feeds(self):
        External inputs are not targets.
        """
        params = self.net.params
+        # Re-create external feed param dictionaries


should we add an "XXX" in the comment somewhere? To indicate that this is a hack (right?) and we should get rid of at some point

Hmm, I'm not sure in this scenario what delineates a hack from a long-term bug fix. We plan to refactor the params and feed modules wrt how feed objects and their respective parameters are passed to Network and NetworkBuilder. Along the way we will end up simplifying the code by e.g. removing the need for calling create_pext() here. That said, I'm happy to update the comment if you think it'd be clearer.

yeah, I agree it might go away but just leaving an XXX tells future contributors that it's a hack that needs to fixed.

jasmainak · 2020-09-12T04:48:33Z

I have a hunch that digging into this might solve your issue. Then ideally, we can add a tiny test to check that MPI vs Joblibs gives same result for 2 trials (with a 3 x 3 network to make it run fast). If it doesn't work, I am fine merging the PR as is if it's good by @blakecaldwell

do make sure to update whats_new.rst. I see that you found a fun way to end your week ;-)

codecov-commenter · 2020-09-14T22:32:45Z

Codecov Report

Merging #171 into master will increase coverage by 0.88%.
The diff coverage is 35.71%.

@@            Coverage Diff             @@
##           master     #171      +/-   ##
==========================================
+ Coverage   67.61%   68.50%   +0.88%     
==========================================
  Files          19       19              
  Lines        1930     1959      +29     
==========================================
+ Hits         1305     1342      +37     
+ Misses        625      617       -8

Impacted Files	Coverage Δ
hnn_core/mpi_child.py	`0.00% <0.00%> (ø)`
hnn_core/parallel_backends.py	`16.41% <0.00%> (ø)`
hnn_core/network_builder.py	`91.74% <100.00%> (+3.16%)`	⬆️
hnn_core/cell.py	`69.23% <0.00%> (-5.66%)`	⬇️
hnn_core/pyramidal.py	`98.32% <0.00%> (-0.10%)`	⬇️
hnn_core/basket.py	`100.00% <0.00%> (ø)`
hnn_core/tests/test_cell.py	`100.00% <0.00%> (ø)`
hnn_core/tests/test_network.py	`100.00% <0.00%> (ø)`
hnn_core/params.py	`61.37% <0.00%> (+6.20%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 04ab8bb...b340fe0. Read the comment docs.

rythorpe · 2020-09-14T22:45:00Z

Update with b340fe0: the script that generated the plots shown at the beginning of this PR now produces the same results for the joblib and mpi backends when running the default param set:

jasmainak · 2020-09-14T23:44:16Z

Amazing! do you think we could add a tiny test?

perhaps one test that checks that the two trials are different
and another that checks that the results are consistent between joblib and MPI?

you can use a small network, e.g., params.update({"N_pyr_x": 3, "N_pyr_y": 3}) and maybe a shorter tstop to make it run fast?

also don't forget to update whats_new.rst!

rythorpe · 2020-09-15T19:35:25Z

I've updated whats_new.rst and added your suggested tests @jasmainak. I also ended up rearranging some of the test functions in test_compare_hnn.py in b8f5d1d in order to limit redundant code. Take a look and let me know if the changes are clear and acceptable.

jasmainak · 2020-09-15T19:39:59Z

Fantastic, thanks for making the changes. PR looks good to me, merged!

rythorpe added 3 commits September 11, 2020 16:14

update feed-specific param dictionaries in Network attribute of Neuro…

bc59634

…nNetwork

fix trial message for MPI_BACKEND

dfa6445

iterate rng seed param for each trial in mpi_child

7118e29

rythorpe requested review from blakecaldwell and jasmainak September 11, 2020 23:30

rythorpe changed the title ~~Parallel random states~~ fix inconsistent random state assignment between parallel trials Sep 11, 2020

jasmainak reviewed Sep 12, 2020

View reviewed changes

MPI and Joblib backends now produce the same results

b340fe0

rythorpe added 2 commits September 15, 2020 15:16

add comparison tests across parallel backends

b8f5d1d

update whats_new.rst

98a4050

rythorpe changed the title ~~fix inconsistent random state assignment between parallel trials~~ [MRG] fix inconsistent random state assignment between parallel trials Sep 15, 2020

jasmainak approved these changes Sep 15, 2020

View reviewed changes

jasmainak merged commit 073ef6c into jonescompneurolab:master Sep 15, 2020

rythorpe mentioned this pull request Sep 18, 2020

multiple trials with consistent seeding #113

Closed

rythorpe mentioned this pull request Oct 20, 2020

[MRG] BUG: send length of data on MPI child completion #196

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MRG] fix inconsistent random state assignment between parallel trials #171

[MRG] fix inconsistent random state assignment between parallel trials #171

rythorpe commented Sep 11, 2020

jasmainak Sep 12, 2020

rythorpe Sep 14, 2020

jasmainak Sep 12, 2020

rythorpe Sep 13, 2020

jasmainak Sep 14, 2020

jasmainak commented Sep 12, 2020

codecov-commenter commented Sep 14, 2020 •

edited

rythorpe commented Sep 14, 2020

jasmainak commented Sep 14, 2020

rythorpe commented Sep 15, 2020

jasmainak commented Sep 15, 2020


		neuron_net = NetworkBuilder(net)
		neuron_net.net.trial_idx = trial_idx

[MRG] fix inconsistent random state assignment between parallel trials #171

[MRG] fix inconsistent random state assignment between parallel trials #171

Conversation

rythorpe commented Sep 11, 2020

jasmainak Sep 12, 2020

Choose a reason for hiding this comment

rythorpe Sep 14, 2020

Choose a reason for hiding this comment

jasmainak Sep 12, 2020

Choose a reason for hiding this comment

rythorpe Sep 13, 2020

Choose a reason for hiding this comment

jasmainak Sep 14, 2020

Choose a reason for hiding this comment

jasmainak commented Sep 12, 2020

codecov-commenter commented Sep 14, 2020 • edited

Codecov Report

rythorpe commented Sep 14, 2020

jasmainak commented Sep 14, 2020

rythorpe commented Sep 15, 2020

jasmainak commented Sep 15, 2020

codecov-commenter commented Sep 14, 2020 •

edited