Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MRG: make joblib work #44

Merged
merged 7 commits into from Aug 17, 2019
Merged

MRG: make joblib work #44

merged 7 commits into from Aug 17, 2019

Conversation

jasmainak
Copy link
Collaborator

@blakecaldwell thanks to your PR I think we might be able to get embarrassingly parallel working pretty easily! This is a bit WIP but I'm quite positive it will work. More soon ...

@codecov-io
Copy link

codecov-io commented May 14, 2019

Codecov Report

Merging #44 into master will decrease coverage by 0.35%.
The diff coverage is 75.4%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #44      +/-   ##
==========================================
- Coverage   81.87%   81.51%   -0.36%     
==========================================
  Files          14       14              
  Lines        1401     1439      +38     
==========================================
+ Hits         1147     1173      +26     
- Misses        254      266      +12
Impacted Files Coverage Δ
mne_neuron/params.py 79.46% <ø> (-1.24%) ⬇️
mne_neuron/tests/test_compare_hnn.py 100% <100%> (ø) ⬆️
mne_neuron/tests/test_network.py 100% <100%> (ø) ⬆️
mne_neuron/__init__.py 100% <100%> (ø) ⬆️
mne_neuron/feed.py 40.9% <25%> (-0.5%) ⬇️
mne_neuron/network.py 78.8% <33.33%> (-0.48%) ⬇️
mne_neuron/parallel.py 73.07% <66.66%> (-26.93%) ⬇️
mne_neuron/dipole.py 95.96% <95.65%> (-0.15%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 561ed2e...1dd230d. Read the comment docs.

Copy link
Member

@blakecaldwell blakecaldwell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be suspicious of NEURON not behaving as you expect (idempotent network instantiation). Tests, as they are currently written, instantiate two Network objects in a single Python process (presumably a single NEURON context). Calling pc.gid_clear() was necessary to avoid a segfault.
32884ad

@blakecaldwell
Copy link
Member

I think moving forward with an embarrassing parallel option (this PR) and an MPI option makes sense. I may be the only user using MPI option for a while, but it's how I plan on running optimization with ~600 cores. Dask distributed required development versions and communicates over TCP instead of InfiniBand on Oscar.

So let's continue to look at how they can both be used rather than choosing one or the other. From our combined experience, we might be able to develop a scheme more suitable for the general user than what I'm using for the short-term research.

@jasmainak jasmainak changed the title WIP: make joblib work MRG: make joblib work Jun 19, 2019
The Network object specifying how cells are
connected.
if trial_idx != 0:
params['prng_*'] = trial_idx
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@blakecaldwell does this seem legit to you? For simulating different trials, I just changed the random seed. The way the random seed is changed is different (and simpler) than in HNN -- so the results could be slightly different for n_trials > 1. But is there anything else that needs to change from one trial to the next?

Feel free to fetch this branch locally and try it. I'll wait for your feedback before merging it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These seem like unnecessary changes at this point. Simplifying schemes like the randomization of inputs is appealing to me, but I pragmatically believe that all code cleanup changes should not change the model output.

In PR #55 (which conflicts with this PR) I have already simplified this as far as possible without changing results.
9402e2c#diff-f4f8654c734567d73fec96e0e0dacce5R112

Copy link
Collaborator Author

@jasmainak jasmainak Jun 20, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, although I don't quite understand what is different in the HNN code. It seems to be essentially setting a bunch of seeds -- unless I am missing something? Now you could argue that the particular seeds used by the HNN code are important so that we can compare things. In that case, I think we should set up a test first. Save data with 2 or more trials from HNN and compare against mne-neuron code in the test. And then make changes so that the test passes. That way, I am not afraid of making changes which will potentially break things ...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fully support a test to compare results from multiple trials. However, creating good tests is time-consuming. Look for a test in the near future as I update the PR.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, if you can save the test data for 2 trials with the default parameters and send it to me, I'll add that as a test. For now, I merged since it does not break anything for single trial.

@jasmainak jasmainak merged commit 3467f96 into master Aug 17, 2019
@jasmainak
Copy link
Collaborator Author

I'm merging this to move on.

net.spiketimes = h.Vector()
net.spikegids = h.Vector()
net._record_spikes()

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Edit: sorry, should have made a new issue. See #78

@jasmainak I'm preparing a PR to add ParallelContext (n_cores > 1, but not compatible with n_jobs > 1). A function similar to _clone_and_simulate will be called, and the net.* functions need to be part of it. Can you explain why these functions need to separate from Network(params, n_cores)? All of _clone_and_simulate gets pickled, right? Perhaps these net.* calls can be part of an appropriately named method of Network.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants