Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix dipole and network (spiking) class write methods #96

Merged
merged 37 commits into from Jun 12, 2020

Conversation

rythorpe
Copy link
Contributor

Two changes have been made: 1) dipole.write() has been fixed to save files according to the user-specified name, and 2) network.write_spikes() has been added so that spiking data for selected trials can be saved to a single .txt file. See the attached example (a modified version of the plot_simulate_evoked.py example) and its corresponding output .txt and .png files that demonstrates reconstruction of the simulated spike event plot using the spike .txt files.
demo.zip

for spike_type,gid_range in self.gid_dict.items():
gidtypes[np.in1d(spikegids,gid_range)] = spike_type

with open(fname,'w') as f:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you should use standard functions whenever possible. In this case, I think you could use np.savetxt

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because I combined data types (i.e., list of floats and list of strings) which don't easily combine into a single array, the most succinct way to handle the formatting was to use low-level I/O functions. I could potentially convert each list to a formatted string, then convert it to a numpy array, and then save it if you think that'd be better?

Note that HNN originally stored spike.txt files along side param.txt files that stored the gid ranges. I tried to retain the HNN format but also combine all the relevant spiking info into a single file.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we are proceeding with using hnn-core for HNN. I'd prefer to keep spikes for different trials in separate files, as with HNN. If it's not, I will have to write a function for HNN that reads the single file and then splits into different files. :(

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is currently setup to take an argument (i.e., trial_idx) that specifies which trials are selected to write to the file, so the write function can be looped over to save all trials separately. The reason for this was to mimic the low-level functionality of the dipole.write() function which also only writes one file at a time. Would it be more helpful if we changed both the dipole.write() and network.write_spikes() functions to output all files with the same naming convention as HNN?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, yes. This is where a separate Spike (Spikedata) class would be helpful for parity with Dipole: Spikedata.write() and Spikedata.write_all_trials(). That way you can enforce some standard naming (like HNN for better or worse) and don't have the ambiguous case of naming a file with some, but not all trials.

There's more than one way to do it. Just my thoughts. Thanks for your work on this.

Parameters
----------
fname : str
Full path to the output file (.txt)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What motivated the use of a .txt file instead of say an .npz or .mat file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just wanted to be consistent with the dipole.write() function and previous HNN I/O convention.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since testing new hnn-core output against old hnn output will be important for the foreseeable future, I plead that we leave files the same (.txt). We will have test functions to compare array values after loading from the file, but also being able to do a diff from the command line is useful.

Indices of selected trials. If 'all',
all trials are selected.
"""
if trial_idx=='all':
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use None. The problem with 'all' is that string comparison may fail if it's not the right case

@@ -518,3 +518,32 @@ def plot_spikes(self, ax=None, show=True):
if show:
plt.show()
return ax.get_figure()

def write_spikes(self, fname, trial_idx='all'):
"""Write spike times to a file.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how do you plan to read them back in?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check out the plot_simulate_evoked.py script in the attached demo.zip folder. Are you thinking we should have a built-in read function as well?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HNN has the functionality to take a given parameter file, find the spike output in an expected directory and load it. If this functionality is part of hnn-core, HNN can use it. Otherwise, the function will still be part of HNN.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So are you in favor of adding a read function in hnn-core or would you rather that be handled by an HNN wrapper?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should add a reader if you're going to have a way to write it :) It's also easier to test that way. You can do a round trip and check if you have what you started with.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since spikes currently exist only as a network class attribute, they either need to be read in through a network class method at/after network initialization or handled as their own object outside of the network class (as with params). In keeping with our desire to make HNN data structures more transparent and tractable (e.g., using .json files for params, json schema for param validation), what do you think of turning spikes into their own object with the following example structure:

{
    <gid_type>: {
        <gid>: [
            <spiketime[0]>,
            <spiketime[1]>,
            ...
            ]
        }
}

We could still maintain the .txt file format convention of HNN output for now, but this would make the spike data more intuitive, modular, and readable into a network instance with proper tests and validation.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Humm ... I would keep it simple for now. You can read in a dictionary of lists. The dictionary would contain three keys gid, gid_type and spiketimes. It sounds like this stuff should belong to the _Cell class but I'm sure it's nasty to save it directly and read it back. Let's stick to python built-ins until it becomes apparent if something should go to a separate class.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thinking more, you could probably use a list of NamedTuple. Maybe that simplifies some code, I don't know ... would keep it for another PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, a reader would be helpful. In the future, I like the idea of moving spiketime and spikegids out of the network class. I'm in favor of either keeping things as close to HNN for now or the other extreme of making a new Spike or Spikedata class.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The namedtuple class does look super convenient for this - good call. In the short term, I'll just implement a simple read function that validates gids prior to them overwriting the spikes of a network instance (to be compatible with the ranges in network.gid_dict).

Full path to the output file (.txt)
trial_idx : list of int
Indices of selected trials. If 'all',
all trials are selected.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you also document what the file actually looks like?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The file is formatted in three columns where each row is <spike_time>\t<spike_gid>\t<gid_type>. There is an example in the demo.zip folder. Are you looking for a formal description in the function documentation?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, a formal description in the function docstring


spiketimes = []
spikegids = []
for i in trial_idx:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

avoid use of single letter variable names when possible. They are hard to Ctrl + F

@blakecaldwell
Copy link
Member

Thanks @rythorpe!

@jasmainak
Copy link
Collaborator

@rythorpe also don't forget to update whats_new.rst and api.rst once you have added the reader function.

Indices of selected trials. If None,
all trials are selected.
"""
if trial_idx==None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if trial_idx==None:
if trial_idx is None:

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spikegids += self.spikegids[idx]

gidtypes = np.empty_like(spikegids,dtype='<U36')
for spike_type,gid_range in self.gid_dict.items():
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you need a PEP8 checker. For some reason Travis didn't run on your branch and PEP8 didn't run. Let me rebase your branch so you see the problem.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's weird your branch is up to date. @blakecaldwell any idea why Travis is not running? I saw you do some recent experiments on HNN with Travis.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure... Do you have this repository on your Travis dashboard? I haven't linked it to my account. I've found the Travis dashboard to have useful hints under the "requests" section. If it fails because of a bad .travis.yml, for example, it will indicate it there (doesn't appear to the case here, though).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do have it under the dashboard. Maybe @rythorpe just try pushing another commit and it might work ...

@rythorpe
Copy link
Contributor Author

rythorpe commented Apr 2, 2020

I don't know why my latest push hasn't updated the PR yet, but here's a python script that does three round trip tests for spike and dipole read/write functions. I wasn't sure if any of these were appropriate to add as formal tests.

test_write_read.zip

@jasmainak
Copy link
Collaborator

@rythorpe do you see an error when pushing? what's the issue?

@rythorpe
Copy link
Contributor Author

rythorpe commented Apr 5, 2020

@rythorpe do you see an error when pushing? what's the issue?

My latest commit doesn't appear in this thread. Am I missing something?

@jasmainak
Copy link
Collaborator

@rythorpe that was so strange. I'm guessing it was related to this incident. I fixed it for you now by making a new commit, pushing again, and then removing that commit.

@rythorpe rythorpe force-pushed the write_methods branch 2 times, most recently from a0a8b1b to 255f79c Compare April 10, 2020 03:02
@rythorpe
Copy link
Contributor Author

I logged the following error traceback when trying to update the documentation with $ make html from within the hnn-core/doc/ directory. Any idea what is going on?

# Python version: 3.7.7 (CPython)
# Docutils version: 0.16 release
# Jinja2 version: 2.11.1
# Last messages:

# Loaded extensions:
Traceback (most recent call last):
  File "/home/ryan/anaconda3/envs/hnn_core/lib/python3.7/site-packages/sphinx/cmd/build.py", line 275, in build_main
    args.tags, args.verbosity, args.jobs, args.keep_going)
  File "/home/ryan/anaconda3/envs/hnn_core/lib/python3.7/site-packages/sphinx/application.py", line 278, in __init__
    self._init_builder()
  File "/home/ryan/anaconda3/envs/hnn_core/lib/python3.7/site-packages/sphinx/application.py", line 334, in _init_builder
    self.events.emit('builder-inited')
  File "/home/ryan/anaconda3/envs/hnn_core/lib/python3.7/site-packages/sphinx/events.py", line 99, in emit
    results.append(callback(self.app, *args))
  File "/home/ryan/anaconda3/envs/hnn_core/lib/python3.7/site-packages/sphinx_gallery/gen_gallery.py", line 291, in generate_gallery_rst
    gallery_conf = parse_config(app)
  File "/home/ryan/anaconda3/envs/hnn_core/lib/python3.7/site-packages/sphinx_gallery/gen_gallery.py", line 91, in parse_config
    abort_on_example_error, lang, app.builder.name, app)
  File "/home/ryan/anaconda3/envs/hnn_core/lib/python3.7/site-packages/sphinx_gallery/gen_gallery.py", line 215, in _complete_gallery_conf
    "found type %s" % type(backref))
ValueError: The 'backreferences_dir' parameter must be of type str, pathlib.Path or None, found type <class 'bool'>

@jasmainak
Copy link
Collaborator

I think we need to update that in conf.py. There have been some changes in sphinx-gallery ...

@jasmainak
Copy link
Collaborator

You need to rebase @rythorpe ? Are you able to do it yourself?

Also -- I think we need to discuss this PR in person. Maybe on Thursday if you have time?

@rythorpe
Copy link
Contributor Author

rythorpe commented May 5, 2020

You need to rebase @rythorpe ? Are you able to do it yourself?

Also -- I think we need to discuss this PR in person. Maybe on Thursday if you have time?

I can rebase, though I don't understand why? Sure, Thursday works for me.

@jasmainak
Copy link
Collaborator

You need to rebase because there are conflicts now. You edited the same file in both the pull requests.

@jasmainak
Copy link
Collaborator

Please make a back up of your branch before you begin rebasing. It might be a mess if you are trying the first time

@rythorpe
Copy link
Contributor Author

rythorpe commented May 5, 2020

@jasmainak Any idea what is going wrong with the circleci error?

@jasmainak
Copy link
Collaborator

Ignore it for now, I have to configure CircleCI ...

@@ -2,7 +2,7 @@

load_custom_mechanisms()

from .dipole import simulate_dipole
from .dipole import simulate_dipole, import_dipole
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
from .dipole import simulate_dipole, import_dipole
from .dipole import simulate_dipole, read_dipole

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also update api.rst

(i.e., as a trial) to the spike-related
attributes of Network instance. If False,
all spikes of the Network instance will
be overwritten.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Returns?

@rythorpe
Copy link
Contributor Author

@jasmainak I've added gid_dict validation for Spikes().update_types() so that it checks for overlapping gid ranges. Note that I also shifted read_spikes() to create a Spikes object and then run Spikes().update_types() when supplied with the gid_dict arg (i.e., as opposed to having read_spikes() run its own code to populate the Spikes()._types attribute) to consolidate code. api.rst and whats_new.rst are updated as well.

Comment on lines +642 to +648
times_self = [[round(time, 3) for time in trial]
for trial in self._times]
times_other = [[round(time, 3) for time in trial]
for trial in other._times]
return (times_self == times_other and
self._gids == other._gids and
self._types == other._types)
Copy link
Collaborator

@jasmainak jasmainak Jun 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if you did np.allclose(times_self, times_other, atol=1e-3, rtol=0) and self._types == other._types it would do the same thing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried it in the beginning but didn't care for the fact that np.allclose() uses broadcasting (i.e., it won't control for the case when there has been a formatting discrepancy between self._times and other._times.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

humm ... but you won't have any formatting discrepancy now? Because you do input validation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There shouldn't be any formatting discrepancies due to inputs, but if a bug were to emerge in Network that results in a formatting discrepancy (e.g., if someone were to mess with some square brackets so that the lists append along the first dimension), it would benefit us to have an __eq__() function that is independent. What if we use np.array_equal()?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ufff right, the Network class. Okay let's leave it like this for now. We can revisit it later.

@jasmainak
Copy link
Collaborator

@rythorpe just two comments. Many tests added, we're going in the right direction!

LGTM once the two comments are addressed

hnn_core/network.py Outdated Show resolved Hide resolved
hnn_core/network.py Outdated Show resolved Hide resolved
@jasmainak
Copy link
Collaborator

That's it from me, let's try to merge this PR as soon as possible so we can move on :)

Copy link
Collaborator

@cjayb cjayb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just two minor comments.

Comment on lines +54 to +58
with tempfile.TemporaryDirectory() as tmp_dir_name:
print(tmp_dir_name)
net.spikes.write(op.join(tmp_dir_name, 'spk_%d.txt'))
spikes = read_spikes(op.join(tmp_dir_name, 'spk_*.txt'))
spikes.plot()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get that it's good to demo the use of the write method and read_spikes function. But I wonder if it confuses the new user (likely to look at this example first). The writing to and reading from disk are not needed for plotting. Maybe both? First net.spikes.plot(), then the I/O-bit?

@@ -36,7 +36,7 @@ def test_hnn_core():

# Test spike type counts
spiketype_counts = {}
for spikegid in net.spikegids[0]:
for spikegid in net.spikes.gids[0]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry if this is obvious, but what's the difference between spikes.gids and spikes._gids?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No difference. It's a safety mechanism so that the user doesn't modify spikes.gids.

@cjayb
Copy link
Collaborator

cjayb commented Jun 12, 2020

L really GTM, my coding skills don't warrant deeper nit-picks.

@jasmainak
Copy link
Collaborator

Thanks for taking a look!

@rythorpe the CIs are failing

@rythorpe
Copy link
Contributor Author

@jasmainak it turns out that there was a reason for the nested statement

@rythorpe
Copy link
Contributor Author

@jasmainak last thing, I just added the inputs and outputs section to api.rst as you had recommended earlier

@rythorpe
Copy link
Contributor Author

@jasmainak I think we're good to go. Do you want to move forward with the merge?

@jasmainak
Copy link
Collaborator

Couple of things for future:

Inputs and Outputs should probably be formatted as a subheadheading: https://54-168215891-gh.circle-artifacts.com/0/html/api.html. Also read_params should be under it. We could have another subheading for cells. Anyway, future work :) I'm really done with this PR. Let's merge

@jasmainak jasmainak merged commit 5aa5610 into jonescompneurolab:master Jun 12, 2020
@jasmainak
Copy link
Collaborator

Merged, thanks @rythorpe for the contribution!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants