[MRG] Add ability to modify cell-cell connectivity #276

ntolley · 2021-03-04T22:33:38Z

Closes #169. Mostly declaring my intention to work on this issue.

For the API, I propose the first two parameters be lists of src_gids and target_gids. The remaining parameters defining each synaptic connection will be arrays of shape (n_src_gids, n_target_gids). At the moment this will require defining 5 matrices for the synaptic weight, receptor, location, delay, and space constant.

Down the line I can add helper functions that will generate these matrices for the default network. This will allow users to edit the existing connections rather than starting from scratch.

While I can see defining these matrices as a user to be a extremely unwieldy, I think it comes with the territory of manipulating network connectivity. Once we have the basic functionality down, I am sure we can brainstorm several quality of life features that will simplify the process of defining connectivity. Definitely interested in everyone's feedback and suggestions as I dig deeper into this problem.

jasmainak · 2021-03-05T00:29:31Z

This is great!! Can you share a few lines of example code to demo the API?

jasmainak · 2021-03-05T01:40:41Z

Another possibility is to do an N x 3 array or list of tuple ...

[(1, 1, 1e-6),
  (1, 2, 1e-3),
 ....
]

and so on

in the format

(src_gid, dest_gid, weight)

hnn_core/network.py

ntolley · 2021-03-05T04:48:59Z

Another possibility is to do an N x 3 array or list of tuple ...
[(1, 1, 1e-6),
  (1, 2, 1e-3),

I like this a lot! I'll give a pass at implementing this structure under the hood. From the user end it should be fairly straight-forward to convert matrices into this format so the option to use either format could be available.

jasmainak · 2021-03-05T05:06:35Z

so the option to use either format could be available.

maybe, but KISS as much as you can. Whichever is the simplest option, just implement that one. Also, I'd suggest starting by writing the test first (that will fail) if you can and working your way to making the test work. So you do "test-driven development" :)

hnn_core/network_builder.py

cjayb · 2021-03-05T17:12:19Z

So cool to get this started! I've mentioned before that we should be careful not to re-invent the wheel. Something like NeuroML might be useful here, or is it an overkill? Even so, the structure might be valuable to understand, especially when talking about extending HNN via NetPyNe or other means.

jasmainak · 2021-03-05T17:43:34Z

Netpyne (don't even get me started here ...) and NeuroML just feel like overengineering to me to be honest. We don't even know at this point if these connectivity matrices can be resolved with MEG. Let's not forget that we want to model source-localized MEG signals and not be a generic simulator. I'd add functionality on a use-case basis and in the future incorporate readers for NeuroML/Netpyne if it's appropriate

cjayb · 2021-03-05T18:03:48Z

I’m pretty sure our views on over-engineering are similar... My point was just that an XML-based syntax might already exist that we could use (perhaps a subset of), rather than creating ad hoc tuples. But I’m +2 on KISS, and whatever makes most sense when actually implementing this stuff!

jasmainak · 2021-03-05T18:10:22Z

My point was just that an XML-based syntax might already exist that we could use (perhaps a subset of), rather than creating ad hoc tuples.

okay fair enough. We could look into these for potential inspiration. The tuples data structure was loosely inspired by the .geo file format where they do something like "connect src dest". But whatever the data format in the API (matrices, tuple or any other format), internally we should use the data structure that gives the simplest implementation

codecov-io · 2021-03-05T23:03:10Z

Codecov Report

Merging #276 (c97936d) into master (33d7044) will increase coverage by 0.39%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master     #276      +/-   ##
==========================================
+ Coverage   91.43%   91.83%   +0.39%     
==========================================
  Files          14       14              
  Lines        2137     2228      +91     
==========================================
+ Hits         1954     2046      +92     
+ Misses        183      182       -1

Impacted Files	Coverage Δ
hnn_core/network.py	`90.75% <100.00%> (+3.41%)`	⬆️
hnn_core/network_builder.py	`91.89% <100.00%> (-1.29%)`	⬇️
hnn_core/parallel_backends.py	`91.94% <0.00%> (ø)`
hnn_core/params.py	`88.97% <0.00%> (+0.39%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 33d7044...c97936d. Read the comment docs.

ntolley · 2021-03-05T23:33:46Z

Absolute miracle the CI's are passing after ravaging the code base like that...

I've cleaned up _connect_celltypes() a bit so that it accepts a list of sources and a list of targets. In its current form it the function proceeds to construct an "all to all" network between the src and target gids.

This is a pretty intermediate step but I think the it will make the subsequent changes go smoother. For the next modfication I think the cleanest format will be passing the proposed N x 3 tuple. After that we can focus on the API, which will boil down to convenience functions that convert network definitions (with whatever format we decide) into the N x 3 tuples.

jasmainak · 2021-03-05T23:36:38Z

sounds like a great plan to me ! see how proper testing can make you confident when changing the code ;-)

hnn_core/network_builder.py

ntolley · 2021-03-06T04:39:46Z

_connect_celltypes() now accepts a variable known as connectivity which is a list containing connection information. It has the form [[src_type, src_gid, target_type, target_gid], ...], an example of one element would be ['L2Pyr', 0, 'L5Pyr', 45]

To create the default network with all to all connectivity between cell classes, I created the function _all_to_all_connect() which simply takes in gid lists and converts them into the format described above. The unique and allow_autapses options were moved there as well to make _connect_celltypes() more flexible.

To @jasmainak's suggestion of keeping things simple, these modifications allow for a very straightforward definition of binary connections. If connectivity doesn't have the connection, it simply isn't instantiated.

The inner items of connectivity could easily be expanded to include extra details like loc and receptor, but I'm actually pretty fond of its current state. Extremely fine grain control connections and synaptic properties can still be achieved with repeated calls to _connect_celltypes().

ntolley · 2021-03-06T04:40:29Z

Definitely interested to hear everyone's thoughts on this. If the general structure of connectivity seems reasonable, it will be pretty quick to write an API that allow users to define the list manually. Then we can discuss alternative formats that get converted to the connectivity format internally.

hnn_core/network_builder.py

jasmainak · 2021-03-06T05:07:53Z

looks great so far! How can we test that it works? I'd worry about the API a bit later.

Make it work, make it nice, make it fast ....

that's the sequence of operations

can we update a test? or an example? even with imperfect API

hnn_core/network.py

jasmainak · 2021-03-06T05:17:51Z

hnn_core/network.py

+    def connect_cells(self, src_gids, target_gids, weight, receptor, location,
+                      delay, space_constant):


I'm almost tempted to suggest that this should be in the constructor of Network ... but not sure. Let's think of the API once we know it works

cjayb · 2021-03-08T16:26:42Z

hnn_core/network_builder.py

+            src_type, src_gid = conn['src_type'], conn['src_gid']
+            target_type, target_gid = conn['target_type'], conn['target_gid']
+            target_cell = self.cells[target_filter[target_gid]]
+            connection_name = f'{_short_name(src_type)}_'\


Is there a strong reason we keep on using _short_name? It's hard to keep track of which you're supposed to use when... I tried to consistently just use the long forms ('L2_pyramidal') etc. in the new drives-API.

The param files and class names are both short names. I think we should consistently change to the short version in a separate PR (see #148). If we change to the long version, we'll have to also change the param file format (harder)

hnn_core/network_builder.py

cjayb · 2021-03-08T16:33:18Z

Very readable code so far, excellent work @ntolley !

hnn_core/network_builder.py

ntolley · 2021-03-09T16:12:34Z

Thanks for the feedback guys! The most recent change essentially copies all of the connectivity code over to the Network class. Simple enough, but I ran into a frustrating issue that ultimately came down to not using deepcopy() on nc_dict :')

During instantiation of the network, all of the cell-cell connection information is stored under a new attribute net.connectivity_list. I think this is the last "moving things around" change that will be necessary for this PR. Any interface just has to handle editing net.connectivity_list.

As for next steps, I think the simplest addition will be a function that appends new connections to connectivity_list. The tests can simply ensure that the gids exist, and the values under recptor, location, and nc_dict are valid. Another useful function will be one that just clears connectivity_list.

ntolley · 2021-03-09T16:15:16Z

Looks like the macOS build failed but linux passed... praying this is a one time occurrence

cjayb · 2021-03-09T17:01:06Z

praying this is a one time occurrence

It might not be, but don't worry about it for now. It's probably just MPI having kittens again. I experienced that too, couldn't fix it, until it fixed itself. Just do yourself the favour of running test_parallel_backends.py on your development machine (whichever OS that may be). You're messing with stuff on the NEURON side of the woods, where parallelisation can be a deal-breaker.

doc/whats_new.rst

rythorpe · 2021-03-25T02:44:47Z

Turns out there are a few more edge cases that are invalid due to the way the cells are instantiated (excitatory synapses from basket cells... no duh). I've opted to remove the hard coded check for now instead of rushing to patch this up in the API functions.

The API is still completely functional, but invalid connections won't throw an error until the build phase. I think this will require a more sophisticated warning system that ensures valid cell-cell connections are provided to add_connection(). I'm ok to work at this in a followup PR, but let me know if you guys think this should be addressed before merging.

I agree that we should avoid hard coding checks for this. As you mentioned, checks should validate against the currently available synapses as defined in the various cell objects (allowing us to flexibly change cell classes and network configurations without breaking anything).

Just to clarify, there are only gabaa and gabab synapses at the L2Pyr soma, L5Pyr soma, and L5Pyr distal apical dendrite right? So e.g. a proximal inhibitory synapse on the L5Pyr cell should be one of your edge cases?

jasmainak · 2021-03-25T02:48:07Z

Okay I'm done here. Added some nitpicks mostly :) Are you going to simplify the example in this PR or a follow-up PR? I let @rythorpe and @cjayb look through, comment and merge when they're happy.

rythorpe

Once @jasmainak's comments are addressed, let's merge and save the synapse check for a later PR. This is a great addition @ntolley!

ntolley · 2021-03-25T11:46:00Z

Are you going to simplify the example in this PR

Oops forgot to commit the change! Should be there now. I believe all the comments have been addressed, let me know if there's anything else! Otherwise feel free to merge when the CI's are green.

jasmainak · 2021-03-25T12:50:21Z

Great stuff! Looking forward to the follow-up PR on viz and checks. Thanks @ntolley ! 🍻

@cjayb I hastened the merge a bit to keep the ball rolling. If you have any more critiques, feel free to leave them and I'm sure @ntolley can address them in the next PR!

jasmainak · 2021-03-25T13:02:15Z

Btw @ntolley I think this API is still a little low-level. We should look into what are the common operations neuro folks may want to do with the connectivity (sparsity? connect layers at a time? specify probability of connections?) and make it so that it can be done in one line ideally ...

cjayb · 2021-03-25T13:09:03Z

Totally cool @jasmainak , I had noted that this had a lot of eyes on it. Nice work @ntolley! Every time we get stuff out of the Neuron-loops, it gets easier to see where to optimise further.

ntolley · 2021-03-25T15:01:05Z

I think this API is still a little low-level.

Totally agree, I think I underestimated the amount of housekeeping that was necessary for this PR. Perhaps these advanced features are a good use case for the future hnn_core-netpyne branch.

We should look into what are the common operations neuro folks may want to do.

From my reading, I think the initial focus should be on probabilistic connection definitions. For example:

src_pop, target_pop = net.gid_ranges('L2_basket'), net.gid_ranges('L5_pyramidal')
net.connect_populations(src_pop, target_pop, probability=0.4, ...)

Sparsity more or less comes for free depending on your definition (since you can specify a small subset with high connection probability, with low probability across the entire network).

jasmainak · 2021-03-26T11:29:33Z

An alternative is creating a function that returns a list of src_gid and target_gid:

src_gids, target_gids = get_probabilistic_connectivity(net, prob=0.4, src_type='L2_basket', target_type='L5_pyramidal')
src_gids, target_gids = connectivity_matrix_to_list(net, connectivity=connectivity_matrix, src_type='L2_basket', target_type='L5_pyramidal')

net.add_connection(src_gids, target_gids, ...)

you could imagine having different functions for different kinds of connectivities.

what still bothers me about this API is that users still have to understand what are the connection locations, receptors etc which may be low-level for many folks. There needs to be some way to quickly see what is the existing connectivity and update a couple of parameters in there

cjayb · 2021-03-26T13:30:21Z

For each connection, we have

source cell type

uniquely defines: neurotransmitter

target cell synapse (depends on source cell neurotransmitter)

variable: location (section name; apical tuft, basal oblique, soma)
variable: type (gabaa, gabab, ampa, nmda, ...)
variable: lamtha

Did I forget something? Looking at that, I'm thinking that the API could keep the source cell fixed:

# excitatory L2 -> L5 (practically all-to-all -> large lamtha; apical target)
conn = get_probabilistic_connectivity(net, source='L2_pyramidal', target='L5_pyramidal', prob=0.6)
synaptic_weights = {'ampa': 1e-2, 'nmda': 2e-3}
synaptic_delays = {'ampa': 0.1, 'nmda': 1.0}
src_gids, target_gids = connectivity_matrix_to_list(net, connectivity= conn)
net.add_connection(src_gids, target_gids, synaptic_location='apical_tuft', synaptic_weigths=synaptic_weigths, synaptic_delays=synaptic_delays, lamtha=60)

# inhibitory L2 basket -> L2 pyramidal (very local -> small lamtha; somatic)
conn = get_probabilistic_connectivity(net, source='L2_basket', target='L2_pyramidal', prob=0.9)
synaptic_weights = {'gabaa': 1e-2, 'gabab': 2e-3}
synaptic_delays = {'gabaa': 0.1, 'gabab': 5.0}
src_gids, target_gids = connectivity_matrix_to_list(net, connectivity= conn)
net.add_connection(src_gids, target_gids, synaptic_location='soma', synaptic_weigths=synaptic_weigths, synaptic_delays=synaptic_delays, lamtha=3)

This is very close to the current API, but with more fine-grained control over target location. I like the idea of using section names instead of 'distal' and 'proximal', because the latter are a little too specific to the default S1 model and don't easily generalise. With this much more control the S1 model could be adapted to approximate other brain areas, without actually altering the network...?

I don't think people should be doing network simulations if they're not willing to put in the work to understand the biology! L2 basket cells are GABAergic, and basket cell synapses are mostly somatic. L2 pyramidal cells project to L1/2/3 synapses on the apical dendrite of L5 pyramidal cells. The neurotransmitter is glutamate, and AMPA and NMDA receptors have very different time constants. Etc. Examples could be written to explain how "distal" and "proximal" inputs relate to cortical structure, and how these two classes have a solid biological interpretation in the specific case of S1 activity.

jasmainak · 2021-03-26T14:02:43Z

It's true that users don't know what is "distal" or "proximal" for new models. But I'm thinking that when you define a new model, you'll have to define what is "distal" and "proximal". That could live as an attribute of Network (suggested by @rythorpe many months ago). Do we really want user to have such fine-grain control as being able to connect to a particular dendrite?

Now, I disagree with the sentiment in this (probably more appropriate discussion over beers?):

I don't think people should be doing network simulations if they're not willing to put in the work to understand the biology

Good scientific software should almost be like a teaching tool. You should be able to start doing things relatively quickly and then refer to the documentation/wikipedia to learn more. I remember when I first tried to use an MEG package as a Masters student, it was FieldTrip and I was overwhelmed by the cfg parameter and all the choices it offered. MNE was way more straightforward as it came with sane defaults and required less tweaking. Scikit-learn is similar as it gives you a fit and transform for all the different algorithms, so you can switch between them without having to understand each algorithm.

Of course, once you got a few plots and start to think more about the data, you better put in the effort to understand what's going on ;-) But that "initial takeoff" is necessary!

cjayb · 2021-03-26T14:38:20Z

I'm all for onboarding new users with sane APIs! Nested dictionaries are awful. And my investment into hnn-core is to a large extent driven by a motivation to have a cool educational tool :)

I agree that it makes no sense to have an API that allows arbitrary synapse targeting (if you want that, do NetPyNE or go to NEURON directly). I guess I'm questioning whether 'distal' and 'apical' are too specific to a single scenario: the two-pronged thalamic inputs to S1.

How do you envision defining 'distal' and 'proximal' if not on the basis of cell morphology? Right now we have (in pyramidal.py:create_dends())

        self.sect_loc['proximal'] = ['apicaloblique', 'basal2', 'basal3']
        self.sect_loc['distal'] = ['apicaltuft']

That's a method of a superclass, so we don't want folks in here :) So you'd do this?

net = Network(params, distal=['apicaltuft'], proximal=['apicaloblique', 'basal2', 'basal3'])

I'm not sure that's better. In fact, I think it would be worse ;)

jasmainak · 2021-03-26T14:52:19Z

I guess I'm questioning whether 'distal' and 'apical' are too specific to a single scenario: the two-pronged thalamic inputs to S1.

good question whose answer I don't know!! @ntolley @rythorpe @stephanie-r-jones ? Do the other models (A1, V1 ...) you envision also have "distal" and "proximal" inputs?

How do you envision defining 'distal' and 'proximal' if not on the basis of cell morphology?

definitely on the basis of cell morphology. But only once when you import the cell. After it has been imported, you can connect only at these two locations? We need to brainstorm a bit how this whole process can be set up now that we have the drives more or less sorted. Just making this up:

cell = read_cell('blah.swc')
cell.proximal = ['apicaloblique', 'blah']
cell.distal = ['apicaltuft']

net = Network(L2Pyr=cell)

again, how much flexibility do we want to offer with cell types? Knowing that MEG signal usually comes from aggregate of Pyramidal cells ...

cjayb · 2021-03-26T15:26:47Z

I do like your sketch of a cell, but I actually think (more like feel...) that details of synaptic connectivity will have a far greater effect on MEG simulations than detailed cell morphology will! The aggregate current is affected by synaptic target location. For example, there are definitely trunk-targeting Glu & GABA inputs to several pyramidal populations (at least outside of S1). Surprisingly, excitation at the trunk might actually decouple (shunt) the tuft synapses from the soma, depending on leak-currents through voltage-gated K-channels along the dendrite.

The reason I'm pushing for a little more control over the synaptic locus is that I think it might massively generalise the default network model. To the point where it could perhaps model MEG-responses in other regions too. This is a hypothesis, but a testable one! Now that more realistic network models exists, we could see how "close" to their output our simple network could get. But it's not possible without fiddling with the targets, and I don't think the current two will cut it. The drives would also need to get the same level of detail, of course (sorry!).

To be clear: the "HNN S1 model" has 'somatic', 'distal' and 'proximal' targets, defined as in the published papers. Period. But an "HNN V1 model" might have some other list of pooled connection locations, with a validation paper to boot. Looking forward to all y'all thoughts :)

cjayb · 2021-03-26T19:32:26Z

I may be getting ahead of myself there... It's not a priority to extend the HNN model! but as we improve the API, we should be mindful of leaving the doors open for such extensions down the line.

jasmainak · 2021-03-28T05:15:05Z

I actually think (more like feel...) that details of synaptic connectivity will have a far greater effect on MEG simulations than detailed cell morphology will!

makes sense. We can collaborate on this :)

From a practical coding point of view, I think the difference between allowing dendrites vs aliases 'proximal' etc. might not be too much. I'm fine with either. Let's try to iterate and go one step at a time ... gradient descent rather than analytical solution ;-) @ntolley I suggest a PR with:

viz of connectivity
one or two convenience functions for getting src_gids, target_gids pairs

To be clear I suggested connectivity_matrix_to_list as another convenience function like get_probabilistic_connectivity. I think it confused @cjayb so maybe we just go with one function for now ... get_probabilistic_connectivity and develop more functions as a the need arises.

stephanie-r-jones · 2021-03-28T12:34:23Z

Sorry for the delayed input here.. The process here is important to talk through together. Can we please put this on our agenda for our next HNN meeting?

A few high level things to keep in mind. HNN will only work with models that have multi-compartment pyramidal neuron dendrites. However, while proximal and distal drive are named based on the pyramidal neuron location they not only going to pyramidal neurons, but rather represent lemniscal thalamocortical drive and cortical-cortical drive. The template model doesn't have a layer 4, which is where lemniscal thalamuocrotical drive would target, instead we model it directly to the proximal dendrites of the pyramidal neurons and to the inhibitory neurons ( in our model point neurons) in layer 2/3 and 5. In expanding the framework to include other networks, a new model might have interneurons with multiple dendrites and we'd have to defined what proximal and distal might mean depending on where the interneurons sit.

In practice, there will be two steps to getting new models to users, (1) defining the location of the proximal and distal drives - this will take a LOT of user knowledge both in where to put the connections and how to put them there - and will not be undertaken novice users. (2) After a model is build, we make it a template that can be loaded in, and now a more novice user can simply import the template and the proximal and distal drives will already be defined and they don't need to worry about this level of detail.

The best way to proceed is with an example use case. Salva has an example of a detailed A1 model that we can use to define a logical process. We have an A1 ERP simulation from our HNN template model (Carmen's paper in review) and we can see if we can recreate this simulation using Salva's detailed A1 model. This would be a great way to uncover how much of the detail matters at the macroscale recording level!

cjayb · 2021-03-28T16:43:59Z

@stephanie-r-jones agreed, this warrants thorough discussion, which can move elsewhere. The semantics of "distal/proximal" are, as you point out, a lot deeper than just dendritic compartment. The synaptic details of the lemniscal pathway might be quite different from those of the geniculate visual pathway, for example. Whenever there's a preprint available on the A1 paper you're preparing, I'd appreciate a heads-up!

In terms of the code, I'm advocating to abstract out all of such top-level semantics. Specifically, we shouldn't have variables/keys/attributes called self.distal (etc.), because these are ambiguous outside of the HNN-S1 context.

jasmainak · 2021-03-28T18:26:37Z

this is great. I think what I'm still missing is a visualization in my head what it means to be an A1 model or an S1 model or a model in the visual cortex. I feel this discussion is a little premature on the software aspect unless we can pin this down first on paper and pencil. Show me 5 different models with diagrams and where you envision the inputs to come from ...

jasmainak reviewed Mar 5, 2021

View reviewed changes

hnn_core/network.py Outdated Show resolved Hide resolved

jasmainak added this to the 0.2 milestone Mar 5, 2021

jasmainak reviewed Mar 5, 2021

View reviewed changes