# Definitive guide to neuprint synapses
This notebook serves as a reference for students in my lab who are working with the Hemibrain connectome. It is meant to disambiguate and demistify the various ways that synapses are counted in neuprint.

In [1]:
from neuprint import Client
# remove my token before making notebook public
c = Client('neuprint.janelia.org', dataset='hemibrain:v1.2.1', token='eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJlbWFpbCI6ImdnMjExNEBjb2x1bWJpYS5lZHUiLCJsZXZlbCI6Im5vYXV0aCIsImltYWdlLXVybCI6Imh0dHBzOi8vbGgzLmdvb2dsZXVzZXJjb250ZW50LmNvbS9hLS9BT2gxNEdpb1lJLUVPLWdidGxPRTh6SmQ0eF9ZQ1Y4ZHF0YVFjWGlHeG5CMz1zOTYtYz9zej01MD9zej01MCIsImV4cCI6MTgxMDUyOTYzNH0.jv9eR0SH5RhfBdXrtp4r-dDFOhcsT8GBbE4v69ysCKs') 
c.fetch_version()

# import important stuff here
import numpy as np
import pandas as pd

I will use a clock neuron as my example neuron. Using fetch_neurons, we obtain neuron_df which contains pre and post columns as well as downstream and upstream columns. Notice that the post count matches the upstream count but the downstream count is much higher than the pre count. I think this is because there are many downstream post sites associated with a single pre site and that the downstream count represents the number of post sites that are downstream of the pre sites.  

In [28]:
bodyID = 2068801704

In [29]:
from neuprint import fetch_neurons

neuron_df, roi_counts_df = fetch_neurons(bodyID)

In [30]:
neuron_df

Unnamed: 0,bodyId,instance,type,pre,post,downstream,upstream,mito,size,status,cropped,statusLabel,cellBodyFiber,somaRadius,somaLocation,roiInfo,notes,inputRois,outputRois
0,2068801704,s-LNv,s-LNv,173,133,817,133,133,965744155,Traced,False,Roughly traced,,581.5,"[4069, 27265, 29472]","{'SNP(R)': {'pre': 156, 'post': 60, 'downstrea...",,"[AME(R), OL(R), PLP(R), POC, PVLP(R), SLP(R), ...","[PLP(R), POC, SLP(R), SMP(R), SNP(R), VLNP(R)]"


If we use fetch_synapses to count up the number of pre and post sites on the neuron's body, those numbers match up with the pre and post counts returned by fetch_neurons. This leads me to believe that the pre and post columns from fetch_neurons have the raw counts of pre and post sites counted up on the actual body of the neuron. 

In [22]:
from neuprint import fetch_synapses, SynapseCriteria as SC
syns = fetch_synapses(bodyID)

In [25]:
syns[syns['type'] == 'post']

Unnamed: 0,bodyId,type,roi,x,y,z,confidence
2,2068801704,post,SLP(R),14962,16835,8866,0.551020
3,2068801704,post,SLP(R),9437,15306,10147,0.533048
27,2068801704,post,PVLP(R),5316,23630,29696,0.991409
28,2068801704,post,PVLP(R),5090,23740,29599,0.580829
29,2068801704,post,PVLP(R),5382,23563,29719,0.851347
...,...,...,...,...,...,...,...
297,2068801704,post,SLP(R),12940,16251,8017,0.949778
298,2068801704,post,SMP(R),17964,18028,7954,0.433000
299,2068801704,post,SLP(R),10986,15873,8655,0.814511
302,2068801704,post,SLP(R),10978,15603,8813,0.693000


In [20]:
syns[syns['type'] == 'pre']

Unnamed: 0,bodyId,type,roi,x,y,z,confidence
0,2068801704,pre,SLP(R),14879,16823,8380,0.991
1,2068801704,pre,SMP(R),15683,17207,8593,0.995
4,2068801704,pre,SMP(R),16795,17746,8560,0.975
5,2068801704,pre,SMP(R),15707,17608,8442,0.993
6,2068801704,pre,SMP(R),16585,17707,8476,0.962
...,...,...,...,...,...,...,...
276,2068801704,pre,SLP(R),12225,15956,8259,0.911
300,2068801704,pre,SLP(R),15576,17061,8642,0.935
301,2068801704,pre,,7283,13979,15207,0.998
304,2068801704,pre,SLP(R),10502,15641,8909,0.844


Even if we specify "primary_only" the counts are the same.

In [32]:
from neuprint import fetch_synapses, SynapseCriteria as SC
syns = fetch_synapses(bodyID, SC(primary_only=True))

In [35]:
len(syns[syns['type'] == 'post'])

133

In [36]:
len(syns[syns['type'] == 'pre'])

173

Indeed, the number of pre rows from fetch_synapses equals the number of pre sites from fetch_neurons. Likewise for posts. Therefore, pre and post from fetch_neurons must be the raw synapse site counts on the neuron body.

If we turn to the functions that return synaptic weights, I start to find discrepancies. I find that for inputs to the neuron, fetch_simple_connections and fetch_adjacencies both return a weight of 113 which is 20 less than the post count (and upstream count) from fetch_neurons (133). It is possible that this discrepancy is because fetch_neurons returns a count of all post sites on the body of the neuron regardless of whether or not there is anything on the other side of that synapse. In other words, perhaps the connectivity weights are strictly functional synapses. 

In [31]:
# inputs to neuron
from neuprint import fetch_simple_connections
inputs = fetch_simple_connections(None,bodyID)
inputs['weight'].sum()

113

In [8]:
from neuprint import fetch_adjacencies
n_df, conn_in_df = fetch_adjacencies(None,bodyID,include_nonprimary=False)
conn_in_df['weight'].sum()

113

fetch_synapse_connections also returns 113 rows where each one is presumably a functional synapse between 2 neurons.

In [27]:
from neuprint import fetch_synapse_connections, SynapseCriteria as SC
pre_syn_conns = fetch_synapse_connections(None, bodyID)
pre_syn_conns

  0%|          | 0/113 [00:00<?, ?it/s]

Unnamed: 0,bodyId_pre,bodyId_post,roi_pre,roi_post,x_pre,y_pre,z_pre,x_post,y_post,z_post,confidence_pre,confidence_post
0,297541369,2068801704,SLP(R),SLP(R),14984,16842,8863,14962,16835,8866,0.989,0.551020
1,297541369,2068801704,SLP(R),SLP(R),9460,15294,10154,9437,15306,10147,0.986,0.533048
2,2129848864,2068801704,PVLP(R),PVLP(R),5295,23631,29690,5316,23630,29696,0.989,0.991409
3,2129848864,2068801704,PVLP(R),PVLP(R),5089,23736,29626,5090,23740,29599,0.993,0.580829
4,2129848864,2068801704,PVLP(R),PVLP(R),5366,23547,29721,5382,23563,29719,0.991,0.851347
...,...,...,...,...,...,...,...,...,...,...,...,...
108,386834269,2068801704,SLP(R),SLP(R),12914,16257,8015,12940,16251,8017,0.998,0.949778
109,386833850,2068801704,SMP(R),SMP(R),17968,18047,7988,17964,18028,7954,0.721,0.433000
110,355816896,2068801704,SLP(R),SLP(R),10966,15869,8651,10986,15873,8655,0.988,0.814511
111,355453590,2068801704,SLP(R),SLP(R),10969,15607,8809,10978,15603,8813,0.724,0.693000


We could check to make sure that every row that fetch_synapse_connections returns is accounted for in fetch_synapses...

When looking at the outputs from this neuron, fetch_simple_connections and fetch_adjacencies both return 411 but neither of those match the pre count nor the downstream count from fetch_neurons. It is possible that the pre count is simply the number of pre-synaptic sites on the neuron's body and that the output weights and downstream counts are some kind of count of the post-synaptic sites on the other side of those pre sites. That would explain why the weights and downstream counts are larger than the pre count, but it is really unclear to me why those counts would be different from each other.

In [12]:
# outputs to neuron
from neuprint import fetch_simple_connections
outputs = fetch_simple_connections(bodyID,None)
outputs['weight'].sum()

411

In [13]:
from neuprint import fetch_adjacencies
n_df, conn_out_df = fetch_adjacencies(bodyID,None,include_nonprimary=False)
conn_out_df['weight'].sum()

411

If we do fetch_synapse_connections for the outputs to the neuron, the returned dataframe has the same number of rows as weights from fetch_simple_connections and fetch_adjacencies. Therefore, fetch_simple_connections and fetch_adjacencies return the number of post sites even when the neuron of interest is pre-synaptic. I like that neuprint uses this convention and we will aim to use it everywhere we have a choice. This reduces any asymmetry that is introduced by the fact that these synapses are polyadic.

__Whenever presenting synapse counts or weights, always default to the count of post-synaptic sites within the synapse regardless of whether the analysis is focused on a pre-synaptic entity.__

In [None]:
from neuprint import fetch_synapse_connections, SynapseCriteria as SC
post_syn_conns = fetch_synapse_connections(bodyID, None)
post_syn_conns

  0%|          | 0/411 [00:00<?, ?it/s]

Unnamed: 0,bodyId_pre,bodyId_post,roi_pre,roi_post,x_pre,y_pre,z_pre,x_post,y_post,z_post,confidence_pre,confidence_post
0,2068801704,386834269,SLP(R),SLP(R),12811,16275,8141,12807,16292,8125,0.996,0.428457
1,2068801704,5813032924,SLP(R),SLP(R),9944,15478,9490,9950,15495,9489,0.925,0.997977
2,2068801704,325113795,SLP(R),SLP(R),14879,16823,8380,14889,16813,8394,0.991,0.796530
3,2068801704,357944930,SLP(R),SLP(R),12897,16287,8175,12928,16297,8165,0.981,0.416790
4,2068801704,357945095,SLP(R),SLP(R),12227,16085,8284,12228,16083,8299,0.824,0.995000
...,...,...,...,...,...,...,...,...,...,...,...,...
406,2068801704,325122525,SLP(R),SLP(R),9300,15404,10213,9316,15422,10243,0.936,0.567212
407,2068801704,325122525,SMP(R),SMP(R),16379,17621,8663,16331,17603,8652,0.971,0.460560
408,2068801704,325122525,SLP(R),SLP(R),12641,15900,8503,12635,15921,8513,0.996,0.996365
409,2068801704,325122525,SLP(R),SLP(R),11625,15760,8594,11620,15758,8570,0.992,0.522653


So what explains the discrepancy between the pre counts, downstream counts, and weights??? I can accept that either the weights or the downstream counts represent the number of functional synapses that are on the post side of the pre sites, but which one represents that number? 

Since the downstream count isn't explained by including non-primary entries (see below), then I'm inclined to think that the downstream count is BS for now. I will default to the connection weights when seeking to get a count of output synapses.

__Henceforth, connections weights are to be used for output synapse counts - never downstream counts which are likely BS until proven otherwise.__

In [37]:
from neuprint import fetch_synapse_connections, SynapseCriteria as SC
post_syn_conns = fetch_synapse_connections(bodyID, None, SC(primary_only=False))
post_syn_conns

  0%|          | 0/411 [00:00<?, ?it/s]

Unnamed: 0,bodyId_pre,bodyId_post,roi_pre,roi_post,x_pre,y_pre,z_pre,x_post,y_post,z_post,confidence_pre,confidence_post
0,2068801704,386834269,"[SLP(R), SNP(R)]","[SLP(R), SNP(R)]",12811,16275,8141,12807,16292,8125,0.996,0.428457
1,2068801704,5813032924,"[SLP(R), SNP(R)]","[SLP(R), SNP(R)]",9944,15478,9490,9950,15495,9489,0.925,0.997977
2,2068801704,325113795,"[SLP(R), SNP(R)]","[SLP(R), SNP(R)]",14879,16823,8380,14889,16813,8394,0.991,0.796530
3,2068801704,357944930,"[SLP(R), SNP(R)]","[SLP(R), SNP(R)]",12897,16287,8175,12928,16297,8165,0.981,0.416790
4,2068801704,357945095,"[SLP(R), SNP(R)]","[SLP(R), SNP(R)]",12227,16085,8284,12228,16083,8299,0.824,0.995000
...,...,...,...,...,...,...,...,...,...,...,...,...
406,2068801704,325122525,"[SLP(R), SNP(R)]","[SLP(R), SNP(R)]",9300,15404,10213,9316,15422,10243,0.936,0.567212
407,2068801704,325122525,"[SMP(R), SNP(R)]","[SMP(R), SNP(R)]",16379,17621,8663,16331,17603,8652,0.971,0.460560
408,2068801704,325122525,"[SLP(R), SNP(R)]","[SLP(R), SNP(R)]",12641,15900,8503,12635,15921,8513,0.996,0.996365
409,2068801704,325122525,"[SLP(R), SNP(R)]","[SLP(R), SNP(R)]",11625,15760,8594,11620,15758,8570,0.992,0.522653
