<img src="resources/cropped-SummerWorkshop_Header.png">  

<h1 align="center">Workshop SWDB 2024 </h1> 
<h3 align="center">Day 3 2024 - Neuron Morphology</h3> 
<h3 align="center">Notebook 4: Analyzing Brain Connectivity via Projections of Light Microscopy Neurons</h3> 

<b>This is the color key that i'm using for leaving comments...

<b><font color='green'> Green</font> --> Add image
    
<b><font color='orange'> Orange</font> --> Write up the thing being described
    
<b><font color='red'> Red</font> --> Question
    

Note: This beginning part is very wordy and most of the text is probably unnecssary. Let's trim it down in the next version, it's here now so that you have an idea of what type of analysis this notebook is geared towards. 

<div style="border-left: 3px solid #000; padding: 1px; padding-left: 10px; background: #F0FAFF; ">
    
# Section 1: Introduction


<font size="3.5"> The main objective of this notebook is to analyze the "projections" of Light Microscopy (LM) neurons, meaning the regions of the brain that the axons and dendrites traverse and communicate their inputs/outputs via the endpoints. This type of long-range projection analysis is only applicable for the LM neurons since the EM neurons were reconstructed within a small piece of tissue fully contained within the visual cortex. In contrast, we will see that many LM neurons have axons that project across many brain regions, see Figure 1.  </font>


<div style="text-align: center;">
    <img src='imgs/lm-vs-em.png' style="max-width: 65%; height: auto;">
</div>

<font size="3.5"><b> Figure 1:</b> LM neuron shown in blue and EM neuron shown in purple.

<div style="border-left: 3px solid #000; padding: 1px; padding-left: 10px; background: #F0FAFF; ">

<font size="3.5"> As an introduction to analyzing the projections of LM neurons, we will explore two open-ended questions related to brain connectivity that scientists consider when analyzing LM neurons. These questions aim to uncover both the fundamental connections and complex network dynamics that define the functional roles of these neurons within the broader neural circuitry. 

<font size="3.5"> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <strong>Question 1:</strong> What do the inputs to a particular brain region look like? </font>

<font size="3.5"> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <strong>Question 2:</strong>  Where else do those inputs send their collaterals?  </font>

<font size="3.5"> Both of these questions involve analyzing a large number of neurons to provide a general overview of how neurons <em>connect</em> different brain regions.  Before addressing these questions, we must first understand the meaning of <em>connectivity</em> at the level of a single neuron, then generalize this notion to the level of brain regions. In this context a neuron connects regions <em>A</em> and <em>B</em> if the neuron has dendritic endpoints in region <em>A</em> and axonal endpoints in region <em>B</em>. The dendritic endpoints receive the <em>input</em> of a neuron, whereas the axonal endpoints send the neuron's output.

<font size="3.5"> Brain regions being <em>connected</em> refers to the existence of established neural pathways—comprising many neurons—through which signals can be transmitted between the regions. This connectivity enables regions to share information and coordinate activities.


<div style="border-left: 3px solid #000; padding: 1px; padding-left: 10px; background: #F0FAFF; ">

<font size="5"> Section 1.1: Strategy for Answering Scientific Questions </font>

<font size="4"> <b>Question 1:</b>
    
<font size="3.5"> First, we need to determine the <em>inputs</em> to a given brain region. Inputs to a brain region are the signals or information arriving from another brain region or sensory organ. These inputs provide the data that the brain uses to process and respond to various stimuli or tasks.  In order to determine the inputs to a brain region, we need to be able to extract the following information from neurons in the exaspim dataset...  </font>
    
<font size="3.5"><strong> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Task 1:</strong> Which brain regions do the dendritic or axonal endpoints of a given neuron belong to? </font>

<font size="3.5"><strong>  &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Task 2:</strong> Find all neurons that have dendritic or axonal endpoints in a given brain region. </font>
    
<font size="3.5"> Note that Sections 2 and 3 describe how to perform these tasks.
<br><br>
    
<font size="4"> <b>Question 2:</b>
    
<font size="3.5"> Once this information arrives to a given brain region, it might not just stay there. The question is also interested in understanding where this information might be sent next. The term "collaterals" refers to the pathways or branches that the information takes after reaching the initial region. These are like additional routes that the message might travel to other hubs or regions in the brain.
<br><br>

<font size="3.5" color='orange'><b> Say a little more about how to answer this question </font></b>

<br><br>

<font size="3.5"> In summary, the objective of these questions is to identify what kind of information enters a particular brain region, then explore where these signals are distributed or sent next in the brain.


<div style="border-left: 3px solid #000; padding: 1px; padding-left: 10px; background: #F0FAFF; ">

<font size="5"> Section 1.2: Analyzing Connectivity with the Common Coordinate Framework (CCF) </font> 

<font size="3.5"> In order to analyze the projections of these neurons, each brain needs to be registered to a standardized template brain space referred to as the Common Coordinate Framework (CCF), see Figure Y. This registration step is important because it enables scientists to analyze and compare neuron reconstructions from multiple brains in an integrated framework. </font>

<font size="3.5" color='red'><b> What should we say about the ccf? Hi Matthew, could you write up a little description here. </font><b>


<font size="3" color='green'><b> Insert image of CCF </font><b>

In [1]:
import pandas as pd


def get_ccf_name(ccf_id):
    idx = ccf_structures["id"] == ccf_id
    try:
        return ccf_structures.loc[idx, "name"].iloc[0]
    except:
        return ccf_id


# Load the CCF structure data as a DataFrame
ccf_structures = pd.read_csv('/data/adult_mouse_ccf_structures.csv')
ccf_structures.head()

Unnamed: 0,id,name,acronym,hemisphere_id,parent_structure_id,graph_order,structure_id_path,color_hex_triplet
0,1000,extrapyramidal fiber systems,eps,3,1009.0,1218,/997/1009/1000/,CCCCCC
1,223,Arcuate hypothalamic nucleus,ARH,3,157.0,733,/997/8/343/1129/1097/157/223/,FF5D50
2,12998,"Somatosensory areas, layer 6b",SS6b,3,453.0,36,/997/8/567/688/695/315/453/12998/,188064
3,163,"Agranular insular area, posterior part, layer 2/3",AIp2/3,3,111.0,287,/997/8/567/688/695/315/95/111/163/,219866
4,552,"Pontine reticular nucleus, ventral part",PRNv,3,987.0,914,/997/8/343/1065/771/987/552/,FFBA86


<div style="border-left: 3px solid #000; padding: 1px; padding-left: 10px; background: #F0FAFF; ">

<font size="3.5"> The ccf ids are stored as integers in a meshparty skeleton, so we'll use the function "get_ccf_name" to get the name of the region corresponding to a given id. Here is a simple example of using this function: </font>

In [2]:
ccf_id = 12998
print(f"The CCF ID '{ccf_id}' represents the '{get_ccf_name(ccf_id)}'\n")


The CCF ID '12998' represents the 'Somatosensory areas, layer 6b'



<div style="border-left: 3px solid #000; padding: 1px; padding-left: 10px; background: #F0FAFF; ">
    
# Section 2: Analyzing Connectivity of a Single Neuron
    
<font size="3.5"> Let's load the skeleton dataset of LM neurons, then sample a single skeleton and analyze what regions of the brain this neuron connects. </font>

In [3]:
# Imports
from random import sample

from utils.graph_utils import get_ccf_ids
from utils.dataset_utils import number_of_samples, load_lm_datasets

import numpy as np


# Initializations
skel_list = load_lm_datasets()
print("Overview of LM Neuron Dataset...")
print("# Brain Samples:", number_of_samples())
print("# Skeletons:", len(skel_list))


Overview of LM Neuron Dataset...
# Brain Samples: 5
# Skeletons: 1649


<div style="border-left: 3px solid #000; padding: 1px; padding-left: 10px; background: #F0FAFF; ">

<font size="3.5"> Each skeleton has a node-level attribute called "id" that specifies where a node is located in ccf space. We will use subroutine called "get_ccf_ids" which is stored in "graph_utils.py". The purpose of this routine is to easily extract the ccf ids from vertices within a certain compartment (e.g. axons or dendrites) and/or vertices which are end points or branch points, see documentation for more details. Next, let's look at some simple examples of using the routine "get_ccf_ids". </font>

In [4]:
# Sample single skeleton
skel = sample(skel_list, 1)[0]

# Root - CCF Compartment
soma_ccf = get_ccf_ids(skel, compartment_type=1)
print("Soma is in the", get_ccf_name(soma_ccf[0]))

# Axon - CCF Regions
print("\nAxons Traverse these CCF Regions...")
axons_ccf = get_ccf_ids(skel, compartment_type=3)
for ccf_id in np.unique(axons_ccf):
    print("  ", get_ccf_name(ccf_id))

# Dendrite - CCF Compartments
print("\nDendrites Traverse these CCF Regions...")
dendrites_ccf = get_ccf_ids(skel, compartment_type=2)
for ccf_id in np.unique(dendrites_ccf):
    print("  ", get_ccf_name(ccf_id))
    

Soma is in the Central lateral nucleus of the thalamus

Axons Traverse these CCF Regions...
   Lateral habenula
   Mediodorsal nucleus of thalamus
   Central lateral nucleus of the thalamus
   fasciculus retroflexus

Dendrites Traverse these CCF Regions...
   Interfascicular nucleus raphe
   medial longitudinal fascicle
   Midbrain reticular nucleus
   Pontine reticular nucleus
   Lateral habenula
   Midbrain
   superior cerebelar peduncles
   Mediodorsal nucleus of thalamus
   Central lateral nucleus of the thalamus
   Central linear nucleus raphe
   fasciculus retroflexus
   mammillotegmental tract
   Periaqueductal gray
   rubrospinal tract
   Dorsal nucleus raphe
   Parafascicular nucleus
   crossed tectospinal pathway


<div style="border-left: 3px solid #000; padding: 1px; padding-left: 10px; background: #F0FAFF; ">


<font size="3.5"> Lets compute the ccf region distriubtion for the axonal and dendritic endpoints. These statistics are more meaningful because they describe where neurons receive/send inputs/outputs. We can use that same routine "get_ccf_ids" to more easily get these ccf ids. </font>


In [5]:
# Subroutines
def report_distribution(values, percent_threshold=0):
    ids, cnts = np.unique(values, return_counts=True)
    print("% Vertices   CCF Region")
    for idx in np.argsort(-cnts):
        percent = 100 * cnts[idx] / len(values)
        ccf_id = get_ccf_name(ids[idx])
        if percent >= percent_threshold:
            print(f"{round(percent, 2)}      {ccf_id}")


# Root - CCF Compartment
soma_ccf = get_ccf_ids(skel, compartment_type=1)
print("Soma is in the", get_ccf_name(soma_ccf[0]))

# Axon Endpoints - CCF Regions
print("\nDistribution of CCF Regions of Axon Endpoints...")
axon_endpoints_ccf = get_ccf_ids(skel, compartment_type=3, vertex_type="end_points")
report_distribution(axon_endpoints_ccf)

# Dendrite Endpoints - CCF Regions
print("\nDistribution of CCF Regions of Dendrite Endpoints...")
dendrite_endpoints_ccf = get_ccf_ids(skel, compartment_type=2, vertex_type="end_points")
report_distribution(dendrite_endpoints_ccf)


Soma is in the Central lateral nucleus of the thalamus

Distribution of CCF Regions of Axon Endpoints...
% Vertices   CCF Region
57.89      Mediodorsal nucleus of thalamus
31.58      Central lateral nucleus of the thalamus
10.53      fasciculus retroflexus

Distribution of CCF Regions of Dendrite Endpoints...
% Vertices   CCF Region
52.94      Midbrain reticular nucleus
23.53      superior cerebelar peduncles
11.76      Periaqueductal gray
5.88      Pontine reticular nucleus
5.88      Midbrain


<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<font size="3.5"><p><b>Task 1.1:</b>  Find the neurons that project across the fewest and largest number of brain regions.
    
</div>

<div style="border-left: 3px solid #000; padding: 1px; padding-left: 10px; background: #F0FAFF; ">
    
# Section 2: Analyzing Connectivity Between Brain Regions

<font size='3.5'> Next, let's ....</font>

In [19]:
# Get ccf ids of somas
soma_ccf_ids = [get_ccf_ids(skel, compartment_type=1)[0] for skel in skel_list]
print("Distribution Somas Locations...")
report_distribution(soma_ccf_ids)


ValueError: can only convert an array of size 1 to a Python scalar

<div style="border-left: 3px solid #000; padding: 1px; padding-left: 10px; background: #F0FAFF; ">

<font size='3.5'> Let's sample a brain region that contains somas from our dataset, then analyze the projections of these neurons.  </font>
    

<font size="3.5" color='orange'><b> Say a little more </font><b>
    

In [20]:
# Sample ccf_id and extract skeletons with soma with that ccf region
ccf_id = sample(soma_ccf_ids, 1)[0]
skels_subset = [skel for skel in skel_list if skel.vertex_properties['ccf'][skel.root] == ccf_id]
print(f"# Skeletons with Soma in {get_ccf_name(ccf_id)}:", len(skels_subset))

# Get ccf regions of endpoints
axon_endpoints_ccf = list()
dendrite_endpoints_ccf = list()
for skel in skels_subset:
    axon_endpoints_ccf.extend(get_ccf_ids(skel, compartment_type=3, vertex_type="end_points").tolist())
    dendrite_endpoints_ccf.extend(get_ccf_ids(skel, compartment_type=2, vertex_type="end_points").tolist())

# Report distribution
print("\nDistribution of Axon Endpoints in CCF Space...")
report_distribution(axon_endpoints_ccf)

print("\nDistribution of Dendrite Endpoints in CCF Space...")
report_distribution(dendrite_endpoints_ccf)


ValueError: ('Lengths must match to compare', (1327,), (1,))