<a href="https://colab.research.google.com/github/tmckim/NS479_NeuroTechniquesLab/blob/main/SP26/NS479_Connectome_Interpreter_Plotting.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Notebook for Lab01 Connectomics Using Codex
## Using the Connectome Interpreter to Plot Neural Connectivity

Resources:
* Codex: https://flywire.ai/ <br>
* Connectome Interpreter package: https://github.com/YijieYin/connectome_interpreter

---
<a name="save"></a>
## Before you start - Save this notebook! üíæ

When you open a new Colab notebook from a shared link, you cannot save changes. It's  best to store the Colab notebook in your personal drive `"File > Save a copy in drive..."` **before** you do anything else.

The file will open in a new tab in your web browser, and it is automatically named something like: "**Copy of NS479_Connectome_Interpreter_Plotting.ipynb**". <br>

You can rename this to just the title of the assignment "**NS479_Connectome_Interpreter_Plotting.ipynb**". <br>

Make sure you do keep an informative name (like the name of the assignment) to help you be able to come back to this after you complete this part of the assignment.

___

**Where does the notebook get saved in Google Drive?**

By default, the notebook will be copied to a folder called ‚ÄúColab Notebooks‚Äù at the root (home directory) of your Google Drive. If you use this for other courses or personal code notebooks, I recommend creating a folder for this course and then moving the assignments AFTER you have completed them.



____
# Learning Objectives
## At the end of this lab, you'll be able to:
* Explore neuron connectivity and statistics in Codex (web browser: https://flywire.ai/)üîé üìà
* Become familiar with anatomy of *Drosophila* (fruit fly) neurons üß† ü™∞
* Identify different brain regions in the *Drosophila* brain üîó üó£
* Analyze connectivity between taste and endocrine cells üç™ üîä
* Visualize and compare connectivity data using both web-based tools (Codex) and this code notebook üíª üìî
______

# Quick Intro to Jupyter (Colab) Notebooks üìì

This section will introduce you using Jupyter Notebooks üìì üíª, a handy coding environment for learning as well as sharing code with others.

### At the end of this notebook, you'll be able to:
* Recognize the main features of Jupyter Notebooks
* Use Jupyter Notebooks to run Python3 üêç Code

### About Jupyter Notebooks


Jupyter notebooks are a way to combine executable code, code outputs, and text into one connected file. They run in a web browser. üì∂

The <b>'kernel'</b> is the thing that executes your code. It is what connects the notebook (as you see it) with the part of your computer that runs code.

### Types of Cells
Jupyter Notebooks have two types of cells, a <b>Markdown</b> (like this one) and <b>Code</b>. Most of the time you won't need to run the Markdown cells, just read through them. However, when we get to a code cell, you need to tell Jupyter to run the lines of code that it contains.

Code cells will be read by the Python interpreter. In other words, the Python kernel will run whatever it recognizes as code within the cell.


In [None]:
#@title Task
from IPython.display import HTML

alert_info = '''
<div style= "font-size: 20px"; class="alert alert-info" role="alert">
  <h4 class="alert-heading">Task</h4>
Run the cell below by clicking on the 'play arrow button' ‚ñ∂ in the top left corner, or using the keys: shift + return (mac) or shift + enter (pc)
</div>
'''

display(HTML('<link href="https://nbviewer.org/static/build/styles.css" rel="stylesheet">'))
display(HTML(alert_info))

In [None]:
# In Python, anything with a "#" in front of it is code annotation,
# and is not read by the computer.
# You can run a cell (this box) by pressing shift-enter or shift-return.
# Click in this cell and then press shift and enter simultaneously.
# This print function below allows us to generate a message.
print('Nice work!')

______

# Setup

In [None]:
#@title Task
from IPython.display import HTML

alert_info = '''
<div style= "font-size: 20px"; class="alert alert-info" role="alert">
  <h4 class="alert-heading">Task</h4>
Run the cells below to get your notebook environment setup
</div>
'''

display(HTML('<link href="https://nbviewer.org/static/build/styles.css" rel="stylesheet">'))
display(HTML(alert_info))

In [None]:
#@title # Step 1: Import the connectome interpreter resources

%%capture
!pip install git+https://github.com/YijieYin/connectome_interpreter.git --no-deps

In [None]:
#@title # Step 2: Other specific visualization packages needed

%%capture
# optional dependency, only needed for any information-flow-related functions: layered_el(), plot_flow_layered_paths()
!pip install navis -U
# optional dependency, needed for any interactive pathway plotting - ** MAKE SURE TO INCLUDE
!pip install pyvis

In [None]:
 #@title # Step 3: Import common packages needed

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.colors as mcl
from matplotlib.colors import Normalize
from matplotlib.cm import viridis
import re
import scipy as sp
import plotly.express as px
import seaborn as sns

from connectome_interpreter import *

In [None]:
 #@title # Step 4: Connect to your google drive to save figures from today directly

# Setup and add files needed to access gdrive
from google.colab import drive                                   # these lines mount your gdrive to access the files we import below
drive.mount('/content/gdrive', force_remount=True)

In [None]:
 #@title # Step 5: Setup a folder in gdrive to save the figures

import os

# ====== EDIT THIS ONE LINE ONLY ======
folder_name = "NS479_Lab01_ConnectomicsFigures"

base_path = "/content/gdrive/MyDrive"
save_path = os.path.join(base_path, folder_name)

# check if folder already exists; if not, make it
if os.path.isdir(save_path):
    print("Given directory exists")
else:
    print("Given directory doesn't exist, making it")
    os.makedirs(save_path, exist_ok=True)

# change into the folder for the session
#os.chdir(save_path)
#print("Changed into directory")

print(f"Figures from today's session will be saved in Google Drive folder called: {folder_name}")

----

# Read in the datasets

We will work with the following 2:
- FAFB (**F**emale **A**dult **F**ull **B**rain): [Dorkenwald et al. 2025](https://www.nature.com/articles/s41586-024-07558-y) & [Schlegel et al. 2025](https://www.nature.com/articles/s41586-024-07686-5) <br>
  * [Eckstein et al. 2024](https://www.cell.com/cell/abstract/S0092-8674(24)00307-6) (neurotransmitter prediction)
  * [Yu et al. 2025](https://www.biorxiv.org/content/10.1101/2025.07.11.664377v1) (new synapse detection (used here))
  * [Buhmann et al. 2021](https://www.nature.com/articles/s41592-021-01183-7) (old synapse detection).
- BANC (**B**rain **A**nd **N**erve **C**ord): [Bates et al. 2025](https://www.biorxiv.org/content/10.1101/2025.07.31.667571v1).


In [None]:
#@title Download FAFB data
%%capture
# FAFB
!wget https://github.com/YijieYin/connectome_data_prep/raw/refs/heads/main/data/fafb_all_neuron/fafb_inprop_all_neuron.npz
!wget https://github.com/YijieYin/connectome_data_prep/raw/refs/heads/main/data/fafb_all_neuron/fafb_ad_inprop_all_neuron.npz
!wget https://github.com/YijieYin/connectome_data_prep/raw/refs/heads/main/data/fafb_all_neuron/fafb_syncount_all_neuron.npz


In [None]:
#@title Download BANC data
%%capture
# BANC
!wget https://github.com/YijieYin/connectome_data_prep/raw/refs/heads/main/data/BANC/banc_inprop_all_neuron.npz
!wget https://github.com/YijieYin/connectome_data_prep/raw/refs/heads/main/data/BANC/banc_syncount_all_neuron.npz

# Read Type-Function Sheet

In [None]:
#@title #Import a spreadsheet file with needed info
from google.colab import auth
auth.authenticate_user()

import gspread
from google.auth import default
creds, _ = default()

gc = gspread.authorize(creds)

In [None]:
#@title # Run this to access it
worksheet = gc.open_by_key('1VHCEnurOdb4FDC_NUKZX_BpBckQ9LpKxv0CsK_ObVok')
df = pd.DataFrame(worksheet.sheet1.get_all_records())
type_to_function = dict(zip(df.cell_type, df.known_function))

---

# Read Data Into Notebook Environment

In [None]:
#@title # FAFB
fafb_inprop = sp.sparse.load_npz('/content/fafb_inprop_all_neuron.npz')
fafb_ad_inprop = sp.sparse.load_npz('/content/fafb_ad_inprop_all_neuron.npz')
fafb_syncount = sp.sparse.load_npz('/content/fafb_syncount_all_neuron.npz')

fafb_meta = pd.read_csv('https://raw.githubusercontent.com/YijieYin/connectome_data_prep/refs/heads/main/data/fafb_all_neuron/fafb_all_neuron_meta.csv',
                   index_col=0)

fafb_meta.loc[:, ["type_side"]] = fafb_meta.cell_type + '_' + fafb_meta.side

# make dictionaries that map indices to fafb_meta info
fafb_idx_to_type = dict(zip(fafb_meta.idx, fafb_meta.cell_type))
fafb_idx_to_type_side = dict(zip(fafb_meta.idx, fafb_meta.type_side))
fafb_idx_to_sign = dict(zip(fafb_meta.idx, fafb_meta.sign))
fafb_idx_to_side = dict(zip(fafb_meta.idx, fafb_meta.side))
fafb_type_to_sign = dict(zip(fafb_meta.cell_type, fafb_meta.sign))
fafb_type_side_to_sign = dict(zip(fafb_meta.type_side, fafb_meta.sign))

In [None]:
#@title # BANC
banc_inprop = sp.sparse.load_npz('/content/banc_inprop_all_neuron.npz')
banc_syncount = sp.sparse.load_npz('/content/banc_syncount_all_neuron.npz')

banc_meta = pd.read_csv('https://raw.githubusercontent.com/YijieYin/connectome_data_prep/refs/heads/main/data/BANC/banc_meta_all_neuron.csv',)
banc_meta['type_side'] = banc_meta.cell_type + '_' + banc_meta.soma_side
banc_idx_to_type = dict(zip(banc_meta.idx, banc_meta.cell_type))
banc_idx_to_side = dict(zip(banc_meta.idx, banc_meta.soma_side))
banc_idx_to_type_side = dict(zip(banc_meta.idx, banc_meta.type_side))
banc_idx_to_sign = dict(zip(banc_meta.idx, banc_meta.sign))
banc_type_to_sign = dict(zip(banc_meta.cell_type, banc_meta.sign))
banc_type_side_to_sign = dict(zip(banc_meta.type_side, banc_meta.sign))

In [None]:
#@title For access to function spreadsheet
if 'type_to_function' in locals():
  # use function if available, otherwise cell_type
  fafb_idx_to_function = {idx: type_to_function[t] if t in type_to_function else t for idx, t in fafb_idx_to_type.items()}
  banc_idx_to_function = {idx: type_to_function[t] if t in type_to_function else t for idx, t in banc_idx_to_type.items()}


  fafb_function_to_sign = {fafb_idx_to_function[idx]: fafb_type_to_sign[t] for idx, t in fafb_idx_to_type.items()}
  banc_function_to_sign = {banc_idx_to_function[idx]: banc_type_to_sign[t] for idx, t in banc_idx_to_type.items()}


  fafb_idx_to_side_function = {idx: fafb_idx_to_side[idx] + ':' + fafb_idx_to_function[idx] for idx in fafb_idx_to_type}
  banc_idx_to_side_function = {idx: banc_idx_to_side[idx] + ':' + banc_idx_to_function[idx] for idx in banc_idx_to_type}



----

# Analysis

## Quick plots to review the datasets and orient ourselves

In [None]:
# Plot of frequency of synapse counts for FAFB and BANC datasets

plt.figure(figsize=(10, 5))
plt.hist(fafb_syncount.data, bins=100, alpha=0.3, color='blue')
plt.hist(banc_syncount.data, bins=100, alpha=0.3, color='red')

plt.yscale('log')
#plt.xscale('log')
plt.xlabel('synapse count')
plt.ylabel('frequency')
plt.legend(['FAFB', 'BANC'])
plt.show()

Conclusion: So generally more synapses in FAFB relative to BANC

In [None]:
# Plot of frequency of input proportions for FAFB and BANC datasets

plt.figure(figsize=(10, 5))
plt.hist(fafb_inprop.data, bins=100, alpha=0.3, color='blue')
plt.hist(banc_inprop.data, bins=100, alpha=0.3, color='red')

plt.yscale('log')
# plt.xscale('log')
plt.xlabel('input proportion')
plt.ylabel('frequency')
plt.legend(['FAFB', 'BANC'])
plt.show()

Conclusion: No difference between datasets

# Connectivity Between Corazonin (CRZ) neurons and Descending Neurons (DNs)

## Dataset 1: FAFB

In [None]:
# FAFB - start by selecting neurons we are interested in (not all 140,000!)

# define the starting neurons
fafb_crz = fafb_meta.idx[fafb_meta.cell_type == 'CRZ']
# define what they connect to
fafb_desc = fafb_meta.idx[(fafb_meta.super_class == 'descending')]

In [None]:
# Find *ALL* paths (connections) based on proportion of input

paths_fafb_all = find_paths_of_length(fafb_inprop, inidx = fafb_crz, # we added our input CRZ neurons here
                             outidx = fafb_desc,                     # we added our output DNs here
                             target_layer_number=1)
paths_fafb_all = group_paths(paths_fafb_all, fafb_idx_to_function, fafb_idx_to_function)
plot_paths(paths_fafb_all, neuron_to_sign=fafb_function_to_sign,
                                 interactive = True)

# This one is not automatically saved -
# move the dots and save manually if you would prefer (not required)

### ^If you want to save a copy of this figure, you can edit the locations of the circles for readability. You then right click on the image and go to 'Save As' and you can name the file and save it somewhere to access later. Otherwise there is not an easy way to save any changes in the interactive mode (we will save a static snapshot below).

In [None]:
# Another way to view the data to see everything

paths_fafb_all.sort_values(by = ['weight'],ascending = False)

In [None]:
# How many are in this list?

print(f"There are {len(paths_fafb_all)} connections shown on the plot")

In [None]:
# Filtered version of above

# This is to save the figure we make
import os
cwd = os.getcwd()
os.chdir(save_path)
# ---- Save ----
fig_name = "Figure01_paths_fafb_filtered"
full_path = os.path.join(save_path, fig_name)

# This is the code to actually make the figure
paths_fafb = find_paths_of_length(fafb_inprop, inidx = fafb_crz, # we added our input CRZ neurons here
                             outidx = fafb_desc,                 # we added our output DNs here
                             target_layer_number=1)
paths_fafb = group_paths(paths_fafb, fafb_idx_to_function, fafb_idx_to_function)
paths_fafb = filter_paths(paths_fafb, 0.01)                     # threshold to get more consistent connections across datasets
plot_paths(paths_fafb, neuron_to_sign=fafb_function_to_sign,
           interactive = False,
           save_plot=True,
           file_name=fig_name)

print(f"Saved static figure to: {fig_name}")

os.chdir(cwd)

## Dataset 2: BANC

In [None]:
# BANC - start by selecting neurons we are interested in (not all 160,000!)

# define the starting neurons
banc_crz = banc_meta.idx[banc_meta.cell_type == 'l_NSC_CRZ']
# define what they connect to
banc_desc = banc_meta.idx[(banc_meta.super_class == 'descending')]

In [None]:
# Find *ALL* paths (connections) based on proportion of input

paths_banc_all = find_paths_of_length(banc_inprop, inidx = banc_crz,    # we added our input CRZ neurons here
                             outidx = banc_desc,                        # we added our output DNs here
                             target_layer_number=1)
paths_banc_all = group_paths(paths_banc_all, banc_idx_to_function, banc_idx_to_function)
plot_paths(paths_banc_all, neuron_to_sign=banc_function_to_sign,
           interactive = False)

# This one is not automatically saved -
# move the dots and save manually if you would prefer (not required)

In [None]:
# Filtered version of above

# This is to save the figure we make
cwd = os.getcwd()
os.chdir(save_path)
# ---- Save ----
fig_name = "Figure02_paths_banc_filtered"
full_path = os.path.join(save_path, fig_name)

# This is the code to actually make the figure
paths_banc = find_paths_of_length(banc_inprop, inidx = banc_crz,       # we added our input CRZ neurons here
                             outidx = banc_desc,                       # we added our output DNs here
                             target_layer_number=1)
paths_banc = group_paths(paths_banc, banc_idx_to_function, banc_idx_to_function)
paths_banc = filter_paths(paths_banc, 0.01)                 # threshold to get more consistent connections across datasets
plot_paths(paths_banc, neuron_to_sign=banc_function_to_sign,
           interactive = False,
           save_plot=True,
           file_name=fig_name)

print(f"Saved static figure to: {fig_name}")

os.chdir(cwd)

# Wrapping Up

Before you finish the lab, make sure to double check you have completed all of the following items for your write-up:


1.   You have saved all the plots and you can locate them in your google drive

*  Plots are saved here:  <font color="red" size=4> MyDrive/NS479_Lab01_ConnectomicsFigures</font>

2. Go back to the activity worksheet and answer the final questions

---



## Questions ‚Åâ
If you're unsure or have questions, please ask us!


---
<a name="credits"></a>
# Technical Notes & Credits üëè üßë

The exercises for this notebook were adapted from our colleague Yijie Yin, who developed the [Connectome Interpreter](https://github.com/YijieYin/connectome_interpreter).

*   There are additional code exercises and notebooks if you are interested in working with these datasets further

The preprint for the paper describing this tool is available on [biorxiv](https://www.biorxiv.org/content/10.1101/2025.09.29.679410v2)




