# MERFISH whole brain spatial transcriptomics (part 2b)

Using the cached results from part 2a, we can continue to explore our examples looking at the expression of canonical neurotransmitter transporter genes and gene Tac2 over the whole brain.

In [13]:
import os
import pandas as pd
import numpy as np
import anndata
import time

In [16]:
input_base = '/allen/programs/celltypes/workgroups/rnaseqanalysis/lydian/ABC_handoff'
input_directory = os.path.join( input_base, 'dataframes', 'MERFISH-C57BL6J-638850','20230630' )

view_directory = os.path.join( input_directory, 'views')
cache_views = False
if cache_views :
    os.makedirs( view_directory, exist_ok=True )

Read in the expanded cell metadata table we created in part 1.

In [17]:
file = os.path.join( input_directory,'views','cell_metadata_with_cluster_annotation.csv')
cell = pd.read_csv(file,dtype={"cell_label":str,"neurotransmitter":str})
cell.set_index('cell_label',inplace=True)

pred = (cell['low_quality_mapping'] == False)
cell = cell[pred]

In [18]:
cell.columns

Index(['brain_section_label', 'cluster_alias', 'average_correlation_score',
       'matrix_prefix', 'donor_label', 'low_quality_mapping', 'donor_genotype',
       'donor_sex', 'x', 'y', 'z', 'neurotransmitter', 'division', 'class',
       'subclass', 'supertype', 'cluster', 'neurotransmitter_color',
       'division_color', 'class_color', 'subclass_color', 'supertype_color',
       'cluster_color'],
      dtype='object')

Read in the gene expression dataframe we created in part 2a.

In [19]:
file = os.path.join( input_directory,'views','example_genes_all_cells_expression.csv')
exp = pd.read_csv(file,dtype={"cell_label":str})
exp.set_index('cell_label',inplace=True)

We define a helper functions *aggregate_by_metadata* to compute the average expression for a given catergory.

In [20]:
def aggregate_by_metadata( df, gnames, value, sort=False ) :
    grouped = df.groupby(value)[gnames].mean()
    if sort :
        grouped = grouped.sort_values(by=gnames[0],ascending=False)
    return grouped

### Expression of canonical neurotransmitter transporter genes

During analysis, clusters were assigned neurotransmitter identities based on the expression of of canonical neurotransmitter transporter genes. In this example, we create a dataframe comprising of expression of the 9 solute carrier family genes for all the cells in the dataset.  We then group the cells by the assigned neurotransmitter class and compute the mean expression for each group and visualized as a colorized table.

The results are similar that in part 1. Using data from the whole brain, gene Slc17a7 is now most enriched in glutamatergic assigned cells. Gene Slc17a6 is most enriched in noradrenergic, then cholinergic types. Genes Slc6a5, Slc6a3 and Slc6a4 shows high specificity to glycinergic, dopaminergic, serotonergic respectively.

ntgenes = ['Slc17a7','Slc17a6','Slc17a8','Slc32a1','Slc6a5','Slc6a3','Slc6a4']
filtered = exp[ntgenes]
joined = cell.join( filtered )

In [29]:
agg = aggregate_by_metadata( joined, ntgenes, 'neurotransmitter' )
agg = agg[ntgenes]
agg.style.background_gradient(cmap='Reds')

Unnamed: 0_level_0,Slc17a7,Slc17a6,Slc17a8,Slc32a1,Slc6a5,Slc6a3,Slc6a4
neurotransmitter,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
Chol,1.336436,1.825008,0.908047,0.775578,0.348834,0.102714,0.12668
Dopa,1.124439,1.19346,0.167337,2.669397,0.107171,4.27548,0.121355
GABA,1.292743,0.330417,0.189557,4.356985,0.13847,0.073594,0.071125
GABA-Glyc,0.551766,0.866311,0.15279,5.134373,4.611229,0.072027,0.069877
Glut,5.45857,1.465431,0.19853,0.644799,0.1686,0.065161,0.060973
Glut-GABA,1.584636,1.415066,3.167689,4.636293,0.139741,0.203079,0.19931
Hist,0.414171,0.273116,0.09279,0.203777,0.042676,0.019365,0.037957
Nora,0.350148,2.256105,0.242355,0.509979,0.421445,0.078692,0.047854
Sero,0.249213,0.516299,2.996847,0.809066,0.279549,0.087234,6.667489


Grouping expression by dissection region of interest shows that each of these genes have distinct spatial patterns. 

In [38]:
agg = aggregate_by_metadata( joined, ntgenes, 'brain_section_label' )
agg = agg.loc[list(reversed(list(agg.index)))]
agg.style.background_gradient(cmap='Reds')

Unnamed: 0_level_0,Slc17a7,Slc17a6,Slc17a8,Slc32a1,Slc6a5,Slc6a3,Slc6a4
brain_section_label,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
C57BL6J-638850.69,1.596082,0.102863,0.033591,3.268463,0.013872,0.061025,0.085603
C57BL6J-638850.68,1.701641,0.234637,0.077904,3.005675,0.041118,0.136117,0.05007
C57BL6J-638850.67,2.20549,0.360153,0.112344,3.257303,0.039684,0.194501,0.059391
C57BL6J-638850.66,2.712561,0.463372,0.061934,3.460905,0.052186,0.224061,0.071751
C57BL6J-638850.64,2.32188,0.801335,0.055074,1.929805,0.032504,0.106636,0.045241
C57BL6J-638850.62,3.847128,0.596643,0.104738,1.759757,0.069106,0.10116,0.039346
C57BL6J-638850.61,4.438801,0.457742,0.221816,1.608314,0.085238,0.087731,0.060141
C57BL6J-638850.60,5.213249,0.5016,0.26211,1.621542,0.114276,0.101325,0.0725
C57BL6J-638850.59,5.266406,0.450183,0.186309,1.411635,0.088505,0.052575,0.038246
C57BL6J-638850.58,5.281095,0.353223,0.176152,1.144399,0.108922,0.04336,0.028584


### Expression of Tachykinin 2 (Tac2) in the whole brain

In mice, the tachykinin 2 (Tac2) gene encodes neuropeptide called neurokinin B (NkB). Tac2 is produced by neurons in specific regions of the brain know to be invovled in emotion and social behavior. Based on [ISH data](https://mouse.brain-map.org/experiment/siv?id=77279001&imageId=77284584&initImage=ish&coordSystem=pixel&x=5384.5&y=3832.5&z=3) from the Allen Mouse Brain Atlas, Tac 2 is sparsely expressed in the mouse isocortex and densely enriched is specific subcortical regions such the medial habenula (MH), the amygdala and hypothalamus.

In this example, we create a dataframe comprising expression values of Tac2 for all cells across the whole brain. As with the single brain section example, grouping expression by neurotransmitter show that Tac2 gene is enriched in cholinergic cell types. With the rest of brain included, we can observe that Tac2 is also enriched in Glut-GABA cell types as well.

In [39]:
exgenes = ['Tac2']
filtered = exp[exgenes]
joined = cell.join( filtered )
agg = aggregate_by_metadata( joined, exgenes, 'neurotransmitter', True )
agg.style.background_gradient(cmap='Reds')

Unnamed: 0_level_0,Tac2
neurotransmitter,Unnamed: 1_level_1
Glut-GABA,0.996132
Chol,0.86014
GABA,0.245813
GABA-Glyc,0.206716
Glut,0.183282
Hist,0.172152
Nora,0.13576
Dopa,0.13044
Sero,0.121097


Grouping by class, shows that Tac2 is enriched in class "08 MH-LH Glut" with cells restricted to the medial (MH) and lateral (LH) habenula and a mixture of glutamatergic and cholinergic type and "04 CGE GABA" GABAergic cells originating from the caudal ganglionic eminence (CGE).

In [25]:
agg = aggregate_by_metadata( joined, exgenes, 'class', True ).head(8)
agg.style.background_gradient(cmap='Reds')

Unnamed: 0_level_0,Tac2
class,Unnamed: 1_level_1
08 MH-LH Glut,3.684035
04 CGE GABA,1.193551
14 CNU-HYa GABA,0.688624
11 HY GABA,0.528664
10 HY MM Glut,0.428825
15 HY Glut,0.364134
03 MOB-DG-IMN,0.223259
21 P GABA,0.222121


At the next level, grouping by subclass reveals enrichment is highly anatomically localized cell types such as the medial habenula (MH), bed nuclei of the stria terminalis (BST), spinal nucleus of the trigeminal (SPVC), main olfactory blub (MOB), central amygdalar nucleus (CEA) and arcuate hypothalamic nucleus (ARH).

In [26]:
agg = aggregate_by_metadata( joined, exgenes, 'subclass', True ).head(15)
agg.style.background_gradient(cmap='Reds')

Unnamed: 0_level_0,Tac2
subclass,Unnamed: 1_level_1
105 BST Tac2 Gaba,4.452095
063 MH Tac2 Glut,4.409363
095 CEA-BST Crh Gaba,2.967356
258 SPVC Nmu Glut,2.448923
276 MOB-mi Frmd7 Gaba,2.177203
037 Sncg Gaba,2.050399
036 Vip Gaba,1.967017
094 CEA-AAA-BST Ebf1 Gaba,1.946971
113 PVHd-DMH Lhx6 Gaba,1.541354
121 ARH-PVp Tbx3 Glut,1.319788
