In this notebook we're going to improve region coverage which is already built in the metadata. 
Cerebellar lobes will be split in teir leaf regions (Purkinje, Molecular, Granular layers)
OLF glomerular layer will be created from unassigned regions

In [17]:
import os
import pickle
import pandas as pd
import numpy as np

In [18]:
#Updated Metadata

file = '/gpfs/bbp.cscs.ch/data/project/proj84/csaba/aibs_10x_mouse_wholebrain/metadata/parcellation_to_parcellation_term_membership_extend.csv'
parcellation_annotation = pd.read_csv(file)


Need to create rows in parcellation_annotation

## CRB

Rules for Cerebellum:
- 		As for neuron types,  Molecular layers are __purely__ inhibitory ? YES
- 		No Glutamatergic cells can be found in the molecular layers ? YES
-  		Astrocytes are common in the molecular layers ? It seems YES.  (verify in the literature)
* 		Astrocytes can be found in the Purkinje layers ?YES, it's the localization of the soma. The astrocytes are called Bergmann glia
+ 		Microglia (immune cells) can be found in every layer ? It seems YES (verify in the litterature)
+ 		Oligos are present in every layer ? YES (more time to read about it)
+ 		Oligos are rare in the molecular layers ? YES (I need more time to read about it)

As a result:
- Purkinje layers will inherit 100% purkinje cells, 100% bergman glia, and microglia
- Molecular layers (PURELY INH): will not inherit bergman glia, purkinje cells, GLUT cells
- Granular layers (EXC): will inherit everything except: purkinje cells, bergman glia, GABA (except Golgi which is present)

- 
For info see: https://www.frontiersin.org/articles/10.3389/fninf.2019.00037/full

In [19]:
#Load Cerebellum specific cell types (calculated in test_cerebellum.ipynb)
cb_cells = pd.read_csv('/gpfs/bbp.cscs.ch/data/project/proj84/csaba/aibs_10x_mouse_wholebrain/metadata/MERFISH-C57BL6J-638850-CCF/20231215/views/CB_cells_small.csv')
cb_cells.set_index('cell_label',inplace=True)
cb_clusters = np.unique(cb_cells['cluster'])

In [20]:
#Purkinje layer
Purkinje_cells = cb_cells[cb_cells['supertype'].str.contains('Purkinje', na=False)]
purkinjes = np.unique(Purkinje_cells['cluster'])
Bergman_cells = cb_cells[cb_cells['supertype'].str.contains('Bergman', na=False)]
bergmann_glia = np.unique(Bergman_cells['cluster'])
microglia_cells = cb_cells[cb_cells['supertype'].str.contains('Microglia', na=False)]
microglia = np.unique(microglia_cells['cluster'])
excitatory_cells = cb_cells[cb_cells['neurotransmitter'].isin(['Glut'])]
excitatory = np.unique(excitatory_cells['cluster'])
inhibitory_cells = cb_cells[cb_cells['neurotransmitter'].isin(['GABA', 'GABA-Glyc', 'Glut-GABA',])]
inhibitory = np.unique(inhibitory_cells['cluster'])
print(len(inhibitory))
inhibitory_exc_golgi_cells = inhibitory_cells[~inhibitory_cells['cluster'].isin([ '5186 CBX Golgi Gly-Gaba_1',
 '5187 CBX Golgi Gly-Gaba_1',])]
inhibitory_exc_golgi = np.unique(inhibitory_exc_golgi_cells['cluster'])
print(len(inhibitory_exc_golgi))

210
208


In [21]:
#Use unpacking operator for purkinje layers
l_purkinje_ctypes = [*purkinjes, *bergmann_glia, *microglia] #Done

#Use unpacking operator for molecular layers
l_molecular_ctypes = [ctype for ctype in [*cb_clusters] if ctype not in l_purkinje_ctypes and ctype not in bergmann_glia and ctype not in excitatory]
# Append elements from microglia
l_molecular_ctypes.extend(microglia)

#Use unpacking operator for granular layers
l_granular_ctypes = [ctype for ctype in [*cb_clusters] if ctype not in l_purkinje_ctypes and ctype not in bergmann_glia and ctype not in inhibitory_exc_golgi]
l_granular_ctypes.extend(microglia)


In [22]:
import sys
sys.path.append('/gpfs/bbp.cscs.ch/data/project/proj84/csaba/aibs_10x_mouse_wholebrain/notebooks/scripts/')

from helper_functions import get_all_filenames, get_csv_filenames, extract_prefix_from_filenames

download_base = '/gpfs/bbp.cscs.ch/data/project/proj84/csaba/aibs_10x_mouse_wholebrain/'
root_folder = f"{download_base}results/density_calculations/"

# Get all regional density data
folder_path = f"{root_folder}csv/"
filenames = get_all_filenames(folder_path)
csv_filenames = get_csv_filenames(folder_path)
prefixes = extract_prefix_from_filenames(csv_filenames)
unique_prefixes = sorted(list(set(prefixes)))

In [23]:
#arb has now leaf regions

from voxcell import RegionMap

#region_map_path ='/gpfs/bbp.cscs.ch/project/proj62/csaba/atlas/bbp_prod_files/1.json'
region_map_path = '/gpfs/bbp.cscs.ch/data/project/proj84/atlas_pipeline_runs/2024-05-15T22:44:26+02:00/hierarchy_ccfv3_l23split_barrelsplit.json'
region_map = RegionMap.load_json(region_map_path)

print(region_map.find('arbor vitae', attr='name', with_descendants=True))
print(region_map.find('Dentate nucleus', attr='name', with_descendants=True))
print(region_map.find('Copula pyramidis', attr='name', with_descendants=True))
print(region_map.find('Fastigial nucleus', attr='name', with_descendants=True))
print(region_map.find('Vestibulocerebellar nucleus', attr='name', with_descendants=True))

#find regions not in aibs cerebellum data:
print(region_map.find('Cerebellar nuclei', attr='name', with_descendants=True)) #contains 4 nuclei
print(region_map.get(589508455, "name", with_ascendants=False))
print(region_map.get(212, "name", with_ascendants=False))

{728}
{846}
{1033, 10684, 10685, 10686}
{989}
{589508455}
{519, 91, 989, 846, 589508455}
Vestibulocerebellar nucleus
Main olfactory bulb, glomerular layer


In [24]:
#for this list see density2.5_scaling
crb_keys = ['ANcr1', 'ANcr2', 'CBunassigned', 'CENT2', 'CENT3', 'COPY', 'CUL45', 'DEC', 'DN', 'FL', 'FN', 'FOTU', 'IP', 'LING', 'NOD', 'PFL', 'PRM', 'PYR', 'SIM', 'UVU', 'VeCB', 'arb']

print(len(crb_keys))

# Remove the items like 'arb' from the list as it is a leaf region and has no layers
items_to_remove = ['arb', 'IP', 'DN', 'FN', 'VeCB', 'CBunassigned']

# Remove the items from the list
for item in items_to_remove:
    if item in crb_keys:
        crb_keys.remove(item)

print(len(crb_keys))

# Filter the list of files with cerebellum cell densities
crb_filenames = [filename for filename in csv_filenames if any(filename.startswith(prefix + "_") for prefix in crb_keys)]
len(crb_filenames) == len(crb_keys)


22
16


True

In [25]:
#This cell proves that all files are physically there!
# Convert lists to sets for efficient operations
unique_prefixes_set = set(unique_prefixes)
crb_keys_set = set(crb_keys)

# Find intersection (elements in both lists)
common_elements = unique_prefixes_set.intersection(crb_keys_set)

# Find difference (elements in crb_keys but not in unique_prefixes)
different_elements = crb_keys_set.difference(unique_prefixes_set)
different_elements

set()

In [26]:
download_base = '/gpfs/bbp.cscs.ch/data/project/proj84/csaba/aibs_10x_mouse_wholebrain/'
root_folder = f"{download_base}results/density_calculations/"
csv_folder = os.path.join(root_folder, 'csv/')

for file in crb_filenames:
    print(file)
    region = file.split('_')[0]
    cb_lobule = pd.read_csv(root_folder + 'csv/' + file, usecols=['cluster', 'density_mm3'], )
    cb_purkinje = cb_lobule[cb_lobule['cluster'].isin(l_purkinje_ctypes)]
    cb_molecular = cb_lobule[cb_lobule['cluster'].isin(l_molecular_ctypes)]
    cb_granular = cb_lobule[cb_lobule['cluster'].isin(l_granular_ctypes)]
    
    cb_purkinje.index.name = region + 'purkinjelayer'
    cb_molecular.index.name = region + 'molecularlayer'
    cb_granular.index.name = region + 'granularlayer'

    cb_purkinje.to_csv(csv_folder + cb_purkinje.index.name + '_density_two_sides.csv', index=True)
    cb_molecular.to_csv(csv_folder + cb_molecular.index.name + '_density_two_sides.csv', index=True)
    cb_granular.to_csv(csv_folder + cb_granular.index.name + '_density_two_sides.csv', index=True)

    # Create new filename with .bak extension
    base_name = os.path.splitext(file)[0]  # Gets the filename without the extension
    new_file = base_name + '.bak'
    
    # Rename the files since they have been split to 3 csv files, and we don't want them counted twice
    os.rename(csv_folder + file, csv_folder + new_file)
    print(f"Renamed {file} to {new_file}")
#Splitting Lobules 4, 5, 4-5 is not needed as 4-5 is just one lobule, no need to split further

ANcr1_density_two_sides.csv
Renamed ANcr1_density_two_sides.csv to ANcr1_density_two_sides.bak
ANcr2_density_two_sides.csv
Renamed ANcr2_density_two_sides.csv to ANcr2_density_two_sides.bak
CENT2_density_two_sides.csv
Renamed CENT2_density_two_sides.csv to CENT2_density_two_sides.bak
CENT3_density_two_sides.csv
Renamed CENT3_density_two_sides.csv to CENT3_density_two_sides.bak
COPY_density_two_sides.csv
Renamed COPY_density_two_sides.csv to COPY_density_two_sides.bak
CUL45_density_two_sides.csv
Renamed CUL45_density_two_sides.csv to CUL45_density_two_sides.bak
DEC_density_two_sides.csv
Renamed DEC_density_two_sides.csv to DEC_density_two_sides.bak
FL_density_two_sides.csv
Renamed FL_density_two_sides.csv to FL_density_two_sides.bak
FOTU_density_two_sides.csv
Renamed FOTU_density_two_sides.csv to FOTU_density_two_sides.bak
LING_density_two_sides.csv
Renamed LING_density_two_sides.csv to LING_density_two_sides.bak
NOD_density_two_sides.csv
Renamed NOD_density_two_sides.csv to NOD_density

```Python

10684 Copula pyramidis, granular layer
10686 Copula pyramidis, molecular layer
10685 Copula pyramidis, Purkinje layer
10675 Crus 1, granular layer
10676 Crus 1, Purkinje layer
10677 Crus 1, molecular layer
10680 Crus 2, molecular layer
10679 Crus 2, Purkinje layer
10678 Crus 2, granular layer
10723 Declive (VI), granular layer
10724 Declive (VI), Purkinje layer
10725 Declive (VI), molecular layer
```

```
1143 Cerebellar cortex, granular layer
1144 Cerebellar cortex, molecular layer
1145 Cerebellar cortex, Purkinje layer

10684 Copula pyramidis, granular layer
10686 Copula pyramidis, molecular layer
10685 Copula pyramidis, Purkinje layer
10672 Simple lobule, granular layer
10673 Simple lobule, Purkinje layer
10674 Simple lobule, molecular layer
10675 Crus 1, granular layer
10676 Crus 1, Purkinje layer
10677 Crus 1, molecular layer
10680 Crus 2, molecular layer
10679 Crus 2, Purkinje layer
10678 Crus 2, granular layer
10681 Paramedian lobule, granular layer
10682 Paramedian lobule, Purkinje layer
10683 Paramedian lobule, molecular layer
10687 Paraflocculus, granular layer
10688 Paraflocculus, Purkinje layer
10689 Paraflocculus, molecular layer
10690 Flocculus, granular layer
10691 Flocculus, Purkinje layer
10692 Flocculus, molecular layer
10705 Lingula (I), granular layer
10706 Lingula (I), Purkinje layer
10707 Lingula (I), molecular layer
10708 Lobule II, granular layer
10709 Lobule II, Purkinje layer
10710 Lobule II, molecular layer
10711 Lobule III, granular layer
10712 Lobule III, Purkinje layer
10713 Lobule III, molecular layer
10714 Lobule IV, granular layer ***
10715 Lobule IV, Purkinje layer
10716 Lobule IV, molecular layer
10717 Lobule V, granular layer ***
10718 Lobule V, Purkinje layer
10719 Lobule V, molecular layer
10720 Lobules IV-V, granular layer ***
10721 Lobules IV-V, Purkinje layer
10722 Lobules IV-V, molecular layer
10723 Declive (VI), granular layer
10724 Declive (VI), Purkinje layer
10725 Declive (VI), molecular layer
10728 Folium-tuber vermis (VII), molecular layer
10726 Folium-tuber vermis (VII), granular layer
10727 Folium-tuber vermis (VII), Purkinje layer
10729 Pyramus (VIII), granular layer
10730 Pyramus (VIII), Purkinje layer
10731 Pyramus (VIII), molecular layer
10733 Uvula (IX), Purkinje layer
10732 Uvula (IX), granular layer
10734 Uvula (IX), molecular layer
10737 Nodulus (X), molecular layer
10736 Nodulus (X), Purkinje layer
10735 Nodulus (X), granular layer
```

## OLF

Rules for Olfactory areas:

212: 'Main olfactory bulb, glomerular layer’: 507 'Main olfactory bulb' or OLF - unassigned

Important: The granular layer is the cell dense one, which is the innermost part of the OLF (coronal cross section)
The glomerular layer is the outermost layer and one can see empty glomerulii in the region where only axons persist. 

In [27]:
#212 is not part of the parcellation database:
parcellation_annotation[parcellation_annotation['label_numbers'] == 212]

Unnamed: 0.1,Unnamed: 0,parcellation_label,parcellation_term_label,parcellation_term_set_label,parcellation_index,voxel_count,volume_mm3,color_hex_triplet,red,green,blue,parcellation_term_name,parcellation_term_acronym,parcellation_term_set_name,term_set_order,term_order,parent_term_label,label_numbers,cluster_as_filename
3440,3440,AllenCCF-Annotation-2020-212,ABC-Ontology-2023-MOB-substructure,AllenCCF-Ontology-2017-SUBS,497,743860.0,0.007439,#9AD2BD,154,210,189,"Main olfactory bulb, glomerular layer",MOBglomerularlayer,substructure,4,252,AllenCCF-Ontology-2017-507,212,MOBglomerularlayer


In [28]:
parcellation_annotation[parcellation_annotation['parcellation_term_acronym'] == 'MOB-unassigned']

Unnamed: 0.1,Unnamed: 0,parcellation_label,parcellation_term_label,parcellation_term_set_label,parcellation_index,voxel_count,volume_mm3,color_hex_triplet,red,green,blue,parcellation_term_name,parcellation_term_acronym,parcellation_term_set_name,term_set_order,term_order,parent_term_label,label_numbers,cluster_as_filename
3147,3147,AllenCCF-Annotation-2020-507,ABC-Ontology-2023-MOB-substructure,AllenCCF-Ontology-2017-SUBS,497,4298904.0,4.298904,#9AD2BD,154,210,189,"Main olfactory bulb, unassigned",MOB-unassigned,substructure,4,252,AllenCCF-Ontology-2017-507,507,MOBunassigned


In [29]:
from voxcell import RegionMap

#region_map_path ='/gpfs/bbp.cscs.ch/project/proj62/csaba/atlas/bbp_prod_files/1.json'
region_map_path = '/gpfs/bbp.cscs.ch/data/project/proj84/atlas_pipeline_runs/2024-05-15T22:44:26+02:00/hierarchy_ccfv3_l23split_barrelsplit.json'
region_map = RegionMap.load_json(region_map_path)

#We check for MOB-unassigned 507 and get all descendant regions
print(region_map.get(507, "name", with_ascendants=False))
print(region_map.get(212, "name", with_ascendants=False))

Main olfactory bulb
Main olfactory bulb, glomerular layer


In [30]:
olf_issues = 'MOBunassigned'

In [31]:
download_base = '/gpfs/bbp.cscs.ch/data/project/proj84/csaba/aibs_10x_mouse_wholebrain/'
root_folder = f"{download_base}results/density_calculations/"

olf_1 = pd.read_csv(root_folder + 'csv/' + 'MOBunassigned_density_two_sides.csv', 
                   usecols=['cluster', 'density_mm3'], )

In [32]:
# Rename the first column
olf_1.index.name = 'MOBglomerularlayer'

In [33]:
olf_1

Unnamed: 0_level_0,cluster,density_mm3
MOBglomerularlayer,Unnamed: 1_level_1,Unnamed: 2_level_1
0,0494 OB Eomes Ms4a15 Glut_4,12923.701554
1,0571 OB-out Frmd7 Gaba_1,11579.865330
2,0572 OB-out Frmd7 Gaba_1,11193.869819
3,5291 OEC NN_1,10907.947219
4,0569 OB-out Frmd7 Gaba_1,10793.578178
...,...,...
194,5237 Astroependymal NN_1,14.296130
195,0554 OB-in Frmd7 Gaba_3,14.296130
196,1029 NDB-SI-ant Prdm12 Gaba_1,14.296130
197,0745 Pvalb Gaba_4,14.296130


In [34]:
# Save the DataFrame to a CSV file
olf_1.to_csv(root_folder + 'csv/' + 'MOBglomerularlayer_density_two_sides.csv', index=True)

In [35]:
import sys
sys.path.append('/gpfs/bbp.cscs.ch/data/project/proj84/csaba/aibs_10x_mouse_wholebrain/notebooks/scripts/')

from helper_functions import get_all_filenames, get_csv_filenames, extract_prefix_from_filenames

download_base = '/gpfs/bbp.cscs.ch/data/project/proj84/csaba/aibs_10x_mouse_wholebrain/'
root_folder = f"{download_base}results/density_calculations/"

# Get all regional density data
folder_path = f"{root_folder}csv/"
filenames = get_all_filenames(folder_path)
csv_filenames = get_csv_filenames(folder_path)
prefixes = extract_prefix_from_filenames(csv_filenames)
unique_prefixes = sorted(list(set(prefixes)))

In [36]:
len(unique_prefixes)

703

In [37]:
crb_keys = ['ANcr1', 'ANcr2', 'CBunassigned', 'CENT2', 'CENT3', 'COPY', 'CUL45', 'DEC', 'DN', 'FL', 'FN', 'FOTU', 'IP', 'LING', 'NOD', 'PFL', 'PRM', 'PYR', 'SIM', 'UVU', 'VeCB', 'arb']

_ = parcellation_annotation[parcellation_annotation['parcellation_term_acronym'].isin(crb_keys)]
_[_['parcellation_term_set_name']=='substructure']

Unnamed: 0.1,Unnamed: 0,parcellation_label,parcellation_term_label,parcellation_term_set_label,parcellation_index,voxel_count,volume_mm3,color_hex_triplet,red,green,blue,parcellation_term_name,parcellation_term_acronym,parcellation_term_set_name,term_set_order,term_order,parent_term_label,label_numbers,cluster_as_filename
2641,2641,AllenCCF-Annotation-2020-976,AllenCCF-Ontology-2017-976,AllenCCF-Ontology-2017-SUBS,966,1335660.0,1.33566,#FFFC91,255,252,145,Lobule II,CENT2,substructure,4,711,AllenCCF-Ontology-2017-920,976,CENT2
2646,2646,AllenCCF-Annotation-2020-984,AllenCCF-Ontology-2017-984,AllenCCF-Ontology-2017-SUBS,974,2743760.0,2.74376,#FFFC91,255,252,145,Lobule III,CENT3,substructure,4,712,AllenCCF-Ontology-2017-920,984,CENT3
2680,2680,AllenCCF-Annotation-2020-1056,AllenCCF-Ontology-2017-1056,AllenCCF-Ontology-2017-SUBS,1045,5693780.0,5.69378,#FFFC91,255,252,145,Crus 1,ANcr1,substructure,4,740,AllenCCF-Ontology-2017-1017,1056,ANcr1
2685,2685,AllenCCF-Annotation-2020-1064,AllenCCF-Ontology-2017-1064,AllenCCF-Ontology-2017-SUBS,1053,5111852.0,5.111852,#FFFC91,255,252,145,Crus 2,ANcr2,substructure,4,741,AllenCCF-Ontology-2017-1017,1064,ANcr2
3398,3398,AllenCCF-Annotation-2020-912,ABC-Ontology-2023-LING-substructure,AllenCCF-Ontology-2017-SUBS,902,121628.0,0.121628,#FFFC91,255,252,145,Lingula (I),LING,substructure,4,707,AllenCCF-Ontology-2017-912,912,LING
3399,3399,AllenCCF-Annotation-2020-936,ABC-Ontology-2023-DEC-substructure,AllenCCF-Ontology-2017-SUBS,926,3334748.0,3.334748,#FFFC91,255,252,145,Declive (VI),DEC,substructure,4,716,AllenCCF-Ontology-2017-936,936,DEC
3400,3400,AllenCCF-Annotation-2020-944,ABC-Ontology-2023-FOTU-substructure,AllenCCF-Ontology-2017-SUBS,934,1053398.0,1.053398,#FFFC91,255,252,145,Folium-tuber vermis (VII),FOTU,substructure,4,720,AllenCCF-Ontology-2017-944,944,FOTU
3401,3401,AllenCCF-Annotation-2020-951,ABC-Ontology-2023-PYR-substructure,AllenCCF-Ontology-2017-SUBS,941,1248018.0,1.248018,#FFFC91,255,252,145,Pyramus (VIII),PYR,substructure,4,724,AllenCCF-Ontology-2017-951,951,PYR
3402,3402,AllenCCF-Annotation-2020-957,ABC-Ontology-2023-UVU-substructure,AllenCCF-Ontology-2017-SUBS,947,2191534.0,2.191534,#FFFC91,255,252,145,Uvula (IX),UVU,substructure,4,728,AllenCCF-Ontology-2017-957,957,UVU
3403,3403,AllenCCF-Annotation-2020-968,ABC-Ontology-2023-NOD-substructure,AllenCCF-Ontology-2017-SUBS,958,1519098.0,1.519098,#FFFC91,255,252,145,Nodulus (X),NOD,substructure,4,732,AllenCCF-Ontology-2017-968,968,NOD
