# Allen Mouse Common Coordinate Framework (2020 version)

Allen Mouse Brain Common Coordinate Framework (CCFv3, [Wang et al, 2020](https://doi.org/10.1016/j.cell.2020.04.007)) is a 3D reference space is an average brain at 10um voxel resolution created from serial two-photon tomography images of 1,675 young adult C57Bl6/J mice. Using multimodal reference data, the entire brain parcellated directly in 3D, labeling every voxel with a brain structure spanning 43 isocortical areas and their layers, 314 subcortical gray matter structures, 81 fiber tracts, and 8 ventricular structures. The 2020 version adds new annotations for layers of the Ammon’s horn (CA), main olfactory bulb (MOB) and minor modification of surrounding fiber tracts.

CCFv3 is used in informatics pipelines and online applications to analyze, visualize and integrate multimodal and multiscale data sets in 3D, and is openly accessible for research use.

The purpose of this notebook is to provide an overview of the data assets and information associated with the Allen CCFv3 through example use cases.

You need to be connected to the internet to run this notebook and that you have already downloaded the data via the getting started notebook.

In [1]:
import os
import pandas as pd
import numpy as np
import json
import matplotlib.pyplot as plt
import requests
import SimpleITK as sitk
import pathlib

The prerequisite for running this notebook is that the data have been downloaded to local directory maintaining the organization from the manifest.json. **Change the download_base variable to where you have downloaded the data in your system.**

In [2]:
download_base = '../../abc_download_root'

url = 'https://allen-brain-cell-atlas.s3-us-west-2.amazonaws.com/releases/20230630/manifest.json'
manifest = json.loads(requests.get(url).text)

In [3]:
view_directory = os.path.join( download_base, 
                               manifest['directory_listing']['Allen-CCF-2020']['directories']['metadata']['relative_path'], 
                              'views')
view_directory = pathlib.Path( view_directory )
cache_views = False
if cache_views :
    os.makedirs( view_directory, exist_ok=True )

## Data Overview

### Reference template and parcellations

There are 3 volumetric data files associated with AllenCCFv3. The "average_template_10" volume is the anatomical template of the CCF constructed as the shape and intensity average of 1675 specime brains. The "annotation_10" volume is the parcellation of the brain with respect to a heirachical partonomy of anatomical structures. The "annotation_boundary_10" volume is a mask that identifies all the voxel with on the boundary of a parcellation to support data visualization. The volumes are stored in compressed NIFTI (.nii.gz) format. In this notebook  uses the [SimpleITK](https://simpleitk.org/) library to open the volume. 

In [4]:
volumes = manifest['file_listing']['Allen-CCF-2020']['image_volumes']
volumes

In [5]:
print("reading average_template_10")
rpath = volumes['average_template_10']['files']['nii.gz']['relative_path']
file = os.path.join( download_base, rpath)
average_template_image = sitk.ReadImage( file )
average_template_array = sitk.GetArrayViewFromImage( average_template_image )

print("reading annotation_10")
rpath = volumes['annotation_10']['files']['nii.gz']['relative_path']
file = os.path.join( download_base, rpath)
annotation_image = sitk.ReadImage( file )
annotation_array = sitk.GetArrayViewFromImage( annotation_image )

print("reading annotation_boundary_10")
rpath = volumes['annotation_boundary_10']['files']['nii.gz']['relative_path']
file = os.path.join( download_base, rpath)
annotation_boundary_image = sitk.ReadImage( file )
annotation_boundary_array = sitk.GetArrayViewFromImage( annotation_boundary_image )

We define a helper function to print out some basic metadata about a volume

In [6]:
# Function to print out image information
def image_info( img ) :
    print('size: ' + str(img.GetSize()) + ' voxels')
    print('spacing: ' + str(img.GetSpacing()) + ' mm' )
    print('direction: ' + str(img.GetDirection()) )
    print('origin: ' + str(img.GetOrigin()))

Each volume is of size 1320 x 800 x 1140 voxels with voxel dimension being 10 x 10 x 10 micrometers. The volume is ASL orientation such that first (x) axis is anterior-to-posterior, the second (y) axis is superior-to-inferior (dorsal-to-ventral) and third (z) axis is left-to-right.

In [7]:
image_info(average_template_image)

In [8]:
spacing = average_template_image.GetSpacing()
voxel_volume = spacing[0] * spacing[1] * spacing[2]
print("voxel volume in mm^3:", "%0.2E" % voxel_volume)

We define a helper function to visualize the sanme coronal section of the average template, annotation and boundary volumes

In [9]:
def plot_section( slice, cmap=plt.cm.Greys_r, fig_width = 6, fig_height = 6 ) :
    fig, ax = plt.subplots()
    fig.set_size_inches(fig_width, fig_height)
    if cmap is not None :
        plt.imshow(slice, cmap=cmap)
    else :
        plt.imshow(slice)
    plt.axis("off")
    return fig, ax

In [10]:
zindex = 720
zslice = np.transpose(average_template_array[:,:,zindex])
fig, ax = plot_section(zslice)
res = ax.set_title('average_template')

In [11]:
zslice = np.transpose(annotation_array[:,:,zindex])
fig, ax = plot_section(zslice)
res = ax.set_title('annotation_10')

In [12]:
zslice = np.transpose(annotation_boundary_array[:,:,zindex])
fig, ax = plot_section(zslice,cmap=plt.cm.Greys)
res = ax.set_title('annotation_boundary_10')

### Anatomical structures and parcellation annotation

In [13]:
metadata = manifest['file_listing']['Allen-CCF-2020']['metadata']

### Parcellations

The annotation volume represents tiling of set of parcellations. Each row of the parcellation dataframe has a label (human readable string that is unique in the database), a parcellation index representing the value in the annotation volume and the number of voxel and volume of that parcellation.

In [14]:
rpath = metadata['parcellation']['files']['csv']['relative_path']
file = os.path.join( download_base, rpath)
parcellation = pd.read_csv(file)
parcellation.set_index('parcellation_index',inplace=True)
print("number of parcellations:",len(parcellation))
parcellation

### Parcellation term sets

For the purpose of ABC atlas visualization and analysis, we have created a simplifed 5 level anatomical heirarchy. Each of these level is represented as a parcellation term set. Each term set consists of a set of ordered terms. Each term set has a label (human readable string that is unique in the database), a name, description and order among the term sets.

In [15]:
rpath = metadata['parcellation_term_set']['files']['csv']['relative_path']
file = os.path.join( download_base, rpath)
parcellation_term_set = pd.read_csv(file)
parcellation_term_set.set_index('label',inplace=True)
print("number of term sets:",len(parcellation_term_set))
parcellation_term_set

### Parcellation terms and term set membership

A parcellation term represents an anatomical structure at a single heirarchy level. Each term has a label (human readable string that is unique in the database), a name, acronym, reference atlas color as hex triplet or as RGB values.

In [16]:
rpath = metadata['parcellation_term']['files']['csv']['relative_path']
file = os.path.join( download_base, rpath)
parcellation_term = pd.read_csv(file)
parcellation_term.set_index('label',inplace=True)
print("number of terms:",len(parcellation_term))
parcellation_term.head(5)

A parcellation is a member of at most one parcellation term set. This membership is represented as a row in the parcellation term set memership dataframe.

In [17]:
rpath = metadata['parcellation_term_set_membership']['files']['csv']['relative_path']
file = os.path.join( download_base, rpath)
parcellation_term_set_membership = pd.read_csv(file)
print("number of memberships:",len(parcellation_term_set_membership))
parcellation_term_set_membership.head(5)

### Parcellation to parcellation term membership

The association between a parcellation and parcellation term is represented as "parcellation to parcellation term membership" within the context of anatomical structure level.  It is expected that a parcellation in only associated with one term within a specific term set.

In [18]:
rpath = metadata['parcellation_to_parcellation_term_membership']['files']['csv']['relative_path']
file = os.path.join( download_base, rpath)
parcellation_annotation = pd.read_csv(file)
print("number of memberships:",len(parcellation_annotation))
parcellation_annotation

## Example use cases

### Aggregate parcellations and voxels per term

We can obtain parcellation and voxel count per parcellation annotation term using the pandas groupby function

In [19]:
# Count the number of parcellation associated with each parcellation term
term_parcellation_count = parcellation_annotation.groupby(['parcellation_term_label'])[['parcellation_index']].count()
term_parcellation_count.columns = ['number_of_parcellations']
term_parcellation_count.sort_values('number_of_parcellations',inplace=True, ascending=False)
term_parcellation_count

In [20]:
# Count the number of voxels associated with each parcellation term
term_voxel_count = parcellation_annotation.groupby(['parcellation_term_label'])[['voxel_count','volume_mm3']].sum()
term_voxel_count.sort_values('voxel_count',inplace=True, ascending=False)
term_voxel_count

In [21]:
# Join counts with the term dataframe
term_with_counts = parcellation_term.join( term_parcellation_count['number_of_parcellations'], how='inner' )
term_with_counts = term_with_counts.join( term_voxel_count[['voxel_count','volume_mm3']] )
term_with_counts[['name','number_of_parcellations','voxel_count','volume_mm3']]

For convenience, we can cache this view for later reuse.

In [22]:
if cache_views :
    file = os.path.join( view_directory, 'parcellation_term_with_counts.csv')
    term_with_counts.to_csv( file )

### Visualizing parcellation annotation at each hierachy level

We can explore the relationship and distribution of parcellations between term sets by creating a pivot table using pandas groupby fuunction. Each row of the resulting dataframe represents a parcellation, each column represents a term set and the value in the table is the name of the term that has been associated with the cluster for that specific term set.

In [23]:
pivot = parcellation_annotation.groupby(['parcellation_index','parcellation_term_set_name'])['parcellation_term_acronym'].first().unstack()
pivot = pivot[parcellation_term_set['name']] # order columns
pivot

In [24]:
name = parcellation_annotation.groupby(['parcellation_index','parcellation_term_set_name'])['parcellation_term_name'].first().unstack()
name = name[parcellation_term_set['name']] # order columns
name

We can also obtain parcellation term color pivot tables in the same way

In [25]:
color = parcellation_annotation.groupby(['parcellation_index','parcellation_term_set_name'])['color_hex_triplet'].first().unstack()
color = color[parcellation_term_set['name']] # order columns
color.columns = ['%s_color' % x for x in color.columns]
color

In [26]:
channels = {}
for c in ['red','blue','green'] :
    df = parcellation_annotation.groupby(['parcellation_index','parcellation_term_set_name'])[c].first().unstack()
    df = df[parcellation_term_set['name']] # order columns
    df.columns = ['%s' % (x) for x in color.columns]
    channels[c] = df

In [27]:
channels['red']

For convenience, we can cache this view for later reuse.

In [28]:
if cache_views :
    
    file = os.path.join( view_directory, 'parcellation_to_parcellation_term_membership_acronym.csv')
    pivot.to_csv( file )
    
    file = os.path.join( view_directory, 'parcellation_to_parcellation_term_membership_name.csv')
    name.to_csv( file )
    
    file = os.path.join( view_directory, 'parcellation_to_parcellation_term_membership_color.csv')
    color.to_csv( file )
    
    for c in channels :
    
        file = os.path.join( view_directory, 'parcellation_to_parcellation_term_membership_%s.csv' % c )
        channels[c].to_csv( file )
    

We define a helper function to colorized parcellation by its reference atlas color for each anatomical level

In [29]:
def colorize( zslice, term_set ) :

    # create a 3d array to store rgb image
    sshape = zslice.shape
    colorized = np.zeros((sshape[0],sshape[1],3),dtype=np.uint8)
    
    for i,c in enumerate(['red','green','blue']) :
        temp = np.zeros((sshape[0],sshape[1]),dtype=np.uint8)
        temp.flat[:] = channels[c].loc[zslice.flat[:],'%s_color'%term_set]
        colorized[:,:,i] = temp
    
    return colorized

In [30]:
zindex = 720
zslice = np.transpose(annotation_array[:,:,zindex])
fig, ax = plot_section(zslice)
res = ax.set_title('annotation_10')

In [31]:
term_set = 'organ'
colorized = colorize(zslice,term_set)
fig, ax = plot_section(colorized)
res = ax.set_title(term_set)

In [32]:
term_set = 'category'
colorized = colorize(zslice,term_set)
fig, ax = plot_section(colorized)
res = ax.set_title(term_set)

In [33]:
term_set = 'division'
colorized = colorize(zslice,term_set)
fig, ax = plot_section(colorized)
res = ax.set_title(term_set)

In [34]:
term_set = 'structure'
colorized = colorize(zslice,term_set)
fig, ax = plot_section(colorized)
res = ax.set_title(term_set)

In [35]:
term_set = 'substructure'
colorized = colorize(zslice,term_set)
fig, ax = plot_section(colorized)
res = ax.set_title(term_set)