# The Common Coordinate Framework and AllenSDK

The Common Coordinate Framework (CCF) can readily be accessed through the AllenSDK and used to analyze arrays of registered data. 

## Installation
To get started, you can follow [these](https://allensdk.readthedocs.io/en/latest/install.html) instructions to install the AllenSDK:

First install python 3.7
Then install the AllenSDK using PIP:
    pip install allensdk

You can now run the AllenSDK in a jupyter notebook or other environment of your choice. Documentation can be found [here](https://allensdk.readthedocs.io/en/latest/).

## The Mouse Connectivity Cache

The average template volume, annotation volume, and ontology can all be accessed through the Mouse Connectivity Cache. This also provides access to published connectivity data registered to the CCF. First, import the MouseConnectivityCache. Along with this, I will also import numpy and pandas to use for data processing, as well as pyplot for visualization.

In [1]:
import numpy as np
import pandas as pd
from allensdk.core.mouse_connectivity_cache import MouseConnectivityCache
from matplotlib import pyplot as plt

In [2]:
%matplotlib notebook

### Accessing the CCF from the MouseConnectivityCache

Instantiate the MouseConnectivityCache (mcc). The default resolution is 25 microns, but 10,50, and 100 microns are also valid options. You can also specify a location of a manifest file, which would keep track of all downloaded assets from template and annotation volumes to experimental data. If a manifest file directory is not provided, one will be created in the current working directory.

In [3]:
mcc = MouseConnectivityCache(resolution=10)

#### Annotated Volume and Reference Space
Now you can get the Reference Space (rsp) at that resolution. This automatically downloads the annotated volume at the specified resolution (25 microns).

In [4]:
rsp = mcc.get_reference_space()

2025-03-10 06:32:33,529 allensdk.api.api.retrieve_file_over_http INFO     Downloading URL: http://download.alleninstitute.org/informatics-archive/current-release/mouse_ccf/annotation/ccf_2017/annotation_10.nrrd


The annotated volume is a 3D numpy ndarray, with axes correspond to AP,DV, and ML respectively:

In [5]:
print('Shape of annotated volume:',rsp.annotation.shape)
print('Annotated volume data type:',rsp.annotation.dtype)

Shape of annotated volume: (1320, 800, 1140)
Annotated volume data type: uint32


The rsp provides access to the structure tree from the ontology. Given that there are multiple atlases and associated ontologies, we are only interested in the structures belonging to this annotation. Thus, we remove unassigned structures from the structure tree. This returns a structure graph which is more easily viewed as a pandas DataFrame:

In [6]:
sg = rsp.remove_unassigned()

In [7]:
sg = pd.DataFrame(sg)

In [8]:
sg.head()

Unnamed: 0,acronym,graph_id,graph_order,id,name,structure_id_path,structure_set_ids,rgb_triplet
0,root,1,0,997,root,[997],[691663206],"[255, 255, 255]"
1,grey,1,1,8,Basic cell groups and regions,"[997, 8]","[112905828, 691663206, 12, 184527634, 11290581...","[191, 218, 227]"
2,CH,1,2,567,Cerebrum,"[997, 8, 567]","[112905828, 691663206, 12, 184527634, 11290581...","[176, 240, 255]"
3,CTX,1,3,688,Cerebral cortex,"[997, 8, 567, 688]","[112905828, 691663206, 12, 184527634, 11290581...","[176, 255, 184]"
4,CTXpl,1,4,695,Cortical plate,"[997, 8, 567, 688, 695]","[112905828, 691663206, 12, 184527634, 11290581...","[112, 255, 112]"


In [9]:
sg.loc[sg['name'] == 'Cerebellum']

Unnamed: 0,acronym,graph_id,graph_order,id,name,structure_id_path,structure_set_ids,rgb_triplet
696,CB,1,1014,512,Cerebellum,"[997, 8, 512]","[2, 112905828, 691663206, 12, 184527634, 11290...","[240, 240, 128]"


The structures' acronyms, names and IDs can all be linked through the structure graph.

While the rsp can be used to generate sections in the coronal (0), horizontal (1), or sagittal (2) planes:

In [None]:
orientation = {'coronal':0,'horizontal':1,'sagittal':2} # makes orientation arguments more readible
pos_microns = lambda x:x*25 # get_slice_image function takes positional arguments in microns instead of 25 micron increments

img = rsp.get_slice_image(orientation['sagittal'],pos_microns(200))
plt.imshow(img)

The rsp can also generate masks of given structure IDs. The IDs are provided as a list, and by default include all voxels annotated as children of the given structures (direct_only=False). If only the parent level voxels are desired, set direct_only to True.

Here we make a mask of the Cerebral Cortex, which from the structure graph is shown to have the ID 688.

In [None]:
ctx_mask = rsp.make_structure_mask([512], direct_only=False)

In [None]:
print('Shape of mask:',ctx_mask.shape)
print('Mask data type:',ctx_mask.dtype)
print('Value of masked voxels:',ctx_mask.max())

In [None]:
midpoint = ctx_mask.shape[2] // 2
plt.imshow(ctx_mask[:,:,midpoint])

The mask is the same shape as the specified reference space, and voxels in that space that belong to the Cerebral Cortex are labeled 1.

#### Average Template Volume
The mcc also provides access to the average template volume, upon which the CCF is constructed. Downloading the template volume puts it in the directory specified by the manifest file when the mcc was instantiated.

In [None]:
avg_temp, meta = mcc.get_template_volume()

In [None]:
print('Shape of average template volume:',avg_temp.shape)
print('Template volume data type:',avg_temp.dtype)
print('Max intensity of template volume:',avg_temp.max())

In [None]:
midpoint = ctx_mask.shape[0] // 2
print(midpoint)
plt.imshow(avg_temp[midpoint,:,:])

### Analysis Using CCF

After accessing the different components of the CCF, we can now use them for analysis. For example, we can sample voxels from any data array in that space using the masks from rsp. To do this, we can simply get the coordinates of the masked voxels and sample our data array with them.

In [None]:
ctx_mask.nonzero()

In [None]:
avg_temp[ctx_mask.nonzero()]

This resulting array contains the intensity of each voxel in the Cerebral Cortex of the average template volume.

### Accessing Experiments

The mcc also allows us to access published experiments. We can view these as a pandas DataFrame and filter out the data of interest. For example, we can look at all experiments listed as having primary injections in VISp.

In [None]:
exp = mcc.get_experiments(dataframe=True)

In [None]:
exp[exp.structure_abbrev == 'VISp']

If we decide to look at the projection densities of a given experiment, we can simply use the mcc to download that data, which will automatically be downloaded at our specified resolution.

In [None]:
projden, meta = mcc.get_projection_density(503069254)

In [None]:
print('Shape of projection density:',projden.shape)
print('Projection density data type:',projden.dtype)
print('Max value of projection density:',projden.max())

According to the experiments data frame, this particular experiment has injection coordinates (8690,1440,3090) in microns. Our projden array is in 25 micron space, so we need to convert our coordinates.

In [None]:
8690/25

In [None]:
plt.imshow(projden[348,:,:]) # indeces must be integers, so we round up from 347.6

### Saving Data

The projection volume can also be written locally to a nrrd file, which can then be viewed by ITK-Snap.

In [None]:
import nrrd

In [None]:
nrrd.write('./projection_density_25_503069254.nrrd',projden)

### Using Masks

Using the masks we create from rsp, we can analyze parcel out the projection densities by annotated structure. For example, we can look at the sum projection density by structure using a loop. We'll have to account for left and right hemispheres, as the masks automatically include both.

In [None]:
print('Number of structures in structure graph:',len(sg.id))

There are 839 annotated structures, so this approach can be a little slow. If there are specific structures of interest, it may make more sense to simply sample those. For example, we can look at the projections in the Thalamus and all its substructures.

To find the Thalamus's ID, we can use the structure graph.

In [None]:
sg[sg.name == 'Thalamus']

We can also use the structre tree in rsp to find the IDs of all of its descendants (substructures).

In [None]:
print(rsp.structure_tree.descendant_ids([549])[0])

IDs are hard to make sense of, so we can map these back to their acronyms.

In [None]:
id_acronym_map = rsp.structure_tree.get_id_acronym_map() # Dictionary returns IDs given acronyms
acronym_id_map = {v:k for k,v in id_acronym_map.items()} # Flip key:value pairs to get dictionary for acronyms given IDs

In [None]:
print(list(map(acronym_id_map.get,rsp.structure_tree.descendant_ids([549])[0])))

We can now sample our data by thalamic structure and hemisphere. We'll use a hemisphere id of 1 for the left hemisphere and 2 for the right. For the sum of signal in both hemispheres, we'll use a hemisphere id of 3.

In [None]:
sum_projden = []
for ID in rsp.structure_tree.descendant_ids([549])[0]:
    mask = rsp.make_structure_mask([ID])
    
    # left hemisphere
    left = projden[:,:,:228][mask[:,:,:228].nonzero()].sum()
    sum_projden.append({'id':ID, 'acronym':acronym_id_map[ID], 'hemisphere':1, 'sum_projection_density':left})
    
    # right hemisphere
    right = projden[:,:,228:][mask[:,:,228:].nonzero()].sum()
    sum_projden.append({'id':ID, 'acronym':acronym_id_map[ID], 'hemisphere':2, 'sum_projection_density':right})
    
    # both hemispheres
    sum_projden.append({'id':ID, 'acronym':acronym_id_map[ID], 'hemisphere':3, 'sum_projection_density':left+right})

In [None]:
df = pd.DataFrame(sum_projden)

In [None]:
df.head(10)

### Accessing Experiments Structure Unionizes

For connectivity data published through the AllenSDK, this has already been done and can be accessed using the mcc.

In [None]:
struct_union = mcc.get_experiment_structure_unionizes(503069254)

In [None]:
struct_union

## Structure Sets

While there are 839 unique annotations in CCFv3, these annotations represent structures at different levels of ontology. Structure sets provide a way to access structures at a given level of ontology. For example, 12 high-level structures have been grouped together in a "coarse" structures set and 316 mid-level structures in a "summary" structures set. These sets of structures can be accessed through their structure_set_ids. To find the available structure sets, the OntologiesApi is needed.

In [None]:
from allensdk.api.queries.ontologies_api import OntologiesApi

In [None]:
oapi = OntologiesApi()

In [None]:
structure_sets = pd.DataFrame(oapi.get_structure_sets())

### Coarse Structure Set

In [None]:
structure_sets[structure_sets.name.map(lambda x:'coarse' in x.lower())]

In [None]:
coarse_structures = pd.DataFrame(rsp.structure_tree.get_structures_by_set_id([2]))

In [None]:
coarse_structures

### Summary Structure Set

In [None]:
structure_sets[structure_sets.name.map(lambda x:'summary' in x.lower())]

In [None]:
summary_structures = pd.DataFrame(rsp.structure_tree.get_structures_by_set_id([167587189]))

In [None]:
summary_structures

## Sample Analysis: Projection Density

As an example, we can now view the projection density across coarse-level structures for all connectivity experiments with primary injection in MOp.

We have our coarse_structure_ids:

In [None]:
coarse_structure_ids = coarse_structures.id.values

In [None]:
coarse_structure_ids

From our experiments data frame, we can get experiment ids corresponding to MOp experiments.

In [None]:
MOp_experiments = exp[exp.structure_abbrev == 'MOp'].id.values

Using the mcc, we can get structure unionizes for those experiments in the coarse structures.

In [None]:
MOp_exp_unionizes = mcc.get_structure_unionizes(MOp_experiments,structure_ids=coarse_structure_ids,hemisphere_ids=[1,2])

For each experiment, we can identify the injection hemisphere.

In [None]:
MOp_exp_unionizes[MOp_exp_unionizes.is_injection][['experiment_id','hemisphere_id']]

If we are interested in ipsilateral projection densities, we can sample the unionizes results to include only non-injection results in the right hemisphere. Then we pivot the table to find the projection density in each coarse-level structure for each experiment.

In [None]:
MOp_union_subsample = MOp_exp_unionizes[(MOp_exp_unionizes.hemisphere_id == 2)&(~MOp_exp_unionizes.is_injection)]
projection_density_table = MOp_union_subsample.pivot(index='experiment_id',columns='structure_id',values='projection_density')

In [None]:
projection_density_table

To view this as a heatmap, we get the axes of the table and convert the structure_ids to acronyms for legibility. We also need the matrix of the values in the table.

In [None]:
structures = projection_density_table.columns.map(acronym_id_map)

In [None]:
experiments = projection_density_table.index

In [None]:
projection_density_matrix = projection_density_table.values

In [None]:
fig, ax = plt.subplots()
fig.set_size_inches(len(structures)/3,len(experiments)/3)

im = ax.imshow(projection_density_matrix)

ax.set_xticks(np.arange(len(structures)))
ax.set_yticks(np.arange(len(experiments)))

ax.set_xticklabels(structures, rotation=90)
ax.set_yticklabels(experiments)

ax.xaxis.tick_top()

fig.tight_layout()
plt.show()

## Download Section Images

It is also important to be able to access and view the section images for a given experiment. This can be done through the ImageDownloadApi.

In [None]:
from allensdk.api.queries.image_download_api import ImageDownloadApi
from allensdk.core.mouse_connectivity_cache import MouseConnectivityApi
import os

For a given experiment, we can access all its sections through the experiment ID.

In [None]:
ID = 503069254

We instantiate the ImageDownloadApi to download images and use the MouseConnectivityApi to access relevant data for those images, such as equalization parameters.

In [None]:
mca = MouseConnectivityApi()
ida = ImageDownloadApi()

For a given experiment, each section has a unique section_id, which we can access from its sections frame. Here, we see that this brain has been sectioned into 140 individual sections.

In [None]:
sections_frame = pd.DataFrame(ida.section_image_query(ID))

In [None]:
sections_frame

We are interested in the section_id corresponding to each section, so we extract those values from the data frame.

In [None]:
section_ids = sections_frame[['section_number','id']].values

In [None]:
print(section_ids)

We can access the equalization parameters from the experiment details, which we get from the MouseConnectivityApi (mca). This lets us get equalization ranges for each channel, to pass to the image downloader.

In [None]:
details = mca.get_experiment_detail(ID)
equalization_params = pd.DataFrame(details).T.loc['equalization'].values[0]
equalization_ranges = [equalization_params['red_lower'],equalization_params['red_upper'],equalization_params['green_lower'],
     equalization_params['green_upper'],equalization_params['blue_lower'],equalization_params['blue_upper']]

In [None]:
equalization_ranges

We can now download the images. It can be convenient to put them all in a directory for that particular experiment. Then we download each individual section. In this case, we are downsampling the image by a factor of 4.

In [None]:
if not os.path.exists('./{0}/sections'.format(ID)):
    os.mkdir('./{0}/sections'.format(ID))
    
for section_number, section_id in section_ids:
    ida.download_image(section_id,downsample=4,range=equalization_ranges,
                       file_path='./{0}/sections/{1}.jpg'.format(ID,section_number))