# MayMySections (https://alleninstitute.github.io/MapMySections/)

Accurately defining this connection between genetic tools and known cell types represents a critical step in interpreting the results of experiments using these tools, from functional assays to potential gene therapies. A detailed cell type characterization of the labeled cells can be achieved through a combination of single cell RNA-sequencing and cell sorting; however, such methods take time, are costly, and are prone to bias. A method for directly inferring cell types from fluorescent images without the need for additional experiments would be immediately applicable to thousands of existing genetic tools, greatly improving their utility and interpretability. Entrants are tasked with 1) creating an algorithm that will accurately match fluorescent images of genetic tools to the most likely spatial transcriptomic cell types and/or 2) presenting such an algorithm as part of a user-friendly tool like MapMyCells. Although images span the whole brain, this challenge is focused on defining cell types in primary visual cortex (VISp).

This challenge includes images and associated cell type specificity for anonymized genetics tools from the Genetics Tools Atlas, a searchable web resource representing information and data on enhancer-adeno-associated viruses (enhancer AAVs) and mouse transgenes. Multiple modalities for summarizing data are included as part of the atlas, but only coronal sections collected using Serial Two-Photon Tomography (STPT) sections will be directly included as part of the challenge. Cell type specificity is assessed by applying single cell RNA-sequencing (SMART-Seq v4) on fluorescently labeled cells from each genetic tool, and then mapping these cells to the published taxonomy of cell types in whole mouse brain (Yao et al, 2023), which are available in the Allen Brain Cell Atlas. Qualitative assessments of cell labeling patterns are also provided for many of the genetic tools.

In [1]:
import pandas as pd

In [3]:
workbook = pd.ExcelFile("MapMySections_EntrantData.xlsx")
sheets = workbook.sheet_names
print(sheets)

['README', 'Entrant Information', 'Column Descriptions', 'Training Set', 'Test Set']


In [10]:
df = workbook.parse(sheets[2])
for index, row in df.iterrows():
    print(row["Column Name"], ":", row["Description"], "\n")

MapMySectionsID : A unique, anonymized identifier for each experiment. These identifiers will be linked to actual experiment identifiers in the Genetics Tools Atlas at the conclusion of the challenge 

Genetics Tools Type : Either “Enhancer AAV” or “Mouse transgenic line” to indicating the underlying type of genetic tool. Note that the training and test data sets each include some enhancer AAVs and some mouse transgenic lines. 

STPT Data File Path : Link to the image series in OME-Zarr format. This is the primary data required for the challenge. See Data Challenge page for resources showing how to access and use these data. 

STPT Thumbnail Image : Link to a small thumbnail image of one section of the genetic tool that intersects VISp for quick viewing. These are potentially useful as a sanity check of algorithm results or for inclusion in user-created tools. 

Neuroglancer File Path : Link to same data for visualization using Neuroglancer. See Data Challenge page for resources showin

## Important sections of the dataframe and dataset

### Qualitative Image Assessment

Sets of comma-separated qualitative assessments for labeling strength, labeling density, and labeled cell populations made based on review of epifluorescence image data for the same genetic tool. In cases where more than one assessment is included, cells from all of the listed types were labeled. While these calls do not constitute quantitative validation, they are likely to reflect actual cell type assignments in cases when SMART-Seq v4 data is missing. Note that a value of "Neuron" indicates that the specific neuronal populations targeted could not be accurately assessed in that experiment. 

Note: check this for SSV4 - if this is missing then prediction is blank

In [6]:
df = workbook.parse(sheets[3])
df.head(7)

Unnamed: 0,MapMySectionsID,Genetics Tools Type,STPT Data File Path,STPT Thumbnail Image,Neuroglancer File Path,CCF Registered Image File Path,Target_Cell_Population,Qualitative Image Assessment,|,ABC.NN,...,Oligo.NN,Peri.NN,Pvalb.Gaba,Pvalb.chandelier.Gaba,SMC.NN,Sncg.Gaba,Sst.Chodl.Gaba,Sst.Gaba,VLMC.NN,Vip.Gaba
0,MMS.training.001,Enhancer AAV,s3://allen-genetic-tools/tissuecyte/1285852977...,https://s3.us-west-2.amazonaws.com/allen-genet...,https://neuroglancer-demo.appspot.com/#!s3://a...,https://s3.us-west-2.amazonaws.com/map-my-sect...,Pax6,"strong, sparse, GABAergic",|,,...,,,,,,,,,,
1,MMS.training.002,Enhancer AAV,s3://allen-genetic-tools/tissuecyte/1283915380...,https://s3.us-west-2.amazonaws.com/allen-genet...,https://neuroglancer-demo.appspot.com/#!s3://a...,https://s3.us-west-2.amazonaws.com/map-my-sect...,Pvalb,"strong, dense, Astro",|,,...,,,,,,,,,,
2,MMS.training.003,Enhancer AAV,s3://allen-genetic-tools/tissuecyte/1121585094...,https://s3.us-west-2.amazonaws.com/allen-genet...,https://neuroglancer-demo.appspot.com/#!s3://a...,https://s3.us-west-2.amazonaws.com/map-my-sect...,Pvalb,"strong, dense, GABAergic; strong, dense, L5_Gl...",|,,...,,,,,,,,,,
3,MMS.training.004,Enhancer AAV,s3://allen-genetic-tools/tissuecyte/1200969502...,https://s3.us-west-2.amazonaws.com/allen-genet...,https://neuroglancer-demo.appspot.com/#!s3://a...,https://s3.us-west-2.amazonaws.com/map-my-sect...,Endo,"strong, sparse, Endo_Peri; weak, dense, Endo_P...",|,0.0,...,0.0,0.0,3.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0
4,MMS.training.005,Enhancer AAV,s3://allen-genetic-tools/tissuecyte/1177798305...,https://s3.us-west-2.amazonaws.com/allen-genet...,https://neuroglancer-demo.appspot.com/#!s3://a...,https://s3.us-west-2.amazonaws.com/map-my-sect...,Oligo,"strong, dense, Oligo",|,,...,,,,,,,,,,
5,MMS.training.006,Enhancer AAV,s3://allen-genetic-tools/tissuecyte/1195060569...,https://s3.us-west-2.amazonaws.com/allen-genet...,https://neuroglancer-demo.appspot.com/#!s3://a...,https://s3.us-west-2.amazonaws.com/map-my-sect...,L4_IT,"[not assessed], [not assessed], Neuron; strong...",|,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,2.0,0.0,0.0
6,MMS.training.007,Enhancer AAV,s3://allen-genetic-tools/tissuecyte/1171819575...,https://s3.us-west-2.amazonaws.com/allen-genet...,https://neuroglancer-demo.appspot.com/#!s3://a...,https://s3.us-west-2.amazonaws.com/map-my-sect...,L6_IT_Car3,"strong, sparse, L6_Glutamatergic; weak, dense,...",|,0.0,...,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
