# Query Cell Types Ontology

# Context

- This notebook has been put together for the MMB demo on 2022-05-30: see [slides](https://docs.google.com/presentation/d/1Ib1_8byK0hVuNS-wPbqmeL5Lcf67m_oegC3-czPw5ws/edit#slide=id.g116ba5ed71e_0_8)
- It has been revised following feedback from the meeting held on 2022-07-07: see [slides](https://docs.google.com/presentation/d/1mgCyYjHerLJLV79GM0kqp3_Htxmru_7QGIr5elUETC0/edit#slide=id.g13b4a370a10_0_19) and [JIRA ticket](https://bbpteam.epfl.ch/project/issues/browse/DKE-942)

## Imports

In [387]:
import json
import rdflib
import getpass
import pandas as pd
from rdflib import RDF, RDFS, XSD, OWL, URIRef, BNode, SKOS
import pprint
from kgforge.core import KnowledgeGraphForge

## Setup
Get an authentication token

For now, the [Nexus web application](https://bbp.epfl.ch/nexus/web) can be used to get a token. We are looking for other simpler alternatives.

- Step 1: From the opened web page, click on the login button on the right corner and follow the instructions.

![login-ui](./login-ui.png)

- Step 2: At the end you’ll see a token button on the right corner. Click on it to copy the token.

![login-ui](./copy-token.png)


In [406]:
TOKEN = getpass.getpass()

 ·······································································································································································································································································································································································································································································································································································································································································································································································································································································································································································································································

In [407]:
forge = KnowledgeGraphForge(# "https://raw.githubusercontent.com/BlueBrain/nexus-forge/master/examples/notebooks/use-cases/prod-forge-nexus.yml",
                            "/Users/akkaufma/Desktop/config/prod-forge-nexus.yml",
                            token=TOKEN,
                            searchendpoints={"sparql": {"endpoint": "https://bbp.epfl.ch/neurosciencegraph/data/views/aggreg-sp/dataset"}},
                            # endpoint="https://staging.nise.bbp.epfl.ch/nexus/v1",
                            bucket="bbp/atlas")

## Ontologies

### Set brain region

During the meeting on `2022-05-30`, it was specified that a brain region will serve as entry point when searching for cell types in the MMB context. Hence, this notebook starts by defining a brain region one wants to get cell types for. Since the most complete cell type information is available for the `Cerebral cortex`, this has been set as the default below.

In [529]:
BRAIN_REGION = "Cerebral cortex"
# BRAIN_REGION = "Caudoputamen"
# BRAIN_REGION = "Cerebellum"
# BRAIN_REGION = "Somatosensory areas"

Get brain region id

In [530]:
r = forge.search({"label": BRAIN_REGION}, cross_bucket=True)

In [531]:
brain_region = r[0].id

In [532]:
brain_region

'http://api.brain-map.org/api/v2/data/Structure/688'

### Set species

In [533]:
SPECIES = "Mus musculus"

Get species id

In [534]:
r = forge.search({"label": SPECIES}, cross_bucket=True)

In [535]:
species = r[0].id

In [536]:
species

'http://purl.obolibrary.org/obo/NCBITaxon_10090'

## Queries

### Get brain regions which do have neuron t-types available

This query will list brain region labels for which the knowledge graph has neuron t-types available

In [537]:
query = f"""

SELECT ?brain_region

WHERE {{
        ?t_type_id subClassOf* <https://bbp.epfl.ch/ontologies/core/celltypes/NeuronTranscriptomicType> ;
                  canBeLocatedInBrainRegion ?brain_region_id .        
        ?brain_region_id label ?brain_region .
}}
"""

In [538]:
resources = forge.sparql(query, limit=1000)

In [539]:
df = forge.as_dataframe(resources)

In [540]:
set(df.brain_region)

{'Agranular insular area',
 'Anterior cingulate area',
 'Area prostriata',
 'Cerebellum',
 'Cerebral cortex',
 'Entorhinal area',
 'Entorhinal area, lateral part',
 'Entorhinal area, medial part, dorsal zone',
 'Fasciola cinerea',
 'Field CA1',
 'Field CA2',
 'Field CA3',
 'Hippocampal formation',
 'Hippocampo-amygdalar transition area',
 'Hypothalamus',
 'Induseum griseum',
 'Isocortex',
 'Parasubiculum',
 'Postsubiculum',
 'Presubiculum',
 'Prosubiculum',
 'Retrohippocampal region',
 'Retrosplenial area',
 'Retrosplenial area, ventral part',
 'Subiculum',
 'root'}

### Get brain regions which do have neuron m-types available

In [541]:
query = f"""

SELECT ?brain_region

WHERE {{
        ?m_type_id subClassOf* <https://bbp.epfl.ch/ontologies/core/bmo/NeuronMorphologicalType> ;
                  canBeLocatedInBrainRegion ?brain_region_id .        
        ?brain_region_id label ?brain_region .
}}
"""

In [542]:
resources = forge.sparql(query, limit=1000)

In [543]:
df = forge.as_dataframe(resources)

In [544]:
set(df.brain_region)

{'Caudoputamen',
 'Field CA1',
 'Globus pallidus, external segment',
 'Globus pallidus, internal segment',
 'Hippocampal formation',
 'Isocortex',
 'Nucleus accumbens',
 'Pallidum, ventral region',
 'Reticular nucleus of the thalamus',
 'Substantia nigra, compact part',
 'Substantia nigra, reticular part',
 'Subthalamic nucleus',
 'Thalamus',
 'root'}

### Get brain region which do have neuron e-types available

In [545]:
query = f"""

SELECT ?brain_region

WHERE {{
        ?e_type_id subClassOf* <https://bbp.epfl.ch/ontologies/core/bmo/NeuronElectricalType> ;
                  canBeLocatedInBrainRegion ?brain_region_id .        
        ?brain_region_id label ?brain_region .
}}
"""

In [546]:
resources = forge.sparql(query, limit=1000)

In [547]:
df = forge.as_dataframe(resources)

In [548]:
set(df.brain_region)

{'Caudoputamen',
 'Globus pallidus, external segment',
 'Globus pallidus, internal segment',
 'Isocortex',
 'Nucleus accumbens',
 'Pallidum, ventral region',
 'Substantia nigra, compact part',
 'Substantia nigra, reticular part',
 'Subthalamic nucleus',
 'Thalamus',
 'root'}

### Get possible t-types for a given brain region

This query lists t-types for a given brain region (i.e. the `BRAIN_REGION` specified above). For demonstration purposes, the `limit` parameter on the query has been set to `100`. This can be increased to get all available t-types. E.g. the total number of available t-types for `Cerebral cortex` on `2022-07-08` was `252`.

In [549]:
query = f"""

SELECT ?brain_region ?t_type

WHERE {{
        ?t_type_id label ?t_type ;
            subClassOf* <https://bbp.epfl.ch/ontologies/core/celltypes/NeuronTranscriptomicType> ;
            canBeLocatedInBrainRegion <{brain_region}> .        
        <{brain_region}> label ?brain_region . 
}}
"""

In [550]:
resources = forge.sparql(query, limit=1000)

In [551]:
df = forge.as_dataframe(resources)

In [552]:
df.head()

Unnamed: 0,brain_region,t_type
0,Cerebral cortex,275_NP PPP
1,Cerebral cortex,289_L6 CT CTX
2,Cerebral cortex,293_L6 CT CTX
3,Cerebral cortex,286_L6 CT CTX
4,Cerebral cortex,281_L6 CT CTX


### Get possible met-type combinations plus excitatory/inhibitory category for a given brain region and species

This query returns possible met-type combinations together with the excitatory/inhibitory categories for the brain region one has set above. For each m- e and t- and transmitter-type, the identifier and the current version in the knowledge graph are also being returned. For a simplified view, please run the `df.drop()` cell below. It will only show the labels of a given type. The `version` indicates the revision of a given type in the knowledge graph and has been included following the Cell Types Meeting on 2022-07-07 to help with reproducibility (see also this JIRA ticket: [DKE-942](https://bbpteam.epfl.ch/project/issues/browse/DKE-942)).

In [553]:
query = f"""

SELECT ?brain_region ?brain_region_version ?species ?species_version ?transmitter ?transmitter_id ?transmitter_version ?t_type ?t_type_id ?t_type_version ?m_type ?m_type_id ?m_type_version ?e_type ?e_type_id ?e_type_version

WHERE {{
        ?t_type_id label ?t_type ;
                canBeLocatedInBrainRegion <{brain_region}> ;
                _rev ?t_type_version .
        
        <{brain_region}> label ?brain_region ;
            _rev ?brain_region_version .
        
        ?m_type_id label ?m_type ;
            _rev ?m_type_version ;
            subClassOf* MType ;
            canHaveTType ?t_type_id ;
            subClassOf* / hasNeurotransmitterType ?transmitter_id ;
            subClassOf* / hasInstanceInSpecies <{species}> .
        
        <{species}> label ?species ;        
            _rev ?species_version .
        
        ?transmitter_id label ?transmitter ;        
            _rev ?transmitter_version .

        ?e_type_id label ?e_type ;
            _rev ?e_type_version ;
            subClassOf* EType ;
            subClassOf* / canHaveMType ?m_type_id ;
            subClassOf* / canHaveTType ?t_type_id .            
}}
"""

In [554]:
resources = forge.sparql(query, limit=1000)

In [555]:
df = forge.as_dataframe(resources)

In [556]:
df.head()

Unnamed: 0,brain_region,brain_region_version,e_type,e_type_id,e_type_version,m_type,m_type_id,m_type_version,species,species_version,t_type,t_type_id,t_type_version,transmitter,transmitter_id,transmitter_version
0,Cerebral cortex,10,NCx_cNAC,https://bbp.epfl.ch/ontologies/core/bmo/Neocor...,2,L1_DAC,http://uri.interlex.org/base/ilx_0383192,30,Mus musculus,49,9_Lamp5 Lhx6,https://bbp.epfl.ch/ontologies/core/ttypes/9_L...,4,Inhibitory,https://bbp.epfl.ch/ontologies/core/bmo/Inhibi...,4
1,Cerebral cortex,10,NCx_cNAC,https://bbp.epfl.ch/ontologies/core/bmo/Neocor...,2,L1_DAC,http://uri.interlex.org/base/ilx_0383192,30,Mus musculus,49,9_Lamp5 Lhx6,https://bbp.epfl.ch/ontologies/core/ttypes/9_L...,4,Inhibitory,https://bbp.epfl.ch/ontologies/core/bmo/Inhibi...,4
2,Cerebral cortex,10,NCx_cNAC,https://bbp.epfl.ch/ontologies/core/bmo/Neocor...,2,L1_DAC,http://uri.interlex.org/base/ilx_0383192,30,Mus musculus,49,12_Lamp5,https://bbp.epfl.ch/ontologies/core/ttypes/12_...,4,Inhibitory,https://bbp.epfl.ch/ontologies/core/bmo/Inhibi...,4
3,Cerebral cortex,10,NCx_cNAC,https://bbp.epfl.ch/ontologies/core/bmo/Neocor...,2,L1_DAC,http://uri.interlex.org/base/ilx_0383192,30,Mus musculus,49,12_Lamp5,https://bbp.epfl.ch/ontologies/core/ttypes/12_...,4,Inhibitory,https://bbp.epfl.ch/ontologies/core/bmo/Inhibi...,4
4,Cerebral cortex,10,NCx_cNAC,https://bbp.epfl.ch/ontologies/core/bmo/Neocor...,2,L1_DAC,http://uri.interlex.org/base/ilx_0383192,30,Mus musculus,49,6_Lamp5 Lhx6,https://bbp.epfl.ch/ontologies/core/ttypes/6_L...,4,Inhibitory,https://bbp.epfl.ch/ontologies/core/bmo/Inhibi...,4


In [558]:
df.drop(["brain_region_version", "e_type_id", "species_version", "e_type_version", "m_type_id", "m_type_version", "t_type_id", "t_type_version", "transmitter_id", "transmitter_version"], axis=1)

Unnamed: 0,brain_region,e_type,m_type,species,t_type,transmitter
0,Cerebral cortex,NCx_cNAC,L1_DAC,Mus musculus,9_Lamp5 Lhx6,Inhibitory
1,Cerebral cortex,NCx_cNAC,L1_DAC,Mus musculus,9_Lamp5 Lhx6,Inhibitory
2,Cerebral cortex,NCx_cNAC,L1_DAC,Mus musculus,12_Lamp5,Inhibitory
3,Cerebral cortex,NCx_cNAC,L1_DAC,Mus musculus,12_Lamp5,Inhibitory
4,Cerebral cortex,NCx_cNAC,L1_DAC,Mus musculus,6_Lamp5 Lhx6,Inhibitory
...,...,...,...,...,...,...
995,Cerebral cortex,NCx_cAC,L1_LAC,Mus musculus,12_Lamp5,Inhibitory
996,Cerebral cortex,NCx_cAC,L1_LAC,Mus musculus,6_Lamp5 Lhx6,Inhibitory
997,Cerebral cortex,NCx_cAC,L1_LAC,Mus musculus,5_Lamp5 Lhx6,Inhibitory
998,Cerebral cortex,NCx_cAC,L1_LAC,Mus musculus,7_Lamp5 Lhx6,Inhibitory


### Get possible t-types for a given brain region and all the brain regions which are part of that brain region

This query returns possible t-types for the brain region one has set above and all the brain regions that are part of that brain region. E.g. if one specifies `Cerebral cortex` as brain region, this query would return t-types from the `Cerebral cortex` but also t-types for the `Isocortex` or the `Hippocampal formation` since they are both part of the `Cerebral cortex`.

In [559]:
query = f"""

SELECT ?brain_region ?brain_region_version ?t_type ?t_type_id ?t_type_version

WHERE {{
        ?t_type_id label ?t_type ;
                subClassOf* <https://bbp.epfl.ch/ontologies/core/celltypes/NeuronTranscriptomicType> ;
                canBeLocatedInBrainRegion ?brain_region_id ;
                _rev ?t_type_version .
        
        ?brain_region_id label ?brain_region ;
            ^hasPart* <{brain_region}> ;
            _rev ?brain_region_version .            
}}
"""

In [560]:
resources = forge.sparql(query, limit=500)

In [561]:
df = forge.as_dataframe(resources)

In [562]:
len(set(df.t_type))

359

In [563]:
df

Unnamed: 0,brain_region,brain_region_version,t_type,t_type_id,t_type_version
0,Postsubiculum,12,129_L2/3 IT POST-PRE,https://bbp.epfl.ch/ontologies/core/ttypes/129...,4
1,Postsubiculum,12,132_L2 IT RSPv-POST-PRE,https://bbp.epfl.ch/ontologies/core/ttypes/132...,4
2,Postsubiculum,12,133_L2 IT RSPv-POST-PRE,https://bbp.epfl.ch/ontologies/core/ttypes/133...,4
3,Postsubiculum,12,130_L2/3 IT POST-PRE,https://bbp.epfl.ch/ontologies/core/ttypes/130...,4
4,Postsubiculum,12,131_L2/3 IT POST-PRE,https://bbp.epfl.ch/ontologies/core/ttypes/131...,4
...,...,...,...,...,...
387,Entorhinal area,11,298_L6 CT ENT,https://bbp.epfl.ch/ontologies/core/ttypes/298...,4
388,Entorhinal area,11,299_L6 CT ENT,https://bbp.epfl.ch/ontologies/core/ttypes/299...,4
389,Entorhinal area,11,300_L6b ENT,https://bbp.epfl.ch/ontologies/core/ttypes/300...,4
390,Entorhinal area,11,301_L6b ENT,https://bbp.epfl.ch/ontologies/core/ttypes/301...,4


### Get m-types together with their transmitter type (sClass)

In [564]:
query = f"""

SELECT ?transmitter ?m_type

WHERE {{
        ?m_type_id label ?m_type ;
            subClassOf* MType ;
            subClassOf* / hasNeurotransmitterType / label ?transmitter .           
}}
"""

In [565]:
resources = forge.sparql(query, limit=100)

In [566]:
df = forge.as_dataframe(resources)

In [567]:
df.head()

Unnamed: 0,m_type,transmitter
0,CCK-positive basket cell,Inhibitory
1,Parvalbumin-positive basket cell,Inhibitory
2,L1_DAC,Inhibitory
3,L1_HAC,Inhibitory
4,L1_LAC,Inhibitory


### Get m-types of pyramidal cells (mClass)

In [568]:
query = f"""

SELECT ?mClass ?m_type

WHERE {{
        ?m_type_id label ?m_type ;
            subClassOf* MType ;
            subClassOf* <https://neuroshapes.org/PyramidalNeuron> .           
}}
"""

In [569]:
resources = forge.sparql(query, limit=100)

In [570]:
df = forge.as_dataframe(resources)

In [571]:
df.head()

Unnamed: 0,m_type
0,L3_TPC:C
1,L5_TPC:B
2,L5_TPC:A
3,L3_TPC:A
4,L2_TPC:B


### Get m-types of interneurons (mClass)

In [572]:
query = f"""

SELECT ?mClass ?m_type

WHERE {{
        ?m_type_id label ?m_type ;
            subClassOf* MType ;
            subClassOf* <https://neuroshapes.org/Interneuron> .           
}}
"""

In [573]:
resources = forge.sparql(query, limit=100)

In [574]:
df = forge.as_dataframe(resources)

In [575]:
df.head()

Unnamed: 0,m_type
0,L1_NGC-DA
1,L1_NGC-SA
2,L23_LBC
3,L23_NBC
4,L23_SBC


### Get m-types with a given morphology and the morphology definition

This query returns m-types which have a given morphological shape. The cell morphologies were taken from the [Phenotype and Trait Ontology](https://ontobee.org/ontology/PATO) (this was done following the request of Georges Khazen who wanted to include the `PATO` deinfitions of morphologies).
Set `MORPHOLOGY` below to one of the following:

- `standard pyramidal morphology`
- `pyramidal family morphology`
- `tufted pyramidal morphology`
- `basket cell morphology`
- `chandelier cell morphology`
- `neurogliaform morphology`
- `Martinotti morphology`
- `cortical bipolar morphology`
- `bitufted cell morphology`

In [576]:
MORPHOLOGY = "basket cell morphology"

In [577]:
query = f"""

SELECT ?cell ?definition

WHERE {{
        ?cell_id subClassOf* / hasMorphologicalPhenotype ?pato_id ;
                  label ?cell .
        ?pato_id subClassOf* / label "{MORPHOLOGY}" .
        ?parent_pato_id label "{MORPHOLOGY}" ;
                <http://purl.obolibrary.org/obo/IAO_0000115> ?definition .
}}
"""

In [578]:
resources = forge.sparql(query, limit=100)

In [579]:
df = forge.as_dataframe(resources)

In [580]:
df.head()

Unnamed: 0,cell,definition
0,BC,A cell morphology that inheres in multipolar n...
1,Hippocampus basket cell,A cell morphology that inheres in multipolar n...
2,L23_LBC,A cell morphology that inheres in multipolar n...
3,L23_NBC,A cell morphology that inheres in multipolar n...
4,L23_SBC,A cell morphology that inheres in multipolar n...


### Get t-types from a paper specific paper

The [Cell Types and Missing Data - Version 1](https://docs.google.com/spreadsheets/d/1iUgqPszKkYQgkJlmpQSkeyFWcEoOxovsBkoLPtA3qPg/edit#gid=1180597294) spreadsheet which served as source for the `Cell Types Ontology` - lists paper identifiers on the `Notes` sheet. These references were added to the respective t-types.
Set `PAPER` below to one of the following:

- Yao et al. 2021: `https://www.sciencedirect.com/science/article/pii/S0092867421005018?dgcid=rss_sd_all`
- Gokce 2016: `https://www.ncbi.nlm.nih.gov/labs/pmc/articles/PMC5004635/`
- Kozareva et al. 2021: `https://www.nature.com/articles/s41586-021-03220-z`
- Chen et al. 2017: `https://www.sciencedirect.com/science/article/pii/S2211124717303212?via%3Dihub`
- Kalish et al. 2018: `https://www.pnas.org/content/115/5/E1051`

In [581]:
PAPER = "https://www.sciencedirect.com/science/article/pii/S0092867421005018?dgcid=rss_sd_all"

In [582]:
query = f"""

SELECT ?label ?brain_region

WHERE {{
        ?id seeAlso <{PAPER}> ;
            label ?label .
        OPTIONAL {{ ?id canHaveBrainRegion / label ?brain_region }} .
}}
"""

In [583]:
resources = forge.sparql(query, limit=100)

In [584]:
df = forge.as_dataframe(resources)

In [585]:
df.head()

Unnamed: 0,label
0,4_Meis2 HPF
1,9_Lamp5 Lhx6
2,12_Lamp5
3,3_Meis2
4,6_Lamp5 Lhx6


### Get the generic m- e- and t-type

One of the requirements specified during the meeting on `2022-05-30` was to have a placeholder class for each of the types. We thus implemented an m- e- and t-type placeholder class.

In [586]:
query = """

SELECT DISTINCT ?id ?prefLabel ?label

WHERE {
        ?id a Class .
        { ?id prefLabel ?prefLabel .
            FILTER (CONTAINS(STR(?prefLabel), 'Generic')) }
        UNION
        { ?id label ?label .
            FILTER (CONTAINS(STR(?label), 'Generic')) }
    
}
"""

In [587]:
resources = forge.sparql(query, limit=100)

In [588]:
df = forge.as_dataframe(resources)

In [589]:
df

Unnamed: 0,id,prefLabel,label
0,https://bbp.epfl.ch/ontologies/core/bmo/Generi...,Generic Excitatory Neuron EType,
1,https://bbp.epfl.ch/ontologies/core/bmo/Generi...,Generic Excitatory Neuron MType,
2,https://bbp.epfl.ch/ontologies/core/bmo/Generi...,Generic Inhibitory Neuron EType,
3,https://bbp.epfl.ch/ontologies/core/bmo/Generi...,Generic Inhibitory Neuron MType,
4,https://bbp.epfl.ch/ontologies/core/bmo/Generi...,,Generic Neuron Electrical Type
5,https://bbp.epfl.ch/ontologies/core/bmo/Generi...,,Generic Neuron Morphological Type
6,https://bbp.epfl.ch/ontologies/core/bmo/Generi...,,Generic Neuron TType


### Get all cell type combinations and probabilities for a given brain region

This query cell type combinations for a given brain region (i.e. the `BRAIN_REGION` specified above). For demonstration purposes, the `limit` parameter on the query has been set to `1000`. This can be increased to get all available cell type combinations.

In [590]:
query = f"""

SELECT ?brain_region ?m_type ?e_type ?molecular_type ?probability

WHERE {{
        ?probability_id hasTarget / hasSource / canBeLocatedInBrainRegion ?brain_region_id ;
            hasBody / value ?probability ;
            hasTarget ?m_type_target ;
            hasTarget ?e_type_target ;
            hasTarget ?molecular_type_target .
        ?brain_region_id label ?brain_region ;
            ^hasPart* <{brain_region}> .
        ?m_type_target hasSource / a MType ;
            hasSource / label ?m_type .
        ?e_type_target hasSource / a EType ;
            hasSource / label ?e_type .
        ?molecular_type_target hasSource / a NeuronMolecularType ;
            hasSource / label ?molecular_type .
}}
"""

In [591]:
resources = forge.sparql(query, limit=1000)

In [592]:
df = forge.as_dataframe(resources)

In [593]:
df.head()

Unnamed: 0,brain_region,e_type,m_type,molecular_type,probability
0,Isocortex,cNAC,L5_BTC,Sncg+,0.044444444
1,Isocortex,bIR,L23_DBC,SST+,0.346590909
2,Isocortex,cAC,L23_BP,PV+,0.0
3,Isocortex,cAC,L23_BTC,PV+,0.5
4,Isocortex,cAC,L23_ChC,PV+,0.333333333


----

## 2022-08-31 Workshop queries

### Get Atlas Release info

#### Get the atlas release resource
These atlas releases can be explored through the atlas web app:

* dev: https://bluebrainatlas.kcpdev.bbp.epfl.ch/atlas
* prod: https://bbp.epfl.ch/atlas


In [594]:
BBP_Mouse_Brain_Atlas_Release = "https://bbp.epfl.ch/neurosciencegraph/data/4906ab85-694f-469d-962f-c0174e901885" # output of the BBP Annotation Atlas pipeline
Allen_Mouse_CCF_v2_v3_hybrid =  "https://bbp.epfl.ch/neurosciencegraph/data/e2e500ec-fe7e-4888-88b9-b72425315dda" # Csaba 1 version: This atlas release uses the brain parcellation resulting of the hybridation between CCFv2 and CCFv3 and integrating the splitting of layer 2 and layer 3. The average brain template and the ontology is common across CCFv2 and CCFv3.
ALLEN_CCFV3_Atlas_Release =  "https://bbp.epfl.ch/neurosciencegraph/data/831a626a-c0ae-4691-8ce8-cfb7491345d9" # original Allen
ALLEN_CCFV2_Atlas_Release = "https://bbp.epfl.ch/neurosciencegraph/data/dd114f81-ba1f-47b1-8900-e497597f06ac"
BBP_Mouse_Brain_Atlas_Release_staging = "https://bbp.epfl.ch/neurosciencegraph/data/brainatlasrelease/c96c71a8-4c0d-4bc1-8a1a-141d9ed6693d"

atlas_release_id = BBP_Mouse_Brain_Atlas_Release

In [595]:
atlas_release = forge.retrieve(atlas_release_id, version=1)

In [None]:
print(atlas_release)

In [492]:
atlas_release._store_metadata["_rev"]

1

#### Get the atlas hierarchy

In [597]:
parcellation_ontology = forge.retrieve(atlas_release.parcellationOntology.id, cross_bucket=True)

In [None]:
print(parcellation_ontology)

In [495]:
forge.download(parcellation_ontology, "distribution.contentUrl", ".", overwrite=True, cross_bucket=True)

#### Get parcellation (annotation) volume

In [496]:
parcellation_volume = forge.retrieve(atlas_release.parcellationVolume.id)

In [None]:
print(parcellation_volume)

In [498]:
forge.download(parcellation_volume, "distribution.contentUrl", ".", overwrite=True)

#### Get orientation field volume

In [598]:
query = {
          "type":"CellOrientationField", 
          "atlasRelease":{"@id":atlas_release_id},
          "brainLocation":{"brainRegion":{"id":"http://api.brain-map.org/api/v2/data/Structure/997"}} # root brain region
        }
cell_orientation_field = forge.search(query)
print(f"{len(cell_orientation_field)} found")

2 found


In [None]:
print(cell_orientation_field[0])

In [501]:
forge.download(cell_orientation_field, "distribution.contentUrl", ".", overwrite=True)

### Get the me-type density nrrd file for each region (region is an input)

In [600]:
# Some atlas might not have densities associated
query = f"""

SELECT ?mtype_label ?etype_label ?nrrd_file ?contentUrl 

WHERE {{
        ?s a METypeDensity ;
            atlasRelease <{atlas_release_id}>; 
            annotation ?mtypeanno ;
            annotation ?etypeanno ;
            distribution ?distribution .
        ?mtypeanno a MTypeAnnotation ;         
            hasBody / label ?mtype_label .
        ?etypeanno a ETypeAnnotation ;         
            hasBody / label ?etype_label .
        ?distribution name ?nrrd_file ;
            contentUrl ?contentUrl .
}}
"""

In [601]:
resources = forge.sparql(query, limit=1000)

In [602]:
df = forge.as_dataframe(resources)

In [603]:
df.head()

Unnamed: 0,contentUrl,etype_label,mtype_label,nrrd_file
0,https://bbp.epfl.ch/nexus/v1/files/bbp/atlas/f...,cADpyr,L6_TPC:A,L6_TPC:A_cADpyr.nrrd
1,https://bbp.epfl.ch/nexus/v1/files/bbp/atlas/b...,cADpyr,L5_TPC:C,L5_TPC:C_cADpyr.nrrd
2,https://bbp.epfl.ch/nexus/v1/files/bbp/atlas/7...,cADpyr,L5_TPC:B,L5_TPC:B_cADpyr.nrrd
3,https://bbp.epfl.ch/nexus/v1/files/bbp/atlas/b...,cADpyr,L2_TPC:B,L2_TPC:B_cADpyr.nrrd
4,https://bbp.epfl.ch/nexus/v1/files/bbp/atlas/6...,cADpyr,L4_SSC,L4_SSC_cADpyr.nrrd


In [None]:
forge.download(resources, "contentUrl", ".", overwrite=True)

### Get the list of mtypes for each region (region is an input)

#### The below query will get m-types from the specified brain region plus m-types from all child brain regions (this was added during the workshop to illustrate the `down the tree` generalisation idea)

In [507]:
BRAIN_REGION = "Cerebral cortex"

Get brain region id

In [508]:
r = forge.search({"label": BRAIN_REGION}, cross_bucket=True)

In [509]:
brain_region = r[0].id

In [510]:
brain_region

'http://api.brain-map.org/api/v2/data/Structure/688'

In [511]:
query = f"""

SELECT ?m_type ?m_type_id ?m_type_version ?brain_region

WHERE {{
        ?m_type_id label ?m_type ;
            _rev ?m_type_version ;
            subClassOf* MType ;
            canBeLocatedInBrainRegion ?brain_region_id .
        
        ?brain_region_id label ?brain_region ;
            ^hasPart* <{brain_region}> ;
            _rev ?brain_region_version . 
}}
"""

In [512]:
resources = forge.sparql(query, limit=1000)

In [513]:
df = forge.as_dataframe(resources)

In [514]:
df.head()

Unnamed: 0,brain_region,m_type,m_type_id,m_type_version
0,Isocortex,L5_TPC:B,http://uri.interlex.org/base/ilx_0381364,33
1,Isocortex,L5_TPC:A,http://uri.interlex.org/base/ilx_0381365,31
2,Isocortex,L3_TPC:A,http://uri.interlex.org/base/ilx_0381366,31
3,Isocortex,L2_TPC:B,http://uri.interlex.org/base/ilx_0381367,33
4,Isocortex,L3_TPC:B,http://uri.interlex.org/base/ilx_0381368,30


In [515]:
set(df.brain_region)

{'Field CA1', 'Hippocampal formation', 'Isocortex'}

#### The below query will get m-types from the specified brain region plus m-types from parent brain regions (this was added during the workshop to illustrate the `up the tree` generalisation idea)

In [516]:
BRAIN_REGION = "Entorhinal area, medial part, dorsal zone"

Get brain region id

In [517]:
r = forge.search({"label": BRAIN_REGION}, cross_bucket=True)

In [518]:
brain_region = r[0].id

In [519]:
brain_region

'http://api.brain-map.org/api/v2/data/Structure/926'

In [520]:
query = f"""

SELECT ?m_type ?m_type_id ?m_type_version ?brain_region

WHERE {{
        ?m_type_id label ?m_type ;
            _rev ?m_type_version ;
            subClassOf* MType ;
            canBeLocatedInBrainRegion ?brain_region_id .
        
        ?brain_region_id label ?brain_region ;
            hasPart* <{brain_region}> ;
            _rev ?brain_region_version . 
}}
"""

In [521]:
resources = forge.sparql(query, limit=1000)

In [522]:
df = forge.as_dataframe(resources)

In [523]:
df.head()

Unnamed: 0,brain_region,m_type,m_type_id,m_type_version
0,root,GEN_mtype,https://bbp.epfl.ch/ontologies/core/bmo/Generi...,2
1,root,GIN_mtype,https://bbp.epfl.ch/ontologies/core/bmo/Generi...,2
2,Hippocampal formation,SR_PC,https://bbp.epfl.ch/neurosciencegraph/ontologi...,28
3,Hippocampal formation,GCL_GC,http://bbp.epfl.ch/neurosciencegraph/ontologie...,4
4,Hippocampal formation,SL_IS2,http://bbp.epfl.ch/neurosciencegraph/ontologie...,4


In [524]:
set(df.brain_region)

{'Hippocampal formation', 'root'}

### Get if the mtype is Excitatory or Inhibitory (mtype is input)

See `Get m-types together with their transmitter type (sClass)` query above

### Get if the mtype is pyramidal or interneuron (mtype is input)

See `Get m-types of pyramidal cells (mClass)` and `Get m-types of interneurons (mClass)` queries above

### Get all brain regions of layer 1 of the neocortex

In [525]:
query = f"""

SELECT ?id ?region

WHERE {{
        ?id hasLayerLocationPhenotype <http://purl.obolibrary.org/obo/UBERON_0005390> ;
            subClassOf* BrainRegion ;
            label ?region
}}
"""

In [526]:
resources = forge.sparql(query, limit=1000)

In [527]:
df = forge.as_dataframe(resources)

In [528]:
df.head()

Unnamed: 0,id,region
0,http://api.brain-map.org/api/v2/data/Structure...,"Primary auditory area, layer 1"
1,http://api.brain-map.org/api/v2/data/Structure...,"Primary somatosensory area, trunk, layer 1"
2,http://api.brain-map.org/api/v2/data/Structure...,"Dorsal auditory area, layer 1"
3,http://api.brain-map.org/api/v2/data/Structure...,"Ventral auditory area, layer 1"
4,http://api.brain-map.org/api/v2/data/Structure...,"Posterior auditory area, layer 1"
