# Query Cell Types Ontology

# Context

- This notebook has been put together for the MMB demo on 2022-05-30: see [slides](https://docs.google.com/presentation/d/1Ib1_8byK0hVuNS-wPbqmeL5Lcf67m_oegC3-czPw5ws/edit#slide=id.g116ba5ed71e_0_8)
- It has been revised following feedback from the meeting held on 2022-07-07: see [slides](https://docs.google.com/presentation/d/1mgCyYjHerLJLV79GM0kqp3_Htxmru_7QGIr5elUETC0/edit#slide=id.g13b4a370a10_0_19) and [JIRA ticket](https://bbpteam.epfl.ch/project/issues/browse/DKE-942)

## Imports

In [191]:
import json
import rdflib
import getpass
import pandas as pd
from rdflib import RDF, RDFS, XSD, OWL, URIRef, BNode, SKOS
import pprint
from kgforge.core import KnowledgeGraphForge

## Setup
Get an authentication token

For now, the [Nexus web application](https://bbp.epfl.ch/nexus/web) can be used to get a token. We are looking for other simpler alternatives.

- Step 1: From the opened web page, click on the login button on the right corner and follow the instructions.

![login-ui](./login-ui.png)

- Step 2: At the end you’ll see a token button on the right corner. Click on it to copy the token.

![login-ui](./copy-token.png)


In [None]:
TOKEN = getpass.getpass()

In [194]:
forge = KnowledgeGraphForge("https://raw.githubusercontent.com/BlueBrain/nexus-forge/master/examples/notebooks/use-cases/prod-forge-nexus.yml",
                            token=TOKEN,
                            searchendpoints={"sparql": {"endpoint": "https://bbp.epfl.ch/neurosciencegraph/data/views/aggreg-sp/dataset"}},
                            endpoint="https://staging.nise.bbp.epfl.ch/nexus/v1",
                            
                            bucket="bbp/atlas")

## Ontologies

### Set brain region

During the meeting on `2022-05-30`, it was specified that a brain region will serve as entry point when searching for cell types in the MMB context. Hence, this notebook starts by defining a brain region one wants to get cell types for. Since the most complete cell type information is available for the `Cerebral cortex`, this has been set as the default below.

In [195]:
BRAIN_REGION = "Cerebral cortex"
# BRAIN_REGION = "Cerebellum"
# BRAIN_REGION = "Somatosensory areas"

Get brain region id

In [196]:
r = forge.search({"label": BRAIN_REGION}, cross_bucket=True)

In [197]:
brain_region = r[0].id

In [198]:
brain_region

'http://api.brain-map.org/api/v2/data/Structure/688'

## Queries

### Get brain regions which do have neuron t-types available

This query will list brain region labels for which the knowledge graph has neuron t-types available

In [199]:
query = f"""

SELECT ?brain_region

WHERE {{
        ?t_type_id subClassOf* <https://bbp.epfl.ch/ontologies/core/celltypes/NeuronTranscriptomicType> ;
                  canBeLocatedInBrainRegion ?brain_region_id .        
        ?brain_region_id label ?brain_region .
}}
"""

In [200]:
resources = forge.sparql(query, limit=1000)

In [201]:
df = forge.as_dataframe(resources)

In [202]:
set(df.brain_region)

{'Agranular insular area',
 'Anterior cingulate area',
 'Area prostriata',
 'Cerebellum',
 'Cerebral cortex',
 'Entorhinal area',
 'Entorhinal area, lateral part',
 'Entorhinal area, medial part, dorsal zone',
 'Fasciola cinerea',
 'Field CA1',
 'Field CA2',
 'Field CA3',
 'Hippocampal formation',
 'Hippocampo-amygdalar transition area',
 'Hypothalamus',
 'Induseum griseum',
 'Isocortex',
 'Parasubiculum',
 'Postsubiculum',
 'Presubiculum',
 'Prosubiculum',
 'Retrohippocampal region',
 'Retrosplenial area',
 'Retrosplenial area, ventral part',
 'Subiculum',
 'root'}

### Get brain regions which do have neuron m-types available

In [339]:
query = f"""

SELECT ?brain_region

WHERE {{
        ?m_type_id subClassOf* <https://bbp.epfl.ch/ontologies/core/bmo/NeuronMorphologicalType> ;
                  canBeLocatedInBrainRegion ?brain_region_id .        
        ?brain_region_id label ?brain_region .
}}
"""

In [340]:
resources = forge.sparql(query, limit=1000)

In [341]:
df = forge.as_dataframe(resources)

In [342]:
set(df.brain_region)

{'Caudoputamen',
 'Field CA1',
 'Globus pallidus, external segment',
 'Globus pallidus, internal segment',
 'Hippocampal formation',
 'Isocortex',
 'Nucleus accumbens',
 'Pallidum, ventral region',
 'Reticular nucleus of the thalamus',
 'Substantia nigra, compact part',
 'Substantia nigra, reticular part',
 'Subthalamic nucleus',
 'Thalamus',
 'root'}

### Get brain region which do have neuron e-types available

In [349]:
query = f"""

SELECT ?brain_region

WHERE {{
        ?e_type_id subClassOf* <https://bbp.epfl.ch/ontologies/core/bmo/NeuronElectricalType> ;
                  canBeLocatedInBrainRegion ?brain_region_id .        
        ?brain_region_id label ?brain_region .
}}
"""

In [350]:
resources = forge.sparql(query, limit=1000)

In [351]:
df = forge.as_dataframe(resources)

In [352]:
set(df.brain_region)

{'Caudoputamen',
 'Globus pallidus, external segment',
 'Globus pallidus, internal segment',
 'Isocortex',
 'Nucleus accumbens',
 'Pallidum, ventral region',
 'Substantia nigra, compact part',
 'Substantia nigra, reticular part',
 'Subthalamic nucleus',
 'Thalamus',
 'root'}

### Get possible t-types for a given brain region

This query lists t-types for a given brain region (i.e. the `BRAIN_REGION` specified above). For demonstration purposes, the `limit` parameter on the query has been set to `100`. This can be increased to get all available t-types. E.g. the total number of available t-types for `Cerebral cortex` on `2022-07-08` was `252`.

In [203]:
query = f"""

SELECT ?brain_region ?t_type

WHERE {{
        ?t_type_id label ?t_type ;
            subClassOf* <https://bbp.epfl.ch/ontologies/core/celltypes/NeuronTranscriptomicType> ;
            canBeLocatedInBrainRegion <{brain_region}> .        
        <{brain_region}> label ?brain_region . 
}}
"""

In [204]:
resources = forge.sparql(query, limit=1000)

In [205]:
df = forge.as_dataframe(resources)

In [206]:
df.head()

Unnamed: 0,brain_region,t_type
0,Cerebral cortex,10_Lamp5
1,Cerebral cortex,100_Sst
2,Cerebral cortex,101_Sst
3,Cerebral cortex,108_Pvalb
4,Cerebral cortex,109_Pvalb


### Get possible met-type combinations plus excitatory/inhibitory category for a given brain region

This query returns possible met-type combinations together with the excitatory/inhibitory categories for the brain region one has set above. For each m- e and t- and transmitter-type, the identifier and the current version in the knowledge graph are also being returned. For a simplified view, please run the `df.drop()` cell below. It will only show the labels of a given type. The `version` indicates the revision of a given type in the knowledge graph and has been included following the Cell Types Meeting on 2022-07-07 to help with reproducibility (see also this JIRA ticket: [DKE-942](https://bbpteam.epfl.ch/project/issues/browse/DKE-942)).

In [207]:
query = f"""

SELECT ?brain_region ?brain_region_version ?transmitter ?transmitter_id ?transmitter_version ?t_type ?t_type_id ?t_type_version ?m_type ?m_type_id ?m_type_version ?e_type ?e_type_id ?e_type_version

WHERE {{
        ?t_type_id label ?t_type ;
                canBeLocatedInBrainRegion <{brain_region}> ;
                _rev ?t_type_version .
        
        <{brain_region}> label ?brain_region ;
            _rev ?brain_region_version .
        
        ?m_type_id label ?m_type ;
            _rev ?m_type_version ;
            subClassOf* MType ;
            canHaveTType ?t_type_id ;
            subClassOf* / hasNeurotransmitterType ?transmitter_id .
        
        ?transmitter_id label ?transmitter ;        
            _rev ?transmitter_version .

        ?e_type_id label ?e_type ;
            _rev ?e_type_version ;
            subClassOf* EType ;
            subClassOf* / canHaveMType ?m_type_id ;
            subClassOf* / canHaveTType ?t_type_id .            
}}
"""

In [208]:
resources = forge.sparql(query, limit=1000)

In [209]:
df = forge.as_dataframe(resources)

In [210]:
df.head()

Unnamed: 0,brain_region,brain_region_version,e_type,e_type_id,e_type_version,m_type,m_type_id,m_type_version,t_type,t_type_id,t_type_version,transmitter,transmitter_id,transmitter_version
0,Cerebral cortex,41,NCx_bNAC,https://bbp.epfl.ch/ontologies/core/bmo/Neocor...,1,L1_DAC,http://uri.interlex.org/base/ilx_0383192,75,10_Lamp5,https://bbp.epfl.ch/ontologies/core/ttypes/10_...,52,Inhibitory,https://bbp.epfl.ch/ontologies/core/bmo/Inhibi...,28
1,Cerebral cortex,41,NCx_bNAC,https://bbp.epfl.ch/ontologies/core/bmo/Neocor...,1,L1_HAC,http://uri.interlex.org/base/ilx_0383193,75,10_Lamp5,https://bbp.epfl.ch/ontologies/core/ttypes/10_...,52,Inhibitory,https://bbp.epfl.ch/ontologies/core/bmo/Inhibi...,28
2,Cerebral cortex,41,NCx_bNAC,https://bbp.epfl.ch/ontologies/core/bmo/Neocor...,1,L1_HAC,http://uri.interlex.org/base/ilx_0383193,75,100_Sst,https://bbp.epfl.ch/ontologies/core/ttypes/100...,52,Inhibitory,https://bbp.epfl.ch/ontologies/core/bmo/Inhibi...,28
3,Cerebral cortex,41,NCx_bNAC,https://bbp.epfl.ch/ontologies/core/bmo/Neocor...,1,L1_HAC,http://uri.interlex.org/base/ilx_0383193,75,101_Sst,https://bbp.epfl.ch/ontologies/core/ttypes/101...,52,Inhibitory,https://bbp.epfl.ch/ontologies/core/bmo/Inhibi...,28
4,Cerebral cortex,41,NCx_bNAC,https://bbp.epfl.ch/ontologies/core/bmo/Neocor...,1,L1_DAC,http://uri.interlex.org/base/ilx_0383192,75,11_Lamp5,https://bbp.epfl.ch/ontologies/core/ttypes/11_...,52,Inhibitory,https://bbp.epfl.ch/ontologies/core/bmo/Inhibi...,28


In [211]:
df.drop(["brain_region_version", "e_type_id", "e_type_version", "m_type_id", "m_type_version", "t_type_id", "t_type_version", "transmitter_id", "transmitter_version"], axis=1)

Unnamed: 0,brain_region,e_type,m_type,t_type,transmitter
0,Cerebral cortex,NCx_bNAC,L1_DAC,10_Lamp5,Inhibitory
1,Cerebral cortex,NCx_bNAC,L1_HAC,10_Lamp5,Inhibitory
2,Cerebral cortex,NCx_bNAC,L1_HAC,100_Sst,Inhibitory
3,Cerebral cortex,NCx_bNAC,L1_HAC,101_Sst,Inhibitory
4,Cerebral cortex,NCx_bNAC,L1_DAC,11_Lamp5,Inhibitory
...,...,...,...,...,...
995,Cerebral cortex,NCx_bIR,L23_BP,117_Pvalb,Inhibitory
996,Cerebral cortex,NCx_bIR,L23_BP,118_Pvalb,Inhibitory
997,Cerebral cortex,NCx_bIR,L23_BP,119_Pvalb,Inhibitory
998,Cerebral cortex,NCx_bIR,L23_BP,120_Pvalb,Inhibitory


### Get possible t-types for a given brain region and all the brain regions which are part of that brain region

This query returns possible t-types for the brain region one has set above and all the brain regions that are part of that brain region. E.g. if one specifies `Cerebral cortex` as brain region, this query would return t-types from the `Cerebral cortex` but also t-types for the `Isocortex` or the `Hippocampal formation` since they are both part of the `Cerebral cortex`.

In [212]:
query = f"""

SELECT ?brain_region ?brain_region_version ?t_type ?t_type_id ?t_type_version

WHERE {{
        ?t_type_id label ?t_type ;
                canBeLocatedInBrainRegion ?brain_region_id ;
                _rev ?t_type_version .
        
        ?brain_region_id label ?brain_region ;
            ^hasPart* <{brain_region}> ;
            _rev ?brain_region_version .            
}}
"""

In [213]:
resources = forge.sparql(query, limit=500)

In [214]:
df = forge.as_dataframe(resources)

In [215]:
len(set(df.t_type))

466

In [216]:
df

Unnamed: 0,brain_region,brain_region_version,t_type,t_type_id,t_type_version
0,Field CA1,41,329_CA1-ProS,https://bbp.epfl.ch/ontologies/core/ttypes/329...,51
1,Field CA1,41,330_CA1-ProS,https://bbp.epfl.ch/ontologies/core/ttypes/330...,51
2,Field CA1,41,331_CA1-ProS,https://bbp.epfl.ch/ontologies/core/ttypes/331...,51
3,Field CA1,41,332_CA1-ProS,https://bbp.epfl.ch/ontologies/core/ttypes/332...,51
4,Field CA1,41,333_CA1-ProS,https://bbp.epfl.ch/ontologies/core/ttypes/333...,51
...,...,...,...,...,...
495,Cerebral cortex,41,74_Sst,https://bbp.epfl.ch/ontologies/core/ttypes/74_Sst,51
496,Cerebral cortex,41,75_Sst,https://bbp.epfl.ch/ontologies/core/ttypes/75_Sst,51
497,Cerebral cortex,41,76_Sst,https://bbp.epfl.ch/ontologies/core/ttypes/76_Sst,51
498,Cerebral cortex,41,77_Sst HPF,https://bbp.epfl.ch/ontologies/core/ttypes/77_...,51


### Get m-types together with their transmitter type (sClass)

In [217]:
query = f"""

SELECT ?transmitter ?m_type

WHERE {{
        ?m_type_id label ?m_type ;
            subClassOf* MType ;
            subClassOf* / hasNeurotransmitterType / label ?transmitter .           
}}
"""

In [218]:
resources = forge.sparql(query, limit=100)

In [219]:
df = forge.as_dataframe(resources)

In [220]:
df.head()

Unnamed: 0,m_type,transmitter
0,L1_DAC,Inhibitory
1,L1_HAC,Inhibitory
2,L1_LAC,Inhibitory
3,L1_SAC,Inhibitory
4,L4_SSC,Excitatory


### Get m-types of pyramidal cells (mClass)

In [221]:
query = f"""

SELECT ?mClass ?m_type

WHERE {{
        ?m_type_id label ?m_type ;
            subClassOf* MType ;
            subClassOf* <https://neuroshapes.org/PyramidalNeuron> .           
}}
"""

In [222]:
resources = forge.sparql(query, limit=100)

In [223]:
df = forge.as_dataframe(resources)

In [224]:
df.head()

Unnamed: 0,m_type
0,L3_TPC:C
1,L5_TPC:B
2,L5_TPC:A
3,L3_TPC:A
4,L2_TPC:B


### Get m-types of interneurons (mClass)

In [225]:
query = f"""

SELECT ?mClass ?m_type

WHERE {{
        ?m_type_id label ?m_type ;
            subClassOf* MType ;
            subClassOf* <https://neuroshapes.org/Interneuron> .           
}}
"""

In [226]:
resources = forge.sparql(query, limit=100)

In [227]:
df = forge.as_dataframe(resources)

In [228]:
df.head()

Unnamed: 0,m_type
0,L5_MC
1,L6_MC
2,L1_DAC
3,L1_HAC
4,L1_LAC


### Get m-types with a given morphology and the morphology definition

This query returns m-types which have a given morphological shape. The cell morphologies were taken from the [Phenotype and Trait Ontology](https://ontobee.org/ontology/PATO) (this was done following the request of Georges Khazen who wanted to include the `PATO` deinfitions of morphologies).
Set `MORPHOLOGY` below to one of the following:

- `standard pyramidal morphology`
- `pyramidal family morphology`
- `tufted pyramidal morphology`
- `basket cell morphology`
- `chandelier cell morphology`
- `neurogliaform morphology`
- `Martinotti morphology`
- `cortical bipolar morphology`
- `bitufted cell morphology`

In [229]:
MORPHOLOGY = "basket cell morphology"

In [230]:
query = f"""

SELECT ?cell ?definition

WHERE {{
        ?cell_id subClassOf* / hasMorphologicalPhenotype ?pato_id ;
                  label ?cell .
        ?pato_id subClassOf* / label "{MORPHOLOGY}" .
        ?parent_pato_id label "{MORPHOLOGY}" ;
                <http://purl.obolibrary.org/obo/IAO_0000115> ?definition .
}}
"""

In [231]:
resources = forge.sparql(query, limit=100)

In [232]:
df = forge.as_dataframe(resources)

In [233]:
df.head()

Unnamed: 0,cell,definition
0,BC,A cell morphology that inheres in multipolar n...
1,Hippocampus basket cell,A cell morphology that inheres in multipolar n...
2,L23_LBC,A cell morphology that inheres in multipolar n...
3,L23_NBC,A cell morphology that inheres in multipolar n...
4,L23_SBC,A cell morphology that inheres in multipolar n...


### Get t-types from a paper specific paper

The [Cell Types and Missing Data - Version 1](https://docs.google.com/spreadsheets/d/1iUgqPszKkYQgkJlmpQSkeyFWcEoOxovsBkoLPtA3qPg/edit#gid=1180597294) spreadsheet which served as source for the `Cell Types Ontology` - lists paper identifiers on the `Notes` sheet. These references were added to the respective t-types.
Set `PAPER` below to one of the following:

- Yao et al. 2021: `https://www.sciencedirect.com/science/article/pii/S0092867421005018?dgcid=rss_sd_all`
- Gokce 2016: `https://www.ncbi.nlm.nih.gov/labs/pmc/articles/PMC5004635/`
- Kozareva et al. 2021: `https://www.nature.com/articles/s41586-021-03220-z`
- Chen et al. 2017: `https://www.sciencedirect.com/science/article/pii/S2211124717303212?via%3Dihub`
- Kalish et al. 2018: `https://www.pnas.org/content/115/5/E1051`

In [234]:
PAPER = "https://www.sciencedirect.com/science/article/pii/S0092867421005018?dgcid=rss_sd_all"

In [235]:
query = f"""

SELECT ?label ?brain_region

WHERE {{
        ?id seeAlso <{PAPER}> ;
            label ?label .
        OPTIONAL {{ ?id canHaveBrainRegion / label ?brain_region }} .
}}
"""

In [236]:
resources = forge.sparql(query, limit=100)

In [237]:
df = forge.as_dataframe(resources)

In [238]:
df.head()

Unnamed: 0,label
0,10_Lamp5
1,100_Sst
2,101_Sst
3,102_Sst HPF
4,103_Sst HPF


### Get the generic m- e- and t-type

One of the requirements specified during the meeting on `2022-05-30` was to have a placeholder class for each of the types. We thus implemented an m- e- and t-type placeholder class.

In [283]:
query = """

SELECT DISTINCT ?id ?prefLabel ?label

WHERE {
        ?id a Class .
        { ?id prefLabel ?prefLabel .
            FILTER (CONTAINS(STR(?prefLabel), 'Generic')) }
        UNION
        { ?id label ?label .
            FILTER (CONTAINS(STR(?label), 'Generic')) }
    
}
"""

In [284]:
resources = forge.sparql(query, limit=100)

In [285]:
df = forge.as_dataframe(resources)

In [286]:
df

Unnamed: 0,id,prefLabel,label
0,https://bbp.epfl.ch/ontologies/core/bmo/Generi...,Generic Excitatory Neuron EType,
1,https://bbp.epfl.ch/ontologies/core/bmo/Generi...,Generic Excitatory Neuron MType,
2,https://bbp.epfl.ch/ontologies/core/bmo/Generi...,Generic Inhibitory Neuron EType,
3,https://bbp.epfl.ch/ontologies/core/bmo/Generi...,Generic Inhibitory Neuron MType,
4,https://bbp.epfl.ch/ontologies/core/bmo/Generi...,,Generic Neuron Electrical Type
5,https://bbp.epfl.ch/ontologies/core/bmo/Generi...,,Generic Neuron Morphological Type
6,https://bbp.epfl.ch/ontologies/core/bmo/Generi...,,Generic Neuron TType


### Get all cell type combinations and probabilities for a given brain region

This query cell type combinations for a given brain region (i.e. the `BRAIN_REGION` specified above). For demonstration purposes, the `limit` parameter on the query has been set to `1000`. This can be increased to get all available cell type combinations.

In [287]:
query = f"""

SELECT ?brain_region ?m_type ?e_type ?molecular_type ?probability

WHERE {{
        ?probability_id hasTarget / hasSource / canBeLocatedInBrainRegion ?brain_region_id ;
            hasBody / value ?probability ;
            hasTarget ?m_type_target ;
            hasTarget ?e_type_target ;
            hasTarget ?molecular_type_target .
        ?brain_region_id label ?brain_region ;
            ^hasPart* <{brain_region}> .
        ?m_type_target hasSource / a MType ;
            hasSource / label ?m_type .
        ?e_type_target hasSource / a EType ;
            hasSource / label ?e_type .
        ?molecular_type_target hasSource / a NeuronMolecularType ;
            hasSource / label ?molecular_type .
}}
"""

In [288]:
resources = forge.sparql(query, limit=1000)

In [289]:
df = forge.as_dataframe(resources)

In [290]:
df.head()

Unnamed: 0,brain_region,e_type,m_type,molecular_type,probability
0,Isocortex,cAC,L23_LBC,SST+,0.117424242
1,Isocortex,cAC,L23_MC,SST+,0.394886364
2,Isocortex,cAC,L23_NBC,SST+,0.0
3,Isocortex,cSTUT,L23_LBC,SST+,0.0
4,Isocortex,bAC,L23_LBC,SST+,0.254456328


----

## 2022-08-31 Workshop queries

### Get Atlas Release info

#### Get the atlas release resource
These atlas releases can be explored through the atlas web app:

* dev: https://bluebrainatlas.kcpdev.bbp.epfl.ch/atlas
* prod: https://bbp.epfl.ch/atlas


In [291]:
BBP_Mouse_Brain_Atlas_Release = "https://bbp.epfl.ch/neurosciencegraph/data/4906ab85-694f-469d-962f-c0174e901885" # output of the BBP Annotation Atlas pipeline
Allen_Mouse_CCF_v2_v3_hybrid =  "https://bbp.epfl.ch/neurosciencegraph/data/e2e500ec-fe7e-4888-88b9-b72425315dda" # Csaba 1 version: This atlas release uses the brain parcellation resulting of the hybridation between CCFv2 and CCFv3 and integrating the splitting of layer 2 and layer 3. The average brain template and the ontology is common across CCFv2 and CCFv3.
ALLEN_CCFV3_Atlas_Release =  "https://bbp.epfl.ch/neurosciencegraph/data/831a626a-c0ae-4691-8ce8-cfb7491345d9" # original Allen
ALLEN_CCFV2_Atlas_Release = "https://bbp.epfl.ch/neurosciencegraph/data/dd114f81-ba1f-47b1-8900-e497597f06ac"
BBP_Mouse_Brain_Atlas_Release_staging = "https://bbp.epfl.ch/neurosciencegraph/data/brainatlasrelease/c96c71a8-4c0d-4bc1-8a1a-141d9ed6693d"

atlas_release_id = BBP_Mouse_Brain_Atlas_Release_staging

In [292]:
atlas_release = forge.retrieve(atlas_release_id, version=1)

In [293]:
print(atlas_release)

{
    context: https://bbp.neuroshapes.org
    id: https://bbp.epfl.ch/neurosciencegraph/data/brainatlasrelease/c96c71a8-4c0d-4bc1-8a1a-141d9ed6693d
    type:
    [
        AtlasRelease
        BrainAtlasRelease
        Entity
    ]
    brainTemplateDataLayer:
    {
        id: https://bbp.epfl.ch/neurosciencegraph/data/dca40f99-b494-4d2c-9a2f-c407180138b7
        type: BrainTemplateDataLayer
    }
    contribution:
    [
        {
            type: Contribution
            agent:
            {
                id: https://bbp.epfl.ch/neurosciencegraph/data/7417edf9-2a0a-421d-ab20-4166e688c619
                type:
                [
                    Agent
                    Person
                ]
            }
            hadRole:
            {
                id: https://neuroshapes.org/BrainAtlasPipelineExecutionRole
                label: Brain Atlas Pipeline Executor role
            }
        }
        {
            type: Contribution
            agent:
            {
        

In [294]:
atlas_release._store_metadata["_rev"]

1

#### Get the atlas hierarchy

In [295]:
parcellation_ontology = forge.retrieve(atlas_release.parcellationOntology.id, cross_bucket=True)

In [296]:
print(parcellation_ontology)

{
    context: https://bbp.neuroshapes.org
    id: https://bbp.epfl.ch/neurosciencegraph/data/ontologies/34388d3b-0b88-4deb-9686-6fcd9ef8990e
    type:
    [
        Entity
        Ontology
        ParcellationOntology
    ]
    label: AIBS Mouse CCF Atlas parcellation ontology L2L3 split
    contribution:
    [
        {
            type: Contribution
            agent:
            {
                id: https://bbp.epfl.ch/neurosciencegraph/data/b1e71aec-0e4e-4ce3-aca2-99f1614da975
                type:
                [
                    Agent
                    Person
                ]
            }
            hadRole:
            {
                id: https://neuroshapes.org/BrainAtlasPipelineExecutionRole
                label: Brain Atlas Pipeline Executor role
            }
        }
        {
            type: Contribution
            agent:
            {
                id: https://www.grid.ac/institutes/grid.5333.6
                type: Organization
            }
        

In [297]:
forge.download(parcellation_ontology, "distribution.contentUrl", ".", overwrite=True, cross_bucket=True)

#### Get parcellation (annotation) volume

In [298]:
parcellation_volume = forge.retrieve(atlas_release.parcellationVolume.id)

In [299]:
print(parcellation_volume)

{
    context: https://bbp.neuroshapes.org
    id: https://bbp.epfl.ch/neurosciencegraph/data/volumetricdatalayer/d1ec2987-0519-4010-b3a1-891c05991b31
    type:
    [
        VolumetricDataLayer
        BrainParcellationDataLayer
        Dataset
    ]
    atlasRelease:
    {
        id: https://bbp.epfl.ch/neurosciencegraph/data/brainatlasrelease/c96c71a8-4c0d-4bc1-8a1a-141d9ed6693d
        type:
        [
            AtlasRelease
            BrainAtlasRelease
            Entity
        ]
    }
    brainLocation:
    {
        atlasSpatialReferenceSystem:
        {
            id: https://bbp.epfl.ch/neurosciencegraph/data/allen_ccfv3_spatial_reference_system
            type:
            [
                BrainAtlasSpatialReferenceSystem
                AtlasSpatialReferenceSystem
            ]
        }
        brainRegion:
        {
            id: http://api.brain-map.org/api/v2/data/Structure/997
            label: root
        }
    }
    bufferEncoding: gzip
    componentEncodin

In [300]:
forge.download(parcellation_volume, "distribution.contentUrl", ".", overwrite=True)

#### Get orientation field volume

In [301]:
query = {
          "type":"CellOrientationField", 
          "atlasRelease":{"@id":atlas_release_id},
          "brainLocation":{"brainRegion":{"id":"http://api.brain-map.org/api/v2/data/Structure/997"}} # root brain region
        }
cell_orientation_field = forge.search(query)
print(f"{len(cell_orientation_field)} found")

2 found


In [304]:
print(cell_orientation_field[0])

{
    context: https://bbp.neuroshapes.org
    id: https://bbp.epfl.ch/neurosciencegraph/data/049013d5-e29d-460e-9960-1007046d409b
    type:
    [
        VolumetricDataLayer
        CellOrientationField
        Dataset
    ]
    atlasRelease:
    {
        id: https://bbp.epfl.ch/neurosciencegraph/data/brainatlasrelease/c96c71a8-4c0d-4bc1-8a1a-141d9ed6693d
        type:
        [
            AtlasRelease
            BrainAtlasRelease
            Entity
        ]
    }
    brainLocation:
    {
        atlasSpatialReferenceSystem:
        {
            id: https://bbp.epfl.ch/neurosciencegraph/data/allen_ccfv3_spatial_reference_system
            type:
            [
                BrainAtlasSpatialReferenceSystem
                AtlasSpatialReferenceSystem
            ]
        }
        brainRegion:
        {
            id: http://api.brain-map.org/api/v2/data/Structure/997
            label: root
        }
    }
    bufferEncoding: gzip
    componentEncoding: float32
    contributio

In [305]:
forge.download(cell_orientation_field, "distribution.contentUrl", ".", overwrite=True)

### Get the me-type density nrrd file for each region (region is an input)

In [306]:
# Some atlas might not have densities associated
query = f"""

SELECT ?mtype_label ?etype_label ?nrrd_file ?contentUrl 

WHERE {{
        ?s a METypeDensity ;
            atlasRelease <{atlas_release_id}>; 
            annotation ?mtypeanno ;
            annotation ?etypeanno ;
            distribution ?distribution .
        ?mtypeanno a MTypeAnnotation ;         
            hasBody / label ?mtype_label .
        ?etypeanno a ETypeAnnotation ;         
            hasBody / label ?etype_label .
        ?distribution name ?nrrd_file ;
            contentUrl ?contentUrl .
}}
"""

In [307]:
resources = forge.sparql(query, limit=1000)

In [308]:
df = forge.as_dataframe(resources)

In [309]:
df.head()

Unnamed: 0,contentUrl,etype_label,mtype_label,nrrd_file
0,https://staging.nise.bbp.epfl.ch/nexus/v1/file...,cADpyr,L6_TPC:A,L6_TPC:A_cADpyr.nrrd
1,https://staging.nise.bbp.epfl.ch/nexus/v1/file...,cADpyr,L3_TPC:B,L3_TPC:B_cADpyr.nrrd
2,https://staging.nise.bbp.epfl.ch/nexus/v1/file...,cADpyr,L5_TPC:A,L5_TPC:A_cADpyr.nrrd
3,https://staging.nise.bbp.epfl.ch/nexus/v1/file...,cADpyr,L2_TPC:B,L2_TPC:B_cADpyr.nrrd
4,https://staging.nise.bbp.epfl.ch/nexus/v1/file...,cADpyr,L3_TPC:A,L3_TPC:A_cADpyr.nrrd


In [311]:
forge.download(resources, "contentUrl", ".", overwrite=True)

### Get the list of mtypes for each region (region is an input)

#### The below query will get m-types from the specified brain region plus m-types from all child brain regions (this was added during the workshop to illustrate the `down the tree` generalisation idea)

In [312]:
BRAIN_REGION = "Cerebral cortex"

Get brain region id

In [313]:
r = forge.search({"label": BRAIN_REGION}, cross_bucket=True)

In [314]:
brain_region = r[0].id

In [315]:
brain_region

'http://api.brain-map.org/api/v2/data/Structure/688'

In [316]:
query = f"""

SELECT ?m_type ?m_type_id ?m_type_version ?brain_region

WHERE {{
        ?m_type_id label ?m_type ;
            _rev ?m_type_version ;
            subClassOf* MType ;
            canBeLocatedInBrainRegion ?brain_region_id .
        
        ?brain_region_id label ?brain_region ;
            ^hasPart* <{brain_region}> ;
            _rev ?brain_region_version . 
}}
"""

In [317]:
resources = forge.sparql(query, limit=1000)

In [318]:
df = forge.as_dataframe(resources)

In [319]:
df.head()

Unnamed: 0,brain_region,m_type,m_type_id,m_type_version
0,Hippocampal formation,SR_PC,https://bbp.epfl.ch/neurosciencegraph/ontologi...,75
1,Hippocampal formation,GCL_GC,http://bbp.epfl.ch/neurosciencegraph/ontologie...,28
2,Hippocampal formation,SL_IS2,http://bbp.epfl.ch/neurosciencegraph/ontologie...,28
3,Hippocampal formation,SLM_PPA,http://bbp.epfl.ch/neurosciencegraph/ontologie...,28
4,Hippocampal formation,SO_BP,http://bbp.epfl.ch/neurosciencegraph/ontologie...,28


In [320]:
set(df.brain_region)

{'Field CA1', 'Hippocampal formation', 'Isocortex'}

#### The below query will get m-types from the specified brain region plus m-types from parent brain regions (this was added during the workshop to illustrate the `up the tree` generalisation idea)

In [321]:
BRAIN_REGION = "Entorhinal area, medial part, dorsal zone"

Get brain region id

In [322]:
r = forge.search({"label": BRAIN_REGION}, cross_bucket=True)

In [323]:
brain_region = r[0].id

In [324]:
brain_region

'http://api.brain-map.org/api/v2/data/Structure/926'

In [325]:
query = f"""

SELECT ?m_type ?m_type_id ?m_type_version ?brain_region

WHERE {{
        ?m_type_id label ?m_type ;
            _rev ?m_type_version ;
            subClassOf* MType ;
            canBeLocatedInBrainRegion ?brain_region_id .
        
        ?brain_region_id label ?brain_region ;
            hasPart* <{brain_region}> ;
            _rev ?brain_region_version . 
}}
"""

In [326]:
resources = forge.sparql(query, limit=1000)

In [327]:
df = forge.as_dataframe(resources)

In [328]:
df.head()

Unnamed: 0,brain_region,m_type,m_type_id,m_type_version
0,Hippocampal formation,SR_PC,https://bbp.epfl.ch/neurosciencegraph/ontologi...,75
1,Hippocampal formation,GCL_GC,http://bbp.epfl.ch/neurosciencegraph/ontologie...,28
2,Hippocampal formation,SL_IS2,http://bbp.epfl.ch/neurosciencegraph/ontologie...,28
3,Hippocampal formation,SLM_PPA,http://bbp.epfl.ch/neurosciencegraph/ontologie...,28
4,Hippocampal formation,SO_BP,http://bbp.epfl.ch/neurosciencegraph/ontologie...,28


In [329]:
set(df.brain_region)

{'Hippocampal formation', 'root'}

### Get the etype ratio for each mtype for each region (region and mtype are inputs)

`TODO`: should we add the brain regions for which we want to use this? Generalisation here: From `rat` to `mouse`, from `primary somatosensory cortex` to `Isocortex`

In [330]:
query = f"""

SELECT ?brain_region ?m_type ?e_type ?ratio

WHERE {{
        ?s a ETypeRatio ;
            brainLocation / brainRegion / label ?brain_region ;
            annotation ?mtype_anno ;
            annotation ?etype_anno ;
            series ?ratioSeries .
        ?ratioSeries statistic "ratio" ;
              value ?ratio .
        ?mtype_anno a MTypeAnnotation ;
            hasBody / label ?m_type .
        ?etype_anno a ETypeAnnotation ;
            hasBody / label ?e_type .
}}
"""

In [331]:
resources = forge.sparql(query, limit=1000)

In [332]:
df = forge.as_dataframe(resources)

In [333]:
df.head()

### Get if the mtype is Excitatory or Inhibitory (mtype is input)

See `Get m-types together with their transmitter type (sClass)` query above

### Get if the mtype is pyramidal or interneuron (mtype is input)

See `Get m-types of pyramidal cells (mClass)` and `Get m-types of interneurons (mClass)` queries above

### Get the mini frequency for the mtype (mtype is input)

In [334]:
# TODO; while this has been discussed during the workshop it was later decided that these data are not needed and will thus not be added to the knowledge graph

### Get all brain regions of layer 1 of the neocortex

In [335]:
query = f"""

SELECT ?id ?region

WHERE {{
        ?id hasLayerLocationPhenotype <http://purl.obolibrary.org/obo/UBERON_0005390> ;
            subClassOf* BrainRegion ;
            label ?region
}}
"""

In [336]:
resources = forge.sparql(query, limit=1000)

In [337]:
df = forge.as_dataframe(resources)

In [338]:
df.head()

Unnamed: 0,id,region
0,http://api.brain-map.org/api/v2/data/Structure...,"Primary somatosensory area, trunk, layer 1"
1,http://api.brain-map.org/api/v2/data/Structure...,"Primary somatosensory area, lower limb, layer 1"
2,http://api.brain-map.org/api/v2/data/Structure...,"Taenia tecta, dorsal part, layer 1"
3,http://api.brain-map.org/api/v2/data/Structure...,"Taenia tecta, ventral part, layer 1"
4,http://api.brain-map.org/api/v2/data/Structure...,"Parasubiculum, layer 1"
