# Query Cell Types Ontology

# Context

- This notebook has been put together for the MMB demo on 2022-05-30: see [slides](https://docs.google.com/presentation/d/1Ib1_8byK0hVuNS-wPbqmeL5Lcf67m_oegC3-czPw5ws/edit#slide=id.g116ba5ed71e_0_8)
- It has been revised following feedback from the meeting held on 2022-07-07: see [slides](https://docs.google.com/presentation/d/1mgCyYjHerLJLV79GM0kqp3_Htxmru_7QGIr5elUETC0/edit#slide=id.g13b4a370a10_0_19) and [JIRA ticket](https://bbpteam.epfl.ch/project/issues/browse/DKE-942)

## Imports

In [86]:
import json
import rdflib
import getpass
import pandas as pd
from rdflib import RDF, RDFS, XSD, OWL, URIRef, BNode, SKOS
import pprint
from kgforge.core import KnowledgeGraphForge

## Setup
Get an authentication token

For now, the [Nexus web application](https://bbp.epfl.ch/nexus/web) can be used to get a token. We are looking for other simpler alternatives.

- Step 1: From the opened web page, click on the login button on the right corner and follow the instructions.

![login-ui](./login-ui.png)

- Step 2: At the end you’ll see a token button on the right corner. Click on it to copy the token.

![login-ui](./copy-token.png)


In [None]:
TOKEN = getpass.getpass()

In [89]:
forge = KnowledgeGraphForge("https://raw.githubusercontent.com/BlueBrain/nexus-forge/master/examples/notebooks/use-cases/prod-forge-nexus.yml",
                            token=TOKEN,
                            searchendpoints={"sparql": {"endpoint": "https://bbp.epfl.ch/neurosciencegraph/data/views/aggreg-sp/dataset"}},
                            # endpoint="https://staging.nise.bbp.epfl.ch/nexus/v1",
                            
                            bucket="bbp/atlas")

## Ontologies

### Set brain region

During the meeting on `2022-05-30`, it was specified that a brain region will serve as entry point when searching for cell types in the MMB context. Hence, this notebook starts by defining a brain region one wants to get cell types for. Since the most complete cell type information is available for the `Cerebral cortex`, this has been set as the default below.

In [87]:
BRAIN_REGION = "Cerebral cortex"
# BRAIN_REGION = "Cerebellum"
# BRAIN_REGION = "Somatosensory areas"

Get brain region id

In [88]:
r = forge.search({"label": BRAIN_REGION}, cross_bucket=True)

In [89]:
brain_region = r[0].id

In [90]:
brain_region

'http://api.brain-map.org/api/v2/data/Structure/688'

## Queries

### Get brain regions which do have neuron t-types available

This query will list brain region labels for which the knowledge graph has neuron t-types available

In [91]:
query = f"""

SELECT ?brain_region

WHERE {{
        ?t_type_id subClassOf* <https://bbp.epfl.ch/ontologies/core/celltypes/NeuronTranscriptomicType> ;
                  canHaveBrainRegion ?brain_region_id .        
        ?brain_region_id label ?brain_region .
}}
"""

In [92]:
resources = forge.sparql(query, limit=1000)

In [93]:
df = forge.as_dataframe(resources)

In [94]:
set(df.brain_region)

{'Agranular insular area',
 'Anterior cingulate area',
 'Area prostriata',
 'Cerebellum',
 'Cerebral cortex',
 'Entorhinal area',
 'Entorhinal area, lateral part',
 'Entorhinal area, medial part, dorsal zone',
 'Fasciola cinerea',
 'Field CA1',
 'Field CA2',
 'Field CA3',
 'Hippocampal formation',
 'Hippocampo-amygdalar transition area',
 'Hypothalamus',
 'Induseum griseum',
 'Isocortex',
 'Parasubiculum',
 'Postsubiculum',
 'Presubiculum',
 'Prosubiculum',
 'Retrohippocampal region',
 'Retrosplenial area',
 'Retrosplenial area, ventral part',
 'Subiculum'}

### Get possible t-types for a given brain region

This query lists t-types for a given brain region (i.e. the `BRAIN_REGION` specified above). For demonstration purposes, the `limit` parameter on the query has been set to `100`. This can be increased to get all available t-types. E.g. the total number of available t-types for `Cerebral cortex` on `2022-07-08` was `252`.

In [95]:
query = f"""

SELECT ?brain_region ?t_type

WHERE {{
        ?t_type_id label ?t_type ;
            subClassOf* <https://bbp.epfl.ch/ontologies/core/celltypes/NeuronTranscriptomicType> ;
            canHaveBrainRegion <{brain_region}> .        
        <{brain_region}> label ?brain_region . 
}}
"""

In [96]:
resources = forge.sparql(query, limit=1000)

In [97]:
df = forge.as_dataframe(resources)

In [98]:
df.head()

Unnamed: 0,brain_region,t_type
0,Cerebral cortex,275_NP PPP
1,Cerebral cortex,289_L6 CT CTX
2,Cerebral cortex,293_L6 CT CTX
3,Cerebral cortex,286_L6 CT CTX
4,Cerebral cortex,281_L6 CT CTX


### Get possible met-type combinations plus excitatory/inhibitory category for a given brain region

This query returns possible met-type combinations together with the excitatory/inhibitory categories for the brain region one has set above. For each m- e and t- and transmitter-type, the identifier and the current version in the knowledge graph are also being returned. For a simplified view, please run the `df.drop()` cell below. It will only show the labels of a given type. The `version` indicates the revision of a given type in the knowledge graph and has been included following the Cell Types Meeting on 2022-07-07 to help with reproducibility (see also this JIRA ticket: [DKE-942](https://bbpteam.epfl.ch/project/issues/browse/DKE-942)).

In [99]:
query = f"""

SELECT ?brain_region ?brain_region_version ?transmitter ?transmitter_id ?transmitter_version ?t_type ?t_type_id ?t_type_version ?m_type ?m_type_id ?m_type_version ?e_type ?e_type_id ?e_type_version

WHERE {{
        ?t_type_id label ?t_type ;
                canHaveBrainRegion <{brain_region}> ;
                _rev ?t_type_version .
        
        <{brain_region}> label ?brain_region ;
            _rev ?brain_region_version .
        
        ?m_type_id label ?m_type ;
            _rev ?m_type_version ;
            subClassOf* MType ;
            canHaveTType ?t_type_id ;
            subClassOf* / hasNeurotransmitterType ?transmitter_id .
        
        ?transmitter_id label ?transmitter ;        
            _rev ?transmitter_version .

        ?e_type_id label ?e_type ;
            _rev ?e_type_version ;
            subClassOf* EType ;
            subClassOf* / canHaveMType ?m_type_id ;
            subClassOf* / canHaveTType ?t_type_id .            
}}
"""

In [100]:
resources = forge.sparql(query, limit=1000)

In [101]:
df = forge.as_dataframe(resources)

In [102]:
df.head()

Unnamed: 0,brain_region,brain_region_version,e_type,e_type_id,e_type_version,m_type,m_type_id,m_type_version,t_type,t_type_id,t_type_version,transmitter,transmitter_id,transmitter_version
0,Cerebral cortex,7,dNAC,http://uri.interlex.org/base/ilx_0738205,21,L23_LBC,http://uri.interlex.org/base/ilx_0383202,27,9_Lamp5 Lhx6,https://bbp.epfl.ch/ontologies/core/ttypes/9_L...,1,Inhibitory,https://bbp.epfl.ch/ontologies/core/bmo/Inhibi...,1
1,Cerebral cortex,7,dNAC,http://uri.interlex.org/base/ilx_0738205,21,L23_LBC,http://uri.interlex.org/base/ilx_0383202,27,12_Lamp5,https://bbp.epfl.ch/ontologies/core/ttypes/12_...,1,Inhibitory,https://bbp.epfl.ch/ontologies/core/bmo/Inhibi...,1
2,Cerebral cortex,7,dNAC,http://uri.interlex.org/base/ilx_0738205,21,L23_LBC,http://uri.interlex.org/base/ilx_0383202,27,6_Lamp5 Lhx6,https://bbp.epfl.ch/ontologies/core/ttypes/6_L...,1,Inhibitory,https://bbp.epfl.ch/ontologies/core/bmo/Inhibi...,1
3,Cerebral cortex,7,dNAC,http://uri.interlex.org/base/ilx_0738205,21,L23_LBC,http://uri.interlex.org/base/ilx_0383202,27,5_Lamp5 Lhx6,https://bbp.epfl.ch/ontologies/core/ttypes/5_L...,1,Inhibitory,https://bbp.epfl.ch/ontologies/core/bmo/Inhibi...,1
4,Cerebral cortex,7,dNAC,http://uri.interlex.org/base/ilx_0738205,21,L23_LBC,http://uri.interlex.org/base/ilx_0383202,27,7_Lamp5 Lhx6,https://bbp.epfl.ch/ontologies/core/ttypes/7_L...,1,Inhibitory,https://bbp.epfl.ch/ontologies/core/bmo/Inhibi...,1


In [103]:
df.drop(["brain_region_version", "e_type_id", "e_type_version", "m_type_id", "m_type_version", "t_type_id", "t_type_version", "transmitter_id", "transmitter_version"], axis=1)

Unnamed: 0,brain_region,e_type,m_type,t_type,transmitter
0,Cerebral cortex,dNAC,L23_LBC,9_Lamp5 Lhx6,Inhibitory
1,Cerebral cortex,dNAC,L23_LBC,12_Lamp5,Inhibitory
2,Cerebral cortex,dNAC,L23_LBC,6_Lamp5 Lhx6,Inhibitory
3,Cerebral cortex,dNAC,L23_LBC,5_Lamp5 Lhx6,Inhibitory
4,Cerebral cortex,dNAC,L23_LBC,7_Lamp5 Lhx6,Inhibitory
...,...,...,...,...,...
995,Cerebral cortex,bNAC,L23_NBC,115_Pvalb,Inhibitory
996,Cerebral cortex,cIR,L23_NBC,110_Pvalb,Inhibitory
997,Cerebral cortex,cIR,L23_NBC,115_Pvalb,Inhibitory
998,Cerebral cortex,cAC,L23_SBC,25_Sncg,Inhibitory


### Get possible t-types for a given brain region and all the brain regions which are part of that brain region

This query returns possible t-types for the brain region one has set above and all the brain regions that are part of that brain region. E.g. if one specifies `Cerebral cortex` as brain region, this query would return t-types from the `Cerebral cortex` but also t-types for the `Isocortex` or the `Hippocampal formation` since they are both part of the `Cerebral cortex`.

In [104]:
query = f"""

SELECT ?brain_region ?brain_region_version ?t_type ?t_type_id ?t_type_version

WHERE {{
        ?t_type_id label ?t_type ;
                canHaveBrainRegion ?brain_region_id ;
                _rev ?t_type_version .
        
        ?brain_region_id label ?brain_region ;
            ^hasPart* <{brain_region}> ;
            _rev ?brain_region_version .            
}}
"""

In [105]:
resources = forge.sparql(query, limit=500)

In [106]:
df = forge.as_dataframe(resources)

In [107]:
len(set(df.t_type))

387

In [108]:
df

Unnamed: 0,brain_region,brain_region_version,t_type,t_type_id,t_type_version
0,Dentate gyrus,8,361_DG,https://bbp.epfl.ch/ontologies/core/ttypes/361_DG,1
1,Dentate gyrus,8,363_DG,https://bbp.epfl.ch/ontologies/core/ttypes/363_DG,1
2,Dentate gyrus,8,362_DG,https://bbp.epfl.ch/ontologies/core/ttypes/362_DG,1
3,Dentate gyrus,8,364_DG,https://bbp.epfl.ch/ontologies/core/ttypes/364_DG,1
4,Induseum griseum,8,359_CA2-IG-FC,https://bbp.epfl.ch/ontologies/core/ttypes/359...,1
...,...,...,...,...,...
415,Cerebral cortex,7,5_Lamp5 Lhx6,https://bbp.epfl.ch/ontologies/core/ttypes/5_L...,1
416,Cerebral cortex,7,7_Lamp5 Lhx6,https://bbp.epfl.ch/ontologies/core/ttypes/7_L...,1
417,Cerebral cortex,7,10_Lamp5,https://bbp.epfl.ch/ontologies/core/ttypes/10_...,1
418,Cerebral cortex,7,2_Meis2,https://bbp.epfl.ch/ontologies/core/ttypes/2_M...,1


### Get m-types together with their transmitter type (sClass)

In [109]:
query = f"""

SELECT ?transmitter ?m_type

WHERE {{
        ?m_type_id label ?m_type ;
            subClassOf* MType ;
            subClassOf* / hasNeurotransmitterType / label ?transmitter .           
}}
"""

In [110]:
resources = forge.sparql(query, limit=100)

In [111]:
df = forge.as_dataframe(resources)

In [112]:
df.head()

Unnamed: 0,m_type,transmitter
0,L23_LBC,Inhibitory
1,L23_NBC,Inhibitory
2,L23_SBC,Inhibitory
3,L4_LBC,Inhibitory
4,L4_NBC,Inhibitory


### Get m-types of pyramidal cells (mClass)

In [113]:
query = f"""

SELECT ?mClass ?m_type

WHERE {{
        ?m_type_id label ?m_type ;
            subClassOf* MType ;
            subClassOf* <https://neuroshapes.org/PyramidalNeuron> .           
}}
"""

In [114]:
resources = forge.sparql(query, limit=100)

In [115]:
df = forge.as_dataframe(resources)

In [116]:
df.head()

Unnamed: 0,m_type
0,L3_TPC:C
1,L5_TPC:B
2,L5_TPC:A
3,L3_TPC:A
4,L2_TPC:B


### Get m-types of interneurons (mClass)

`TODO`: While we do have interneuron as a class, the subclass relationship remains to be added.

In [121]:
query = f"""

SELECT ?mClass ?m_type

WHERE {{
        ?m_type_id label ?m_type ;
            subClassOf* MType ;
            subClassOf* <https://neuroshapes.org/Interneuron> .           
}}
"""

In [122]:
resources = forge.sparql(query, limit=100)

In [123]:
df = forge.as_dataframe(resources)

In [124]:
df.head()

### Get m-types with a given morphology and the morphology definition

This query returns m-types which have a given morphological shape. The cell morphologies were taken from the [Phenotype and Trait Ontology](https://ontobee.org/ontology/PATO) (this was done following the request of Georges Khazen who wanted to include the `PATO` deinfitions of morphologies).
Set `MORPHOLOGY` below to one of the following:

- `standard pyramidal morphology`
- `pyramidal family morphology`
- `tufted pyramidal morphology`
- `basket cell morphology`
- `chandelier cell morphology`
- `neurogliaform morphology`
- `Martinotti morphology`
- `cortical bipolar morphology`
- `bitufted cell morphology`

In [125]:
MORPHOLOGY = "basket cell morphology"

In [126]:
query = f"""

SELECT ?cell ?definition

WHERE {{
        ?cell_id subClassOf* / hasMorphologicalPhenotype ?pato_id ;
                  label ?cell .
        ?pato_id subClassOf* / label "{MORPHOLOGY}" .
        ?parent_pato_id label "{MORPHOLOGY}" ;
                <http://purl.obolibrary.org/obo/IAO_0000115> ?definition .
}}
"""

In [127]:
resources = forge.sparql(query, limit=100)

In [128]:
df = forge.as_dataframe(resources)

In [129]:
df.head()

### Get t-types from a paper specific paper

The [Cell Types and Missing Data - Version 1](https://docs.google.com/spreadsheets/d/1iUgqPszKkYQgkJlmpQSkeyFWcEoOxovsBkoLPtA3qPg/edit#gid=1180597294) spreadsheet which served as source for the `Cell Types Ontology` - lists paper identifiers on the `Notes` sheet. These references were added to the respective t-types.
Set `PAPER` below to one of the following:

- Yao et al. 2021: `https://www.sciencedirect.com/science/article/pii/S0092867421005018?dgcid=rss_sd_all`
- Gokce 2016: `https://www.ncbi.nlm.nih.gov/labs/pmc/articles/PMC5004635/`
- Kozareva et al. 2021: `https://www.nature.com/articles/s41586-021-03220-z`
- Chen et al. 2017: `https://www.sciencedirect.com/science/article/pii/S2211124717303212?via%3Dihub`
- Kalish et al. 2018: `https://www.pnas.org/content/115/5/E1051`

In [130]:
PAPER = "https://www.sciencedirect.com/science/article/pii/S0092867421005018?dgcid=rss_sd_all"

In [131]:
query = f"""

SELECT ?label ?brain_region

WHERE {{
        ?id seeAlso <{PAPER}> ;
            label ?label .
        OPTIONAL {{ ?id canHaveBrainRegion / label ?brain_region }} .
}}
"""

In [132]:
resources = forge.sparql(query, limit=100)

In [133]:
df = forge.as_dataframe(resources)

In [134]:
df.head()

Unnamed: 0,label,brain_region
0,128_L2 IT APr,Area prostriata
1,129_L2/3 IT POST-PRE,Postsubiculum
2,129_L2/3 IT POST-PRE,Presubiculum
3,132_L2 IT RSPv-POST-PRE,Postsubiculum
4,132_L2 IT RSPv-POST-PRE,Presubiculum


### Get the m- e- and t-type placeholders

One of the requirements specified during the meeting on `2022-05-30` was to have a placeholder class for each of the types. We thus implemented an m- e- and t-type placeholder class.

In [135]:
query = """

SELECT ?id ?label

WHERE {
        ?id label ?label .
        FILTER (CONTAINS(STR(?label), 'Placeholder'))
}
"""

In [136]:
resources = forge.sparql(query, limit=100)

In [137]:
df = forge.as_dataframe(resources)

In [138]:
df

Unnamed: 0,id,label
0,https://bbp.epfl.ch/ontologies/core/bmo/MTypeP...,MType Placeholder
1,https://bbp.epfl.ch/ontologies/core/bmo/TTypeP...,TType Placeholder
2,https://bbp.epfl.ch/ontologies/core/bmo/ETypeP...,EType Placeholder


### Get all cell type combinations and probabilities for a given brain region

This query cell type combinations for a given brain region (i.e. the `BRAIN_REGION` specified above). For demonstration purposes, the `limit` parameter on the query has been set to `1000`. This can be increased to get all available cell type combinations.

In [139]:
query = f"""

SELECT ?brain_region ?m_type ?e_type ?molecular_type ?probability

WHERE {{
        ?probability_id hasTarget / hasSource / hasSomaLocatedIn ?brain_region_id ;
            hasBody / value ?probability ;
            hasTarget ?m_type_target ;
            hasTarget ?e_type_target ;
            hasTarget ?molecular_type_target .
        ?brain_region_id label ?brain_region ;
            ^hasPart* <{brain_region}> .
        ?m_type_target hasSource / a MType ;
            hasSource / label ?m_type .
        ?e_type_target hasSource / a EType ;
            hasSource / label ?e_type .
        ?molecular_type_target hasSource / a NeuronMolecularType ;
            hasSource / label ?molecular_type .
}}
"""

In [140]:
resources = forge.sparql(query, limit=1000)

In [141]:
df = forge.as_dataframe(resources)

In [142]:
df.head()

Unnamed: 0,brain_region,e_type,m_type,molecular_type,probability
0,Isocortex,cAC,L4_LBC,VIP+,0.462554113
1,Isocortex,cAC,L23_NBC,Serpinf1+,0.0
2,Isocortex,cAC,L23_BP,PV+,0.0
3,Isocortex,cAC,L23_BTC,PV+,0.5
4,Isocortex,cAC,L23_ChC,PV+,0.333333333


----

## 2022-08-31 Workshop queries

### Get Atlas Release info

#### Get the atlas release resource
These atlas releases can be explored through the atlas web app:

* dev: https://bluebrainatlas.kcpdev.bbp.epfl.ch/atlas
* prod: https://bbp.epfl.ch/atlas


In [118]:
BBP_Mouse_Brain_Atlas_Release = "https://bbp.epfl.ch/neurosciencegraph/data/4906ab85-694f-469d-962f-c0174e901885" # output of the BBP Annotation Atlas pipeline
Allen_Mouse_CCF_v2_v3_hybrid =  "https://bbp.epfl.ch/neurosciencegraph/data/e2e500ec-fe7e-4888-88b9-b72425315dda" # Csaba 1 version: This atlas release uses the brain parcellation resulting of the hybridation between CCFv2 and CCFv3 and integrating the splitting of layer 2 and layer 3. The average brain template and the ontology is common across CCFv2 and CCFv3.
ALLEN_CCFV3_Atlas_Release =  "https://bbp.epfl.ch/neurosciencegraph/data/831a626a-c0ae-4691-8ce8-cfb7491345d9" # original Allen
ALLEN_CCFV2_Atlas_Release = "https://bbp.epfl.ch/neurosciencegraph/data/dd114f81-ba1f-47b1-8900-e497597f06ac"

atlas_release_id = BBP_Mouse_Brain_Atlas_Release

In [97]:
atlas_release = forge.retrieve(atlas_release_id, version=1)

In [None]:
print(atlas_release)

In [99]:
atlas_release._store_metadata["_rev"]

4

#### Get the atlas hierarchy

In [100]:
parcellation_ontology = forge.retrieve(atlas_release.parcellationOntology.id, cross_bucket=True)

In [None]:
print(parcellation_ontology)

In [None]:
forge.download(parcellation_ontology, "distribution.contentUrl", ".", overwrite=True, cross_bucket=True)

#### Get parcellation (annotation) volume

In [None]:
parcellation_volume = forge.retrieve(atlas_release.parcellationVolume.id)

In [None]:
print(parcellation_volume)

In [None]:
forge.download(parcellation_volume, "distribution.contentUrl", ".", overwrite=True)

#### Get orientation field volume

In [None]:
query = {
          "type":"CellOrientationField", 
          "atlasRelease":{"@id":atlas_release_id},
          "brainLocation":{"brainRegion":{"id":"http://api.brain-map.org/api/v2/data/Structure/997"}} # root brain region
        }
cell_orientation_field = forge.search(query)
print(f"{len(cell_orientation_field)} found")

In [None]:
print(cell_orientation_field)

In [67]:
forge.download(cell_orientation_field, "distribution.contentUrl", ".", overwrite=True)

### Get the mtype density nrrd file for each region (region is an input)

`TODO`: Inlude the `atlas release` and the `brain region`. Can there be multiple? Is it for the whole brain?

In [124]:
# Some atlas might not have densities associated
query = f"""

SELECT ?mtype_label ?nrrd_file ?contentUrl 

WHERE {{
        ?s a MTypeDensity ;
            atlasRelease <{atlas_release_id}>; 
            annotation / hasBody / label ?mtype_label ;
            distribution ?distribution .
        ?distribution name ?nrrd_file ;
            contentUrl ?contentUrl .
}}
"""

In [125]:
resources = forge.sparql(query, limit=1000)

In [126]:
df = forge.as_dataframe(resources)

In [127]:
df.head()

Unnamed: 0,contentUrl,mtype_label,nrrd_file
0,https://bbp.epfl.ch/nexus/v1/files/bbp/atlas/d...,L5_TPC:A,[cell_density]L5_TPC:A.nrrd
1,https://bbp.epfl.ch/nexus/v1/files/bbp/atlas/5...,L2_IPC,[cell_density]L2_IPC.nrrd
2,https://bbp.epfl.ch/nexus/v1/files/bbp/atlas/1...,L4_SSC,[cell_density]L4_SSC.nrrd
3,https://bbp.epfl.ch/nexus/v1/files/bbp/atlas/8...,L4_UPC,[cell_density]L4_UPC.nrrd
4,https://bbp.epfl.ch/nexus/v1/files/bbp/atlas/6...,L5_UPC,[cell_density]L5_UPC.nrrd


In [None]:
forge.download(resources, "contentUrl", ".", overwrite=True)

### Get the list of mtypes for each region (region is an input)

#### The below query will get m-types from the specified brain region plus m-types from all child brain regions (this was added during the workshop to illustrate the `down the tree` generalisation idea)

In [148]:
BRAIN_REGION = "Cerebral cortex"

Get brain region id

In [149]:
r = forge.search({"label": BRAIN_REGION}, cross_bucket=True)

In [150]:
brain_region = r[0].id

In [151]:
brain_region

'http://api.brain-map.org/api/v2/data/Structure/688'

In [152]:
query = f"""

SELECT ?m_type ?m_type_id ?m_type_version ?brain_region

WHERE {{
        ?m_type_id label ?m_type ;
            _rev ?m_type_version ;
            subClassOf* MType ;
            hasSomaLocatedIn ?brain_region_id .
        
        ?brain_region_id label ?brain_region ;
            ^hasPart* <{brain_region}> ;
            _rev ?brain_region_version . 
}}
"""

In [153]:
resources = forge.sparql(query, limit=1000)

In [154]:
df = forge.as_dataframe(resources)

In [155]:
df.head()

Unnamed: 0,brain_region,m_type,m_type_id,m_type_version
0,Isocortex,L5_TPC:B,http://uri.interlex.org/base/ilx_0381364,30
1,Isocortex,L5_TPC:A,http://uri.interlex.org/base/ilx_0381365,28
2,Isocortex,L3_TPC:A,http://uri.interlex.org/base/ilx_0381366,28
3,Isocortex,L2_TPC:B,http://uri.interlex.org/base/ilx_0381367,30
4,Isocortex,L3_TPC:B,http://uri.interlex.org/base/ilx_0381368,27


In [156]:
set(df.brain_region)

{'Hippocampal formation', 'Isocortex'}

#### The below query will get m-types from the specified brain region plus m-types from parent brain regions (this was added during the workshop to illustrate the `up the tree` generalisation idea)

In [157]:
BRAIN_REGION = "Entorhinal area, medial part, dorsal zone"

Get brain region id

In [158]:
r = forge.search({"label": BRAIN_REGION})

In [159]:
brain_region = r[0].id

IndexError: list index out of range

In [None]:
brain_region

In [None]:
query = f"""

SELECT ?m_type ?m_type_id ?m_type_version ?brain_region

WHERE {{
        ?m_type_id label ?m_type ;
            _rev ?m_type_version ;
            subClassOf* MType ;
            hasSomaLocatedIn ?brain_region_id .
        
        ?brain_region_id label ?brain_region ;
            hasPart* <{brain_region}> ;
            _rev ?brain_region_version . 
}}
"""

In [None]:
resources = forge.sparql(query, limit=1000)

In [None]:
df = forge.as_dataframe(resources)

In [None]:
df.head()

In [None]:
set(df.brain_region)

### Get the etype ratio for each mtype for each region (region and mtype are inputs)

`TODO`: should we add the brain regions for which we want to use this? Generalisation here: From `rat` to `mouse`, from `primary somatosensory cortex` to `Isocortex`

In [165]:
query = f"""

SELECT ?brain_region ?m_type ?e_type ?ratio

WHERE {{
        ?s a NeuronComposition ;
            brainLocation / brainRegion / label ?brain_region ;
            annotation ?mtype_anno ;
            annotation ?etype_anno ;
            series ?ratioSeries .
        ?ratioSeries statistic "ratio" ;
              value ?ratio .
        ?mtype_anno a MTypeAnnotation ;
            hasBody / label ?m_type .
        ?etype_anno a ETypeAnnotation ;
            hasBody / label ?e_type .
}}
"""

In [166]:
resources = forge.sparql(query, limit=1000)

In [167]:
df = forge.as_dataframe(resources)

In [168]:
df.head()

Unnamed: 0,brain_region,e_type,m_type,ratio
0,primary somatosensory cortex,cSTUT,L5_MC,0.037
1,primary somatosensory cortex,cACint,L5_BP,0.2857
2,primary somatosensory cortex,cNAC,L5_BP,0.1429
3,primary somatosensory cortex,dSTUT,L5_BP,0.0714
4,primary somatosensory cortex,bAC,L23_DBC,0.0588


### Get if the mtype is Excitatory or Inhibitory (mtype is input)

See `Get m-types together with their transmitter type (sClass)` query above

### Get if the mtype is pyramidal or interneuron (mtype is input)

See `Get m-types of pyramidal cells (mClass)` and `Get m-types of interneurons (mClass)` queries above

### Get the mini frequency for the mtype (mtype is input)

In [169]:
# TODO; these data are not yet integrated in the knowledge graph

### Get all brain regions of layer 1 of the neocortex

In [170]:
query = f"""

SELECT ?id ?region

WHERE {{
        ?id hasLayerLocationPhenotype <http://purl.obolibrary.org/obo/UBERON_0005390> ;
            subClassOf* BrainRegion ;
            label ?region
}}
"""

In [171]:
resources = forge.sparql(query, limit=1000)

In [172]:
df = forge.as_dataframe(resources)

In [173]:
df.head()

Unnamed: 0,id,region
0,http://api.brain-map.org/api/v2/data/Structure...,"Primary auditory area, layer 1"
1,http://api.brain-map.org/api/v2/data/Structure...,"Primary somatosensory area, trunk, layer 1"
2,http://api.brain-map.org/api/v2/data/Structure...,"Dorsal auditory area, layer 1"
3,http://api.brain-map.org/api/v2/data/Structure...,"Ventral auditory area, layer 1"
4,http://api.brain-map.org/api/v2/data/Structure...,"Posterior auditory area, layer 1"
