# Tutorial: Describe Neuroscience Dataset using MINDS

## Initialize and configure

In [None]:
!pip install nexusforge==0.7.0

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting nexusforge==0.7.0
  Downloading nexusforge-0.7.0.tar.gz (720 kB)
[K     |████████████████████████████████| 720 kB 5.0 MB/s 
[?25hCollecting hjson
  Downloading hjson-3.0.2-py3-none-any.whl (54 kB)
[K     |████████████████████████████████| 54 kB 2.6 MB/s 
Collecting nexus-sdk
  Downloading nexus_sdk-0.3.2-py3-none-any.whl (45 kB)
[K     |████████████████████████████████| 45 kB 1.9 MB/s 
[?25hCollecting aiohttp
  Downloading aiohttp-3.8.1-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (1.1 MB)
[K     |████████████████████████████████| 1.1 MB 41.9 MB/s 
Collecting rdflib>=6.0.0
  Downloading rdflib-6.1.1-py3-none-any.whl (482 kB)
[K     |████████████████████████████████| 482 kB 45.0 MB/s 
[?25hCollecting pyLD
  Downloading PyLD-2.0.3.tar.gz (70 kB)
[K     |████████████████████████████████| 70 kB 7.9 MB/s 
[?25hCollecting 

In [None]:
!pip install allensdk

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting allensdk
  Downloading allensdk-2.13.4-py3-none-any.whl (1.8 MB)
[K     |████████████████████████████████| 1.8 MB 5.0 MB/s 
Collecting pynwb
  Downloading pynwb-2.1.0-py2.py3-none-any.whl (118 kB)
[K     |████████████████████████████████| 118 kB 43.1 MB/s 
[?25hCollecting simpleitk<3.0.0,>=2.0.2
  Downloading SimpleITK-2.1.1.2-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (48.4 MB)
[K     |████████████████████████████████| 48.4 MB 30 kB/s 
[?25hCollecting psycopg2-binary<3.0.0,>=2.7
  Downloading psycopg2_binary-2.9.3-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.0 MB)
[K     |████████████████████████████████| 3.0 MB 35.1 MB/s 
[?25hCollecting requests-toolbelt<1.0.0
  Downloading requests_toolbelt-0.9.1-py2.py3-none-any.whl (54 kB)
[K     |████████████████████████████████| 54 kB 2.7 MB/s 
Collecting simplejson<4.0.0,>=3.10.0
  Downloading sim

In [None]:
!pip install neurom[plotly]==3.0.1

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting neurom[plotly]==3.0.1
  Downloading neurom-3.0.1.tar.gz (474 kB)
[K     |████████████████████████████████| 474 kB 5.3 MB/s 
[?25h  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Installing backend dependencies ... [?25l[?25hdone
    Preparing wheel metadata ... [?25l[?25hdone
Collecting morphio>=3.1.1
  Downloading MorphIO-3.3.3-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.9 MB)
[K     |████████████████████████████████| 1.9 MB 43.4 MB/s 
Collecting psutil>=5.5.1
  Downloading psutil-5.9.1-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (281 kB)
[K     |████████████████████████████████| 281 kB 45.9 MB/s 
Building wheels for collected packages: neurom
  Building wheel for neurom (PEP 517) ... [?25l[?25hdone
  Created wheel for neurom: fil

In [None]:
!pip install --upgrade nest-asyncio==1.5.1

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


### Get an authentication token

The [Nexus sandbox application](https://sandbox.bluebrainnexus.io) can be used to get a token:

- Step 1: From the [web page](https://sandbox.bluebrainnexus.io), click on the login button in the top right corner and follow the instructions on screen.

- Step 2: You will then see a `Copy token` button in the top right corner. Click on it to copy the token to the clipboard.


Once a token is obtained, proceed to paste it as the value of the `TOKEN` variable below.

__Important__: A Nexus token is valid for 8 hours, if your working session is open for more than 8 hours, you may need to refresh the value of the token and reintialize the forge client in the _'Configure a forge client to store, manage and access datasets'_ section below. 

In [None]:
import getpass

In [None]:
TOKEN = getpass.getpass()

··········


### Configure a forge client to store, manage and access datasets

In [None]:
import uuid
import base64
import requests
import json
from pathlib import Path

from kgforge.core import KnowledgeGraphForge
from kgforge.specializations.mappings import DictionaryMapping

from allensdk.api.queries.cell_types_api import CellTypesApi
from allensdk.core.cell_types_cache import CellTypesCache

In [None]:
r = requests.get('https://raw.githubusercontent.com/BlueBrain/nexus/ef830192d4e7bb95f9351c4bdab7b0114c27e2f0/docs/src/main/paradox/docs/getting-started/notebooks/rdfmodel/jsonldcontext.json')
dirpath = './rdfmodel'
Path(dirpath).mkdir(parents=True, exist_ok=True)
with open(f'{dirpath}/jsonldcontext.json', 'w') as outfile:
    json.dump(r.json(), outfile)

In [None]:
ORG = "github-users"
PROJECT = "cdjcodes"  # Provide here the automatically created project name created when you logged into the Nexus sandbox instance.

In [None]:
forge = KnowledgeGraphForge("https://raw.githubusercontent.com/BlueBrain/nexus/ef830192d4e7bb95f9351c4bdab7b0114c27e2f0/docs/src/main/paradox/docs/getting-started/notebooks/forge.yml",
                            bucket=f"{ORG}/{PROJECT}",
                            endpoint="https://sandbox.bluebrainnexus.io/v1",
                            token=TOKEN)

## Download datasets from Allen Cell Types Database

### Download mouse neuron morphology from the Allen Cell Types Database

We will be downloading mouse neuron morphology data from the [Allen Cell Types Database](https://celltypes.brain-map.org/). The [AllenSDK](https://allensdk.readthedocs.io/en/latest/) can be used for data download.

In [None]:
ALLEN_DIR = "allen_cell_types_database"

In [None]:
ctc = CellTypesCache(manifest_file=f"{ALLEN_DIR}/manifest.json")

In [None]:
MAX_CELLS = 1
SPECIES = CellTypesApi.MOUSE

In [None]:
nm_allen_identifiers = [cell["id"] for cell in ctc.get_cells(species=[SPECIES], require_reconstruction = True)][:MAX_CELLS]
print(f"Selected a mouse neuron with identifier: {nm_allen_identifiers}")

Selected a mouse neuron with identifier: [485909730]


In [None]:
with open(f"{ALLEN_DIR}/cells.json") as f:
    allen_cell_types_metadata = json.load(f)

In [None]:
nm_allen_metadata = [neuron for neuron in allen_cell_types_metadata if neuron["specimen__id"] in nm_allen_identifiers]

In [None]:
print(f"Metadata of the neuron {nm_allen_identifiers}:")
nm_allen_metadata

Metadata of the neuron [485909730]:


[{'cell_reporter_status': 'positive',
  'csl__normalized_depth': 0.478343598387418,
  'csl__x': 8881.0,
  'csl__y': 953.839501299405,
  'csl__z': 7768.22695782726,
  'donor__age': '',
  'donor__disease_state': '',
  'donor__id': 485250100,
  'donor__name': 'Cux2-CreERT2;Ai14-205530',
  'donor__race': '',
  'donor__sex': '',
  'donor__species': 'Mus musculus',
  'donor__years_of_seizure_history': '',
  'ef__adaptation': 0.0323396179505003,
  'ef__avg_firing_rate': 17.8906878969496,
  'ef__avg_isi': 55.895,
  'ef__f_i_curve_slope': 0.25,
  'ef__fast_trough_v_long_square': -49.0000038146973,
  'ef__peak_t_ramp': 2.85131166666667,
  'ef__ri': 213.124960660934,
  'ef__tau': 20.5677674593068,
  'ef__threshold_i_long_square': 70.0,
  'ef__upstroke_downstroke_ratio_long_square': 3.04293347960074,
  'ef__vrest': -76.9283905029297,
  'ephys_inst_thresh_thumb_path': '/api/v2/well_known_file_download/491381130',
  'ephys_thumb_path': '/api/v2/well_known_file_download/485911650',
  'erwkf__id': 491

### Download one mouse neuron morphology reconstructed from the selected neuron

We will be downloading one mouse neuron morphology from the [Allen Cell Types Database](https://celltypes.brain-map.org/) using the [AllenSDK](https://allensdk.readthedocs.io/en/latest/).

In [None]:
for identifier in nm_allen_identifiers:
    ctc.get_reconstruction(identifier)

2022-07-12 09:07:05,983 allensdk.api.api.retrieve_file_over_http INFO     Downloading URL: http://api.brain-map.org/api/v2/well_known_file_download/500961530


### Download one mouse neuron electrophysiology recording from the selected neuron

We will be downloading one mouse neuron electrophysiology from the [Allen Cell Types Database](https://celltypes.brain-map.org/) using the [AllenSDK](https://allensdk.readthedocs.io/en/latest/).

In [None]:
for identifier in nm_allen_identifiers:
    ctc.get_ephys_data(identifier)

2022-07-12 09:07:17,093 allensdk.api.api.retrieve_file_over_http INFO     Downloading URL: http://api.brain-map.org/api/v2/well_known_file_download/491316386


## Transform Allen Cell Types Database Metadata to [Neuroshapes' MINDS](https://bbp-nexus.epfl.ch/datamodels/class-schemadataset.html) metadata

### Map the Allen Cell Types Database neuron morphologies metadata to Neuroshapes

In [None]:
allen_nm_mapping = DictionaryMapping.load("https://raw.githubusercontent.com/BlueBrain/nexus/ef830192d4e7bb95f9351c4bdab7b0114c27e2f0/docs/src/main/paradox/docs/getting-started/notebooks/mappings/allen_morphology_dataset.hjson")
nm_allen_resources = forge.map(nm_allen_metadata, allen_nm_mapping, na='')

### Map the Allen Cell Types Database neuron electrophysiology recording to Neuroshapes

In [None]:
allen_ephys_mapping = DictionaryMapping.load("https://raw.githubusercontent.com/BlueBrain/nexus/ef830192d4e7bb95f9351c4bdab7b0114c27e2f0/docs/src/main/paradox/docs/getting-started/notebooks/mappings/allen_ephys_dataset.hjson")
nephys_allen_resources = forge.map(nm_allen_metadata, allen_ephys_mapping, na='')

## Register

If the registration fails, try refreshing the access token and reinitializing the forge client in the _'Configure a forge client to store, manage and access datasets'_ section.

### Register the Allen Cell Types Database neuron morphology

In [None]:
nm_allen_resources.id = forge.format("identifier", "neuronmorphologies", str(uuid.uuid4()))

In [None]:
forge.register(nm_allen_resources)

<action> _register_one
<succeeded> True


### Register the Allen Cell Types Database neuron electrophysiology recording

In [None]:
nephys_allen_resources.id = forge.format("identifier", "traces", str(uuid.uuid4()))

In [None]:
forge.register(nephys_allen_resources)

<action> _register_one
<succeeded> True


## Access

### Set filters

In [None]:
_type = "NeuronMorphology"

filters = {"type": _type}

### Run Query

In [None]:
number_of_results = 10  # You can limit the number of results, pass `None` to fetch all the results

data = forge.search(filters, limit=number_of_results)

print(f"{str(len(data))} dataset(s) of type {_type} found")

1 dataset(s) of type NeuronMorphology found


### Display the results as pandas dataframe

In [None]:
property_to_display = ["id","name","subject","brainLocation.brainRegion.id","brainLocation.brainRegion.label","brainLocation.layer.id","brainLocation.layer.label", "contribution","brainLocation.layer.id","brainLocation.layer.label","distribution.name","distribution.contentUrl","distribution.encodingFormat"]
reshaped_data = forge.reshape(data, keep=property_to_display)

forge.as_dataframe(reshaped_data)

Unnamed: 0,id,brainLocation.brainRegion.id,brainLocation.brainRegion.label,brainLocation.layer,contribution.type,contribution.agent.id,contribution.agent.type,contribution.agent.label,distribution.contentUrl,distribution.encodingFormat,distribution.name,name,subject.type,subject.age.period,subject.identifier,subject.name,subject.species.label,subject.strain.label
0,https://bbp.epfl.ch/neurosciencegraph/data/neu...,http://api.brain-map.org/api/v2/data/Structure...,VISp5,5,Contribution,https://www.grid.ac/institutes/grid.417881.3,Organization,Allen Institute for Brain Science,https://sandbox.bluebrainnexus.io/v1/files/git...,application/swc,reconstruction.swc,Cux2-CreERT2;Ai14-205530.03.02.01,Subject,Post-natal,485250100,Cux2-CreERT2;Ai14-205530,Mus musculus,Cux2-CreERT2


### Download

In [None]:
dirpath = "./downloaded/"
forge.download(data, "distribution.contentUrl", dirpath, overwrite=True)

In [None]:
ls ./downloaded/

reconstruction.swc


### Display a result as 3D Neuron Morphology

In [None]:
from neurom import load_morphology
from neurom.view.plotly_impl import plot_morph3d
import IPython

In [None]:
neuron = load_morphology(f"{dirpath}/{data[0].distribution.name}")
plot_morph3d(neuron, inline=False)
IPython.display.HTML(filename='./morphology-3D.html')

Output hidden; open in https://colab.research.google.com to view.

## Version the dataset
Tagging a dataset is equivalent to `git tag`. It allows to version a dataset.

In [None]:
forge.tag(data, value="releaseV112")

<count> 1
<action> _tag_many
<succeeded> True


In [None]:
# The version argument can be specified to retrieve the dataset at a given tag.

tagged_data = forge.retrieve(id=data[0].id, version="releaseV112")

In [None]:
forge.as_dataframe(tagged_data)

Unnamed: 0,id,type,_schemaProject,apicalDendrite,brainLocation.type,brainLocation.brainRegion.id,brainLocation.brainRegion.label,brainLocation.coordinatesInBrainAtlas.valueX,brainLocation.coordinatesInBrainAtlas.valueY,brainLocation.coordinatesInBrainAtlas.valueZ,...,objectOfStudy.id,objectOfStudy.type,objectOfStudy.label,subject.type,subject.age.period,subject.identifier,subject.name,subject.species.label,subject.strain.label,tag__apical
0,https://bbp.epfl.ch/neurosciencegraph/data/neu...,"[Dataset, NeuronMorphology]",https://sandbox.bluebrainnexus.io/v1/projects/...,spiny,BrainLocation,http://api.brain-map.org/api/v2/data/Structure...,VISp5,8881.0,953.839501,7768.226958,...,http://bbp.epfl.ch/neurosciencegraph/taxonomie...,ObjectOfStudy,Single Cell,Subject,Post-natal,485250100,Cux2-CreERT2;Ai14-205530,Mus musculus,Cux2-CreERT2,intact


In [None]:
data[0].description="Neuron Morphology from Allen"

In [None]:
forge.update(data[0])

<action> _update_one
<succeeded> True


In [None]:
non_tagged_data = forge.retrieve(id=data[0].id)

In [None]:
forge.as_dataframe(non_tagged_data)

Unnamed: 0,id,type,_schemaProject,apicalDendrite,brainLocation.type,brainLocation.brainRegion.id,brainLocation.brainRegion.label,brainLocation.coordinatesInBrainAtlas.valueX,brainLocation.coordinatesInBrainAtlas.valueY,brainLocation.coordinatesInBrainAtlas.valueZ,...,objectOfStudy.id,objectOfStudy.type,objectOfStudy.label,subject.type,subject.age.period,subject.identifier,subject.name,subject.species.label,subject.strain.label,tag__apical
0,https://bbp.epfl.ch/neurosciencegraph/data/neu...,"[Dataset, NeuronMorphology]",https://sandbox.bluebrainnexus.io/v1/projects/...,spiny,BrainLocation,http://api.brain-map.org/api/v2/data/Structure...,VISp5,8881.0,953.839501,7768.226958,...,http://bbp.epfl.ch/neurosciencegraph/taxonomie...,ObjectOfStudy,Single Cell,Subject,Post-natal,485250100,Cux2-CreERT2;Ai14-205530,Mus musculus,Cux2-CreERT2,intact
