# Store Allen Cell Types Database data in Blue Brain Nexus

The goal of this notebook is to collect, map, ingest, find and download data from the Allen Cell Types Database using Blue Brain Nexus and Neuroshapes

<img src="../ingest-allen-celltypes-db-in-nexus/assets/nexus_workshop_data_pipeline.png" width="1200">

In [None]:
# TODO: fix the Quick start tutorial link

## Prerequisites

This notebook assumes that
- you have created a project within the AWS sandbox deployment of Blue Brain Nexus. If not follow the Blue Brain Nexus [Quick Start tutorial](https://bluebrain.github.io/nexus/docs/tutorial/getting-started/quick-start/index.html)
- the neuroshapes schemas are available in the neuroshapes/schemas project of the AWS sandbox deployment of Blue Brain Nexus

## Overview

You'll work through the following steps:

1. Configure Blue Brain Nexus environment
2. Collect and explore Allen Cell Types Database electrophysiology and neuron morphology data (files and metadata)
3. Store electrophysiology and neuron morphology files in Blue Brain Nexus
4. Map Allen Cell Types Database metadata to Neuroshapes
5. Store mapped Allen Cell Types Database metadata in Blue Brain Nexus
6. Find and download stored data using SPARQL

## Step 1: Configure Blue Brain Neuxs environment

Install the required python packages

In [None]:
# !pip install allensdk
# !pip install -U nexus-sdk
# !pip install rdflib
# !pip install SPARQLWrapper

Import the required python packages

In [1]:
# TODO: check usage
import requests
import json
import getpass
import pandas as pd
import os
import matplotlib.pyplot as plt

from allensdk.core.cell_types_cache import CellTypesCache
from allensdk.api.queries.cell_types_api import CellTypesApi
from allensdk.core.cell_types_cache import ReporterStatus as RS
from allensdk.core.swc import Marker

from sparqlendpointhelper import SparqlViewHelper
import nexussdk as nexus
import Nexus.Mapper as mapper
import Nexus.Utils as utils
import Nexus.Neuroshapes as neuroshapes

Set up the Blue Brain Nexus sandbox environment

In [2]:
DEPLOYMENT = "https://sandbox.bluebrainnexus.io/v1"

In [3]:
TOKEN = getpass.getpass() # Paste your token here

 ·······································································································································································································································································································································································································································································································································································································································································································································································································································································································································································································································

In [4]:
nexus.config.set_environment(DEPLOYMENT)

In [5]:
nexus.config.set_token(TOKEN)

In [6]:
ORGANIZATION = "tutorialnexus" # For the purpose of this workshop, we will be working in the tutorialnexus organization

In [7]:
PROJECT = "akk" # Paste your project name here

Configure your project in the tutorialnexus organization

In [8]:
utils.configure_project(nexus, DEPLOYMENT, ORGANIZATION, PROJECT)

409 Client Error: Conflict for url: https://sandbox.bluebrainnexus.io/v1/resources/tutorialnexus/akk/_
---
{
  "@context": "https://bluebrain.github.io/nexus/contexts/error.json",
  "@type": "ResourceAlreadyExists",
  "reason": "Resource 'https://akk.neuroshapes.org' already exists."
}


## Step 2: Collect and explore Allen Cell Types Database electrophysiology and neuron morphology data (files and metadata)

We will be working with human and mouse neuron morphology and electrophysiology data from the [Allen Cell Types Database](https://celltypes.brain-map.org/). The [AllenSDK](https://allensdk.readthedocs.io/en/latest/) can be used for data download

In [None]:
ctc = CellTypesCache(manifest_file="./allen_cell_types_db/manifest.json")

We will select all cells for which there is a reconstructed neuron morphology available

In [None]:
allen_cells = ctc.get_cells(require_reconstruction = True)

In [None]:
print("Total number of cells in the Allen Cell Types Database which have ephys and reconstruction data: %d" % len(allen_cells))

We will be downloading a subset of the data from the Allen Cell Types Database (the 20 first cells)

In [None]:
allen_cells_ids = [c["id"] for c in allen_cells][0:20] # TODO: change to all cells with reconstruction?

Download the reconstructed neuron morphology files (file format: swc)

In [None]:
allen_cells_reconstruction = [ctc.get_reconstruction(i) for i in allen_cells_ids]

Download the trace collection files (file format: nwb)

In [None]:
allen__cells_electrophysiology = [ctc.get_ephys_data(i) for i in allen_cells_ids]

Acces the cells.json metadata file

In [9]:
allen_cells_metadata = utils.load_json("./allen_cell_types_db/cells.json")

Display the first element from the cells.json file

In [None]:
allen_cells_metadata[0]

Plot a reconstructed neuron morphology

In [None]:
morphology = allen_cells_reconstruction[0]
fig, axes = plt.subplots(1, 2, sharey=True, sharex=True)
axes[0].set_aspect('equal')
axes[1].set_aspect('equal')

# Make a line drawing of x-y and y-z views
for n in morphology.compartment_list:
    for c in morphology.children_of(n):
        axes[0].plot([n['x'], c['x']], [n['y'], c['y']], color='black')
        axes[1].plot([n['z'], c['z']], [n['y'], c['y']], color='black')

axes[0].set_ylabel('y')
axes[0].set_xlabel('x')
axes[1].set_xlabel('z')
plt.show()

## Step 3: Store electrophysiology and neuron morphology files in Blue Brain Nexus

Store the neuron morphologies using the Blue Brain Nexus default storage

In [None]:
morph_files_metadata = dict()

In [None]:
for cell_id in allen_cells_ids:
    morph_files_metadata = utils.store_allen_files(nexus, cell_id=cell_id, data_type="reconstruction", metadata_dict=morph_files_metadata, org_label=ORGANIZATION, project_label=PROJECT)

In [None]:
utils.save_json(morph_files_meta, "./morph_files_metadata.json")

Store the electrophysiology using the Blue Brain Nexus default storage

In [None]:
ephys_files_metadata = dict()

In [None]:
for cell_id in allen_cells_ids:
    ephys_files_metadata = utils.store_allen_files(nexus, cell_id=cell_id, data_type="ephys", metadata_dict=ephys_files_metadata, org_label=ORGANIZATION, project_label=PROJECT)

In [None]:
utils.save_json(ephys_files_meta, "./ephys_files_metadata.json")

Check out the files in [Nexus Web](https://sandbox.bluebrainnexus.io/web/tutorialnexus)

## Step 4: Map Allen Cell Types Database metadata to Neuroshapes

In [10]:
ephys_files_metadata = utils.load_json("ephys_files_metadata.json")
morphs_files_metadata = utils.load_json("morph_files_metadata.json")

Select the metadata of your subset of cells

In [11]:
subset_allen_cells_metadata = list()
for cell in allen_cells_metadata:
    if str(cell["specimen__id"]) in ephys_files_metadata.keys():
        subset_allen_cells_metadata.append(cell)

In [12]:
mapping = mapper.Mapper(deployment=DEPLOYMENT, org_label=ORGANIZATION, project_label=PROJECT)

Map the metadata provided by the Allen Cell Types Database to Neuroshapes

In [13]:
metadata_entities = mapping.allencelltypesdb2neuroshapes(PROJECT, subset_allen_cells_metadata)

In [14]:
# TODO: explore the metadata entities

Add experimental protocol information to the metadata entities

In [15]:
experiment = neuroshapes.Experiment(PROJECT) # experiment.experimentalprotocol

In [16]:
ephys_experimental_protocol = experiment.experimentalprotocol(name="Technical White Paper: Electrophysiology",
                                                            at_id="http://help.brain-map.org/download/attachments/8323525/CellTypes_Ephys_Overview.pdf?version=2&modificationDate=1508180425883&api=v2",
                                                            author_id="https://www.grid.ac/institutes/grid.417881.3",
                                                            author_type="Organization",
                                                            description="Protocol used to generate Allen Cell Types Database")

In [17]:
metadata_entities.append(ephys_experimental_protocol)

In [18]:
reconstruction_experimental_protocol = experiment.experimentalprotocol(name="Technical White Paper: Cell Morphology and Histology",
                                                            at_id="http://help.brain-map.org/download/attachments/8323525/CellTypes_Morph_Overview.pdf?version=4&modificationDate=1528310097913&api=v2",
                                                            author_id="https://www.grid.ac/institutes/grid.417881.3",
                                                            author_type="Organization",
                                                            description="Protocol used to generate Allen Cell Types Database")

In [19]:
metadata_entities.append(reconstruction_experimental_protocol)

In [20]:
utils.save_json(metadata_entities, "./metadata_entities.json")

## Step 5: Store mapped Allen Cell Types Database metadata in Blue Brain Nexus

In [None]:
utils.store_allen_metadata(nexus, ORGANIZATION, PROJECT, metadata_entities, ephys_files_metadata, morphs_files_metadata)

Check out the metadata in [Nexus Web](https://sandbox.bluebrainnexus.io/web/tutorialnexus)

## Step 6: Find and download stored data using SPARQL

Define the properties you want to filter by

In [None]:
data_type = "nsg:ReconstructedNeuronMorphology"
brain_region_layer = "\"layer 5\""
brain_region = "" # TODO: Add option to filter by brain region
apical_dendrite = "\"intact\""

Provide the SPARQL query

In [None]:
sparqlview_endpoint = f"{DEPLOYMENT}/views/{ORGANIZATION}/{PROJECT}/graph/sparql"

In [None]:
nexus_df = utils.query_data(sparqlview_endpoint, data_type, brain_region_layer, apical_dendrite, TOKEN)

In [None]:
if nexus_df is not None:
    print("Results stats: ")
    display(nexus_df.describe())
    print("Results : ")
    display(nexus_df.head(5))
    entities = set(nexus_df["entity"])
    print(" : %s" % (len(entities)))
else:
    print("No result was found")

Download the selected reconstructed neuron morphologies

In [None]:
data_dir ="./Download/"

In [None]:
if not os.path.exists(data_dir):
    os.mkdir(data_dir)

In [None]:
download_urls = list(set(nexus_df["downloadUrl"]))

In [None]:
print(f"Number of download links: {len(download_urls)}")

In [None]:
for url in download_urls:    
    try:
        response = nexus.files.fetch(ORGANIZATION, PROJECT, file_id=url, out_filepath=data_dir)
    except nexus.HTTPError as e:
        print(e)
        nexus.tools.pretty_print(resource)
        print("----")
        nexus.tools.pretty_print(e.response.json())

In [None]:
# TODO: Fetch one and plot it again

In [None]:
# TODO: Download
#- get one and plot it
#- add the file extension?