## 04.05.2020

The possibility to run this notebook has been brought forward.

This adds some prerequisites and adaptations to cope with what is not available yet.

Note:
- Mappings have not been reviewed yet ([DKE-143](https://bbpteam.epfl.ch/project/issues/browse/DKE-143))
- **The configuration `bbp-staging-forge.yml` should be completed with the values available at the time of running.**

Prerequisites:
- The Python package `kgforge` is installed.
- This notebook is running from the subdirectory `/examples/notebooks/use-cases/` of the cloned package repository.

Adaptations:
- In a cell, the commented line is the targeted behaviour which is not available yet. A replacement is provided meanwhile.

---

## DISCLAMER

This notebook might not be executable as-is at the moment you try to do so (missing data, access, implementations, ...).

The goal here is to demonstrate what can be done with the framework in specific use cases.

---

## Content

This notebook focuses on demonstrating an end-to-end workflow for the Blue Brain Knowledge Graph.

- Tools installation

- Session configuration

- A - Data Integration
  - Retrieve human neuron morphologies from the Allen Cell Types Database
  - Load the complete metadata of the neuron morphologies from Allen
  - Consult the list of managed data sources and mappings to the Blue Brain Knowledge Graph
  - Map the neuron morphologies from Allen to the Blue Brain Knowledge Graph
  - Verify the created entities from Allen conform to the Blue Brain Knowledge Graph Schema
  - Add the created entities from Allen to the Blue Brain Knowledge Graph


- B - Data Exploration
  - Discover which neuron morphologies are already in the Blue Brain Knowledge Graph
  - Select neuron morphologies in the cortical layer V and with intact apical dendrites
  - Regroup as a dataset a selection of neuron morphologies from the Blue Brain Knowledge Graph
  - Give the first revision of the dataset a human-friendly name


- C - Data Analytics
  - Retrieve a specific dataset at a given version from the Blue Brain Knowledge Graph
  - Download the reconstruction files of the neuron morphologies of the dataset
  - Perform a topological analysis of the neuron morphologies from the dataset
  - Register the analysis result with its provenance into the Blue Brain Knwoledge Graph

---

## Tools installation

In [None]:
# ! pip install kgforge allensdk tmd
! pip install allensdk tmd

---

## Session configuration

In [None]:
import getpass

In [None]:
from kgforge.core import KnowledgeGraphForge

In [None]:
from kgforge.specializations.resources import Dataset

Please enter your BBP token:

In [None]:
token = getpass.getpass()

In [None]:
# forge = KnowledgeGraphForge("bbp-prod-forge.yml", bucket="bbp/<project>", token=token)
forge = KnowledgeGraphForge("bbp-staging-forge.yml", token=token)

---

## A - Data Integration

### Retrieve human neuron morphologies from the Allen Cell Types Database

In [None]:
from allensdk.core.cell_types_cache import CellTypesCache

In [None]:
from allensdk.api.queries.cell_types_api import CellTypesApi

In [None]:
ALLEN_DIR = "allen_cell_types_database"

In [None]:
ctc = CellTypesCache(manifest_file=f"{ALLEN_DIR}/manifest.json")

In [None]:
human_cells = ctc.get_cells(species=[CellTypesApi.HUMAN], require_reconstruction=True)

In [None]:
CELLS_LIMIT = 2

In [None]:
human_cell_ids = [x["id"] for x in human_cells][:CELLS_LIMIT]

In [None]:
human_cell_reconstructions = [ctc.get_reconstruction(x) for x in human_cell_ids]

### Load the complete metadata of the neuron morphologies from Allen

In [None]:
import json

In [None]:
with open(f"{ALLEN_DIR}/cells.json") as f:
    allen_cell_types_metadata = json.load(f)

In [None]:
human_cell_metadata = [x for x in allen_cell_types_metadata if x["specimen__id"] in human_cell_ids]

### Consult the list of managed data sources and mappings to the Blue Brain Knowledge Graph

In [None]:
forge.sources()

In [None]:
DATA_SOURCE = "allen-cell-types-database"

In [None]:
forge.mappings(DATA_SOURCE)

### Map the neuron morphologies from Allen to the Blue Brain Knowledge Graph

In [None]:
subject_mapping = forge.mapping("Subject", DATA_SOURCE)

In [None]:
print(subject_mapping)

In [None]:
patchedcell_mapping = forge.mapping("PatchedCell", DATA_SOURCE)

In [None]:
print(patchedcell_mapping)

In [None]:
neuronmorphology_mapping = forge.mapping("NeuronMorphology", DATA_SOURCE)

In [None]:
print(neuronmorphology_mapping)

In [None]:
mappings = [subject_mapping, patchedcell_mapping, neuronmorphology_mapping]

In [None]:
resources = forge.map(human_cell_metadata, mappings)

### Verify the created entities from Allen conform to the Blue Brain Knowledge Graph Schema

In [None]:
forge.validate(resources)

### Add the created entities from Allen to the Blue Brain Knowledge Graph

In [None]:
forge.register(resources)

---

## B - Data Exploration

In [None]:
# p = forge.paths("NeuronMorphology")
p = forge.paths("ReconstructedPatchedCell")

### Discover which neuron morphologies are already in the Blue Brain Knowledge Graph

In [None]:
# results = forge.search(p.type == "NeuronMorphology")
results = forge.search(p.type == "ReconstructedPatchedCell")

In [None]:
len(results)

In [None]:
DISPLAY_LIMIT = 25

In [None]:
forge.as_dataframe(results[:DISPLAY_LIMIT])

### Select neuron morphologies in the cortical layer V and with intact apical dendrites

In [None]:
neuronmorphologies = forge.search(p.type == "NeuronMorphology",
                                  p.brainLocation.layer.label == "5",
                                  p.apicalDendrite == "intact")

In [None]:
len(neuronmorphologies)

In [None]:
forge.as_dataframe(neuronmorphologies[:DISPLAY_LIMIT])

### Regroup as a dataset a selection of neuron morphologies from the Blue Brain Knowledge Graph

In [None]:
from uuid import uuid4

In [None]:
DATASET_ID = forge.format("identifier", "datasets", str(uuid4()))

In [None]:
# FIXME According to Neuroshapes on 09.08.2019, 'subject', 'brainLocation' are also required.
# FIXME Property 'hasPart' is not part of the Dataset shape at the moment (09.08.2019).
dataset = Dataset(forge,
                  id=DATASET_ID,
                  type=["Dataset", "NeuronMorphology"],
                  name="All layer 5 morphologies with intact apical dendrites",
                  description="Neuron morphologies to be used for Topological Morphology Descriptor analysis")

In [None]:
CONTRIBUTOR_NAME = "Jane Doe"

In [None]:
agent = forge.resolve(CONTRIBUTOR_NAME, scope="entities", target="agents", type="Person")

In [None]:
AGENT_ID = agent.id

In [None]:
dataset.add_contribution(AGENT_ID)

In [None]:
dataset.add_parts(neuronmorphologies)

In [None]:
forge.register(dataset)

### Give the first revision of the dataset a human-friendly name

In [None]:
VERSION_NAME = "v2019-08-20"

In [None]:
forge.tag(dataset, VERSION_NAME)

---

## C - Data Analytics

In [None]:
import tmd

In [None]:
from tmd.view import plot

### Retrieve a specific dataset at a given version from the Blue Brain Knowledge Graph

In [None]:
dataset = forge.retrieve(id=DATASET_ID, version=VERSION_NAME)

### Download the reconstruction files of the neuron morphologies of the dataset

In [None]:
DOWNLOAD_DIR = f"./reconstructions_{VERSION_NAME}/"

In [None]:
dataset.download("parts", DOWNLOAD_DIR)

### Perform a topological analysis of the neuron morphologies from the dataset

In [None]:
pop = tmd.io.load_population(DOWNLOAD_DIR)

In [None]:
phs = [tmd.methods.get_persistence_diagram(x.apical[0]) for x in pop.neurons]

In [None]:
phs_flattened = tmd.analysis.collapse(phs)

#### Visualize the persistence diagram

In [None]:
plot.diagram(phs_flattened)

#### Visualize the persistence barcode

In [None]:
plot.barcode(phs_flattened)

#### Visualize and save the persistence image

In [None]:
ANALYSIS_DIR = "analysis"

In [None]:
ANALYSIS_FILENAME = "persistence_image"

In [None]:
plot.persistence_image(phs_flattened, output_path=ANALYSIS_DIR, output_name=ANALYSIS_FILENAME)

### Register the analysis result with its provenance into the Blue Brain Knwoledge Graph

In [None]:
# FIXME According to Neuroshapes on 09.08.2019, 'used' (2 times) and 'generated' are also required.
analysis = Dataset(forge, type=["Dataset", "Analysis"], name="Persistence image")

In [None]:
analysis.add_contribution(AGENT_ID)

In [None]:
analysis.add_derivation(dataset)

In [None]:
analysis.add_distribution(f"./{ANALYSIS_DIR}/{ANALYSIS_FILENAME}")

In [None]:
forge.register(analysis)