# NanoSolveIT KB and eNanoMapper Ontology
This notebook generates some visualizations about the usage of eNanoMapper ontology terms described to use either a nanomaterial or measurement variable in the NanoSolveIT Knowledge Base.

# Imports

In [1]:
import numpy as np
import pandas as pd
import re
import plotly.express as px
import preprocess as prep

# Data preprocessing
The data received from the partners is a result of the following query against the NanoSolveIT Knowledge Base:
> Which eNM-ontology terms are used in the NanoSolveIT KB to describe either a Nanomaterial or any Measurement Variable

The SPARQL query used for this purpose did not render the data in the shape that would be most useful for these visualizations, so a bit of data preprocessing was needed. The steps are included in ta separate script for reference.

In [2]:
data = prep.do_preprocess()

# Visualization
## Sunburst chart
The following chart uses the [Plotly Sunburst chart](https://plotly.com/python/sunburst-charts/) to visualize how the NanoSolveIT KB is represented by terms in the eNanoMapper ontology. The datasets and the described measured variables, and OECD guidelines will be concatenated under the `target` column. Each target is assigned a value of one.

Click on any ring segment to expand

In [5]:
# Preprocess the data to obtain the columns needed for sunburst charting
nm = data[data["nanomaterial"]!=""].drop(["variable", "dataset", "oecd_guideline"], axis=1).rename(columns={"nanomaterial":"target"})
nm["type"] = ["Nanomaterial" for i in range(len(nm["iri"]))]
vars = data[data["variable"]!=""].drop(["nanomaterial", "dataset", "oecd_guideline"], axis=1).rename(columns={"variable":"target"})
vars["type"] = ["Variable" for i in range(len(vars["iri"]))]
oecd = data[data["oecd_guideline"] != ""].drop(["nanomaterial", "dataset", "variable"], axis = 1).rename(columns={"oecd_guideline":"target"})
oecd["type"] = ["OECD Guideline" for i in range(len(oecd["iri"]))]
dataset = data[data["dataset"] != ""].drop(["nanomaterial", "oecd_guideline", "variable"], axis = 1).rename(columns={"dataset":"target"})
dataset["type"] = ["Dataset" for i in range(len(dataset["iri"]))]
datasb = pd.concat([nm, vars, oecd, dataset])
datasb["eNanoMapper terms"] = ["eNanoMapper terms" for i in range(len(datasb["iri"]))]
datasb["Number of observations"] = [1 for i in range(len(datasb["iri"]))]
# Drawing the figure
fig = px.sunburst(datasb,
                    path = ["eNanoMapper terms", "type", "iri", "label", "target"], values = "Number of observations",
                    width=1920, height=1080)
fig.show()


The previous figure does include the list of nanomaterials described by eNanoMapper terms in the NanoSolveIT Knowledge Base, but they are far too few to be visible. Again, the outer ring represents the nanomaterial being described.

In [4]:
nm["eNanoMapper terms"] = ["eNanoMapper terms" for i in range(len(nm["iri"]))]
nm["value"] = [1 for i in range(len(nm["iri"]))]
fig = px.sunburst(nm,
                    path = ["type", "iri", "label", "target"], values = "Number of observations",
                    width=800, height=800,)
fig.show()

TBD: 
- Add tables, etc for further visualization.
- Mapping of the terms onto the ontology (perhaps using Protege)