# eNanoMapper API guide

## eNanoMapper database background

- FP7 project eNanoMapper http://www.enanomapper.net/
- eNanoMapper database implementation: AMBIT software http://ambit.sf.net
- publication https://www.beilstein-journals.org/bjnano/articles/6/165
- eNanoMapper prototype database https://data.enanomapper.net 

## Nanosafety data 

- NanoSafety data compiled in eNanoMapper databases: https://search.data.enanomapper.net/ 
- Each project data is imported into one eNanoMapper instance, e.g. https://apps.ideaconsult.net/nanoreg1  
  - AMBIT REST API 
- Aggregated search view across multiple databases are available at https://search.data.enanomapper.net 
  - Solr REST API


## eNanoMapper data model

![data model](http://ambit.sourceforge.net/enanomapper/templates/images/data_model.png)
http://ambit.sourceforge.net/enanomapper/templates/convertor_how.html

## eNanoMapper database API

Swagger API docs at http://enanomapper.github.io/API/

In [1]:
from solrscope import aa
import ipywidgets as widgets
from ipywidgets import interact, interactive, fixed, interact_manual
import requests
from importlib import reload 
from solrscope import client_solr
from solrscope import client_ambit
from solrscope import annotation
import pandas as pd
import numpy as np
import json
import warnings
warnings.filterwarnings("ignore")

In [2]:
print('Select eNanoMapper database instance:')

#config="enanomapper_private.yaml"
config="enanomapper_public.yaml"
#config="enm_composite_private.yaml"

def search_service_protected(url,apikey):
    return (url,apikey)
def search_service_open(url):
    return (url)    


style = {'description_width': 'initial'}
config,config_servers, config_security, auth_object, msg = aa.parseOpenAPI3(config=config)    
service_widget = widgets.Dropdown(
    options=config_servers['url'],
    description='Service:',
    disabled=False,
    style=style
)
if config_security is None:
    service = interactive(search_service_open,url=service_widget)
else:
    print(msg)
    apikey_widget=widgets.Text(
            placeholder='',
            description=config_security,
            disabled=False,
            style=style
    )    
    service = interactive(search_service_protected,url=service_widget,apikey=apikey_widget)    

display(service)


Select eNanoMapper database instance:


interactive(children=(Dropdown(description='Service:', options=('https://apps.ideaconsult.net/nanoreg1', 'http…

### What is in the database ?

In [6]:
service_uri=service_widget.value
if auth_object != None:
    auth_object.setKey(apikey_widget.value)

cli_facets = client_ambit.AMBITFacets(service_uri)
r = cli_facets.get(page=0,pagesize=1000,auth=auth_object)
if r.status_code==200:

    facets = cli_facets.parse(r.json())    
    print(json.dumps(facets, indent=4))   
else:
    facets = None
    print(r.status_code)

Sending request to https://apps.ideaconsult.net/nanoreg1/query/study params {'media': 'application/json', 'page': 0, 'pagesize': 1000}
[
    {
        "value": "6.1.5. Toxicity to aquatic algae and cyanobacteria",
        "endpoint": "EC_ALGAETOX_SECTION",
        "count": 20,
        "substancescount": -1,
        "uri": "https://apps.ideaconsult.net/nanoreg1/substance?type=endpointcategory&search=EC_ALGAETOX_SECTION",
        "subcategory": "ECOTOX",
        "subcategoryuri": "https://apps.ideaconsult.net/nanoreg1/substance?type=topcategory&search=ECOTOX",
        "bundles": {}
    },
    {
        "value": "6.1.3. Short-term toxicity to aquatic inverterbrates",
        "endpoint": "EC_DAPHNIATOX_SECTION",
        "count": 30,
        "substancescount": -1,
        "uri": "https://apps.ideaconsult.net/nanoreg1/substance?type=endpointcategory&search=EC_DAPHNIATOX_SECTION",
        "subcategory": "ECOTOX",
        "subcategoryuri": "https://apps.ideaconsult.net/nanoreg1/substance?type=

In [7]:
df=pd.DataFrame(facets)
display(df[["subcategory","endpoint","value","count"]])

Unnamed: 0,subcategory,endpoint,value,count
0,ECOTOX,EC_ALGAETOX_SECTION,6.1.5. Toxicity to aquatic algae and cyanobact...,20
1,ECOTOX,EC_DAPHNIATOX_SECTION,6.1.3. Short-term toxicity to aquatic inverter...,30
2,ECOTOX,EC_SOILDWELLINGTOX_SECTION,6.3.1. Toxicity to soil macroorganisms,12
3,P-CHEM,ANALYTICAL_METHODS_SECTION,CHMO_0001075. Analytical Methods,85
4,P-CHEM,ASPECT_RATIO_SHAPE_SECTION,4.27. Nanomaterial aspect ratio/shape,13
5,P-CHEM,CRYSTALLINE_PHASE_SECTION,4.25. Nanomaterial crystalline phase,112
6,P-CHEM,CRYSTALLITE_AND_GRAIN_SIZE_SECTION,4.26. Nanomaterial crystallite and grain size,6
7,P-CHEM,DUSTINESS_SECTION,4.31. Nanomaterial dustiness,22
8,P-CHEM,ENM_0000081_SECTION,ENM_0000081. Batch Dispersion quality,168
9,P-CHEM,ENM_8000223_SECTION,ENM_8000223. Aerosol characterisation,29


In [8]:
#endpoints
cli_facets = client_ambit.AMBITFacets(service_uri,key="/experiment_endpoints")

r = cli_facets.get(page=0,pagesize=100,params={"top":"TOX"},auth=auth_object)
if r.status_code==200:

    facets = cli_facets.parse(r.json())    
    #print(json.dumps(facets, indent=4))   
    df=pd.DataFrame(facets)
    display(df)
else:
    substances = None
    print(r.status_code)

Sending request to https://apps.ideaconsult.net/nanoreg1/query/experiment_endpoints params {'top': 'TOX', 'media': 'application/json', 'page': 0, 'pagesize': 100}


Unnamed: 0,category,count,endpoint,endpointtype,synonyms,top,unit,value
0,ENM_0000037_SECTION,191,PERCENTAGE OF CONTROL,DOSERESPONSE,[http://ncicb.nci.nih.gov/xml/owl/EVS/Thesauru...,TOX,%,ENM_0000037. Oxidative Stress
1,ENM_0000044_SECTION,60,AVERAGE EPITHELIAL HEIGHT PER TISSUE,DOSERESPONSE,[],TOX,um,ENM_0000044. Barrier integrity
2,ENM_0000044_SECTION,58,AVERAGE OF DEEPEST PENETRATION DEPTH (PER TISSUE),DOSERESPONSE,[],TOX,um,ENM_0000044. Barrier integrity
3,ENM_0000044_SECTION,65,INTEGRITY CONTROL QUALITY: MEAN TEER VALUES BE...,AGGREGATED,[],TOX,ohms/cm2,ENM_0000044. Barrier integrity
4,ENM_0000044_SECTION,65,MAXIMUM TEER VALUES BEFORE STARTING NPS EXPOSURE,AGGREGATED,[http://purl.enanomapper.org/onto/ENM_8000301],TOX,ohms/cm2,ENM_0000044. Barrier integrity
5,ENM_0000044_SECTION,57,MEAN OF BASOLATERAL PAPP THROUGH CELL FREE INS...,AGGREGATED,[],TOX,cm/s,ENM_0000044. Barrier integrity
6,ENM_0000044_SECTION,57,MEAN OF BASOLATERAL PARACELLULAR MARKER THROUG...,AGGREGATED,[],TOX,%,ENM_0000044. Barrier integrity
7,ENM_0000044_SECTION,222,MEAN OF BASOLATERAL PARACELLULAR MARKER THROUG...,DOSERESPONSE,[],TOX,%,ENM_0000044. Barrier integrity
8,ENM_0000044_SECTION,222,MEAN OF PARACELLULAR MARKER PAPP THROUGH EPITH...,DOSERESPONSE,[],TOX,cm/s,ENM_0000044. Barrier integrity
9,ENM_0000044_SECTION,29,MEAN PARACELLULAR MARKER PAPP THROUGH EPTHELIUM,DOSERESPONSE,[],TOX,cm/s,ENM_0000044. Barrier integrity


###  Substance queries
#### All gold nanoparticles

In [10]:
materialtype="NPO_401"

a = annotation.DictionarySubstancetypes()
print(">>> Looking for {}".format(a.annotate(materialtype)))

service_uri=service_widget.value

cli_materials = client_ambit.AMBITSubstance(service_uri)
r = cli_materials.get(params={'search': materialtype,'type' : 'substancetype'},page=0,pagesize=10,auth=auth_object)
if r.status_code==200:

    substances = cli_materials.parse(r.json())    
    print(json.dumps(substances, indent=4))    
else:
    substances = None
    print(r.status_code)

>>> Looking for gold nanoparticle
Sending request to https://apps.ideaconsult.net/nanoreg1/substance params {'search': 'NPO_401', 'type': 'substancetype', 'media': 'application/json', 'page': 0, 'pagesize': 10}
[
    {
        "URI": "https://apps.ideaconsult.net/nanoreg1/substance/NNRG-9204c38f-ac08-e003-bd75-97239871d602",
        "ownerUUID": "NNRG-04022171-cf37-b07b-b933-a6b8d57883ed",
        "ownerName": "NANoREG",
        "i5uuid": "NNRG-9204c38f-ac08-e003-bd75-97239871d602",
        "name": "Au 13 nm",
        "publicname": "Au@PBPK",
        "format": "TNOEXP",
        "substanceType": "NPO_401",
        "referenceSubstance": {
            "i5uuid": null,
            "uri": "https://apps.ideaconsult.net/nanoreg1/query/compound/search/all?search=null"
        },
        "composition": [],
        "externalIdentifiers": [
            {
                "type": "Material code",
                "id": "Au@PBPK"
            },
            {
                "type": "NANoREG supplier",

#### Retrieve physchem data for selected substances

In [11]:
endpointcategory='PC_GRANULOMETRY_SECTION'
a = annotation.DictionaryEndpointCategory()
print(">>> Looking for {}".format(a.annotate(endpointcategory)))

for substance in substances:
    print(substance['URI'])    
    cli = client_ambit.AMBITSubstanceStudy(substance['URI'])
    r = cli.get(params={'category': endpointcategory,'top' : 'P-CHEM'},page=0,pagesize=10,auth=auth_object)
    #print(r.json())
    print(json.dumps(r.json(), indent=4))    

>>> Looking for http://purl.obolibrary.org/obo/CHMO_0002119
https://apps.ideaconsult.net/nanoreg1/substance/NNRG-9204c38f-ac08-e003-bd75-97239871d602
Sending request to https://apps.ideaconsult.net/nanoreg1/substance/NNRG-9204c38f-ac08-e003-bd75-97239871d602/study params {'category': 'PC_GRANULOMETRY_SECTION', 'top': 'P-CHEM', 'media': 'application/json', 'page': 0, 'pagesize': 10}
{
    "study": [
        {
            "uuid": "NRSZ-55e3f22e-f795-ae96-2ae5-a7c103503358",
            "investigation_uuid": null,
            "assay_uuid": "609caf44-b732-b054-dda5-860569e7e92e",
            "owner": {
                "substance": {
                    "uuid": "NNRG-9204c38f-ac08-e003-bd75-97239871d602"
                },
                "company": {
                    "uuid": "NNRG-04022171-cf37-b07b-b933-a6b8d57883ed",
                    "name": "NANoREG"
                }
            },
            "citation": {
                "title": "Provided",
                "year": "0",
       

#### Substance compositions

In [12]:
reload(client_ambit)
for substance in substances:
  
    print(substance['URI'])    
    cli = client_ambit.AMBITSubstanceComposition(substance['URI'])
    r = cli.get(auth=auth_object)
    compositions = cli.parse(r.json())
    for composition in compositions:
        print("-------------------------------------------------------------------------")
        print(composition['relation'])
        print(composition['proportion'])        
        print(composition['component']['compound']['cas'])
        print(composition['component']['compound']['name'])
        
        cli_cmp = client_ambit.AMBITCompound(root_uri=composition['component']['compound']['URI'],resource=None)
        response = cli_cmp.get(media="chemical/x-mdl-sdfile",pagesize=1)
        
        if response.status_code == 200:
            print(response.text) 
            

https://apps.ideaconsult.net/nanoreg1/substance/NNRG-9204c38f-ac08-e003-bd75-97239871d602
Sending request to https://apps.ideaconsult.net/nanoreg1/substance/NNRG-9204c38f-ac08-e003-bd75-97239871d602/composition params {'media': 'application/json', 'page': 0, 'pagesize': 10}
-------------------------------------------------------------------------
HAS_CORE
{'typical': {'precision': None, 'value': 0.0, 'unit': None}, 'real': {'lowerPrecision': None, 'lowerValue': 0.0, 'upperPrecision': None, 'upperValue': 0.0, 'unit': None}, 'function_as_additive': None}


Sending request to https://apps.ideaconsult.net/nanoreg1/compound/29726 params {'media': 'chemical/x-mdl-sdfile', 'page': 0, 'pagesize': 1}

  CDK     0611191232

  1  0  0  0  0  0  0  0  0  0999 V2000
    0.0000    0.0000    0.0000 Au  0  0  0  0  0  0  0  0  0  0  0  0
M  END
$$$$


#### Investigation
results in a tabular form

In [13]:
reload(client_ambit)
cli_investigation= client_ambit.AMBITInvestigation(service_uri)
r = cli_investigation.get(params={'search': endpointcategory,'type' : 'bystudytype'},page=0,pagesize=100,auth=auth_object)
if r.status_code==200:

    results = cli_investigation.parse(r.json())    
    print(json.dumps(results, indent=4))    
else:
    df=None
    print(r.status_code)

Sending request to https://apps.ideaconsult.net/nanoreg1/investigation params {'search': 'PC_GRANULOMETRY_SECTION', 'type': 'bystudytype', 'media': 'application/json', 'page': 0, 'pagesize': 100}
[
    {
        "name": "NM-402 (MWCNT 12.7 nm)",
        "publicname": "JRCNM04002a",
        "owner_name": "NANoREG",
        "topcategory": "P-CHEM",
        "endpointcategory": "PC_GRANULOMETRY_SECTION",
        "endpoint": "SIZE",
        "document_uuid": "NRDM-00000000-0000-0000-0000-000000000001",
        "guidance": "DLS",
        "reference": "final test live version",
        "reference_owner": "TNO",
        "idresult": 425931,
        "effectendpoint": "HYDRODYNAMIC DIAMETER",
        "unit": "nm",
        "loValue": 546.33,
        "errQualifier": "SD",
        "err": 0.13,
        "type_s": "study",
        "s_uuid": "NNRG-ea97c99b-e936-7dcf-b048-1ef314545e86",
        "substanceType": "NPO_354",
        "reference_year": 2016,
        "content": "JRC - IHCP",
        "iuuid": "3

In [14]:
df=pd.DataFrame(results)
display(df.head())

Unnamed: 0,_childDocuments_,auuid,content,document_uuid,effectendpoint,endpoint,endpointcategory,err,errQualifier,guidance,...,reference_year,resultgroup,resulttype,s_uuid,studyResultType,substanceType,topcategory,type_s,unit,updated
0,[{'document_uuid': 'NRDM-00000000-0000-0000-00...,D2E2DC2E0AE0FE3D12CDCDEBA31209AE,JRC - IHCP,NRDM-00000000-0000-0000-0000-000000000001,HYDRODYNAMIC DIAMETER,SIZE,PC_GRANULOMETRY_SECTION,0.13,SD,DLS,...,2016,1,Z-AVERAGE,NNRG-ea97c99b-e936-7dcf-b048-1ef314545e86,Measured,NPO_354,P-CHEM,study,nm,2017-08-01 12:00:00
1,[{'document_uuid': 'NRDM-00000000-0000-0000-00...,D2E2DC2E0AE0FE3D12CDCDEBA31209AE,JRC - IHCP,NRDM-00000000-0000-0000-0000-000000000001,GLOBAL MEAN SIZE,SIZE,PC_GRANULOMETRY_SECTION,0.113,SD,DLS,...,2016,1,INTENSITY-WEIGHTED,NNRG-ea97c99b-e936-7dcf-b048-1ef314545e86,Measured,NPO_354,P-CHEM,study,nm,2017-08-01 12:00:00
2,[{'document_uuid': 'NRDM-00000000-0000-0000-00...,D2E2DC2E0AE0FE3D12CDCDEBA31209AE,JRC - IHCP,NRDM-00000000-0000-0000-0000-000000000001,GLOBAL MEAN SIZE,SIZE,PC_GRANULOMETRY_SECTION,,,DLS,...,2016,2,VOLUME-WEIGHTED,NNRG-ea97c99b-e936-7dcf-b048-1ef314545e86,Measured,NPO_354,P-CHEM,study,cm,2017-08-01 12:00:00
3,[{'document_uuid': 'NRDM-00000000-0000-0000-00...,D2E2DC2E0AE0FE3D12CDCDEBA31209AE,JRC - IHCP,NRDM-00000000-0000-0000-0000-000000000001,GLOBAL MEAN SIZE,SIZE,PC_GRANULOMETRY_SECTION,0.153,SD,DLS,...,2016,3,,NNRG-ea97c99b-e936-7dcf-b048-1ef314545e86,Measured,NPO_354,P-CHEM,study,nm,2017-08-01 12:00:00
4,[{'document_uuid': 'NRDM-00000000-0000-0000-00...,D2E2DC2E0AE0FE3D12CDCDEBA31209AE,JRC - IHCP,NRDM-00000000-0000-0000-0000-000000000001,GLOBAL MEAN SIZE,SIZE,PC_GRANULOMETRY_SECTION,0.123,SD,DLS,...,2016,1,INTENSITY-WEIGHTED,NNRG-ea97c99b-e936-7dcf-b048-1ef314545e86,Measured,NPO_354,P-CHEM,study,nm,2017-08-01 12:00:00


# Aggregated search

- Using Solr-powered free text and faceted search over several eNanoMapper database instances
- see https://search.data.enanomapper.net (web app) and  https://api.ideaconsult.net for API access


### Service selection

In [None]:
print('Select enanoMapper aggregated search service:')
style = {'description_width': 'initial'}
config,config_servers, config_security, auth_object, msg = aa.parseOpenAPI3()    
service_widget = widgets.Dropdown(
    options=config_servers['url'],
    description='Service:',
    disabled=False,
    style=style
)
if config_security is None:
    service = interactive(search_service_open,url=service_widget)
else:
    print(msg)
    apikey_widget=widgets.Text(
            placeholder='',
            description=config_security,
            disabled=False,
            style=style
    )    
    service = interactive(search_service_protected,url=service_widget,apikey=apikey_widget)    

display(service)

In [None]:
service_uri=service_widget.value
print("Sending queries to {}".format(service_uri))
if auth_object!=None:
    auth_object.setKey(apikey_widget.value)


### Faceted search 

#### [Facets] Number of substances per project

In [17]:
facets = client_solr.Facets()
query=facets.getQuery(query="*:*",facets=["dbtag_hss"],fq="type_s:substance")
#print(query)
r = client_solr.post(service_uri,query=query,auth=auth_object)
response_json=r.json()
print(response_json)
if r.status_code==200:
    facets.parse(response_json['facets'])
else:
    print(r.status_code)

{'responseHeader': {'zkConnected': True, 'status': 0, 'QTime': 4, 'params': {'q': '*:*', 'json.facet': '{field1: {type:terms,field:dbtag_hss ,limit : -1, mincount:1, missing:true }}', 'fq': 'type_s:substance', 'rows': '0', 'wt': 'json'}}, 'response': {'numFound': 156, 'start': 0, 'docs': []}, 'facets': {'count': 156, 'field1': {'missing': {'count': 0}, 'buckets': [{'val': 'NNRG', 'count': 156}]}}}
	()'_'	156	ALL
		('_',)'NNRG'	156	field1


#### [Facets] Number of material types per project

In [18]:
query=facets.getQuery(query="*:*",facets=["dbtag_hss","substanceType_hs"],fq="type_s:substance")
#print(query)
r = client_solr.post(service_uri,query=query,auth=auth_object)
response_json=r.json()
if r.status_code==200:
    facets.parse(response_json['facets'])
else:
    print(r.status_code)

	()'_'	156	ALL
		('_',)'NNRG'	156	field2
			('_', 'NNRG')'NPO_1373'	34	field1
			('_', 'NNRG')'NPO_354'	30	field1
			('_', 'NNRG')'NPO_1486'	19	field1
			('_', 'NNRG')'NPO_1892'	13	field1
			('_', 'NNRG')'CHEBI_133349'	8	field1
			('_', 'NNRG')'NPO_1542'	8	field1
			('_', 'NNRG')'CHEBI_3311'	7	field1
			('_', 'NNRG')'ENM_9000006'	5	field1
			('_', 'NNRG')'NPO_1373	'	5	field1
			('_', 'NNRG')'CHEBI:133349'	4	field1
			('_', 'NNRG')'CHEBI_59999'	3	field1
			('_', 'NNRG')'CHEBI:18246'	2	field1
			('_', 'NNRG')'CHEBI_133326'	2	field1
			('_', 'NNRG')'CHEBI_51135'	2	field1
			('_', 'NNRG')'NPO_1550'	2	field1
			('_', 'NNRG')'CHEBI_133333'	1	field1
			('_', 'NNRG')'CHEBI_133337'	1	field1
			('_', 'NNRG')'CHEBI_133340'	1	field1
			('_', 'NNRG')'CHEBI_18246'	1	field1
			('_', 'NNRG')'CHEBI_33418'	1	field1
			('_', 'NNRG')'CHEBI_36973'	1	field1
			('_', 'NNRG')'ENM_9000007'	1	field1
			('_', 'NNRG')'NPO_157'	1	field1
			('_', 'NNRG')'NPO_401'	1	field1
			('_', 'NNRG')'NPO_606'	1	field1
			('_',

In [19]:
a = annotation.DictionarySubstancetypes()
term=a.annotate("NPO_354")
print(term)
term=a.annotate("NPO_1373")
print(term)


multi-walled carbon nanotube
silicon dioxide nanoparticle


#### [Facets] Get all cell types

In [20]:
reload(client_solr)

facets = client_solr.Facets()
query=facets.getQuery(query="*:*",facets=["E.cell_type_s"],fq="type_s:params")
#print(query)
r = client_solr.post(service_uri,query=query,auth=auth_object)
response_json=r.json()
if r.status_code==200:
    facets.parse(response_json['facets'])
else:
    print(r.status_code)


	()'_'	38766	ALL
		('_',)'A549'	9877	field1
		('_',)'CACO-2'	4287	field1
		('_',)'THP-1'	3359	field1
		('_',)'GF'	1369	field1
		('_',)'BEAS-2B'	1192	field1
		('_',)'TK6'	864	field1
		('_',)'RAW 264.7'	710	field1
		('_',)'U937'	588	field1
		('_',)'SAOS'	380	field1
		('_',)'A549:THP1'	327	field1
		('_',)'BAL'	124	field1
		('_',)'V79'	104	field1
		('_',)'LIVER CELLS'	98	field1
		('_',)'HEPG2'	97	field1
		('_',)'PERITONEAL PRIMARY MACROPHAGES'	96	field1
		('_',)'LUNG CELLS'	86	field1
		('_',)'3T3'	72	field1
		('_',)'CAKI-1'	56	field1
		('_',)'BONE MARROW CELLS'	55	field1
		('_',)'CALU-3'	42	field1
		('_',)'HEP3B'	42	field1
		('_',)'NHBE'	36	field1
		('_',)'HMDM'	24	field1
		('_',)'MLN'	12	field1
		('_',)'SPLEEN CELLS'	12	field1
		('_',)'BRAIN CELL'	9	field1
		('_',)'OVARY CELLS'	6	field1
		('_',)'PERIPHERAL BLOOD - LEUKOCYTES'	6	field1
		('_',)'_'	14836	field1


#### [Facets] Get all protocols per endpoint for titanium dioxide nanoparticles (NPO_1486)

In [21]:
fields=["topcategory_s","endpointcategory_s","guidance_s"]
query=facets.getQuery(query="substanceType_s:NPO_1486",fq="type_s:study",facets=fields)
print(query)
r = client_solr.post(service_uri,query=query,auth=auth_object)
print(r.status_code)
if r.status_code==200:
    facets.parse(r.json()['facets'])
else:
    print(r.status_code)

{'q': 'substanceType_s:NPO_1486', 'fq': 'type_s:study', 'wt': 'json', 'json.facet': '{field3: {type:terms,field:topcategory_s ,limit : -1, mincount:1, missing:true , facet:{field2: {type:terms,field:endpointcategory_s ,limit : -1, mincount:1, missing:true , facet:{field1: {type:terms,field:guidance_s ,limit : -1, mincount:1, missing:true }}}}}}', 'rows': 0}
200
	()'_'	8850	ALL
		('_',)'TOX'	5676	field3
			('_', 'TOX')'ENM_0000068_SECTION'	2250	field2
				('_', 'TOX', 'ENM_0000068_SECTION')'CELL COUNT'	357	field1
				('_', 'TOX', 'ENM_0000068_SECTION')'IMPEDANCE ADHERENT CELLS'	291	field1
				('_', 'TOX', 'ENM_0000068_SECTION')'MTS'	277	field1
				('_', 'TOX', 'ENM_0000068_SECTION')'COLONY FORMING'	260	field1
				('_', 'TOX', 'ENM_0000068_SECTION')'ALAMAR BLUE'	251	field1
				('_', 'TOX', 'ENM_0000068_SECTION')'NUCLEAR AREA'	105	field1
				('_', 'TOX', 'ENM_0000068_SECTION')'NUCLEAR INTENSITY'	105	field1
				('_', 'TOX', 'ENM_0000068_SECTION')'LDH'	98	field1
				('_', 'TOX', 'ENM_000006

#### [Facets] Get all methods

In [22]:
fields=["topcategory_s","endpointcategory_s","E.method_s","E.sop_reference_s"]
query=facets.getQuery(query="*:*",fq="type_s:params",facets=fields)
print(query)
r = client_solr.post(service_uri,query=query,auth=auth_object)
print(r.status_code)
if r.status_code==200:
    facets.parse(r.json()['facets'])
else:
    print(r.status_code)

{'q': '*:*', 'fq': 'type_s:params', 'wt': 'json', 'json.facet': '{field4: {type:terms,field:topcategory_s ,limit : -1, mincount:1, missing:true , facet:{field3: {type:terms,field:endpointcategory_s ,limit : -1, mincount:1, missing:true , facet:{field2: {type:terms,field:E.method_s ,limit : -1, mincount:1, missing:true , facet:{field1: {type:terms,field:E.sop_reference_s ,limit : -1, mincount:1, missing:true }}}}}}}}', 'rows': 0}
200
	()'_'	38766	ALL
		('_',)'TOX'	26235	field4
			('_', 'TOX')'ENM_0000068_SECTION'	10990	field3
				('_', 'TOX', 'ENM_0000068_SECTION')'MTS CELL VIABILITY ASSAY'	1735	field2
					('_', 'TOX', 'ENM_0000068_SECTION', 'MTS CELL VIABILITY ASSAY')'NANOVALID SOP FOR MTS CELL VIABILITY ASSAY'	714	field1
					('_', 'TOX', 'ENM_0000068_SECTION', 'MTS CELL VIABILITY ASSAY')'CIRCABC WP5'	484	field1
					('_', 'TOX', 'ENM_0000068_SECTION', 'MTS CELL VIABILITY ASSAY')'MTS CELL VIABILITY FOR NP-TREATED A549 CELLS'	195	field1
					('_', 'TOX', 'ENM_0000068_SECTION', 'MTS CE

#### [Facets] Get all material types

In [23]:
fields=["substanceType_hs","publicname_hs","name_hs","dbtag_hss"]
query=facets.getQuery(fq="type_s:substance",facets=fields)
#print(query)
r = client_solr.post(service_uri,query=query,auth=auth_object)
print(r.status_code)
if r.status_code==200:
    facets.parse(r.json()['facets'])
else:
    print(r.status_code)

200
	()'_'	156	ALL
		('_',)'NPO_1373'	34	field4
			('_', 'NPO_1373')'JRCNM02000a'	3	field3
				('_', 'NPO_1373', 'JRCNM02000a')'NM-200 (silica 18.3 nm)'	2	field2
					('_', 'NPO_1373', 'JRCNM02000a', 'NM-200 (silica 18.3 nm)')'NNRG'	2	field1
				('_', 'NPO_1373', 'JRCNM02000a')'NM-200 (Synthetic Amorphous Silica PR-A-02 )'	1	field2
					('_', 'NPO_1373', 'JRCNM02000a', 'NM-200 (Synthetic Amorphous Silica PR-A-02 )')'NNRG'	1	field1
			('_', 'NPO_1373')'JRCNM02003a'	3	field3
				('_', 'NPO_1373', 'JRCNM02003a')'NM-203 (SiO2 13-45 nm)'	1	field2
					('_', 'NPO_1373', 'JRCNM02003a', 'NM-203 (SiO2 13-45 nm)')'NNRG'	1	field1
				('_', 'NPO_1373', 'JRCNM02003a')'NM-203 (Synthetic Amorphous Silica PY-A-04)'	1	field2
					('_', 'NPO_1373', 'JRCNM02003a', 'NM-203 (Synthetic Amorphous Silica PY-A-04)')'NNRG'	1	field1
				('_', 'NPO_1373', 'JRCNM02003a')'NM-203 (silica 24.7 nm)'	1	field2
					('_', 'NPO_1373', 'JRCNM02003a', 'NM-203 (silica 24.7 nm)')'NNRG'	1	field1
			('_', 'NPO_1373')'SiO2@IIT 5

#### [Facets]  Get all endpoints for nanotubes

In [24]:
query=facets.getQuery(query="carbon nanotube",facets=["topcategory_s","endpointcategory_s","effectendpoint_s","unit_s"],fq="type_s:study")
#print(query)
r = client_solr.post(service_uri,query=query,auth=auth_object)
print(r.status_code)
#print(r.json()['facets'])
if r.status_code==200:
    facets.parse(r.json()['facets'])
else:
    print(r.status_code)

200
	()'_'	1930	ALL
		('_',)'TOX'	1238	field4
			('_', 'TOX')'TO_GENETIC_IN_VIVO_SECTION'	472	field3
				('_', 'TOX', 'TO_GENETIC_IN_VIVO_SECTION')'MEDIAN TAIL INTENSITY BAL'	96	field2
					('_', 'TOX', 'TO_GENETIC_IN_VIVO_SECTION', 'MEDIAN TAIL INTENSITY BAL')'%'	96	field1
				('_', 'TOX', 'TO_GENETIC_IN_VIVO_SECTION')'MEDIAN TAIL INTENSITY LIVER'	96	field2
					('_', 'TOX', 'TO_GENETIC_IN_VIVO_SECTION', 'MEDIAN TAIL INTENSITY LIVER')'%'	96	field1
				('_', 'TOX', 'TO_GENETIC_IN_VIVO_SECTION')'MEDIAN TAIL INTENSITY LUNG'	96	field2
					('_', 'TOX', 'TO_GENETIC_IN_VIVO_SECTION', 'MEDIAN TAIL INTENSITY LUNG')'%'	96	field1
				('_', 'TOX', 'TO_GENETIC_IN_VIVO_SECTION')'MEDIAN TAIL INTENSITY SPLEEN'	96	field2
					('_', 'TOX', 'TO_GENETIC_IN_VIVO_SECTION', 'MEDIAN TAIL INTENSITY SPLEEN')'%'	96	field1
				('_', 'TOX', 'TO_GENETIC_IN_VIVO_SECTION')'MEDIAN TAIL INTENSITY BLOOD'	88	field2
					('_', 'TOX', 'TO_GENETIC_IN_VIVO_SECTION', 'MEDIAN TAIL INTENSITY BLOOD')'%'	88	field1
			('_', 'TOX

### Retrieve experimental data

#### Physchem example - MWCNT size

In [25]:
reload(client_solr)
study = client_solr.StudyDocuments()
filter = {'topcategory_s':'P-CHEM', 'endpointcategory_s':'PC_GRANULOMETRY_SECTION' }
study.setStudyFilter(filter)
print(study.getSettings())
#all TiO2 NPO_1486
query = study.getQuery(textfilter='substanceType_s:NPO_354',rows=10000)
r = client_solr.post(service_uri,query=query,auth=auth_object)

{'studyfilter': '   topcategory_s:P-CHEM AND endpointcategory_s:PC_GRANULOMETRY_SECTION', 'query_organism': None, 'endpointfilter': None, 'query_guidance': None, 'fields': None}


In [26]:
#parse the data
if r.status_code==200:
    study = client_solr.StudyDocuments()
    rows = study.parse(r.json()['response']['docs'])
    df = study.rows2frame(rows)
    rows=None
    uuids = ['uuid.investigation','uuid.assay','uuid.document','uuid.substance']
    df.sort_values(by=uuids)
    display(df.head(50))
else:
    print(r.status_code)

Unnamed: 0,db,m.materialprovider,m.public.name,m.substance.name,m.substance.type,p.guidance,p.oht.module,p.oht.section,p.reference,p.reference_year,...,x.params.T.instrumentmodel,x.params.Vial,x.params.Vial_d,x.params.concentration_UNIT,x.params.concentration_d,x.params.guidance,xR.purposeFlag,xR.reliability,xR.studyResultType,xx.QualityRemark
0,NNRG,NANoREG,JRCNM04000a,NM-400 (MWCNT 13.6 nm),NPO_354,DLS,P-CHEM,PC_GRANULOMETRY_SECTION,Ecotox dispersion_NM400_1,2015,...,,not recorded,,,,DLS,,,Measured,
1,NNRG,NANoREG,JRCNM04000a,NM-400 (MWCNT 13.6 nm),NPO_354,DLS,P-CHEM,PC_GRANULOMETRY_SECTION,Ecotox dispersion_NM400_1,2015,...,,not recorded,,,,DLS,,,Measured,
2,NNRG,NANoREG,JRCNM04000a,NM-400 (MWCNT 13.6 nm),NPO_354,DLS,P-CHEM,PC_GRANULOMETRY_SECTION,Ecotox dispersion_NM400_1,2015,...,,not recorded,,,,DLS,,,Measured,
3,NNRG,NANoREG,JRCNM04000a,NM-400 (MWCNT 13.6 nm),NPO_354,DLS,P-CHEM,PC_GRANULOMETRY_SECTION,Ecotox dispersion_NM400_1,2015,...,,not recorded,,,,DLS,,,Measured,
4,NNRG,NANoREG,JRCNM04000a,NM-400 (MWCNT 13.6 nm),NPO_354,DLS,P-CHEM,PC_GRANULOMETRY_SECTION,Ecotox dispersion_NM400_2,2015,...,,not recorded,,,,DLS,,,Measured,
5,NNRG,NANoREG,JRCNM04000a,NM-400 (MWCNT 13.6 nm),NPO_354,DLS,P-CHEM,PC_GRANULOMETRY_SECTION,Ecotox dispersion_NM400_2,2015,...,,not recorded,,,,DLS,,,Measured,
6,NNRG,NANoREG,JRCNM04000a,NM-400 (MWCNT 13.6 nm),NPO_354,DLS,P-CHEM,PC_GRANULOMETRY_SECTION,Ecotox dispersion_NM400_2,2015,...,,not recorded,,,,DLS,,,Measured,
7,NNRG,NANoREG,JRCNM04000a,NM-400 (MWCNT 13.6 nm),NPO_354,DLS,P-CHEM,PC_GRANULOMETRY_SECTION,Ecotox dispersion_NM400_2,2015,...,,not recorded,,,,DLS,,,Measured,
8,NNRG,NANoREG,JRCNM04000a,NM-400 (MWCNT 13.6 nm),NPO_354,DLS,P-CHEM,PC_GRANULOMETRY_SECTION,Ecotox dispersion_NM400_3,2015,...,,not recorded,,,,DLS,,,Measured,
9,NNRG,NANoREG,JRCNM04000a,NM-400 (MWCNT 13.6 nm),NPO_354,DLS,P-CHEM,PC_GRANULOMETRY_SECTION,Ecotox dispersion_NM400_3,2015,...,,not recorded,,,,DLS,,,Measured,


In [27]:
#Group by material and endpoint
groups=[]

groups.append("m.public.name")
#groups.append("x.params.E.method")
#groups.append("p.guidance")
groups.append("x.params.MEDIUM")
groups.append("value.endpoint")
groups.append("value.endpoint_type")
groups.append("value.unit")
print(groups)

tmp=df.groupby(by=groups).agg({"value.range.lo" : ["mean","std","count"]}).reset_index()
(tmp)

['m.public.name', 'x.params.MEDIUM', 'value.endpoint', 'value.endpoint_type', 'value.unit']


Unnamed: 0_level_0,m.public.name,x.params.MEDIUM,value.endpoint,value.endpoint_type,value.unit,value.range.lo,value.range.lo,value.range.lo
Unnamed: 0_level_1,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,mean,std,count
0,JRCNM04000a,BEGM,GLOBAL MEAN SIZE,INTENSITY-WEIGHTED,nm,1397.65,1970.525794,6
1,JRCNM04000a,BEGM,HYDRODYNAMIC DIAMETER,Z-AVERAGE,nm,4504.5,3270.368863,2
2,JRCNM04000a,DMEM INSA,GLOBAL MEAN SIZE,INTENSITY-WEIGHTED,nm,2710.825,2211.633713,4
3,JRCNM04000a,DMEM INSA,HYDRODYNAMIC DIAMETER,Z-AVERAGE,nm,601.7,244.941789,2
4,JRCNM04000a,RPMI 1640 glutamax/Hepes 89% +,GLOBAL MEAN SIZE,INTENSITY-WEIGHTED,nm,1142.323571,2404.909784,28
5,JRCNM04000a,RPMI 1640 glutamax/Hepes 89% +,HYDRODYNAMIC DIAMETER,Z-AVERAGE,nm,717.118,932.453894,10
6,JRCNM04000a,STOCK,SIZE DISTRIBUTION,MEAN,nm,199.0,0.0,7
7,JRCNM04000a,STOCK,SIZE DISTRIBUTION,MODE,nm,152.0,0.0,7
8,JRCNM04000a,Water,GLOBAL MEAN SIZE,,nm,36.0,8.485281,2
9,JRCNM04000a,Water,GLOBAL MEAN SIZE,INTENSITY-WEIGHTED,nm,1095.253643,1516.256331,14


#### Tox example - TiO2 cell viability

In [28]:
reload(client_solr)
study = client_solr.StudyDocuments()
filter = {'topcategory_s':'TOX', 'endpointcategory_s':'ENM_0000068_SECTION' }
study.setStudyFilter(filter)
print(study.getSettings())
#all TiO2 NPO_1486
query = study.getQuery(textfilter='substanceType_s:NPO_354',rows=10000)
r = client_solr.post(service_uri,query=query,auth=auth_object)

{'studyfilter': '   topcategory_s:TOX AND endpointcategory_s:ENM_0000068_SECTION', 'query_organism': None, 'endpointfilter': None, 'query_guidance': None, 'fields': None}


In [29]:
#parse the data
if r.status_code==200:
    study = client_solr.StudyDocuments()
    rows = study.parse(r.json()['response']['docs'])
    df = study.rows2frame(rows)
    rows=None
    uuids = ['uuid.investigation','uuid.assay','uuid.document','uuid.substance']
    df.sort_values(by=uuids)
    display(df.head(50))
else:
    print(r.status_code)

Unnamed: 0,db,m.materialprovider,m.public.name,m.substance.name,m.substance.type,p.guidance,p.oht.module,p.oht.section,p.reference,p.reference_year,...,x.params.Material state,x.params.Provider,x.params.T.instrumentmodel,x.params.Vial,x.params.Vial_d,x.params.guidance,xR.purposeFlag,xR.reliability,xR.studyResultType,xx.QualityRemark
0,NNRG,NANoREG,JRCNM04000a,NM-400 (MWCNT 13.6 nm),NPO_354,ALAMAR BLUE,TOX,ENM_0000068_SECTION,NM400_KI_24h,2015,...,,,,JRCNM04000a020019,,ALAMAR BLUE,,,Measured,
1,NNRG,NANoREG,JRCNM04000a,NM-400 (MWCNT 13.6 nm),NPO_354,ALAMAR BLUE,TOX,ENM_0000068_SECTION,NM400_KI_24h,2015,...,,,,JRCNM04000a020019,,ALAMAR BLUE,,,Measured,
2,NNRG,NANoREG,JRCNM04000a,NM-400 (MWCNT 13.6 nm),NPO_354,ALAMAR BLUE,TOX,ENM_0000068_SECTION,NM400_KI_24h,2015,...,,,,JRCNM04000a020019,,ALAMAR BLUE,,,Measured,
3,NNRG,NANoREG,JRCNM04000a,NM-400 (MWCNT 13.6 nm),NPO_354,ALAMAR BLUE,TOX,ENM_0000068_SECTION,NM400_KI_24h,2015,...,,,,JRCNM04000a020019,,ALAMAR BLUE,,,Measured,
4,NNRG,NANoREG,JRCNM04000a,NM-400 (MWCNT 13.6 nm),NPO_354,ALAMAR BLUE,TOX,ENM_0000068_SECTION,NM400_KI_24h,2015,...,,,,JRCNM04000a020019,,ALAMAR BLUE,,,Measured,
5,NNRG,NANoREG,JRCNM04000a,NM-400 (MWCNT 13.6 nm),NPO_354,ALAMAR BLUE,TOX,ENM_0000068_SECTION,NM400_KI_24h,2015,...,,,,JRCNM04000a020019,,ALAMAR BLUE,,,Measured,
6,NNRG,NANoREG,JRCNM04000a,NM-400 (MWCNT 13.6 nm),NPO_354,ALAMAR BLUE,TOX,ENM_0000068_SECTION,NM400_KI_24h,2015,...,,,,JRCNM04000a020019,,ALAMAR BLUE,,,Measured,
7,NNRG,NANoREG,JRCNM04000a,NM-400 (MWCNT 13.6 nm),NPO_354,ALAMAR BLUE,TOX,ENM_0000068_SECTION,NM400_KI_48h,2015,...,,,,JRCNM04000a020019,,ALAMAR BLUE,,,Measured,
8,NNRG,NANoREG,JRCNM04000a,NM-400 (MWCNT 13.6 nm),NPO_354,ALAMAR BLUE,TOX,ENM_0000068_SECTION,NM400_KI_48h,2015,...,,,,JRCNM04000a020019,,ALAMAR BLUE,,,Measured,
9,NNRG,NANoREG,JRCNM04000a,NM-400 (MWCNT 13.6 nm),NPO_354,ALAMAR BLUE,TOX,ENM_0000068_SECTION,NM400_KI_48h,2015,...,,,,JRCNM04000a020019,,ALAMAR BLUE,,,Measured,


In [30]:
groups=[]

groups.append("m.public.name")
groups.append("uuid.assay")
groups.append("uuid.document")
#groups.append("x.params.E.method")
#groups.append("p.guidance")
groups.append("x.params.MEDIUM")
groups.append("x.params.E.cell_type")
groups.append("x.conditions.material")
groups.append("value.endpoint")
groups.append("value.endpoint_type")
groups.append("value.unit")
print(groups)

tmp=df.groupby(by=groups).agg({"value.range.lo" : ["mean","std","count"]}).reset_index()
display(tmp)

['m.public.name', 'uuid.assay', 'uuid.document', 'x.params.MEDIUM', 'x.params.E.cell_type', 'x.conditions.material', 'value.endpoint', 'value.endpoint_type', 'value.unit']


Unnamed: 0_level_0,m.public.name,uuid.assay,uuid.document,x.params.MEDIUM,x.params.E.cell_type,x.conditions.material,value.endpoint,value.endpoint_type,value.unit,value.range.lo,value.range.lo,value.range.lo
Unnamed: 0_level_1,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,mean,std,count
0,JRCNM04000a,05616713-a032-9c55-e292-e4c2ca51350b,NRTR-00000000-0000-0000-0000-0000000016b7,CCM,HEPG2,,IC50,AGGREGATED,ug/ml,100.000000,,1
1,JRCNM04000a,05616713-a032-9c55-e292-e4c2ca51350b,NRTR-00000000-0000-0000-0000-0000000016b7,CCM,HEPG2,none,PERCENTAGE VIABILITY COMPARED TO CONTROL,DOSERESPONSE,%,100.000000,,1
2,JRCNM04000a,0c9627fb-5586-771e-2c1c-a3899fb49736,NRTR-00000000-0000-0000-0000-0000000016db,DMEM Hi plus NEAA,A549,NM,PERCENTAGE VIABILITY COMPARED TO CONTROL,DOSERESPONSE,%,102.494596,10.566695,6
3,JRCNM04000a,0c9627fb-5586-771e-2c1c-a3899fb49736,NRTR-00000000-0000-0000-0000-0000000016db,DMEM Hi plus NEAA,A549,NM400,IC20,AGGREGATED,ug/mL,100.000000,,1
4,JRCNM04000a,0c9627fb-5586-771e-2c1c-a3899fb49736,NRTR-00000000-0000-0000-0000-0000000016db,DMEM Hi plus NEAA,A549,NM400,IC50,AGGREGATED,ug/mL,100.000000,,1
5,JRCNM04000a,0c9627fb-5586-771e-2c1c-a3899fb49736,NRTR-00000000-0000-0000-0000-0000000016db,DMEM Hi plus NEAA,A549,Positive control,PERCENTAGE VIABILITY COMPARED TO CONTROL,DOSERESPONSE,%,81.937305,28.201351,6
6,JRCNM04000a,16807101-fb57-4ece-3518-81904010fa56,NRTR-00000000-0000-0000-0000-0000000016f7,DMEM Hi plus NEAA,CACO-2,Positive control,,AGGREGATED,ug/mL,19.132538,,1
7,JRCNM04000a,28e7c43c-5494-6f8e-9c52-63328987ee65,NRTR-00000000-0000-0000-0000-000000001658,DMEM Hi plus NEAA,3T3,,IC50,AGGREGATED,ug/ml,69.380000,,1
8,JRCNM04000a,28e7c43c-5494-6f8e-9c52-63328987ee65,NRTR-00000000-0000-0000-0000-000000001658,DMEM Hi plus NEAA,3T3,none,PERCENTAGE VIABILITY COMPARED TO CONTROL,DOSERESPONSE,%,50.000000,,1
9,JRCNM04000a,2f8ab728-02ee-4b6f-62ba-0e1d5ee982ea,NRTR-00000000-0000-0000-0000-000000001657,DMEM Hi plus NEAA,A549,,IC50,AGGREGATED,ug/ml,100.000000,,1


.

## Annotation examples

In [None]:

reload(annotation)
a = annotation.DictionaryEndpoints()
for endpoint in ["CIRCULARITY","FERET_DIAMETER","IC50"]:
    term=a.annotate(endpoint)
    print(endpoint)
    print(term)
    print(a.getLink(term))


In [None]:
a = annotation.DictionaryCells()
for cell in ["3T3","A549"]:
    term=a.annotate(cell)
    print(cell)
    print(term)
    print(a.getLink(term))

In [None]:
a = annotation.DictionaryAssays()
for assay in ["CFE","Alamar blue","TEM","COMET"]:
    term=a.annotate(assay)
    print(assay)
    print(term)
    print(a.getLink(term))


In [None]:
a = annotation.DictionaryEndpointCategory()
term=a.annotate("PC_GRANULOMETRY_SECTION")
print(term)
print(a.getLink(term))

In [None]:
a = annotation.DictionarySpecies()
term=a.annotate("rat")
print(term)
print(a.getLink(term))

In [None]:
a = annotation.DictionarySubstancetypes()
term=a.annotate("NPO_401")
print(term)
