<html>
    <div style = "display: inline-block; width=150px; height=75px;">
            <h2 style="text-align: center">NIMH Data Archive</h2>
        </div>
    <div>
        <div style="display: inline-block; width=150px; height=75px;">
            <img src="https://ndar.nih.gov/images/ndar/circuit_brain_red.png" alt="NIMH data archive image"/>            
        </div>        
	</div>
    <h2>Background</h2>
    <ul>
    <li>Joint initiative supported by NIMH, NICHD, NINDS, and NIEHS</li>
    <li>Contains data from human subjects related to autism (and control subjects) and other NIMH funded research and clinical trials</li>
    <li>Data are available to the research community through a not too difficult application process</li>
    <li>Begun in late 2006, first data was received in 2008, significant data became available in 2012.</li>
    <li>Data submission every 6 months for NIMH awardees</li>
    </ul>
</html>    

<html>
<h2>New Projects (coming soon)</h2>
    <h3>Adolescent Brain Cognitive Development (ABCD) Study</h3>
    <ul>
    <li>Recruit 10,000 healthy children, ages 9 to 10 across the United States, and follow them into early adulthood.</li>
    <li>Use advanced brain imaging to observe brain growth with unprecedented precision.</li>
    <li>Examine how biology and environment interact and relate to developmental outcomes such as physical health, mental health, and life achievements including academic success</li>
    <li><a href="http://abcd-study.org/">Read more here...</a></li>
    </ul>
    <h3>Human Connectome Project (HCP)</h3>
    <ul>
    <li>Target for 1200 Healthy Adults ~ initial release of 900</li>
    <li>Common data acquisition methods, 3T and 7T scanners, diffusion imaging and resting-state fMRI</li>
    <li><a href="http://www.humanconnectome.org/about/project/">Read more here...</a></li></ul>
</html>

<html>
<h2>Harmonization Standards</h2>
    <ul>
    <li><a href="https://ndar.nih.gov/standards.html#guid">GUID</a></li>
    <li><a href="https://ndar.nih.gov/standards.html#clinical">Data Dictionary</a></li>
    </ul>
</html>

<html>
<div>
<h2>Some Numbers</h2>
<table style="float:left; margin-right:10px;">
<thead>
<tr><td>Shared Status</td><td>Data Type</td><td>Subject Count</td></tr>
</thead>
<tbody>
<tr><td>Shared</td><td>Clinical Assessments</td><td>337782</td></tr>
<tr><td>Shared</td><td>Neurosignaling Recordings</td><td>8082</td></tr>
<tr><td>Shared</td><td>Omics</td><td>37676</td></tr>
<tr><td>Private</td><td>Clinical Assessments</td><td>47661</td></tr>
<tr><td>Private</td><td>Neurosignaling Recordings</td><td>12716</td></tr>
<tr><td>Private</td><td>Omics</td><td>24158</td></tr>
</tbody>
</table>
</div>
</html>

<html>
<div>
    <h3>Available APIs <a href="https://data-archive.nimh.nih.gov/API">https://data-archive.nimh.nih.gov/API</a></h3>
    <ul>
    <li>Data Dictionary <a href="https://ndar.nih.gov/swagger">https://ndar.nih.gov/api/datadictionary</a></li>
    <li>Experiment <a href="https://ndar.nih.gov/swagger">https://ndar.nih.gov/api/experiment</a></li>
    <li>Search <a href="https://ndar.nih.gov/api/search">https://ndar.nih.gov/api/search</a></li>
    <li>GUID <a href="https://ndar.nih.gov/api/guid">https://ndar.nih.gov/api/guid</a></li>
    <li>miNDAR <a href="https://stage.nimhda.org/api/mindar">https://stage.nimhda.org/api/mindar</a></li>
    </ul>
    
    </div>
 </html>

<html>
    <div style = "display: inline-block; width=150px; height=75px;">
        <h2>Data Dictionary Webservice</h2>
        <p>Provides the capability for programmatic interrogation of all available datastructures/measures, by name, type, source, and category.  The service also provides element-level detail.</p>
        <p>Additionally, a history of changes can be retreived for both datasturcture and element.</p>
    </div>
    <p>
        <a href="https://ndar.nih.gov/swagger">Swagger User Interface</a>
    </p>
</html>

In [None]:
# Progromatically retreive data from the dictionary service

import requests
import json
shortname = input('Enter a Data Structure shortname:')
r = requests.get('https://ndar.nih.gov/api/datadictionary/datastructure/{}'
                 .format(shortname),
                  headers={'Accept':'application/json'})
structure = json.loads(r.text)

# Get Data Structure change history

r = requests.get('https://ndar.nih.gov/api/datadictionary/datastructure/{}/changes'
                 .format(shortname),
                  headers={'Accept':'application/json'})

changes = json.loads(r.text)

In [None]:
# Show data structure elements that are required, or potentially required (conditional)

for element in structure['dataElements']:
    if element['required'] in ['Required','Conditional']:
        print('elementInfo: {}\n'.format(element))

In [None]:
# Grab File elements from the structure

for element in structure['dataElements']:
    if element['type'] == 'File':
        print('elementName: {}'.format(element['name']))

In [None]:
# Get data element info and changes
from IPython.display import display

class changeHistoryTable():
    
    def __init__(self, list):
        self.list = list
        self.headers = ['id','changeDescription','changedDate','elementName','newValue','oldValue','shortName']
        self._repr_html_()
    
    def _repr_html_(self):

        html = ["<table width=100%>"]
        html.append("<thead><tr>")
        for header in self.headers:
            html.append("<td>{}</td>".format(header))
        html.append("</tr></thead><tbody>")      

        for row in self.list:
            html.append("<tr>")
            for header in self.headers:
                html.append("<td>{}</td>".format(row[header]))
            html.append("</tr>")
        html.append("</tbody></table>")
        return ''.join(html)

change_list = []
        
for element in structure['dataElements']:
    if element['type'] == 'File':
        r = requests.get('https://ndar.nih.gov/api/datadictionary/dataelement/{}'
                 .format(element['name']),
                  headers={'Accept':'application/json'})
        elementInfo = json.loads(r.text)
        
        r = requests.get('https://ndar.nih.gov/api/datadictionary/dataelement/{}/changes'
                .format(element['name']),
                headers={'Accept':'application/json'})
        changes = json.loads(r.text)
        try:
            change_list.extend(changes['list'])
        except KeyError:
            print('No changes for elementName {}'.format(element['name']))

display(changeHistoryTable(change_list))

<html>
    <div style = "display: inline-block; width=150px; height=75px;">
        <h2>Search Webservice</h2>
        <p>Provides the capability for programmatic search across all NDA sites (NDAR, pediatricMRI, ABCD, Clinical Trials, Humman Connectome Project, etc.) enabling users to identify projects with data of potential interest.</p>
        <p>Search content includes experimental information, project descriptions, investigators, grant numbers, page-content, and all data elements that make up the 1000s of measures defined in the data dictionary.</p>
    </div>
    
    <p>
        <a href="https://ndar.nih.gov/api/search">Swagger User Interface</a>
    </p>
</html>

<html>
<div>
<h2>Full Search</h2>
<p>Search resource that allows for querying a word or phrase and identifying matching content, studies, collections, experiments, and data elements.</p>
</div>
<h2>Data Element Search</h2>
<p>Search resource that allows for specifying attributes of a data element and querying the entire dictionary for matching elements.</p>
<div>
<table style="float:left; margin-right:10px;">
<thead><tr><td>Attribute</td><td>Type</td></tr></thead>
<tbody>
  <tr><td>name</td><td>string</td></tr>
  <tr><td>description</td><td>string</td></tr>
  <tr><td>valueRanges</td><td>string</td></tr>
  <tr><td>notes</td><td>string</td></tr>
  <tr><td>type</td><td>string</td></tr>
  <tr><td>allQuery</td><td>string</td></tr>
</tbody>
</table>
</div>
</html>

In [None]:
# Data Element Search

description = input("Enter a description to query:")
query = {'description': description}
r = requests.post("https://stage.nimhda.org/api/search/nda_sw_removal/dataElementSearch?size=20", 
                  data=json.dumps(query),
                  headers={'content-type':'application/json'})
element_results = json.loads(r.text)
for result in element_results['dataElements']: 
    print("score:{}\nname:{}\ndescriptoin:{}\n".format(result['score'], result['name'], result['description']))

In [None]:
# Here is a programmatic example searching by collection
import requests
import json


class collectionLink():

    def __init__(self, title, id):
            self.id = id
            self.title = title
            self._repr_html_()

    def _repr_html_(self):
        collection_link = 'https://ndar.nih.gov/edit_collection.html?id={}'.format(self.id)
        html = ['<a href="{}">{}</a>'.format(collection_link, self.title)]
        return ''.join(html)

    
query = input("Enter your query phrase:")
r = requests.post("https://ndar.nih.gov/api/search/nda_sw_removal/collection/full", query)
collections = json.loads(r.text)
print("\n")
for result in collections['collection']['results']:
    display(collectionLink(result['title'],result['id']))

<html>
    <div style = "display: inline-block; width=150px; height=75px;">
        <h2>GUID Webservice</h2>
        <p>Provides the capability to programmatically access all submitted accross proejcts data for a specifc subject.</p>
        <p>This service requires authentication and will only return data that is accessible to your user, based on your privleges and permisisons on the data.</p>
    </div>
    <p>
    <a href="https://ndar.nih.gov/api/guid">Swagger User Interface</a>
    </p>
</html>

In [None]:
from getpass import getpass

username = input("What is your NDA username:")
password = getpass("What is your NDA password:")
guid = input("What GUID would you like to access data from:")
r = requests.get("https://ndar.nih.gov/api/guid/{}".format(guid), 
                 auth=requests.auth.HTTPBasicAuth(username, password),
                 headers={'Accept':'application/json'})
print(r.text)

In [None]:
from getpass import getpass

username = input("What is your NDA username:")
password = getpass("What is your NDA password:")
guid = input("What GUID would you like to access data from:")
r = requests.get("https://ndar.nih.gov/api/guid/{}/data?short_name=image03".format(guid), 
                 auth=requests.auth.HTTPBasicAuth(username, password),
                 headers={'Accept': 'application/json'})
guid_data = json.loads(r.text)
#print(guid_data)

In [None]:
# Extract experiment IDs from response

experiments = []
ages = []
for age in guid_data['age']:
    age_value = age['value']
    for row in age['dataStructureRow']:
        for element in row['dataElement']:
            if element['name']=='EXPERIMENT_ID':
                if element['value'] not in experiments:
                    experiments.append(element['value'])

for experiment in experiments:
    print('experiment: {}'.format(experiment))

In [None]:
# In previous 2 slides, have identified some fMRI, EEG, or Eye Tracking data; show how to retreive experimental details.

query = input("Enter your experiment ID:")
r = requests.get("https://ndar.nih.gov/api/experiment/{}".format(query),
                 headers={'Accept':'application/json'})

experiment = json.loads(r.text)
print(experiment)

In [None]:
# Pull out image files from response
image_files = []
ages = []
for age in guid_data['age']:
    age_value = age['value']
    for row in age['dataStructureRow']:
        for link in row['links']['link']:
            if link['rel']=='data_file':
                image_files.append(link['href'])
                ages.append(age_value)
for i,image in enumerate(image_files):
    print("age:{}, url:{}".format(ages[i],image))

In [None]:
# Generate FederatedToken
# https://github.com/NDAR/nda_aws_token_generator
from getpass import getpass

url = 'https://ndar.nih.gov/DataManager/dataManager'
username = input('Enter your NIMH Data Archives username:')
password = getpass('Enter your NIMH Data Archives password:')

from nda_aws_token_generator import *
generator = NDATokenGenerator(url)
token = generator.generate_token(username, password)

In [None]:
# Pull image out of S3

file = input("Enter S3 URL:")
#s3://NDAR_Central_2/submission_11013/002000001590/scanVisit__0020__0002/MRI__0001/B0_phase1/Native/Original__0001/DICOM.tar.gz

import boto3
from boto3.s3.transfer import S3Transfer
import botocore
from urllib.parse import urlparse
import os

bucket = urlparse(file).netloc
name = urlparse(file).path
key = name[1:]
l = key.split('/')
l = l[:1]
l = '/'.join(l)
location = os.path.join(os.path.expanduser('~'), 'AWS_downloads', l )
if not os.path.isdir(location):
    os.makedirs(location)
    
file_name = key.split('/')[-1]
file_name = os.path.join(location,file_name)

s3 = boto3.session.Session(aws_access_key_id=token.access_key,
                           aws_secret_access_key=token.secret_key,
                           aws_session_token=token.session)
s3_client = s3.client('s3')
s3transfer = S3Transfer(s3_client)


try:
    print('downloading S3 file %s' % file)
    s3transfer.download_file(bucket, key, file_name)
except botocore.exceptions.ClientError as e:
    print('S3 error: %s' % e)
    


<html>
<p>The downloaded file can now be viewed and analyzed.  <b>It should be noted that the boto package provides functionality to read the object into memory as a 'string', which could be passed to other functions, and boto also supports streaming the content.</b></p>
<a href="https://brainbrowser.cbrain.mcgill.ca/volume-viewer">Brain Browser Viewer</a>
</html>

<html>
    <div style = "display: inline-block; width=150px; height=75px;">
        <h2>miNDAR Webservice</h2>
        <p>A miNDAR is short for mini-NDAR, which is a remote database (Oracle) that you have control over and can push data to. The webservice currently provides the capability to POST data throu RESTful web service to a remote miNDAR. This service requires authentication to ensure you are the miNDAR owner.</p>
    </div>
    <p>
    <a href="https://stage.nimhda.org/api/mindar">Swagger User Interface</a>
    </p>
</html>

In [None]:
# Demo miNDAR import (stage)

from getpass import getpass

#username = input("What is your NDA username:")
#password = getpass("What is your NDA password:")

file = open('miNDAR POST/test_data_submission_genomics_sample03_5.xml', 'r')
data = file.read()

r = requests.post("https://stage.nimhda.org/api/mindar/import", 
                 auth=requests.auth.HTTPBasicAuth(username, password),
                 headers={'content-type':'application/xml'},
                 data = data)
print(r.text)

