# RADx Data Access Requests in dbGaP
This notebook analyzes the authorized requests for datasets submitted by the COVID Rapid Acceleration of Diagnostics [RADx Initiative](https://www.nih.gov/research-training/medical-research-initiatives/radx) projects:

- [RADx Tech](https://www.nih.gov/research-training/medical-research-initiatives/radx/radx-programs#radx-tech)
- [RADx Underserved Populations (RADx-UP)](https://www.nih.gov/research-training/medical-research-initiatives/radx/radx-programs#radx-up)
- [RADx Radical (RADx-rad)](https://www.nih.gov/research-training/medical-research-initiatives/radx/radx-programs#radx-rad)
- [RADx Digital Health Technologies (RADx-DHT)](https://www.nih.gov/news-events/news-releases/nih-awards-contracts-develop-innovative-digital-health-technologies-covid-19)

Author: Peter W. Rose (pwrose@ucsd.edu)
Creation date: 2023-02-23

In [1]:
import pandas as pd
pd.set_option('display.max_colwidth', None)

In [2]:
# List of studies
# TODO automate the studies file download for any query term

# To download the list of studies, run this query and click the "Save Results" button.
# https://www.ncbi.nlm.nih.gov/gap/advanced_search/?TERM=<query_term>

# Examples:
# https://www.ncbi.nlm.nih.gov/gap/advanced_search/?TERM=radx-rad
study = "https://raw.githubusercontent.com/radxrad/dbgap-reporter/main/data/radx-rad_studies.csv"
# study = "https://raw.githubusercontent.com/radxrad/dbgap-reporter/main/data/radx-up_studies.csv"
# study = "https://raw.githubusercontent.com/radxrad/dbgap-reporter/main/data/radx-tech_studies.csv"
# study = "https://raw.githubusercontent.com/radxrad/dbgap-reporter/main/data/radx-dht_studies.csv"

studies = pd.read_csv(study, usecols=["accession", "name", "description", "Study Design", "Study Consent",])

## Table of Studies

In [3]:
print("Number of studies", studies.shape[0])
studies

Number of studies 48


Unnamed: 0,accession,name,description,Study Design,Study Consent
0,phs002585.v1.p1,Rapid Acceleration of Diagnostics - Radical (RADx-rad): AICORE-kids,"This work is directed at characterizing pediatric COVID-19 and stratifying incoming patients by projected (future) disease severity. Such stratification has several implications: immediately improving treatment planning, and as disease mechanistic regulatory milestones intended to conform with the Emergency Use Authorization (EUA) programs in effect for SARS-CoV-2 diagnostics. Note for data in RADx",Case Set,GRU --- General research use
1,phs002525.v1.p1,Rapid Acceleration of Diagnostics - Radical (RADx-rad): SF-RAD: Development and Proof-of-Concept Implementation of the South Florida Miami RADx-rad SARS-CoV-2 Wastewater-Based Surveillance Infrastructure,"The University of Miami (UM), with three primary campuses in Miami, Florida, is geographically spread within one of the worst current COVID-19 hotbeds. UM has deployed an elaborate human surveillance strategies. Working closely with the RADx-rad Data Coordination Center (DCC), this application (SF-RAD) will develop and implement data standards and",Case Set,GRU --- General research use
2,phs002782.v1.p1,Rapid Acceleration of Diagnostics - Radical (RADx-rad): NIEHS Diagnostic-Prognostic RNAseq,"Infectious disease outbreaks like Coronavirus Disease 2019 (COVID-19) can overwhelm healthcare systems when screening tools are scarce or lacking. In the face of an ongoing COVID-19 pandemic and with single-plex , long queue times, backlogs in COVID-19 diagnoses, and delayed access to specialized treatment for COVID-19 patients. The goal of this RADx funded project",Case Set,GRU --- General research use
3,phs002679.v1.p1,Rapid Acceleration of Diagnostics - Radical (RADx-rad): Wastewater Detection of COVID-19,"When faced with a pandemic such as SARS-Coronavirus-2 (SAR-CoV-2), the virus responsible for COVID-19, timely risk assessment and action are required to prevent public health impacts to entire communities. Because existing and emerging variants from wastewater, and 3) design platforms for communicating wastewater variant results to the public. Note for data in RADx",Collection,GRU --- General research use
4,phs002600.v1.p1,Rapid Acceleration of Diagnostics - Radical (RADx-rad): Portable GC Detector for COVID Diagnostics,"The data herein combines GC-MS and GC-DMS analysis of exhaled breath vapor compounds. The intent of this study is to develop a portable GC-DMS system to diagnose SARS-CoV-2 infections from , weight, symptoms at time of sampling, etc., was also collected. Note for data in RADx: Instructions for requesting individual-level data are available on",Collection,GRU --- General research use
5,phs002603.v1.p1,Rapid Acceleration of Diagnostics - Radical (RADx-rad): Diagnosis of MIS-C in Febrile Children,"The recent emergence of SARS-CoV2 and resultant pandemic of COVID-19 disease has overwhelmed global health systems and led to over 200,000 American deaths to date. While initial reports suggested that diagnostic strategy to distinguish children with MIS-C from children with other causes of fever. Note for data in RADx: Instructions for requesting individual",Prospective Longitudinal Cohort,GRU --- General research use
6,phs002685.v1.p1,Rapid Acceleration of Diagnostics - Radical (RADx-rad): DNA Star SAS-CoV-2 Rapid Test,"Automated, rapid diagnostics with little sample collection and preparation are needed to identify and trace affected persons in times when hyper-infectious pathogens cause pandemics. Frequent, low cost and highly scalable to results in minutes) and cost effective ( $3 per test). Note for data in RADx: Instructions for requesting individual-level data are available on",Case Set,GRU --- General research use
7,phs002583.v1.p1,Rapid Acceleration of Diagnostics - Radical (RADx-rad): A Rapid Breathalyzer Diagnostics Platform for COVID-19,"We propose to develop a novel testing platform that detects SARS-CoV-2 virions in a patient's breath. When a person exhales into the COVID breathalyzer, droplets and other emitted particles are , and has accurate reporting with high sensitivity and specificity. Note for data in RADx: Instructions for requesting individual-level data are available",Collection,GRU --- General research use
8,phs002604.v1.p1,Rapid Acceleration of Diagnostics - Radical (RADx-rad): Tracking the COVID-19 Epidemic in Sewage (TRACES),"Wastewater based testing (WBT) holds great promise for cost-effective population surveillance and transmission tracking of SARS-CoV-2, but optimal sampling modalities and protocols are unknown. Taking advantage of a diverse inner develop point-of-use microfluidics systems for timely WBT. Note for data in RADx: Instructions for requesting individual-level data are available on RADx",Prospective Longitudinal Cohort,GRU --- General research use
9,phs002524.v1.p1,Rapid Acceleration of Diagnostics - Radical (RADx-rad): Validation of Smart Masks for Surveillance of COVID-19,Vulnerable populations do not just need testing - they need surveillance. The ideal surveillance tool would operate in the background with minimal involvement of the population to be tested; it . Note for data in RADx: Instructions for requesting individual-level data are available on RADx Data Hub at https://radx-hub.nih.gov/home. Apply for data,Collection,GRU --- General research use


In [4]:
def get_download_url(accession):
    return "https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/GetAuthorizedRequestDownload.cgi?study_id=" + accession

## Create a table of approved requests for datasets

In [5]:
authorized_requests = pd.DataFrame()

for _, row in studies.iterrows():
    df = pd.read_csv(get_download_url(row["accession"]), 
                     usecols=["Requestor", "Affiliation", "Project", "Date of approval", "Request status", 
                              "Public Research Use Statement", "Technical Research Use Statement"],
                     sep="\t")
    df["accession"] = row["accession"]
    df["name"] = row["name"]
    authorized_requests = pd.concat([authorized_requests, df], ignore_index=True)

In [6]:
print("Number of authorized requests :", authorized_requests.shape[0])
print("Number of unique requestors   :", len(authorized_requests["Requestor"].unique()))
print("Number of unique studies      :", len(authorized_requests["accession"].unique()))
authorized_requests

Number of authorized requests : 3
Number of unique requestors   : 2
Number of unique studies      : 3


Unnamed: 0,Requestor,Affiliation,Project,Date of approval,Request status,Public Research Use Statement,Technical Research Use Statement,accession,name
0,"Ciofani, Danielle","BROAD INSTITUTE, INC.",Testing Data Access for RADx developer (2),"Dec12, 2022",approved,I will be confirming data access using the Hub.,I am testing data access for select RADx program data. I will be using the Data Hub to access the data.,phs002525.v1.p1,Rapid Acceleration of Diagnostics - Radical (RADx-rad): SF-RAD: Development and Proof-of-Concept Implementation of the South Florida Miami RADx-rad SARS-CoV-2 Wastewater-Based Surveillance Infrastructure
1,"Miguez, Maria-Jose",FLORIDA INTERNATIONAL UNIVERSITY,System analysis for COVID humoral response,"Feb10, 2023",approved,"Multisystem inflammatory syndrome in children (MIS-C) is a rare condition that emerged a couple weeks after a child is exposed to SARS-CoV-2. MIS-C compromised several organs (e.g. heart, lungs, kidneys, brain, skin, eyes, or gastrointestinal tract), and it can be deadly. Therefore, it is critical to understand who is at risk (e.g. those with allergies). Our long term goal is to improve early diagnosis. Based on our strong preliminary data, we want to validate our findings in a pediatric cohort. The RADx hub will provide a unique opportunity to do so with the large data available. This team is uniquely prepared given our medical and research experience.","Multisystem inflammatory syndrome in children (MIS-C) is a rare condition that emerged a couple weeks after a child is exposed to SARS-CoV-2. MIS-C compromised several organs (e.g. heart, lungs, kidneys, brain, skin, eyes, or gastrointestinal tract), and it can be deadly. Indeed, in a sizable proportion of cases (up to 40%), the patient requires to be admitted to the ICU service. Therefore, it is critical to understand who is at risk (e.g. those with allergic or non-allergic diseases) those with alterations in the IgE network (platelets, eosinophils, specific Th2 or Th17 cytokines). Our long term objective is to develop new strategies for early diagnosis and to identify biomarkers of risk that can be easily deployed globally. Based on our strong preliminary data, we want to validate our findings in a pediatric cohort. The RADx hub will provide a unique opportunity to do so with the large data available. This team is uniquely qualified to do such analyses given the medical background of the PI, our ongoing work in COVID both in humans and with animal models, our unique expertise in IgE, and expertise using machine learning and model-based analyses.",phs002781.v1.p1,Rapid Acceleration of Diagnostics - Radical (RADx-rad): Discovery and Clinical Validation of Host Biomarkers of Disease Severity and Multi-system Inflammatory Syndrome in Children (MIS-C) and COVID-19
2,"Miguez, Maria-Jose",FLORIDA INTERNATIONAL UNIVERSITY,System analysis for COVID humoral response,"Feb10, 2023",approved,"Multisystem inflammatory syndrome in children (MIS-C) is a rare condition that emerged a couple weeks after a child is exposed to SARS-CoV-2. MIS-C compromised several organs (e.g. heart, lungs, kidneys, brain, skin, eyes, or gastrointestinal tract), and it can be deadly. Therefore, it is critical to understand who is at risk (e.g. those with allergies). Our long term goal is to improve early diagnosis. Based on our strong preliminary data, we want to validate our findings in a pediatric cohort. The RADx hub will provide a unique opportunity to do so with the large data available. This team is uniquely prepared given our medical and research experience.","Multisystem inflammatory syndrome in children (MIS-C) is a rare condition that emerged a couple weeks after a child is exposed to SARS-CoV-2. MIS-C compromised several organs (e.g. heart, lungs, kidneys, brain, skin, eyes, or gastrointestinal tract), and it can be deadly. Indeed, in a sizable proportion of cases (up to 40%), the patient requires to be admitted to the ICU service. Therefore, it is critical to understand who is at risk (e.g. those with allergic or non-allergic diseases) those with alterations in the IgE network (platelets, eosinophils, specific Th2 or Th17 cytokines). Our long term objective is to develop new strategies for early diagnosis and to identify biomarkers of risk that can be easily deployed globally. Based on our strong preliminary data, we want to validate our findings in a pediatric cohort. The RADx hub will provide a unique opportunity to do so with the large data available. This team is uniquely qualified to do such analyses given the medical background of the PI, our ongoing work in COVID both in humans and with animal models, our unique expertise in IgE, and expertise using machine learning and model-based analyses.",phs002945.v1.p1,Rapid Acceleration of Diagnostics - Radical (RADx-rad): A Data Science Approach to Identify and Manage Multisystem Inflammatory Syndrome in Children (MIS-C) Associated with SARS-CoV-2 Infection and Kawasaki Disease in Pediatric Patients
