### Online Asthma datasets for Cohort Representation hackathon
The following lists a number of asthma studies with data that could be explored as part of the Cohort Representation hackathon. These studies are either already set up, or require onlly a few steps to make them available for use both via FHIR and via Data Connect.

#### Data Connect
At present the data dictionaries for these can be provided via GA4GH Data Connect. Some examples are illustrated below.

#### FHIR
Synthetic data could be generated or data access obtained to the real data.

ResearchStudy resources are available for the same studies through the dbGaP on FHIR server. See https://github.com/ncbi/DbGaP-FHIR-API-Docs/blob/production/quickstart.md

Patient and Observation resources from these studies should also be available. These data are simulated and harmonized. See https://github.com/ncbi/DbGaP-FHIR-API-Docs/blob/production/quickstart.md#other-objects



In [12]:
import pandas as pd
df = pd.read_csv("data/asthma_studies.txt", sep="\t")
df

Unnamed: 0,Study name,Subjects,Link,id,Pedigree,Notes
0,An Omics View of Asthma through Monozygotic Twins,74,http:/identifiers.org/dbgap:phs000886.v1.p1,dbgap:phs000886.v1.p1,Y,
1,"NHLBI TOPMed: Study of African Americans, Asth...",2106,http:/identifiers.org/dbgap:phs000921.v4.p1,dbgap:phs000921.v4.p1,Y,
2,NHLBI TOPMed: The Genetics and Epidemiology of...,1527,http:/identifiers.org/dbgap:phs001143.v4.p1,dbgap:phs001143.v4.p1,Y,
3,The EVE Asthma Genetics Consortium: Building U...,296,http:/identifiers.org/dbgap:phs001156.v2.p1,dbgap:phs001156.v2.p1,N,
4,NHLBI TOPMed: Severe Asthma Research Program (...,1882,http:/identifiers.org/dbgap:phs001446.v2.p1,dbgap:phs001446.v2.p1,Y,
5,NHLBI TOPMed: Childhood Asthma Management Prog...,2640,http:/identifiers.org/dbgap:phs001726.v2.p1,dbgap:phs001726.v2.p1,N,The details for this study report that CARE an...
6,NHLBI TOPMed: Pediatric Asthma Controller Tria...,41,http:/identifiers.org/dbgap:phs001730.v2.p1,dbgap:phs001730.v2.p1,N,
7,NHLBI TOPMed: Pathways to Immunologically Medi...,73,http:/identifiers.org/dbgapphs001727.v2.p1,dbgapphs001727.v2.p1,N,
8,"A Large-Scale, Consortium-Based Genomewide Ass...",26475,http:/identifiers.org/ega.study:EGAS00000000077,ega.study:EGAS00000000077,,


### Example FHIR and Data Connect data for these studies
Get a Data Connect client

In [13]:
from fasp.search import DataConnectClient
cl = DataConnectClient('http://localhost:8089/')

### NHLBI TOPMed: Childhood Asthma Management Program (CAMP)

dbGaP Study Accession: phs001726.v2.p1

The details for this study report that CARE and ACRN standard variables were used. 

A FHIR ResearchStudy resource for this study is available at https://dbgap-api.ncbi.nlm.nih.gov/fhir/x1/ResearchStudy/phs001726

Simulated individual level data is also available as Patient and Observation resources.

The following shows the data dictionary for two data tables for this study.

In [14]:
cl.listTableColumns('TOPMed.CAMP.LData', descriptions=True, enums=False)

tx_grp_fup
Treatment group at follow-up visit
_______________________________________
prefev
FEV1 at follow-up visit
_______________________________________
pos_skintest
One or more positive skin test at follow-up visit
_______________________________________
prefvc
FVC at follow-up visit
_______________________________________
saba_tx_fup
Treatment group assigned at follow-up visit was SABA (Short acting B-agonist) arm
_______________________________________
ige
Total IgE at follow-up visit
_______________________________________
eos
Blood eosinophils at follow-up visit
_______________________________________
laba_tx_fup
Treatment group assigned at follow-up was LABA (Long acting B-agonist) arm
_______________________________________
visit_week
Number of weeks in relation to Baseline visit
_______________________________________
ampf
Average AM peak flow (L/min) as recorded in daily diary card since last visit
_______________________________________
npos_core_skintest
Number of core p

In [15]:
cl.listTableColumns('TOPMed.CAMP.CData', descriptions=True, enums=False)

bmiz_baseline
Baseline BMIZ
_______________________________________
pred_bursts_event1_month
Number of months from Baseline to first prednisone burst.  With the exception of a visit at 2 months, vists are four months apart.  At each visit a participant answers whether prednisone was used since the previous visit.  If prednisone had been used, the value of this variable is the number of months that the visit is from Baseline.  For example, if the first burst was used between the four month and the eight month visit, the pred_bursts_event1_month variable has a value of eight assigned.
_______________________________________
edhos_event2_week
Number of weeks from Baseline to second ER visit or hospitalization.  See week assignments in pred_bursts_event1_week.
_______________________________________
edhos_event1_month
Number of months from Baseline to first ER visit or hospitalization.  See month assignments in pred_bursts_event1_month.
_______________________________________
protocol
Prot

In [16]:
cl.listTableInfo('TOPMed.CAMP.LData', verbose=True)

_Schema for tableTOPMed.CAMP.LData_
{
   "name": "TOPMed.CAMP.LData",
   "data_model": {
      "$id": "dbgap:pht000701.v1",
      "properties": {
         "tx_grp_fup": {
            "$id": "dbgap:phv00071005.v1",
            "description": "Treatment group at follow-up visit",
            "type": "encoded value",
            "oneOf": [
               {
                  "const": "0",
                  "title": "No treatment"
               },
               {
                  "const": "1",
                  "title": "Albuterol"
               },
               {
                  "const": "2",
                  "title": "Azithromycin"
               },
               {
                  "const": "3",
                  "title": "Beclomethasone"
               },
               {
                  "const": "4",
                  "title": "Budesonide"
               },
               {
                  "const": "5",
                  "title": "Flunisolide"
               },
           

<fasp.search.data_connect_client.SearchSchema at 0x133671820>

### Full schema listing

The raw listing of the Data Connect schema response for the same table is as follows.

In [17]:
cl.listTableInfo('TOPMed.CAMP.CData', verbose=True)

_Schema for tableTOPMed.CAMP.CData_
{
   "name": "TOPMed.CAMP.CData",
   "data_model": {
      "$id": "dbgap:pht000700.v1",
      "properties": {
         "bmiz_baseline": {
            "$id": "dbgap:phv00070943.v1",
            "description": "Baseline BMIZ",
            "type": "decimal"
         },
         "pred_bursts_event1_month": {
            "$id": "dbgap:phv00070958.v1",
            "description": "Number of months from Baseline to first prednisone burst.  With the exception of a visit at 2 months, vists are four months apart.  At each visit a participant answers whether prednisone was used since the previous visit.  If prednisone had been used, the value of this variable is the number of months that the visit is from Baseline.  For example, if the first burst was used between the four month and the eight month visit, the pred_bursts_event1_month variable has a value of eight assigned.",
            "type": "integer",
            "$unit": "month"
         },
         "edhos_

<fasp.search.data_connect_client.SearchSchema at 0x133671940>