# Identifying and Extracting Longitudinal Variables using R PIC-SURE API

This tutorial notebook will demonstrate how to identify and extract longitudinal variables using the R PIC-SURE API. Longitudinal variables are defined as containing multiple 'Exam' or 'Visit' descriptions within their concept path. 

In this example, we will find the patient level data for a lipid-related longitudinal variable within the Framingham Heart study. We will:
1. Identify what longitudinal variables are associated with the keywords of interest (lipid, triglyceride), and how many exams / visits are associated with each one
2. Select a longitudinal variable of interest from a specific study (Framingham heart study)
3. Extract patient level data into a dataframe where each rows represent patients and columns represent visits

For a more basic introduction to the R PIC-SURE API, see the `1_PICSURE_API_101.ipynb` notebook.

**Before running this notebook, please be sure to get a user-specific security token. For more information about how to proceed, see the "Get your security token" instructions in the [README.md](https://github.com/hms-dbmi/Access-to-Data-using-PIC-SURE-API/tree/harmonized_lipid_measurements_example/NHLBI_BioData_Catalyst#get-your-security-token).**

## Environment Set-Up

### System Requirements
R >= 3.4

### Install Packages

In [1]:
source("R_lib/requirements.R")

Updating HTML index of packages in '.Library'

Making 'packages.html' ...
 done

also installing the dependencies ‘credentials’, ‘zip’, ‘gitcreds’, ‘ini’, ‘gert’, ‘gh’, ‘whisker’


Updating HTML index of packages in '.Library'

Making 'packages.html' ...
 done

Updating HTML index of packages in '.Library'

Making 'packages.html' ...
 done



installing: 
-  ggplot2 
-  dplyr 
-  tidyr 
-  urltools 
-  devtools 
-  ggrepel 


also installing the dependencies ‘systemfonts’, ‘textshaping’, ‘xopen’, ‘brew’, ‘rex’, ‘crosstalk’, ‘clisymbols’, ‘cyclocomp’, ‘xmlparsedata’, ‘downlit’, ‘ragg’, ‘parsedate’, ‘whoami’, ‘hunspell’, ‘memoise’, ‘pkgbuild’, ‘rcmdcheck’, ‘remotes’, ‘roxygen2’, ‘rversions’, ‘sessioninfo’, ‘BiocManager’, ‘covr’, ‘DT’, ‘foghorn’, ‘gmailr’, ‘lintr’, ‘mockery’, ‘pingr’, ‘pkgdown’, ‘rhub’, ‘spelling’


“installation of package ‘systemfonts’ had non-zero exit status”
“installation of package ‘textshaping’ had non-zero exit status”
“installation of package ‘ragg’ had non-zero exit status”
“installation of package ‘pkgdown’ had non-zero exit status”
Updating HTML index of packages in '.Library'

Making 'packages.html' ...
 done

also installing the dependencies ‘beeswarm’, ‘vipor’, ‘gridExtra’, ‘prettydoc’, ‘ggbeeswarm’


Updating HTML index of packages in '.Library'

Making 'packages.html' ...
 done


Attaching package: ‘dplyr’


The following objects are masked from ‘package:stats’:

    filter, l

Install latest R PIC-SURE API libraries from github

In [2]:
Sys.setenv(TAR = "/bin/tar")
options(unzip = "internal")
install.packages("https://cran.r-project.org/src/contrib/Archive/devtools/devtools_1.13.6.tar.gz", repos=NULL, type="source")
install.packages("https://cran.r-project.org/src/contrib/R6_2.5.0.tar.gz", repos=NULL, type="source")
install.packages("https://cran.r-project.org/src/contrib/hash_2.2.6.1.tar.gz", repos=NULL, type="source")
install.packages(c("urltools"),repos = "http://cran.us.r-project.org")
devtools::install_github("hms-dbmi/pic-sure-r-client", force=T)
devtools::install_github("hms-dbmi/pic-sure-r-adapter-hpds", force=T)
devtools::install_github("hms-dbmi/pic-sure-biodatacatalyst-r-adapter-hpds", force=T)

“installation of package ‘/tmp/Rtmpc7DNOP/downloaded_packages/devtools_1.13.6.tar.gz’ had non-zero exit status”
“unable to access index for repository http://cran.us.r-project.org/src/contrib:
  cannot open URL 'http://cran.us.r-project.org/src/contrib/PACKAGES'”
“package ‘urltools’ is not available for this version of R

A version of this package for your version of R might be available elsewhere,
see the ideas at
https://cran.r-project.org/doc/manuals/r-patched/R-admin.html#Installing-packages”
Downloading GitHub repo hms-dbmi/pic-sure-r-client@HEAD



stringi (1.6.1 -> 1.6.2) [CRAN]


Installing 1 packages: stringi

Updating HTML index of packages in '.Library'

Making 'packages.html' ...
 done



[32m✔[39m  [90mchecking for file ‘/tmp/Rtmpc7DNOP/remotes5f791da6b679/hms-dbmi-pic-sure-r-client-115deb5/DESCRIPTION’[39m[36m[39m
[90m─[39m[90m  [39m[90mpreparing ‘picsure’:[39m[36m[39m
[32m✔[39m  [90mchecking DESCRIPTION meta-information[39m[36m[39m
[90m─[39m[90m  [39m[90mchecking for LF line-endings in source and make files and shell scripts[39m[36m[39m
[90m─[39m[90m  [39m[90mchecking for empty or unneeded directories[39m[36m[39m
[90m─[39m[90m  [39m[90mbuilding ‘picsure_0.1.0.tar.gz’[39m[36m[39m
   


Downloading GitHub repo hms-dbmi/pic-sure-r-adapter-hpds@HEAD




[32m✔[39m  [90mchecking for file ‘/tmp/Rtmpc7DNOP/remotes5f797abd06ad/hms-dbmi-pic-sure-r-adapter-hpds-2cee5ee/DESCRIPTION’[39m[36m[39m
[90m─[39m[90m  [39m[90mpreparing ‘hpds’:[39m[36m[39m
[32m✔[39m  [90mchecking DESCRIPTION meta-information[39m[36m[39m
[90m─[39m[90m  [39m[90mchecking for LF line-endings in source and make files and shell scripts[39m[36m[39m
[90m─[39m[90m  [39m[90mchecking for empty or unneeded directories[39m[36m[39m
[90m─[39m[90m  [39m[90mbuilding ‘hpds_0.1.1.tar.gz’[39m[36m[39m
   


Downloading GitHub repo hms-dbmi/pic-sure-biodatacatalyst-r-adapter-hpds@HEAD




[32m✔[39m  [90mchecking for file ‘/tmp/Rtmpc7DNOP/remotes5f79659c98d8/hms-dbmi-pic-sure-biodatacatalyst-r-adapter-hpds-d019468/DESCRIPTION’[39m[36m[39m
[90m─[39m[90m  [39m[90mpreparing ‘bdc’:[39m[36m[39m
[32m✔[39m  [90mchecking DESCRIPTION meta-information[39m[36m[39m
[90m─[39m[90m  [39m[90mchecking for LF line-endings in source and make files and shell scripts[39m[36m[39m
[90m─[39m[90m  [39m[90mchecking for empty or unneeded directories[39m[36m[39m
[90m─[39m[90m  [39m[90mbuilding ‘bdc_0.1.0.tar.gz’[39m[36m[39m
   


Load user-defined functions

In [3]:
source("R_lib/utils.R")

## Connecting to a PIC-SURE Network
**Again, before running this notebook, please be sure to get a user-specific security token. For more information about how to proceed, see the "Get your security token" instructions in the [README.md](https://github.com/hms-dbmi/Access-to-Data-using-PIC-SURE-API/tree/harmonized_lipid_measurements_example/NHLBI_BioData_Catalyst#get-your-security-token).**

In [4]:
PICSURE_network_URL <- "https://picsure.biodatacatalyst.nhlbi.nih.gov/picsure"
resource_id <- "02e23f52-f354-4e8b-992c-d37c8b9ba140"
token_file <- "token.txt"

In [5]:
token <- scan(token_file, what = "character")

In [6]:
myconnection <- picsure::connect(url = PICSURE_network_URL,
                                 token = token)

[1] "02e23f52-f354-4e8b-992c-d37c8b9ba140"
[2] "70c837be-5ffc-11eb-ae93-0242ac130002"


In [7]:
resource <- bdc::get.resource(myconnection,
                               resourceUUID = resource_id)

[1] "Loading data dictionary... (takes a minute)"


## Longitudinal Lipid Variable Example
Example showing how to extract lipid measurements from multiple visits for different cohorts

### Access the data
First, we will create multiIndex variable dictionaries of all variables that contain 'lipid' or 'triglyceride'. We will then combine these multiIndex variable dictionaries into `lipid_vars`.

In [11]:
lipid_varDict <- bdc::find.in.dictionary(resource, 'lipid') %>% bdc::extract.entries()
triglyceride_varDict <- bdc::find.in.dictionary(resource, 'triglyceride') %>% bdc::extract.entries()

lipid_multiindex <- get_multiIndex_variablesDict(lipid_varDict)
triglyceride_multiindex <- get_multiIndex_variablesDict(triglyceride_varDict)

In [14]:
lipid_vars <- rbind(lipid_multiindex, triglyceride_multiindex)
lipid_vars

level_0,level_1,level_2,level_3,level_4,simplified_name,name,observationCount,categorical,categoryValues,min,max,HpdsDataType
<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<int>,<lgl>,<list>,<dbl>,<dbl>,<chr>
Framingham Cohort ( phs000007 ),Lab Work,Blood,Lipids,DO YOU LIVE WITH CHILDREN,DO YOU LIVE WITH CHILDREN,\Framingham Cohort ( phs000007 )\Lab Work\Blood\Lipids\DO YOU LIVE WITH CHILDREN\,3391,TRUE,"NO , YES, LESS THAN 3 MONTHS PER YEAR, YES, MORE THAN 3 MONTHS PER YEAR",,,phenotypes
Multi-Ethnic Study of Atherosclerosis (MESA) SHARe ( phs000209 ),"MESA Air Exam Main Dataset: The MESA Air Exam is an ancillary exam of the MESA Study. There are no overlaps in subjects in between MESA Classic Exam datasets, MESA Family Exam datasets, and MESA AIR Exam datasets. However, many of the phenotype measurements are shared betweeen the MESA Classic and MESA Air Exams. Variables included in the MESA Air Exam are those from standard questionnaires, clinical and laboratory measurements. The questionnaires include variables of demography, socioeconomic and psychosocial status, medical and family history, medication use, dietary and alcohol intakes, smoking, and physical activity. The clinical measurements include anthropometry, blood pressure, ankle/brachial blood pressure indices, ECG, coronary calcium determination, arterial wave forms, and flow-dependent brachial artery vasodilation. Laboratory measurements include various lipids, cytokines, adhesion molecules, NO, and hemostasis/fibrinolysis markers.",22: MODERATE WORK MET-min/wk M-Su,,,22: MODERATE WORK MET-min/wk M-Su,"\Multi-Ethnic Study of Atherosclerosis (MESA) SHARe ( phs000209 )\MESA Air Exam Main Dataset: The MESA Air Exam is an ancillary exam of the MESA Study. There are no overlaps in subjects in between MESA Classic Exam datasets, MESA Family Exam datasets, and MESA AIR Exam datasets. However, many of the phenotype measurements are shared betweeen the MESA Classic and MESA Air Exams. Variables included in the MESA Air Exam are those from standard questionnaires, clinical and laboratory measurements. The questionnaires include variables of demography, socioeconomic and psychosocial status, medical and family history, medication use, dietary and alcohol intakes, smoking, and physical activity. The clinical measurements include anthropometry, blood pressure, ankle/brachial blood pressure indices, ECG, coronary calcium determination, arterial wave forms, and flow-dependent brachial artery vasodilation. Laboratory measurements include various lipids, cytokines, adhesion molecules, NO, and hemostasis/fibrinolysis markers.\22: MODERATE WORK MET-min/wk M-Su\",251,FALSE,,0.0,15120,phenotypes
Framingham Cohort ( phs000007 ),Lab Work,Blood,Lipids,CDI - RHEUMATIC HEART DISEASE,CDI - RHEUMATIC HEART DISEASE,\Framingham Cohort ( phs000007 )\Lab Work\Blood\Lipids\CDI - RHEUMATIC HEART DISEASE\,4063,TRUE,"MAYBE, NO , YES",,,phenotypes
Framingham Cohort ( phs000007 ),Lab Work,Blood,Lipids,ECG - IV BLOCK PATTERN,ECG - IV BLOCK PATTERN,\Framingham Cohort ( phs000007 )\Lab Work\Blood\Lipids\ECG - IV BLOCK PATTERN\,699,TRUE,"INDETERMINATE, LEFT , NO IV BLOCK , RIGHT",,,phenotypes
Framingham Cohort ( phs000007 ),Lab Work,Blood,Lipids,"DURING THE PAST YEAR, HOW OFTEN HAVE YOU PARTICIPATED IN THE FOLLOWING LEISURE TIME ACTIVITIES? --PLAYING TENNIS OR GOLF","DURING THE PAST YEAR, HOW OFTEN HAVE YOU PARTICIPATED IN THE FOLLOWING LEISURE TIME ACTIVITIES? --PLAYING TENNIS OR GOLF","\Framingham Cohort ( phs000007 )\Lab Work\Blood\Lipids\DURING THE PAST YEAR, HOW OFTEN HAVE YOU PARTICIPATED IN THE FOLLOWING LEISURE TIME ACTIVITIES? --PLAYING TENNIS OR GOLF\",272,TRUE,"NEVER , OCCASIONALLY (LESS THAN ONCE A MONTH) , ONCE WEEKLY (1 DAY PER WEEK) , SEVERAL DAYS PER WEEK (2-6 DAYS PER WEEK)",,,phenotypes
Multi-Ethnic Study of Atherosclerosis (MESA) SHARe ( phs000209 ),"MESA Air Exam Main Dataset: The MESA Air Exam is an ancillary exam of the MESA Study. There are no overlaps in subjects in between MESA Classic Exam datasets, MESA Family Exam datasets, and MESA AIR Exam datasets. However, many of the phenotype measurements are shared betweeen the MESA Classic and MESA Air Exams. Variables included in the MESA Air Exam are those from standard questionnaires, clinical and laboratory measurements. The questionnaires include variables of demography, socioeconomic and psychosocial status, medical and family history, medication use, dietary and alcohol intakes, smoking, and physical activity. The clinical measurements include anthropometry, blood pressure, ankle/brachial blood pressure indices, ECG, coronary calcium determination, arterial wave forms, and flow-dependent brachial artery vasodilation. Laboratory measurements include various lipids, cytokines, adhesion molecules, NO, and hemostasis/fibrinolysis markers.",RESULTS OF ECG,,,RESULTS OF ECG,"\Multi-Ethnic Study of Atherosclerosis (MESA) SHARe ( phs000209 )\MESA Air Exam Main Dataset: The MESA Air Exam is an ancillary exam of the MESA Study. There are no overlaps in subjects in between MESA Classic Exam datasets, MESA Family Exam datasets, and MESA AIR Exam datasets. However, many of the phenotype measurements are shared betweeen the MESA Classic and MESA Air Exams. Variables included in the MESA Air Exam are those from standard questionnaires, clinical and laboratory measurements. The questionnaires include variables of demography, socioeconomic and psychosocial status, medical and family history, medication use, dietary and alcohol intakes, smoking, and physical activity. The clinical measurements include anthropometry, blood pressure, ankle/brachial blood pressure indices, ECG, coronary calcium determination, arterial wave forms, and flow-dependent brachial artery vasodilation. Laboratory measurements include various lipids, cytokines, adhesion molecules, NO, and hemostasis/fibrinolysis markers.\RESULTS OF ECG\",251,TRUE,COMPLETE,,,phenotypes
Multi-Ethnic Study of Atherosclerosis (MESA) SHARe ( phs000209 ),"MESA Air Exam Main Dataset: The MESA Air Exam is an ancillary exam of the MESA Study. There are no overlaps in subjects in between MESA Classic Exam datasets, MESA Family Exam datasets, and MESA AIR Exam datasets. However, many of the phenotype measurements are shared betweeen the MESA Classic and MESA Air Exams. Variables included in the MESA Air Exam are those from standard questionnaires, clinical and laboratory measurements. The questionnaires include variables of demography, socioeconomic and psychosocial status, medical and family history, medication use, dietary and alcohol intakes, smoking, and physical activity. The clinical measurements include anthropometry, blood pressure, ankle/brachial blood pressure indices, ECG, coronary calcium determination, arterial wave forms, and flow-dependent brachial artery vasodilation. Laboratory measurements include various lipids, cytokines, adhesion molecules, NO, and hemostasis/fibrinolysis markers.","SYMPATHOMIMETICS, ORAL AND INHALED",,,"SYMPATHOMIMETICS, ORAL AND INHALED","\Multi-Ethnic Study of Atherosclerosis (MESA) SHARe ( phs000209 )\MESA Air Exam Main Dataset: The MESA Air Exam is an ancillary exam of the MESA Study. There are no overlaps in subjects in between MESA Classic Exam datasets, MESA Family Exam datasets, and MESA AIR Exam datasets. However, many of the phenotype measurements are shared betweeen the MESA Classic and MESA Air Exams. Variables included in the MESA Air Exam are those from standard questionnaires, clinical and laboratory measurements. The questionnaires include variables of demography, socioeconomic and psychosocial status, medical and family history, medication use, dietary and alcohol intakes, smoking, and physical activity. The clinical measurements include anthropometry, blood pressure, ankle/brachial blood pressure indices, ECG, coronary calcium determination, arterial wave forms, and flow-dependent brachial artery vasodilation. Laboratory measurements include various lipids, cytokines, adhesion molecules, NO, and hemostasis/fibrinolysis markers.\SYMPATHOMIMETICS, ORAL AND INHALED\",248,TRUE,"NO , YES",,,phenotypes
Framingham Cohort ( phs000007 ),Lab Work,Blood,Lipids,"EYE EXAMINATION: RETINA, EXAM 1","EYE EXAMINATION: RETINA, EXAM 1","\Framingham Cohort ( phs000007 )\Lab Work\Blood\Lipids\EYE EXAMINATION: RETINA, EXAM 1\",4823,TRUE,"ABNORMAL, GROUP I , ABNORMAL, GROUP II , ABNORMAL, GROUP III , ABNORMAL, GROUP IV , ABNORMAL, GROUP UNSPECIFIED, NORMAL",,,phenotypes
Framingham Cohort ( phs000007 ),Lab Work,Blood,Lipids,HEMATOCRIT,HEMATOCRIT,\Framingham Cohort ( phs000007 )\Lab Work\Blood\Lipids\HEMATOCRIT\,7969,FALSE,,25.0,62,phenotypes
Framingham Cohort ( phs000007 ),Lab Work,Blood,Lipids,ECHO: VALVE-MITRAL-EXCURSION (MM),ECHO: VALVE-MITRAL-EXCURSION (MM),\Framingham Cohort ( phs000007 )\Lab Work\Blood\Lipids\ECHO: VALVE-MITRAL-EXCURSION (MM)\,3604,FALSE,,10.0,37,phenotypes


### Identify the longitudinal lipid variables
This block of code does the following:
- uses the multiindex dataframe containing variables which are related to 'lipid' or 'triglyceride'
- filters for variables with keywords 'exam #' or 'visit #'
- extracts the exam number of each variable into column `exam_number`
- groups variables by study (`level_0`) and longitudinal variable (`longvar`)
- returns a table showing the variables that have more than one exam recorded

In [15]:
longitudinal_lipid_vars <- lipid_vars %>%
    # Filter to variables containing exam # or visit #
    filter((grepl('exam \\d+', name, ignore.case=TRUE) |
          grepl('visit \\d+', name, ignore.case=TRUE))) %>%
    # Save exam # as exam_number and variable without exam # info as longvar
    mutate(exam_number = str_extract(name, regex("(exam \\d+)|(visit \\d+)", ignore_case=T)),
          longvar =  tolower(str_replace_all(name, regex('(exam|visit) \\d+', ignore_case = T), 'exam'))) %>%
    # Group by level_0 (study) and longvar
    group_by(level_0, longvar) %>%
    # Count number of exams for each longvar
    summarise(n_exams = n_distinct(exam_number)) %>%
    # Find longvars with 2+ exams (longitudinal variables)
    filter(n_exams > 1) %>% 
    arrange(desc(n_exams))
    
longitudinal_lipid_vars

`summarise()` has grouped output by 'level_0'. You can override using the `.groups` argument.



level_0,longvar,n_exams
<chr>,<chr>,<int>
Framingham Cohort ( phs000007 ),"\framingham cohort ( phs000007 )\lab work\blood\lipids\blood analysis: serum cholesterol, exam\",7
Framingham Cohort ( phs000007 ),"\framingham cohort ( phs000007 )\lab work\blood\lipids\blood pressure: first examiner, diastolic, exam\",7
Framingham Cohort ( phs000007 ),"\framingham cohort ( phs000007 )\lab work\blood\lipids\blood pressure: first examiner, systolic, exam\",7
Framingham Cohort ( phs000007 ),"\framingham cohort ( phs000007 )\lab work\blood\lipids\blood pressure: second examiner, diastolic, exam\",7
Framingham Cohort ( phs000007 ),"\framingham cohort ( phs000007 )\lab work\blood\lipids\blood pressure: second examiner, systolic, exam\",7
Framingham Cohort ( phs000007 ),"\framingham cohort ( phs000007 )\lab work\blood\lipids\ecg: atrioventricular block, exam\",7
Framingham Cohort ( phs000007 ),"\framingham cohort ( phs000007 )\lab work\blood\lipids\ecg: myocardial infarction, exam\",7
Framingham Cohort ( phs000007 ),"\framingham cohort ( phs000007 )\lab work\blood\lipids\relative weight, exam\",7
Framingham Cohort ( phs000007 ),"\framingham cohort ( phs000007 )\lab work\blood\lipids\urinalysis: sugar, exam\",7
Framingham Cohort ( phs000007 ),"\framingham cohort ( phs000007 )\lab work\blood\lipids\weight, exam\",7


*Note: Some variables have capitalization differences, which is why* `longvar` *has been changed to lowercase.*

Now that we know which longitudinal variables are available to us, we can choose a variable of interest and extract the patient and visit level data associated with it.

However, note that the `longvar` we extracted is not equivalent to the actual PIC-SURE concept path needed to query for this variable. 

*Now we can filter for specific studies and extract the longitudinal variable names. Note that* `longvar` *is not equivalent to the actual PIC-SURE concept path, we will need to use the original name from* `multiindex`*. You will not be able to use only the table above to get the variables of interest.*

### Isolate variables of interest

In this example, we will choose to further investigate the first longitudinal variable in the `longitudinal_lipid_vars` dataframe we generated above.

In [16]:
my_variable <- longitudinal_lipid_vars$longvar[1]
print(my_variable)

[1] "\\framingham cohort ( phs000007 )\\lab work\\blood\\lipids\\blood analysis: serum cholesterol, exam\\"


To add the longitudinal variable of interest to our PIC-SURE query, we will need to search for our variable within the overall multiindex data dictionary we created before (`multiindex`)

*Note: There are some variables that have minor text differences. The workaround here is to separate the variable into parts. Here, we separate* `longvar` *where it says "exam" or "visit" into the variable* `keywords`*. Then we check to see if each of these parts are in the variable name.*

*This workaround does not work for every variable, so be sure to double check that you are selecting all longitudinal variables of interest.*

In [17]:
# Getting rid of punctuation that gives R trouble
fixed_my_variable <- str_replace_all(my_variable, '[[:punct:]]', '')
# Split the fixed_my_variable into separate strings wherever 'exam' or 'visit' is
keywords <- unlist(strsplit(fixed_my_variable, c('exam','visit')))

keywords

In [18]:
# Filter the multiindex to get query variables
query_vars <- multiindex %>%
                mutate(new_name = tolower(str_replace_all(name, '[[:punct:]]', '')), # Get rid of punctuation from concept path and make lowercase
                       test_val = sapply(keywords, # For each string in keywords,
                                         grepl, # see if it is in...
                                         new_name, # the concept path
                                         ignore.case=TRUE),
                      other = apply(test_val, 1, sum)) %>% # Count the number of "TRUE", or times that theres a keywords & new_name match
                filter(other == length(keywords)) %>% # Keep only rows where all keywords matched new_name
                pull(name) # Return only full concept paths
query_vars

The resulting `query_vars` variable contains the variables we will want to add to our query. 

### Create & run query
First, we will create a new query object.

In [19]:
my_query <- bdc::new.query(resource = resource)

We will use the `bdc::query.anyof.add()` method. This will allow us to include all input variables, but only patient records that contain at least one non-null value for those variables in the output. See the `1_PICSURE_API_101.ipynb` notebook for a more in depth explanation of query methods.

In [20]:
bdc::query.anyof.add(query = my_query,
                      keys = lapply(query_vars, as.character))

#### Update consent codes if necessary
Uncomment this code below and run as necessary to restrict your query to certain consent codes.
In the current example, the query is restricted to the 'phs000179.c2' consent code.

In [21]:
# Delete current consents
#bdc::query.filter.delete(query = my_query,
#                      keys = "\\_consents\\")

# Add in consents
#bdc::query.filter.add(query = my_query,
#                      keys = "\\_consents\\",
#                      as.list(c("phs000179.c2")))

We can now run our query:

In [22]:
my_df <- bdc::query.run(my_query, result.type = "dataframe")

Our dataframe contains each exam / visit for the longitudinal variable of interest, with each row representing a patient. In order to be included in the output, each patient must have at least one reported value for one of the exams / visits for the variable of interest

In [23]:
my_df

Patient ID,\Framingham Cohort ( phs000007 )\Lab Work\Blood\Lipids\BLOOD ANALYSIS: SERUM CHOLESTEROL (MG/100 ML)\,"\Framingham Cohort ( phs000007 )\Lab Work\Blood\Lipids\BLOOD ANALYSIS: SERUM CHOLESTEROL, EXAM 1 OR EXAM 2\","\Framingham Cohort ( phs000007 )\Lab Work\Blood\Lipids\BLOOD ANALYSIS: SERUM CHOLESTEROL, EXAM 1\","\Framingham Cohort ( phs000007 )\Lab Work\Blood\Lipids\BLOOD ANALYSIS: SERUM CHOLESTEROL, EXAM 2\","\Framingham Cohort ( phs000007 )\Lab Work\Blood\Lipids\BLOOD ANALYSIS: SERUM CHOLESTEROL, EXAM 3\","\Framingham Cohort ( phs000007 )\Lab Work\Blood\Lipids\BLOOD ANALYSIS: SERUM CHOLESTEROL, EXAM 4\","\Framingham Cohort ( phs000007 )\Lab Work\Blood\Lipids\BLOOD ANALYSIS: SERUM CHOLESTEROL, EXAM 5\","\Framingham Cohort ( phs000007 )\Lab Work\Blood\Lipids\BLOOD ANALYSIS: SERUM CHOLESTEROL, EXAM 6\","\Framingham Cohort ( phs000007 )\Lab Work\Blood\Lipids\BLOOD ANALYSIS: SERUM CHOLESTEROL, EXAM 7\",\_Parent Study Accession with Subject ID\,\_Topmed Study Accession with Subject ID\,\_consents\
<int>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<chr>,<chr>,<chr>
54641,200,133,133,171,160,201,186,198,155,phs000007.v30_1,,phs000007.c1
54643,241,296,296,306,260,301,307,256,281,phs000007.v30_3,,phs000007.c1
54644,212,155,155,194,189,205,229,217,240,phs000007.v30_4,,phs000007.c1
54646,254,233,,233,273,225,267,280,256,phs000007.v30_7,,phs000007.c1
54652,,209,,209,202,184,281,290,,phs000007.v30_16,,phs000007.c1
54654,187,150,150,178,169,199,202,261,245,phs000007.v30_20,,phs000007.c1
54657,239,232,232,231,249,252,246,270,303,phs000007.v30_27,,phs000007.c1
54659,222,184,,184,165,206,204,215,214,phs000007.v30_29,phs000974.v3_29,phs000007.c2
54664,230,184,,184,226,198,244,217,238,phs000007.v30_39,,phs000007.c2
54667,226,209,,209,245,199,206,227,228,phs000007.v30_44,phs000974.v3_44,phs000007.c1
