# Table of Contents

1. <a href="#REDCap-Overview">REDCap Overview</a>  
    1a. <a href="#Definitions">Definitions</a>  
    1b. <a href="#REDCap-Data-Building-Blocks">REDCap Data Building Blocks</a>  
2. <a href="#Gathered-Survey-Data">Gathered Survey Data</a>  
    2a. <a href="#Records">Records</a></n>  
    2b. <a href="#Reports">Reports</a></n>  
    2c. <a href="#Files---optional-attachments-to-individual-records">Files - optional attachments to individual records</a></n>  
3. <a href="#Metadata">Metadata</a></n>   
4. <a href="#Data-Validations">Data Validations</a></n>
5. <a href="#Appendix">Appendix</a></n> <br>
    5a. <a href="#Cannot-be-Imported">Cannot be Imported</a></n>  
    5b. <a href="#Example-REDCap-Projects-and-Data-Structures">Example REDCap Projects and Data Structures</a></n>  
    5c. <a href="#Example:-Uploading-Records-from-a-CSV">Example: Uploading Records from a CSV</a></n>  

# REDCap Overview

REDCap is a web-based application developed by Vanderbilt University to capture data for clinical research and create databases and projects. REDCap revolutionizes the survey development process and empowers users to rapidly create surveys tailored to their specific public health needs. REDCap allows the main administrator of a project to assign specific rights to individuals to access data and create reports. REDCap also provides clear audit trails for tracking the history of data entry and revision. *For more information, see the Daily Dose article on REDCap [here.](https://stateofwa.sharepoint.com/sites/DOH-dailydose/SitePages/Welcome-to-REDCap,-your-tool-for-public-health-data-management-success!.aspx?xsdata=MDV8MDJ8fDY0MjMzNWIyOTQwZjQzNjk0MTRlMDhkYzAwYzYyNjg4fDExZDBlMjE3MjY0ZTQwMGE4YmEwNTdkY2MxMjdkNzJkfDB8MHw2MzgzODYwOTgyMjk3NTYyMzh8VW5rbm93bnxWR1ZoYlhOVFpXTjFjbWwwZVZObGNuWnBZMlY4ZXlKV0lqb2lNQzR3TGpBd01EQWlMQ0pRSWpvaVYybHVNeklpTENKQlRpSTZJazkwYUdWeUlpd2lWMVFpT2pFeGZRPT18MXxMMk5vWVhSekx6RTVPbTFsWlhScGJtZGZUMGRTYkU1RVZtcE5iVVYwVDFSTk5WcHBNREJhYWxrd1RGZEthMDFVUlhSYWFrVjRUbnBzYWs5RVJtcGFSRnBwUUhSb2NtVmhaQzUyTWk5dFpYTnpZV2RsY3k4eE56QXpNREV6TURBMk9URTJ8OWEzZGY2NTg0MTgwNDE5OTQxNGUwOGRjMDBjNjI2ODh8MjI0NGZhYzdiNjc5NGI2YmFhMjU4NDg4NGQ1MGViMTU%3D&sdata=THg5c1g2dy9FajR5RmpMNXBsRUtCWDRKQTZqYUF6eEQyMUJ1dVM2WTVjaz0%3D&ovuser=11d0e217-264e-400a-8ba0-57dcc127d72d%2CEmily.Pearman%40doh.wa.gov&OR=Teams-HL&CT=1703015604587&clickparams=eyJBcHBOYW1lIjoiVGVhbXMtRGVza3RvcCIsIkFwcFZlcnNpb24iOiIyNy8yMzExMDIyNDcwNSIsIkhhc0ZlZGVyYXRlZFVzZXIiOmZhbHNlfQ%3D%3D)*

## Definitions

### Records

Records are the set of information for a unique participant. Each record is composed of a number of fields (pieces of data), which can be spread across multiple instruments per record.  

Importing records creates a new record with all of the data provided entered into the record(s) specified. The record_id(s) to be imported can be named explicitly or automatically assigned based on the next available record_id in the REDCap project.

### Fields

Fields are the individual places where data can be recorded (e.g., a question on a survey). The column names must match the field names in the REDCap project to be imported successfully.

### Instruments/Forms

Instruments are a collection of fields to collect data. Instruments may be referred to as "forms" when being filled out by a project user or a "survey" when being filled out by external users (via a web link or email invitation). 

Instruments may be repeating and can be in either longitudinal or non-longitudinal projects. Repeating instruments can be set up to repeat a defined number of times or repeat an indefinite number of times. Each repeat is called an “instance”.

### Events

Events are only possible to enable in longitudinal projects. An event may be a temporal event during your longitudinal project, such as a participant visit or a task to be performed. Events can have multiple instruments. The default is 1 event. 

Events may be repeating or non-repeating. Repeating events can be set up so that all instruments in an event repeat together or only certain instruments are repeated from the event. Each repeat is called an “instance”.

### Arms

Events can be grouped into 'arms'. There may be one or more arms for a project. Arms can be though of as different groups in a clinical trial (e.g., a test group and a control group). Each arm can have as many events as you wish. Each arm can have the same events or different events with different instruments. The default is 1 arm.

### Metadata

Metadata refers to the project's set up characteristics, including field attributes grouped by instrument assignment. Metadata can be thought of as the project's data dictionary.

### Reports

Reports are a good way to view data from multiple records at once. REDCap has two default reports (A & B) and additional custom reports can be created. Report A displays all data for all records, while Report B can be customized to display data from specified instruments or events. 

## REDCap Data Building Blocks

<img src="../files/REDCap API Tutorial - Building Blocks.png" width=900/>

*For examples of REDCap project set-ups and their expected data structures, see <a href="#Example-REDCap-Projects-and-Data-Structures">Example REDCap Projects and Data Structures</a>.*

In [1]:
library(REDCapR)
library(dplyr)


Attaching package: 'dplyr'


The following objects are masked from 'package:stats':

    filter, lag


The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union




In [5]:
token <- jsonlite::fromJSON('json_api_data.json')$dev_token$'276'
url <- "https://dev-redcap.doh.wa.gov/api/"

# Gathered Survey Data

## Records

Records can be imported into a REDCap project from a dataframe in R using `redcap_write_oneshot()` to write a records all at once, or using `redcap_write()` which can batch the records to be imported so the server is not overwhelmed in the case of large imports.  

If the record_id being imported already exists in the REDCap project, the imported data will overwrite the previously existing data for that record. Using a record_id that does not already exist will create a new record. The `redcap_next_free_record_name()` function will show what the next unused record_id is. If Data Access Groups (DAGs) are used in the project, this method accounts for the special formatting of the record name for users in DAGs where the unique auto-assigned DAG number is a prefix to the actual record_id (e.g., DAG-ID; a user assigned to a DAG numbered 1732 with 3 existing records will return 1732-4).

In [12]:
# Define data to import
df1 <- data.frame(record_id  = c(7,8),
                  first_name = c("John","Jane"),
                  last_name = c("Doe","Doe")
                  )

In [13]:
redcap_write_oneshot(df1, redcap_uri=url, token=token)

ERROR: Error in redcap_write_oneshot(df1, redcap_uri = url, token = token): Assertion on 'token' failed: Must be of type 'character', not 'NULL'.


In [8]:
df2 <- data.frame(record_id  = 9,
                  first_name = "John",
                  last_name = "Doe"
                  )

In [6]:
redcap_write(df2, redcap_uri=url, token=token)
#optional argument: batch_size = 100 (default)

Starting to update 1 records to be written at 2023-12-29 09:05:33.

Writing batch 1 of 1, with indices 1 through 1.

1 records were written to REDCap in 0.4 seconds.



In [7]:
redcap_next_free_record_name(redcap_uri=url, token=token)

The next free record name in REDCap was successfully determined in 0.3 seconds.  The http status code was 200.  Is is 10.



## Files - optional attachments to individual records

File uploads are a unique field type in REDCap that accept a variety of file types, including images and other documents. Unlike other export methods, importing files only works for one file field for one record at a time. 

If the project has repeating events (i.e. a longitudinal project), the event name must be specified. If the file of interest is in a repeat instance, the instance number must also be specified.

In [10]:
filepath = '../files/test_file.png'
redcap_file_upload_oneshot(file_name=filepath, record=7, field='test_upload', event='case_intake_arm_1', redcap_uri=url, token=token)

Preparing to upload the file `../files/test_file.png`.

file uploaded to REDCap in 0.5 seconds.



# Metadata

Metadata refers to the project's set up characteristics, including field attributes grouped by instrument assignment. Metadata can be thought of as the project's data dictionary.  

In this example, the metadata will be exported using `redcap_metadata_read()` and re-imported so no fields are changed. 

In [9]:
metadata <- redcap_metadata_read(redcap_uri=url, token=token)$data

The data dictionary describing 30 fields was read from REDCap in 0.4 seconds.  The http status code was 200.



In [10]:
head(metadata)

field_name,form_name,section_header,field_type,field_label,select_choices_or_calculations,field_note,text_validation_type_or_show_slider_number,text_validation_min,text_validation_max,identifier,branching_logic,required_field,custom_alignment,question_number,matrix_group_name,matrix_ranking,field_annotation
<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>
record_id,demographics,,text,Study ID,,,,,,,,,,,,,
first_name,demographics,Personal Information,text,First Name,,,,,,,,,,,,,
last_name,demographics,,text,Last Name,,,,,,,,,,,,,
phone_num,demographics,,text,Phone Number,,,phone,,,,,,,,,,
zip_code,demographics,,text,ZIP Code,,,integer,10001.0,99999.0,y,,,,,,,
dob,demographics,,text,Date of birth,,,date_mdy,,,y,,,,,,,


In [11]:
redcap_metadata_write(metadata, redcap_uri=url, token=token)

30 fields were written to the REDCap dictionary in 0.8 seconds.



# Data Validations

REDCapR a a few data validation functions that can be used to check your data before importing it to your REDCap project. These validations will not be specific to your paricular REDCAp project but are general validations that apply to all REDCap projects. For example, you can check if you have any boolean values (True/False) since REDCap will only except import of 0/1 integers. You can also check for duplicates and uniqe IDs. You can view more details on these data validation functions [here](https://ouhscbbmc.github.io/REDCapR/reference/validate.html).

# Appendix

## Cannot be Imported 
Field Names <br>
- Importing this would be covered by the `redcap_metadata_write()` function. <br>
- Can be exported using `redcap_variables()`. <br>

Forms/Instruments <br>
- Importing this would be covered by the `redcap_metadata_write()` function. <br>
- Can be exported using `redcap_instruments()` or downloaded using `redcap_instrument_download()`. <br>

Instrument Event Map <br>
- Importing this would be covered by the `redcap_metadata_write()` function. <br> 
- Can be exported using `redcap_event_instruments()`. <br>

Reports <br>
- Cannot be imported. <br>
- Can be exported using `redcap_report()`. <br>

Users <br>
- Cannot be imported with REDCapR. <br> 
- Can be exported using `redcap_users_export()`. <br>

User Roles <br>
- Cannot be imported with REDCapR. <br> 
- Can be exported using `redcap_users_export()`, if applicable. <br>

Data Access Groups (DAGs) <br>
- Cannot be imported with REDCapR. <br> 
- Can be exported using `redcap_dag_read()`. <br>

Logging <br>
- Cannot be imported. <br> 
- Can be exported using `redcap_log_read()`. <br>

Can be imported in REDCap but not REDCapR: [ADD HERE] <br>

## Example REDCap Projects and Data Structures

<img src="../files/REDCap API Tutorial - Examples v2 (non-long).png" width=800/>

<img src="../files/REDCap API Tutorial - Examples v2 (long) pt1.png" width=800/>

<img src="../files/REDCap API Tutorial - Examples v2 (long) pt2.png" width=800/>

The base of every unique key is always the `record_id`. In non-longitudinal projects, there may also be a `repeat_instrument` and `repeat_instance` column if these features are enabled. In longitudinal projects, there will be an `event_name` column, as well as a `repeat_instance` column in the case of repeating events and a `repeat_instrument` column in the case of independently repeating instruments. 

This unique key applies for studies with and without multiple arms. Each value for the `event_name` includes the study arm as a suffix. The suffix will automatically be *"_arm_1"* for longitudinal studies without additional arms.  

These special unique key fields must be appropriately filled out in the data being imported to REDCap.

## Example: Uploading Records from a CSV

In this example, we have a csv named "data_to_import.csv", with new records to upload.

In [36]:
df_to_import <- read.csv("../files/data_to_import.csv")
head(df_to_import)

Unnamed: 0_level_0,record_id,redcap_event_name,redcap_repeat_instrument,redcap_repeat_instance,redcap_survey_identifier,demographics_timestamp,first_name,last_name,phone_num,zip_code,...,cc_phone,cc_email,close_contacts_complete,supervisor_name,supervisor_email,work_inperson_yesno,work_date,work_contagious,work_contagious_calc,work_information_complete
Unnamed: 0_level_1,<int>,<chr>,<chr>,<int>,<lgl>,<lgl>,<chr>,<chr>,<chr>,<int>,...,<chr>,<chr>,<int>,<chr>,<chr>,<int>,<chr>,<int>,<lgl>,<int>
1,3,personal_info_arm_1,,,,,John,Doe,(999) 999-9999,98105.0,...,,,,,,,,,,
2,3,notifications_arm_1,,,,,,,,,...,,,,Boss,,0.0,,0.0,,2.0
3,3,case_intake_arm_1,,1.0,,,,,,,...,,,,,,,,,,
4,3,notifications_arm_1,close_contacts,1.0,,,,,,,...,(999) 999-9999,fake_email@gmail.com,2.0,,,,,,,
5,3,notifications_arm_1,close_contacts,2.0,,,,,,,...,(999) 999-9999,fake_email@gmail.com,2.0,,,,,,,
6,4,personal_info_arm_1,,,,,Jane,Doe,(999) 999-9999,98105.0,...,,,,,,,,,,


Because this project is longitudinal with repeat instruments and events, there are multiple rows per record.

In [37]:
# view which record_id's are currently being used in the data set to import. 
unique(df_to_import$record_id)

In the dataframe we will import, the record IDs are 3-6. However, these IDs are already exist in the REDCap project and importing this data would overwrite the already exisiting record IDs 3-6. If we want to import these as new records, we will need to rename the record IDs. 

In [38]:
# start by getting the next available record_id
next_record <- redcap_next_free_record_name(redcap_uri=url, token=token)

The next free record name in REDCap was successfully determined in 0.4 seconds.  The http status code was 200.  Is is 10.



In [39]:
### sequence the df_to_import records starting at one
df_to_import <- df_to_import[order(df_to_import$record_id), , drop = FALSE]
df_to_import$seq <- as.numeric(factor(df_to_import$record_id))

In [40]:
df_to_import %>% group_by(record_id, seq) %>% summarize(n=n())

[1m[22m`summarise()` has grouped output by 'record_id'. You can override using the
`.groups` argument.


record_id,seq,n
<int>,<dbl>,<int>
3,1,5
4,2,6
5,3,4
6,4,5


In [41]:
# Adjust record IDs to start at the next available record_id
df_to_import$record_id <- as.numeric(df_to_import$seq) + (as.numeric(next_record)-1)
unique(df_to_import$record_id)

The record ID's have been changed to new IDs that don't already exist in the REDCap project.

In [42]:
# Remove the seq var that was created above
df_to_import <- df_to_import %>% select(-seq)

Date fields in REDCap are character fields with a designated date validation added. There are many different types of date validations/formats that can be chosen for a date field. All date fields must be imported to REDCap only in the Y-M-D format, regardless of the specific date format designated for this field in the REDCap project. Below is an example on how to use the project metadata to isolate and format all date fields before importing data. 

In [29]:
# Export metadata
metadata <- redcap_metadata_read(redcap_uri = url, token = token)$data

The data dictionary describing 30 fields was read from REDCap in 2.2 seconds.  The http status code was 200.



In [30]:
head(metadata)

field_name,form_name,section_header,field_type,field_label,select_choices_or_calculations,field_note,text_validation_type_or_show_slider_number,text_validation_min,text_validation_max,identifier,branching_logic,required_field,custom_alignment,question_number,matrix_group_name,matrix_ranking,field_annotation
<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>
record_id,demographics,,text,Study ID,,,,,,,,,,,,,
first_name,demographics,Personal Information,text,First Name,,,,,,,,,,,,,
last_name,demographics,,text,Last Name,,,,,,,,,,,,,
phone_num,demographics,,text,Phone Number,,,phone,,,,,,,,,,
zip_code,demographics,,text,ZIP Code,,,integer,10001.0,99999.0,y,,,,,,,
dob,demographics,,text,Date of birth,,,date_mdy,,,y,,,,,,,


Note that the 'text_validation_type_or_show_slider_number' field in the metadata is where the date format is specified. 

In [31]:
metadata$text_validation_type_or_show_slider_number

In [32]:
# Isolate all field_names in the metadata that have any date validation 
date_fields <- metadata %>% filter(grepl("date", text_validation_type_or_show_slider_number)) %>% select(field_name)

In [33]:
# Make a list of all the date fields
date_list <- (date_fields$field_name)
date_list

In [44]:
# mutate across all date fields to get the desired Y-M-D format.  
df_to_import2 <- df_to_import %>%
  mutate(across(all_of(date_list), ~as.Date(., "%m/%d/%Y" )))

In [45]:
print(df_to_import2$test_positive_date)

 [1] NA           NA           "2023-10-10" NA           NA          
 [6] NA           NA           "2023-10-12" "2021-06-07" NA          
[11] NA           NA           NA           NA           NA          
[16] NA           NA           "2023-10-03" NA           NA          


In [46]:
# Import the new records
redcap_write(df_to_import2, redcap_uri=url, token=token)

Starting to update 20 records to be written at 2024-03-20 10:19:47.

Writing batch 1 of 1, with indices 1 through 20.

4 records were written to REDCap in 2.1 seconds.

