---
title: Importing to REDCap
format: html
knitr:
  opts_chunk: 
    collapse: true
    comment: "#>" 
    R.options:
      knitr.graphics.auto_pdf: true
---

## Project Setup

Install and load the necessary packages and set up the REDCap project connection.

In [2]:
#| echo: false
#| output: false
import os
os.environ['R_HOME'] = f'C:/Users/{os.environ.get('USERNAME')}/Miniconda3/envs/r_python_jl/Lib/R'

In [3]:
#| echo: false
#| output: false
# enables the %%R magic, not necessary if you've already done this
%load_ext rpy2.ipython
# only have to run once to allow the R magic command



::: {.panel-tabset}

#### R

In [17]:
%%capture --no-display 
%%R
library("dplyr")
library("jsonlite")
library("tidyr")
library("REDCapR")
library("knitr")
library("remotes")
library("gt")

In [72]:
%%capture --no-display --no-stdout
%%R

# Detach REDCapR if already loaded, and download the latest version
if (version!='1.1.9005') {
    detach("package:REDCapR", unload=TRUE)
    remotes::install_github("OuhscBbmc/REDCapR")
    library("REDCapR")
    print(packageVersion("REDCapR"))
}

else {
    print("REDCapR package up to date")
}

[1] '1.1.9005'


In [75]:
%%R
# Load API tokens from the json file
token <- jsonlite::fromJSON('./../../json_api_data.json')$dev_token$'309'
url <- "https://dev-redcap.doh.wa.gov/api/"

#### Python

In [6]:
import redcap
import json
import csv
import pandas as pd
import numpy as np
import requests
import tempfile

In [45]:
# Load API tokens from the json file
key = json.load(open('./../../json_api_data.json'))
token = key['dev_token']['309']
url = key['dev_url']

project = redcap.Project(url, token)

:::

## Records

::: {.panel-tabset}

#### R


`redcap_write_oneshot()` and `redcap_write()`

Records can be imported into a REDCap project from a dataframe in R using `redcap_write_oneshot()` to write a records all at once, or using `redcap_write()` which can batch the records to be imported so the server is not overwhelmed in the case of large imports. These methods will accept either an R dataframe or tibble containing the data to be imported. 

If the record_id(s) being imported already exists in the REDCap project, the imported data will overwrite the previously existing data for that record. Using a record_id that does not already exist will create a new record. The `redcap_next_free_record_name()` function will show what the next unused record_id is. If Data Access Groups (DAGs) are used in the project, this method accounts for the special formatting of the record name for users in DAGs where the unique auto-assigned DAG number is a prefix to the actual record_id (e.g., DAG-ID; a user assigned to a DAG numbered 1732 with 3 existing records will return 1732-4).

In [12]:
%%capture --no-display 
%%R
# Define data to import
df1 <- data.frame(record_id = c(7,8),
                  first_name = c("John","Jane"),
                  last_name = c("Doe","Doe")
                  )

In [20]:
%%capture --no-stdout 
%%R
redcap_write_oneshot(df1, redcap_uri=url, token=token)

$success
[1] TRUE

$status_code
[1] 200

$outcome_message
[1] "2 records were written to REDCap in 0.2 seconds."

$records_affected_count
[1] 2

$affected_ids
[1] "7" "8"

$elapsed_seconds
[1] 0.2243779

$raw_text
[1] ""



In [73]:
%%R
df2 <- data.frame(record_id = 9,
                  first_name = "John",
                  last_name = "Doe"
                  )

In [74]:
%%capture --no-stdout 
%%R
redcap_write(df2, redcap_uri=url, token=token)
#optional argument: batch_size = 100 (default)

$success
[1] TRUE

$status_code
[1] "200"

$outcome_message
[1] "1 records were written to REDCap in 0.7 seconds."

$records_affected_count
[1] 1

$affected_ids
[1] "9"

$elapsed_seconds
[1] 1.260039



In [77]:
%%capture --no-stdout 
%%R
redcap_next_free_record_name(redcap_uri=url, token=token)

[1] "14"


#### Python

`import_records()`

Data can be imported as a pandas dataframe, json, csv, or xml, specified by the `import_format` argument (default is json).

If the record_id(s) being imported already exists in the REDCap project, the imported data will overwrite the previously existing data for that record. Using a record_id that does not already exist will create a new record. The `force_auto_number = 'True'` argument will automatically reassign existing record_ids to new record_ids during import. If set to 'False' and your record id's to import already exist in REDCap, they will overwrite the existing REDCap records during import. 

The overwrite argument is set to 'normal' by default; under this setting, if blank values are imported for fields on existing REDCap records and that data is not missing in REDCap, these values will not be overwritten as missing. If you want to overwrite existing data as missing, be sure to use `overwrite = 'overwrite'.`

In [25]:
df_py = [{'record_id': 7,
  'redcap_event_name': 'personal_info_arm_1',
  'redcap_repeat_instrument': '',
  'redcap_repeat_instance': None,
  'first_name': 'John',
  'last_name': 'Doe'},
 {'record_id': 8,
  'redcap_event_name': 'personal_info_arm_1',
  'redcap_repeat_instrument': '',
  'redcap_repeat_instance': None,
  'first_name': 'Jane',
  'last_name': 'Doe'}]

project.import_records(df_py, force_auto_number=True)

{'count': 2}

:::

## Files - optional attachments to individual records

File uploads are a unique field type in REDCap that accept a variety of file types, including images and other documents. Unlike other export methods, importing files only works for one file field for one record at a time. 

If the project has repeating events (i.e. a longitudinal project), the event name must be specified. If the file of interest is in a repeat instance, the instance number must also be specified.

::: {.panel-tabset}

#### R

`redcap_file_upload_oneshot()`

In [71]:
%%capture --no-stdout 
%%R
redcap_file_upload_oneshot(file_name='./files/test_file.png', record=7, field='test_upload', event='case_intake_arm_1', redcap_uri=url, token=token)

$success
[1] TRUE

$status_code
[1] 200

$outcome_message
[1] "file uploaded to REDCap in 1.0 seconds."

$records_affected_count
[1] 1

$affected_ids
[1] "7"

$elapsed_seconds
[1] 1.024144

$raw_text
[1] ""



#### Python

`import_file()`

In [28]:
tmp_file = tempfile.TemporaryFile()
project.import_file(record="7",
                 field="test_upload", 
                 file_name="./files/test_file.png",
                 file_object=tmp_file,
                 event="case_intake_arm_1")

[{}]

:::

## Metadata

Metadata refers to the project's set up characteristics, including field attributes grouped by instrument assignment. Metadata can be thought of as the project's data dictionary.  

In this example, we will export the project metadata and re-import it so that no changes are made to the project.

::: {.panel-tabset}

#### R

`redcap_metadata_write()`

In [78]:
%%R
metadata <- redcap_metadata_read(redcap_uri=url, token=token)$data

R[write to console]: The data dictionary describing 30 fields was read from REDCap in 0.4 seconds.  The http status code was 200.



::: {.content-hidden when-format="html"}

In [33]:
%%R
tbl<- gt(head(metadata))
gt::gtsave(tbl, filename = 'import_metadata.html', path = "./files/")

:::


<iframe width="100%" height="500" src="./files/import_metadata.html" title="Quarto Documentation"></iframe>

In [79]:
%%capture --no-stdout 
%%R
redcap_metadata_write(metadata, redcap_uri=url, token=token)

$success
[1] TRUE

$status_code
[1] 200

$outcome_message
[1] "30 fields were written to the REDCap dictionary in 0.8 seconds."

$field_count
[1] 30

$elapsed_seconds
[1] 0.7923529

$raw_text
[1] ""



#### Python

`import_metadata()`

In [29]:
metadata = project.metadata
metadata[0]

{'field_name': 'record_id',
 'form_name': 'demographics',
 'section_header': '',
 'field_type': 'text',
 'field_label': 'Study ID',
 'select_choices_or_calculations': '',
 'field_note': '',
 'text_validation_type_or_show_slider_number': '',
 'text_validation_min': '',
 'text_validation_max': '',
 'identifier': '',
 'branching_logic': '',
 'required_field': '',
 'custom_alignment': '',
 'question_number': '',
 'matrix_group_name': '',
 'matrix_ranking': '',
 'field_annotation': ''}

In [30]:
project.import_metadata(to_import=metadata)

30

:::

## Instrument Event Map

::: {.panel-tabset}

#### R

Importing this would be covered by the `redcap_metadata_write()` function. *See <a href="#Limitations-to-Importing">Limitations to Importing</a></n> for more.* 

#### Python

`import_instrument_event_mappings()`

In this example, we will export the project's instrument-event mapping and re-import it so that no changes are made to the project.  )

In [36]:
instrument_event_mappings = project.export_instrument_event_mappings(format_type='df')
instrument_event_mappings

Unnamed: 0,arm_num,unique_event_name,form
0,1,personal_info_arm_1,demographics
1,1,case_intake_arm_1,symptoms
2,1,case_intake_arm_1,test_information
3,1,notifications_arm_1,close_contacts
4,1,notifications_arm_1,work_information


In [39]:
project.import_instrument_event_mappings(instrument_event_mappings, import_format='df')

5

:::

## Users

::: {.panel-tabset}

#### R

Cannot be imported with REDCapR. Can be imported with the native API or uploaded via CSV in REDCap under the `User Rights` application. *See <a href="#Limitations-to-Importing">Limitations to Importing</a></n> for more.* 

#### Python

`import_users()`

In this example, we will export the project's users and re-import it so that no changes are made to the project. (Note: attemping to import a user already assigned to a user role will result in an error.)

In [50]:
users = project.export_users(format_type='df')
users

Unnamed: 0,username,email,firstname,lastname,expiration,data_access_group,data_access_group_id,design,alerts,user_rights,...,mobile_app,mobile_app_download_data,record_create,record_rename,record_delete,lock_records_all_forms,lock_records,lock_records_customization,forms,forms_export
0,alexey.gilman@doh.wa.gov,Alexey.Gilman@doh.wa.gov,Alexey,Gilman,,,,1,1,1,...,1,0,1,1,1,0,0,1,"demographics:3,symptoms:1,test_information:1,c...","demographics:1,symptoms:1,test_information:1,c..."
1,caitlin.drover@doh.wa.gov,Caitlin.Drover@doh.wa.gov,Caitlin,Drover,,,,1,1,1,...,1,0,1,1,1,0,0,1,"demographics:3,symptoms:1,test_information:1,c...","demographics:1,symptoms:1,test_information:1,c..."
2,emily.pearman@doh.wa.gov,emily.pearman@doh.wa.gov,Emily,Pearman,,,,1,1,1,...,1,1,1,1,1,0,0,0,"demographics:1,symptoms:1,test_information:1,c...","demographics:1,symptoms:1,test_information:1,c..."


In [51]:
project.import_users(users, import_format='df')

3

:::

## User Roles

::: {.panel-tabset}

#### R

Cannot be imported with REDCapR. Can be imported with the native API or uploaded via CSV in REDCap under the `User Rights` application. *See <a href="#Limitations-to-Importing">Limitations to Importing</a></n> for more.* 

#### Python

Roles can be imported using `import_user_roles()` and assigned using `import_user_role_assignment()`. 

In [52]:
user_roles = project.export_user_roles(format_type='df')
user_roles

Unnamed: 0,unique_role_name,role_label,design,alerts,user_rights,data_access_groups,reports,stats_and_charts,manage_survey_participants,calendar,...,mobile_app,mobile_app_download_data,record_create,record_rename,record_delete,lock_records_customization,lock_records,lock_records_all_forms,forms,forms_export
0,U-131Y8RXN3P,Test Role,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,"demographics:0,symptoms:0,test_information:0,c...","demographics:0,symptoms:0,test_information:0,c..."
1,U-1564393FT9,Advanced Role,0,1,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,"demographics:0,symptoms:3,test_information:0,c...","demographics:0,symptoms:3,test_information:0,c..."
2,U-5354FA3HYL,Admin,1,1,1,1,1,1,1,1,...,1,0,1,1,1,1,0,0,"demographics:3,symptoms:1,test_information:1,c...","demographics:3,symptoms:1,test_information:1,c..."


Note: the `unique_role_name` is automatically generated by REDCap.

In [55]:
project.import_user_roles(user_roles, import_format='df')

3

In [56]:
user_role_assign = project.export_user_role_assignment(format_type='df')
user_role_assign

Unnamed: 0,username,unique_role_name,data_access_group
0,alexey.gilman@doh.wa.gov,,
1,caitlin.drover@doh.wa.gov,,
2,emily.pearman@doh.wa.gov,,


In [59]:
user_role_assign['unique_role_name'] = user_role_assign['unique_role_name'].astype('str').replace('nan', np.NaN)
user_role_assign.loc[0,'unique_role_name'] = 'U-5354FA3HYL'
user_role_assign

Unnamed: 0,username,unique_role_name,data_access_group
0,alexey.gilman@doh.wa.gov,U-5354FA3HYL,
1,caitlin.drover@doh.wa.gov,,
2,emily.pearman@doh.wa.gov,,


In [60]:
project.import_user_role_assignment(user_role_assign, import_format='df')

3

:::

## DAGs

::: {.panel-tabset}

#### R

Cannot be imported with REDCapR. Can be imported with the native API or uploaded via CSV in REDCap under the `DAGs` application. *See <a href="#Limitations-to-Importing">Limitations to Importing</a></n> for more.* 

#### Python

DAGs can be imported using `import_dags()` and assigned using `import_user_dag_assignment()`. If the API user is assigned to multiple DAGs, they can be switched between using `switch_dag()`. 

In [65]:
dags = project.export_dags(format_type='df')
dags

Unnamed: 0,data_access_group_name,unique_group_name,data_access_group_id
0,Full Access,full_access,2708
1,Limited Access,limited_access,2709


In [67]:
new_dag = [{"data_access_group_name": "Test DAG", "unique_group_name": ""}]
project.import_dags(new_dag)

1

Note: the `unique_group_name` field must be left blank as this is auto-generated by REDCap from the `data_access_group_name`.

In [69]:
dag_mapping = [{"username": 'alexey.gilman@doh.wa.gov', "redcap_data_access_group": "full_access"}]
project.import_user_dag_assignment(dag_mapping)

1

Note: the `redcap_data_access_group` name when importing is the same as `unique_group_name` when exporting DAGs.

:::

# Data Validations

::: {.panel-tabset}

#### R
REDCapR has a few data validation functions that can be used to check your data before importing it to your REDCap project. These validations will not be specific to your paricular REDCap project but are general validations that apply to all REDCap projects.  

For example, you can check if you have any boolean values (True/False) since REDCap will only accept a raw data import of 0/1 integers. You can also check for duplicates and unique IDs. You can view more details on these data validation functions [here](https://ouhscbbmc.github.io/REDCapR/reference/validate.html).
:::

# Appendix

## Limitations to Importing 

::: {.panel-tabset}

#### R
Field Names <br>
- Importing this would be covered by the `redcap_metadata_write()` function. <br>
- Can be exported using `redcap_variables()`. <br>

Forms/Instruments <br>
- Importing this would be covered by the `redcap_metadata_write()` function. <br>
- Can be exported using `redcap_instruments()` or downloaded using `redcap_instrument_download()`. <br>

Instrument Event Map <br>
- Importing this would be covered by the `redcap_metadata_write()` function. <br> 
- Can be exported using `redcap_event_instruments()`. <br>

Reports <br>
- Cannot be imported. <br>
- Can be exported using `redcap_report()`. <br>

Users <br>
- Cannot be imported with REDCapR. Can be imported with the native API or uploaded via CSV in REDCap under the `User Rights` application. <br> 
- Can be exported using `redcap_users_export()`. <br>

User Roles <br>
- Cannot be imported with REDCapR. Can be imported with the native API or uploaded via CSV in REDCap under the `User Rights` application. <br> 
- Can be exported using `redcap_users_export()`. <br>

Data Access Groups (DAGs) <br>
- Cannot be imported with REDCapR. Can be imported with the native API or uploaded via CSV in REDCap under the `DAGs` application. <br> 
- Can be exported using `redcap_dag_read()`. <br>

Logging <br>
- Cannot be imported. <br> 
- Can be exported using `redcap_log_read()`. <br>


#### Python
Field Names <br>
- Importing this would be covered by the `import_metadata()` function. <br>
- Can be exported using `export_field_names()`. <br>

Forms/Instruments <br>
- Importing non-repeating instruments would be covered by the `import_metadata()` function. Repeating instrument/event settings can be imported using the `import_repeating_instruments_events()` function. <br>
- Can be exported using `export_instruments()` and `export_repeating_instruments_events()` for the settings. <br>

Reports <br>
- Cannot be imported. <br>
- Can be exported using `export_records()`. <br>

Logging <br>
- Cannot be imported. <br> 
- Can be exported using `export_logging()`. <br>

:::

## Example: Uploading Records from a CSV

In this example, we have a csv named "data_to_import.csv", with records to upload.

::: {.panel-tabset}

#### R

In [36]:
%%R
df_to_import <- read.csv("../files/data_to_import.csv")
head(df_to_import)

Unnamed: 0_level_0,record_id,redcap_event_name,redcap_repeat_instrument,redcap_repeat_instance,redcap_survey_identifier,demographics_timestamp,first_name,last_name,phone_num,zip_code,...,cc_phone,cc_email,close_contacts_complete,supervisor_name,supervisor_email,work_inperson_yesno,work_date,work_contagious,work_contagious_calc,work_information_complete
Unnamed: 0_level_1,<int>,<chr>,<chr>,<int>,<lgl>,<lgl>,<chr>,<chr>,<chr>,<int>,...,<chr>,<chr>,<int>,<chr>,<chr>,<int>,<chr>,<int>,<lgl>,<int>
1,3,personal_info_arm_1,,,,,John,Doe,(999) 999-9999,98105.0,...,,,,,,,,,,
2,3,notifications_arm_1,,,,,,,,,...,,,,Boss,,0.0,,0.0,,2.0
3,3,case_intake_arm_1,,1.0,,,,,,,...,,,,,,,,,,
4,3,notifications_arm_1,close_contacts,1.0,,,,,,,...,(999) 999-9999,fake_email@gmail.com,2.0,,,,,,,
5,3,notifications_arm_1,close_contacts,2.0,,,,,,,...,(999) 999-9999,fake_email@gmail.com,2.0,,,,,,,
6,4,personal_info_arm_1,,,,,Jane,Doe,(999) 999-9999,98105.0,...,,,,,,,,,,


Because this project is longitudinal with repeat instruments and events, there are multiple rows per record.

In [37]:
%%R
# view which record_id's are currently being used in the data set to import. 
unique(df_to_import$record_id)

In the dataframe we will import, the record IDs are 3-6. However, these IDs are already exist in the REDCap project and importing this data would overwrite the already exisiting record IDs 3-6. If we want to import these as new records, we will need to rename the record IDs. 

In [38]:
%%R
# start by getting the next available record_id
next_record <- redcap_next_free_record_name(redcap_uri=url, token=token)

The next free record name in REDCap was successfully determined in 0.4 seconds.  The http status code was 200.  Is is 10.



In [39]:
%%R
### sequence the df_to_import records starting at one
df_to_import <- df_to_import[order(df_to_import$record_id), , drop = FALSE]
df_to_import$seq <- as.numeric(factor(df_to_import$record_id))

In [40]:
%%R
df_to_import %>% group_by(record_id, seq) %>% summarize(n=n())

[1m[22m`summarise()` has grouped output by 'record_id'. You can override using the
`.groups` argument.


record_id,seq,n
<int>,<dbl>,<int>
3,1,5
4,2,6
5,3,4
6,4,5


In [41]:
%%R
# Adjust record IDs to start at the next available record_id
df_to_import$record_id <- as.numeric(df_to_import$seq) + (as.numeric(next_record)-1)
unique(df_to_import$record_id)

The record ID's have been changed to new IDs that don't already exist in the REDCap project.

In [42]:
%%R
# Remove the seq var that was created above
df_to_import <- df_to_import %>% select(-seq)

Date fields in REDCap are character fields with a designated date validation added. There are many different types of date validations/formats that can be chosen for a date field. All date fields must be imported to REDCap only in the Y-M-D format, regardless of the specific date format designated for this field in the REDCap project. Below is an example on how to use the project metadata to isolate and format all date fields before importing data. 

In [29]:
%%R
# Export metadata
metadata <- redcap_metadata_read(redcap_uri = url, token = token)$data

The data dictionary describing 30 fields was read from REDCap in 2.2 seconds.  The http status code was 200.



In [30]:
%%R
head(metadata)

field_name,form_name,section_header,field_type,field_label,select_choices_or_calculations,field_note,text_validation_type_or_show_slider_number,text_validation_min,text_validation_max,identifier,branching_logic,required_field,custom_alignment,question_number,matrix_group_name,matrix_ranking,field_annotation
<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>
record_id,demographics,,text,Study ID,,,,,,,,,,,,,
first_name,demographics,Personal Information,text,First Name,,,,,,,,,,,,,
last_name,demographics,,text,Last Name,,,,,,,,,,,,,
phone_num,demographics,,text,Phone Number,,,phone,,,,,,,,,,
zip_code,demographics,,text,ZIP Code,,,integer,10001.0,99999.0,y,,,,,,,
dob,demographics,,text,Date of birth,,,date_mdy,,,y,,,,,,,


Note that the 'text_validation_type_or_show_slider_number' field in the metadata is where the date format is specified. 

In [31]:
%%R
metadata$text_validation_type_or_show_slider_number

In [32]:
%%R
# Isolate all field_names in the metadata that have any date validation 
date_fields <- metadata %>% filter(grepl("date", text_validation_type_or_show_slider_number)) %>% select(field_name)

In [33]:
%%R
# Make a list of all the date fields
date_list <- (date_fields$field_name)
date_list

In [44]:
%%R
# mutate across all date fields to get the desired Y-M-D format.  
df_to_import2 <- df_to_import %>%
  mutate(across(all_of(date_list), ~as.Date(., "%m/%d/%Y" )))

In [45]:
%%R
print(df_to_import2$test_positive_date)

 [1] NA           NA           "2023-10-10" NA           NA          
 [6] NA           NA           "2023-10-12" "2021-06-07" NA          
[11] NA           NA           NA           NA           NA          
[16] NA           NA           "2023-10-03" NA           NA          


In [46]:
%%R
# Import the new records
redcap_write(df_to_import2, redcap_uri=url, token=token)

Starting to update 20 records to be written at 2024-03-20 10:19:47.

Writing batch 1 of 1, with indices 1 through 20.

4 records were written to REDCap in 2.1 seconds.



#### Python

When reading a csv as a pandas dataframe, python will take any numeric column with missing data and convert them to [float with NaN inserted](https://stackoverflow.com/questions/39666308/pd-read-csv-by-default-treats-integers-like-floats) in the blank cells. In longitudinal projects, we expect lots of blank cells since data is wide and for each row, only columns relevant to that event/instrument are filled out. Many of REDCap's field types (checkbox, yes/no, radio, and form_complete variables) are integers. Pandas will convert these columns to float variables with a decimal place added (i.e. 1.0 instead of 1 for 'Yes' in a yes/no field) and import to these integer field types in REDCap will fail.

In [31]:
# Read and view data to import
df_to_import = pd.read_csv("./files/data_to_import.csv")
df_to_import.head()

Unnamed: 0,record_id,redcap_event_name,redcap_repeat_instrument,redcap_repeat_instance,redcap_survey_identifier,demographics_timestamp,first_name,last_name,phone_num,zip_code,...,cc_phone,cc_email,close_contacts_complete,supervisor_name,supervisor_email,work_inperson_yesno,work_date,work_contagious,work_contagious_calc,work_information_complete
0,3,personal_info_arm_1,,,,,John,Doe,(999) 999-9999,98105.0,...,,,,,,,,,,
1,3,notifications_arm_1,,,,,,,,,...,,,,Boss,,0.0,,0.0,,2.0
2,3,case_intake_arm_1,,1.0,,,,,,,...,,,,,,,,,,
3,3,notifications_arm_1,close_contacts,1.0,,,,,,,...,(999) 999-9999,fake_email@gmail.com,2.0,,,,,,,
4,3,notifications_arm_1,close_contacts,2.0,,,,,,,...,(999) 999-9999,fake_email@gmail.com,2.0,,,,,,,


Notice how the redcap_repeat_instance, close_contacts_complete, and work_inperson_yesno are some of the many fields that were converted to float with an added decimal. Importing this dataset as-is will produce errors.

**Solution:** Convert all floats to Int64 Pandas datatype.
- Int64 is a unique pandas datatype that allows numeric fields to contain missing values. For more information read the documentation [here](https://pandas.pydata.org/docs/reference/api/pandas.Int64Dtype.html)  

- Note: Before applying this solution, ensure that there are no numeric fields in your REDCap Project that should have decimals (you will not want to convert these variables to int64 since they would lose their decimal places). Make sure you are familiar with your project's metadata. All radio, checkboxes, yes/no, redcap_repeat_instance, and form_complete variables need to be integers. In REDCap, actual numeric fields are stored as text fields with optional validation. Any text field in REDCap with no validation or with 'numeric' as their validation type, would allow numbers with decimal places. Any text fields with other validations types (i.e. zip code, phone number, integer) will not allow import of decimal places. 

In [32]:
float_list = df_to_import.select_dtypes(include=[np.float64]).columns.values.tolist()
print(float_list)

['redcap_repeat_instance', 'redcap_survey_identifier', 'demographics_timestamp', 'zip_code', 'age', 'ethnicity', 'race', 'gender', 'demographics_complete', 'symptoms_yesno', 'symptoms_exp___1', 'symptoms_exp___2', 'symptoms_exp___3', 'symptoms_exp___4', 'symptoms_exp___5', 'symptoms_exp___6', 'symptoms_exp___7', 'symptoms_exp___8', 'symptoms_exp___9', 'symptoms_exp___10', 'symptoms_exp___11', 'symptom_notes', 'symptoms_complete', 'test_yesno', 'test_positive_yesno', 'prior_covid_yesno', 'test_information_complete', 'close_contacts_complete', 'work_inperson_yesno', 'work_contagious', 'work_contagious_calc', 'work_information_complete']


At this point, if needed, you can remove any variables from this list that you need to keep in float format. 

In [33]:
df_to_import[float_list] = df_to_import[float_list].apply(lambda x: x.astype("Int64"))
df_to_import.head()

Unnamed: 0,record_id,redcap_event_name,redcap_repeat_instrument,redcap_repeat_instance,redcap_survey_identifier,demographics_timestamp,first_name,last_name,phone_num,zip_code,...,cc_phone,cc_email,close_contacts_complete,supervisor_name,supervisor_email,work_inperson_yesno,work_date,work_contagious,work_contagious_calc,work_information_complete
0,3,personal_info_arm_1,,,,,John,Doe,(999) 999-9999,98105.0,...,,,,,,,,,,
1,3,notifications_arm_1,,,,,,,,,...,,,,Boss,,0.0,,0.0,,2.0
2,3,case_intake_arm_1,,1.0,,,,,,,...,,,,,,,,,,
3,3,notifications_arm_1,close_contacts,1.0,,,,,,,...,(999) 999-9999,fake_email@gmail.com,2.0,,,,,,,
4,3,notifications_arm_1,close_contacts,2.0,,,,,,,...,(999) 999-9999,fake_email@gmail.com,2.0,,,,,,,


You can now see the `redcap_repeat_instance` and `clost_contacts_complete` fields are in integer format. The <NA> seen in the blank cells will not interfere with data import. Now you can make any edits necessary including numeric specific transformations on your integer fields. 

In [34]:
# Import data
project.import_records(df_to_import, date_format = 'MDY', import_format = 'df')

{'count': 4}

:::