Skip to content
master
Go to file
Code

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
bin
 
 
 
 
 
 

README.md

OpenHumansDataTools

Tools to work with data downloaded from Open Humans research platform.

Tool #1: Unzip, split if needed based on size, and convert json to csv, and do it on a full batch of downloaded data from OH.

Unzip-Zip-CSVify-OpenHumans-data.sh Note that this tool was designed for use with the OpenAPS and Nightscout Data Commons, which pulls in Nightscout data as json files to Open Humans. Any users will need to specify the data file types for use in the second "for" loop. (The first for loop is Nightscout specific based on the data type, and uses an alternative json to csv conversion - see tips for other installation requirements).

See these tips for help, especially related to the first for loop if you will be using entries.json from Nightscout.

This script calls complex-json2csv and jsonsplit.sh. Both tools are in a package (see repo here) which can be installed by npm (see this).

Progress output from the tool while running, with the script in current form, looks like:

########
########_entries.json
########_entries.csv
Starting participant ########
Extracted ########_profile.json; splitting it...
.
Grouping split records into valid json...
-
Creating CSV files...
=
Participant ########: profile CSV files created:       1
Extracted ########_treatments.json; splitting it...
..............
Grouping split records into valid json...
--------------
Creating CSV files...
==============
Participant ########: treatments CSV files created:      14
Extracted ########_devicestatus.json; splitting it...
...................................
Grouping split records into valid json...
-----------------------------------
Creating CSV files...
===================================
Participant ########: devicestatus CSV files created:      35

Tool #2: Unzip, merge, and create output file from multiple data files from an OH download

Unzip-merge-output.sh Note that this tool was designed for use with the OpenAPS and Nightscout Data Commons, which pulls in Nightscout data as json files to Open Humans. Any users will need to specify the data file types for use in the second "for" loop, but can see this script as a template for taking various pieces of data from multiple files (i.e. timezone from devicestatus and BG data from entries) and creating one file, complete with headers, ready for data analysis.

Per the headers for the file provided as an example in this script, if needed, I have formulas created in excel to calculate if data is from a control or intervention period or neither; the hour of day the data is from to calculate if it is day or nighttime; and also (once looping start date manually added to file) can calculate number of days looping and number of days of data in the upload to calculate the control/intervention time frames based on the project protocol.

Mock data in output file along with additional calculations for various variables as defined by a project protocol:

Example output file with mock data and formulas embedded for calculating these other fields

Tool #3: Examples and descriptions of the four data file types from Nightscout

NS-data-types.md attemps to explain the nuances and what is contained in each of the four data file types: profile, entries, device status, and treatments.

Tool #4: Pull ISF from device status

Requires csvkit, so do sudo pip install csvkit to install before running this script. Also, it assumes your NS data files are already in csv format, using tool #1 Unzip-Zip-CSVify-OpenHumans-data.sh.

Note: depending on your install of six, you may get an attribute error. Following this rabbit hole about the error, various combinations of solutions outlined in this stack overflow article may help.

The devicestatus-pull-isf-timestamp.sh script, when successful, pulls ISF and timestamp to enable further ISF analysis.

Output file looks like this: Example of isf timestamp puller

Tool #5: Assess amount of looping data

There are two methods for assessing amounts of data.

  • You can use howmuchBGdata.sh to see how much time worth of BG entries someone has. However, this doesn't necessarily represent time of looping data.
  • Or, you can use howmuchdevicestatusdata.sh to see how much looping data (OpenAPS only for now; someone can add in Loop assessment later with same principle) someone has in the Data Commons.

Before running howmuchdevicestatusdata.sh, you'll need to first run devicestatustimestamp.sh to pull out the timestamp into a separate file. If you haven't, you'll need csvkit (see Tool #4 for details). Also, both of these may need chmod +x <filename> before running on your machine.

Output on the command line of devicestatustimestamp.sh: Example from command line output of devicestatustimestamp.sh

Then, run howmuchdevicestatusdata.sh, and the output in the command line also shows which files are being processed: Example from command line output of howmuchdevicestatusdata.sh

The original output of howmuchdevicestatusdata.sh is a CSV.

  • Due to someone having multiple uploads, there may be multiple lines for a single person. You can use Excel to de-duplicate these.
  • Loop users (until someone updates the script to pull in loop/enacted/timestamp) will show up as 0. You may want to remove these before averaging to estimate the Data Commons' total looping data.

Example CSV output of howmuchdevicestatusdata.sh

TODO for Tool 5:

  1. add Loop/enacted/timestamp to also assess Loop users
  2. add a script version to include both BG and looping data in same output CSV)

Tool #6: Outcomes

This script (outcomes.sh) assess the start/end of BG and looping data to calculate time spent low (default: <70 mg/dL), time in range (default: 70-180mg/dL), time spent high (default:>180mg/dL), amount of high readings, and the average glucose for the time frame where there is entries data and they are looping.

Tl;dr - this analyzes the post-looping time frame.

You can’t perform that action at this time.