# Utils Notebook

This notebook illustrates the use of some of the utility functions. Note that some of this has to be run in a Python virtual environment. See here:

https://github.com/onnela-lab/beiwe/wiki/mano

https://github.com/harvard-nrg/mano

In [1]:
# import utils.py as a module; provide the correct path here
import os
import imp
utils = imp.load_source("utils", os.path.expanduser("~/Dropbox/JP/beiwe/code/utils.py"))

### Download data from Beiwe

This script downloads data from the Beiwe back-end. See the following for help:

https://github.com/onnela-lab/beiwe/wiki/mano

https://github.com/harvard-nrg/mano

JP Onnela / June 30, 2018

In [2]:
help(utils.download_beiwe_data)

Help on function download_beiwe_data in module utils:

download_beiwe_data(study_id, data_streams, output_folder, time_end=None, time_start=None)
    Download Beiwe data to a local directory. If time_start is not specified, extraction will
    proceed one week from time_end. If time_end is not specified, today's date is use as default.
    
    Args:
        study_no (str): Beiwe study ID, e.g. "5a3a856203d3c42e31329970"
        data_streams (list): Names of data streams as strings to download
        output_folder (str): Downloaded data will be stored here
        time_end (str): End date of extraction in the format YYYY-MM-DD
        time_start (str): Start date of extraction in the format YYYY-MM-DD
    
    Returns:
        active_users (list): List of Beiwe user IDs for whom data directories were returned



In [3]:
# study names and IDs
studies = {1: ("UCSD_Little_UNI_HIV_Pos", "5a3a856203d3c42e31329970"), \
           2: ("UCSD_Little_UNI_HIV_Neg", "5a3a85db03d3c42e3132998a"), \
           3: ("UCSD_Little_RDS Study", "5a3a862503d3c42e3217bb90")}

In [4]:
# specify study, data streams and output folder
study_no = 2
(study_name, study_id) = studies[study_no]
print("  Processing study %s (study ID %s)." % (study_name, study_id))
data_streams = ["power_state", "survey_answers"]
output_folder = "/tmp/beiwe-data/"

# proceed with download
subjects = utils.download_beiwe_data(study_id, data_streams, output_folder)

  Processing study UCSD_Little_UNI_HIV_Neg (study ID 5a3a85db03d3c42e3132998a).
enter keyring passphrase: ········
  Extracting data from 2018-06-23T00:00:00 to 2018-06-30T00:00:00 to /tmp/beiwe-data/.
  Downloading data for user 4k2zibq6.
  Downloading data for user h9wtyf9o.
  Downloading data for user jf69m2b8.
  Downloading data for user w2fa1uu5.
  Downloading data for user fvmuymh7.
  Downloading data for user xwrpz42o.
  Downloading data for user ofuhx7fg.
  Downloading data for user 47zghzxj.
  Downloading data for user e2loxwqb.
  Downloading data for user 86h2jjyz.
  Downloading data for user 5biz9f21.
  Downloading data for user csd2q1bf.
  Downloading data for user uzd11em8.
  Downloading data for user lzhywkhd.
  Downloading data for user 6isorvks.
  Downloading data for user 7wl2dltq.
  Downloading data for user nsabskmk.
  Downloading data for user vy5p4iva.
  Downloading data for user s8x47qv3.
  Downloading data for user 5drn99jp.
  Downloading data for user xns16dvw.


### Check the size of passive data files and survey files

This script prints out the size of different passive data streams and surveys as specified by the user. We assume you've run the notebook above.

JP Onnela / June 27, 2018; June 30, 2018

In [5]:
help(utils.check_file_size)

Help on function check_file_size in module utils:

check_file_size(data_dir, dates, subjects, surveys, data_streams)
    Function to loop over all specified dates, subjects, data streams and surveys.
    Prints out the file sizes for each.
    
    Args:
        data_dir (str): Location of the directory called "data" as downloaded from Beiwe
        dates (list): List of dates to check
        subjects (list): List of Beiwe subject IDs to check
        surveys (list): List of Beiwe survey IDs to check
        data_streams (list): List of strings corresponding to data different streams:
            "accelerometer", "bluetooth", "calls", "gps", "identifiers", 
            "app_log", "power_state", "survey_answers", "survey_timings", 
            "texts", "audio_recordings", "wifi", "proximity", "gyro", 
            "magnetometer", "devicemotion", "reachability", "ios_log", "image_survey"



In [6]:
# specify data directory, dates, subjects, surveys
data_dir = output_folder
dates = ["2018-06-28"]
subjects = subjects
surveys = ["5aa8606103d3c4391076ba52"]

utils.check_file_size(data_dir, dates, subjects, surveys, data_streams)

Date: 2018-06-28
-----------------
Subject: fvmuymh7
  power_state total file size is 22362 bytes.
  survey_answers total file size is 0 bytes.
  5aa8606103d3c4391076ba52 survey total file size is 0 bytes.

Subject: csd2q1bf
  power_state total file size is 0 bytes.
  survey_answers total file size is 0 bytes.
  5aa8606103d3c4391076ba52 survey total file size is 0 bytes.

Subject: 47zghzxj
  power_state total file size is 11960 bytes.
  survey_answers total file size is 0 bytes.
  5aa8606103d3c4391076ba52 survey total file size is 501 bytes.

Subject: 5drn99jp
  power_state total file size is 25542 bytes.
  survey_answers total file size is 0 bytes.
  5aa8606103d3c4391076ba52 survey total file size is 0 bytes.

Subject: 4k2zibq6
  power_state total file size is 11579 bytes.
  survey_answers total file size is 0 bytes.
  5aa8606103d3c4391076ba52 survey total file size is 0 bytes.


