Reinforcement Learning for SMS Messaging to Improve Medication Adherence - Roybal

Import necessary packages

In [8]:
import sys
import time
from azure.cognitiveservices.personalizer import PersonalizerClient
from azure.cognitiveservices.personalizer.models import RankRequest
from msrest.authentication import CognitiveServicesCredentials
import pandas as pd
import numpy as np
import math
import time
from datetime import datetime
from collections import Counter
import string
import pickle
import json

Imports from Local Python Code Files

In [4]:
from input.data_input_functions import import_Pillsy, import_redcap, load_dict_pickle
from preprocessing.pillsy import get_pillsy_study_ids, find_rewards, update_pt_dict
from ranking.driverRank import run_ranking, write_sms_history
from rewarding.driverReward import send_rewards, get_reward_update
from preprocessing.redcap import update_patient_dict_redcap
from output.data_output_functions import export_pt_dict_pickle

Start Program Timer & Set Up MS Azure Personalizer Client

In [23]:
start_time = time.time()

Personalizer Keys:
In the Microsoft Azure Dashboard, navigate to our bwh-pharmacoepi-roybal-dev-use2-cog Cognitive Services page.
 Within the Keys and Endpoint section, copy either Key 1 or Key 2 to enter as the Personalizer Key.

In [24]:
personalizer_key = input("ENTER PERSONALIZER KEY: \n")

ENTER PERSONALIZER KEY: 
1beebd47bed0488ea3641a27e7a66620


Personalizer Endpoint:
https://bwh-pharmacoepi-roybal-dev-use2-cog.cognitiveservices.azure.com/

In [25]:
personalizer_endpoint = input("ENTER PERSONALIZER ENDPOINT: \n")

ENTER PERSONALIZER ENDPOINT: 
https://bwh-pharmacoepi-roybal-dev-use2-cog.cognitiveservices.azure.com/


Instantiates PersonalizerClient

In [26]:
client = PersonalizerClient(personalizer_endpoint, CognitiveServicesCredentials(personalizer_key))

If we've already initiated the trial, we will have:
* Pre-existing patient dataset in need of reward updates
* Pillsy data from yesterday to determine reward

Example Pillsy Inputs:
* MAC:  /Users/lilybessette/Dropbox (Partners HealthCare)/SHARED -- REINFORCEMENT LEARNING/Pillsy/PILLSY_Full Sample_CF.csv
* PC:   C:\Users\lg436\Dropbox (Partners HealthCare)\SHARED -- REINFORCEMENT LEARNING\Pillsy\PILLSY_Full Sample_CF.csv

Example Patient Data Inputs:
* MAC:  /Users/lilybessette/Dropbox (Partners HealthCare)/SHARED -- REINFORCEMENT LEARNING/PatientData/
* PC:   C:\Users\lg436\Dropbox (Partners HealthCare)\SHARED -- REINFORCEMENT LEARNING\PatientData

*To Do: Need to make example patieent data picklee*

In [None]:
pt_dict = load_dict_pickle()
new_pillsy_data = import_Pillsy()

# Extract list of study_id's in the Pillsy Data.
pillsy_study_ids_list = get_pillsy_study_ids(new_pillsy_data)
# From Pillsy data, computes the Rewards to send to Personalizer for each patient's Rank calls from yesterday's run.
pt_dict_with_reward, pt_dict_without_reward = find_rewards(new_pillsy_data,pillsy_study_ids_list,pt_dict)

# Using this updated Pillsy joined Patient data, we format the rewards to Personalizer into a dataframe
rewards_to_send = get_reward_update(pt_dict_with_reward)

# Combines the patients with and without rewards to update the patient dataset.
pillsy_updated_pt_dict = update_pt_dict(pt_dict_with_reward, pt_dict_without_reward)

send_rewards(rewards_to_send, client)

No matter if its the trial initiation or not, we process the new patients in the trial and call Personalizer
 to rank action features to find the correct text message to send today.

In [None]:
redcap_data = import_redcap()
redcap_updated_pt_dict = update_patient_dict_redcap(redcap_data, pillsy_updated_pt_dict)
final_pt_dict = {}

for study_id, patient in final_pt_dict.items():
    ranked_patient = run_ranking(patient, client)
    final_pt_dict[study_id] = patient

# write CSV for SMS export
#TODO
# Function input, patient dictionary -> function exports csv to "SMSHistory" folder
# columns: 'record_id'

write_sms_history(final_pt_dict)
# write run function to export csv of sms output
export_pt_dict_pickle(final_pt_dict)

To Do List:
* General -
    * Remove the weird option for if this is start or not, perhaps figure out a way to reduce human choice here
    * Export of log files for rank & reward as pickles - do we want these as CSVs? = JULIE ANSWER - Yes we do
    * Export of patient_dict as CSV as well as pickle
    * Data processing - timezone timestamp issue for datetime data from Pillsy
    * Automate file import methods based on naming convention of:
        * date_recap.csv
        * date_pillsy.csv
        * date_patient_dict.pickle
* Pillsy -
    * [x] Lily-DONE YAY- Pseudo code taken event decisions for Joe to code up (did it before, but we made some changes to algorithm)
    * [x] Lily-DONE YAY - Redo processing algorithm for taken events pseudo code handwritten in notes
    * [x] Lily-DONE YAY - Fix timing of when this is run - i.e. subset to time frame
    * [x] Lily-DONE YAY Need variable to store the last time this was run - i.e. should this be user entered or stored?
    * [x] Lily-DONE YAY  i.e. not from time of last run to current time; will be from last run to defined time of 'early today'
    * [ ] Another question sent to Elad/Marco/Julie again; JULIE ANSWER algorithm runs from last run to 12am anytime 12am to current time = early use - Need to define what early today is i.e. 2am onwards? - I don't think we defined this yet actually
    * [x] Lily-DONE-YAY- Account for if patient already took med that day
        * AKA run above algorithm on data occurring morning of for early_rx_use_before_sms to be true if adherence = 1 already
* RedCap -
    * [ ] Need to work on import method
    * Need to properly encode categorical variables as strings rather than numeric
    * JULIE ANSWER YES (only update for Pillsy Rx and Num 2x meds) - Do we need to account for the RedCap data changing for a patient that is already receiving text messages? i.e. updates to baseline variables coming from red_cap?
* SMS -
    * For each patient:
        * Use 1,0's from ranking that are stored in "FEATURE"_sms as 0,1 (2 for neg framing) to compute SMS choice
        * Store sms choice by factor_set and text_number
        * Retrieve String of that chosen text_number and replace X with total_dichot_adherence_past7
    * Export file once all patients are updated
* Ranking/Context Features -
    * How to json dump without null values because we don't want these in the model
    * Double checking json dump is working appropriately and looks correct
    * How to encode the rank decisions each stepwise rank call to personalizer
        * How to encode framing = pos in the history rank call
        * i.e. do we want this as a namespace or is it okay as a free standing variable in the json features list
* Add ons -
    * Making this more user friendly than a Jupyter Notebook / Command Line
    * Hooks into Pillsy/RedCap for data retrieval
    * Hooks into SMS Platform to automate text sending