## Finding Effective Treatments

Many of the analysis that we want to perform hinge on being able to determine which treatments or tags are having an effect on how a user feels.
In this document we will go over a few different ways to measure this.

There are two major challenges that we need to deal with.  First is that the effects that we are trying to measure may have unmeasured delays or cycles.  For example, a person who begins taking Synthroid for a thyroid condition is not expected to see improvement for several weeks.
The second major challenge is that this data is inherently noisey.  The noise is expected from user reported data.  Nobody uses the app every single day, and may not give perfectly accurate representations of their symptoms.  

We should also be aware that we may be dealing with some biases.  For example people who are drawn to use the app are likely suffering more than the average person, are more likely in a younger age group, and have likely already had some trouble with diagnosis. 

## Correlation

The first thing we will try in finding effective treatments will be simply a correlation between treatments/tags and the severity of conditions/symptoms.

We will take a look at the top 10 Conditions, and see if each user has a correlation between being on a treatment or tag on a specific day, and their Condition rating for that day.

In [None]:
import numpy as np
import pandas as pd
from scipy.stats.stats import pearsonr
import warnings
warnings.filterwarnings('ignore')

#When you can get this down to 0.05 and still get some results, then we're getting somewhere
P_THRESHOLD = 0.1

df = pd.read_csv("flaredown_trackable_data_080316.csv")
df['checkin_date'] = pd.to_datetime(df['checkin_date'])

just_depressed_users = df.groupby(['user_id', 'checkin_date']).filter(lambda x: 'Depression' in x['trackable_name'].values)
def add_depression_score(x):
    return x[x['trackable_name'] == 'Depression']['trackable_value'].values[0]
depression_days = just_depressed_users.groupby(['user_id', 'checkin_date'])

#create a table of depression scores by user/day
depression_scores = depression_days.apply(add_depression_score)
depression_scores = depression_scores.reset_index()
depression_scores.columns = ["user_id", "checkin_date", "depression_score"]

just_depressed_users = just_depressed_users[just_depressed_users['trackable_type'] == 'Treatment'].append(just_depressed_users[just_depressed_users['trackable_type'] == 'Tag'])
just_depressed_users = pd.get_dummies(just_depressed_users, columns=['trackable_name'])

just_depressed_users = just_depressed_users.merge(depression_scores, on=['user_id','checkin_date'])
just_depressed_users['depression_score'] = pd.to_numeric(just_depressed_users['depression_score'])

print "writing file"
userlist = set(just_depressed_users['user_id'])
for userid in userlist:
    user = just_depressed_users[just_depressed_users['user_id'] == userid]
    first = True
    for column in user.columns[5:len(user.columns)-1]:
        corr,p =  pearsonr(user['depression_score'], user[column])
        if (p < P_THRESHOLD):
            print "found correlation"

What makes this difficult is that we are trying to determine if a treatment/tag is working for an individual, not the whole group.  Which means that our sample size for each user depends on how many days they have logged in the app.  This causes fairly low sample sizes, and inacceptably high p values, especially for this shotgun approach.  We also need to consider, does the treatment make the user feel better the same day?  What if a treatment takes a week to kick in?  Or what if the treatment, such as Humira for Crohn's, functions on a two week cycle?

As can be seen above, the vast majority of users are not getting any correlations with an acceptable p value, so this strict method of determining if a treatment/tag is working can only be considered inconclusive.

This way of thinking may prove fruitful when the app has been around for longer, but at this time the number of logged days per user is too low.


## Comparison of Before and After Treatment

In order to simplify this metric, lets take a look at the day that a treatment started.  If a user suffers from a condition, and then starts logging a new treatment or tag, does the condition get better after that point?  

We will need to identify the date that a treatment or tag started, and then calculate the mean condition score before that date and the mean condition score after that date.  Hopefully the score after the date will be improved (lower).

Using this metric helps us get around a few quirks of the data.  Since we are taking the mean, we can ignore any cyclical nature of the condition.  Also it accounts for users who have started taking a treatment but are not consistantly logging it.