# Data_Analysis

Questions:

- Does X treatment affect Y symptom positively/negatively/not at all? What are the most strongly-correlated symptoms and treatments?
- Are there subsets within our current diagnoses that could more accurately represent symptoms and predict effective treatments?
- Can we reliably predict what triggers a flare for a given user or all users with a certain condition?
- Could we recommend treatments more effectively based on similarity of users, rather than specific symptoms and conditions? (Netflix recommendations for treatments)
- Can we quantify a patient’s level of disease activity based on their symptoms? How different is it from our existing measures?
- Can we predict which symptom should be treated to have the greatest effect on a given illness?
- How accurately can we guess a condition based on a user’s symptoms?
- Can we detect new interactions between treatments?

<a href="https://www.kaggle.com/flaredown/flaredown-autoimmune-symptom-tracker?select=export.csv">Source</a>

In [1]:
import pandas as pd
import numpy as np
from matplotlib import pyplot as plt

In [33]:
df = pd.read_csv(r'C:\Users\Chunna\Documents\Data_Analyst_Portfolio\Chronic_Illness\Chronic_Illness.csv', low_memory=False)
df

Unnamed: 0,user_id,age,sex,country,checkin_date,trackable_id,trackable_type,trackable_name,trackable_value
0,QEVuQwEABlEzkh7fsBBjEe26RyIVcg==,,,,2015-11-26,1069,Condition,Ulcerative colitis,0
1,QEVuQwEAWRNGnuTRqXG2996KSkTIEw==,32.0,male,US,2015-11-26,1069,Condition,Ulcerative colitis,0
2,QEVuQwEA+WkNxtp/qkHvN2YmTBBDqg==,2.0,female,CA,2017-04-28,3168,Condition,pain in left upper arm felt like i was getting...,4
3,QEVuQwEA+WkNxtp/qkHvN2YmTBBDqg==,2.0,female,CA,2017-04-28,3169,Condition,hip pain when gettin up,3
4,QEVuQwEA+WkNxtp/qkHvN2YmTBBDqg==,2.0,female,CA,2017-04-28,3170,Condition,pain in hand joints,4
...,...,...,...,...,...,...,...,...,...
7976218,QEVuQwEAtlfm8VyoxZ9biWjDHb74gQ==,22.0,female,GB,2019-12-04,1,Tag,tired,
7976219,QEVuQwEAtlfm8VyoxZ9biWjDHb74gQ==,22.0,female,GB,2019-12-04,2,Tag,stressed,
7976220,QEVuQwEAtlfm8VyoxZ9biWjDHb74gQ==,22.0,female,GB,2019-12-04,9002,Food,soup,
7976221,QEVuQwEAtlfm8VyoxZ9biWjDHb74gQ==,22.0,female,GB,2019-12-04,9139,Food,yogurt,


In [24]:
# Check for Nulls
null_count = df.isnull().sum().sum()
null_count

1666202

In [25]:
# Count rows
row_count = df.shape[0]
row_count

7976223

In [29]:
# Percentage of nulls
round(null_count/row_count * 100,2)

20.89

In [30]:
# Case deletion
df.dropna()

Unnamed: 0,user_id,age,sex,country,checkin_date,trackable_id,trackable_type,trackable_name,trackable_value
1,QEVuQwEAWRNGnuTRqXG2996KSkTIEw==,32.0,male,US,2015-11-26,1069,Condition,Ulcerative colitis,0
2,QEVuQwEA+WkNxtp/qkHvN2YmTBBDqg==,2.0,female,CA,2017-04-28,3168,Condition,pain in left upper arm felt like i was getting...,4
3,QEVuQwEA+WkNxtp/qkHvN2YmTBBDqg==,2.0,female,CA,2017-04-28,3169,Condition,hip pain when gettin up,3
4,QEVuQwEA+WkNxtp/qkHvN2YmTBBDqg==,2.0,female,CA,2017-04-28,3170,Condition,pain in hand joints,4
5,QEVuQwEA+WkNxtp/qkHvN2YmTBBDqg==,2.0,female,CA,2017-04-28,3171,Condition,numbness in right hand,2
...,...,...,...,...,...,...,...,...,...
7976213,QEVuQwEAtlfm8VyoxZ9biWjDHb74gQ==,22.0,female,GB,2019-12-04,3368,Symptom,difficulty getting up,4
7976214,QEVuQwEAtlfm8VyoxZ9biWjDHb74gQ==,22.0,female,GB,2019-12-04,153,Symptom,Neck pain,2
7976215,QEVuQwEAtlfm8VyoxZ9biWjDHb74gQ==,22.0,female,GB,2019-12-04,242,Symptom,Fatigue,3
7976216,QEVuQwEAtlfm8VyoxZ9biWjDHb74gQ==,22.0,female,GB,2019-12-04,1026,Symptom,Poor concentration,3
