**Cleaning of Home Health Care Data and Creating Single Spreadsheet**
***
The goal of this cleaning is to join the ratings spreadsheet with additional ratings into a single spreadsheet.<br>
Furthermore, columns will be renamed for ease of use.<br>
By the end of this cleaning all clean spreadsheets will be merged into a final spreadsheet for analysis.


In [1]:
# Importing Necessary Tools
import pandas as pd
import numpy as np

In [2]:
# ImportDataFrames for Cleaning
df_rate = pd.read_csv('HHC_Rate.csv')
df_add = pd.read_csv('HHC_Report.csv')
df_cleaned = pd.read_csv('Readmissions_Cleaned.csv')

In [3]:
df_rate.head()

Unnamed: 0,State,Quality of Patient Care Star Rating,How often the home health team began their patients’ care in a timely manner,How often the home health team taught patients (or their family caregivers) about their drugs,How often the home health team checked patients’ risk of falling,How often the home health team checked patients for depression,How often the home health team made sure that their patients have received a flu shot for the current flu season.,How often the home health team made sure that their patients have received a pneumococcal vaccine (pneumonia shot).,"With diabetes, how often the home health team got doctor’s orders, gave foot care, and taught patients about foot care",How often the home health team checked patients for pain,...,How often the home health team checked patients for the risk of developing pressure sores (bed sores),How often patients got better at walking or moving around,How often patients got better at getting in and out of bed,How often patients got better at bathing,How often patients had less pain when moving around,How often patients’ breathing improved,How often patients’ wounds improved or healed after an operation,How often patients got better at taking their drugs correctly by mouth,How often home health patients had to be admitted to the hospital,"How often patients receiving home health care needed urgent, unplanned care in the ER without being admitted"
0,AK,2.5,76.7,94.5,99.4,93.9,64.7,74.4,96.1,97.5,...,99.4,55.6,54.0,61.4,55.1,58.8,80.4,45.8,14.2,17.3
1,AL,3.5,94.3,97.8,99.6,98.4,72.4,73.3,96.9,99.6,...,99.5,72.9,66.7,77.3,75.6,74.5,91.3,62.5,17.5,12.2
2,AR,3.0,94.3,95.9,99.5,98.6,75.0,78.4,96.7,99.3,...,99.2,71.1,67.8,76.1,71.9,71.7,91.8,61.1,17.4,13.4
3,AZ,3.5,93.4,96.9,99.4,97.3,74.9,77.4,97.2,98.7,...,98.9,63.1,61.6,70.7,67.8,71.3,85.8,55.2,14.9,13.5
4,CA,3.5,90.3,97.2,99.1,97.7,73.6,74.8,97.0,98.9,...,98.5,66.1,62.0,71.4,75.2,71.9,91.9,55.5,14.8,11.9


In [4]:
df_add.head()

Unnamed: 0,State,Percent of patients who reported that their home health team gave care in a professional way,Percent of patients who reported that their home health team communicated well with them,"Percent of patients who reported that their home health team discussed medicines, pain, and home safety with them",Percent of patients who gave their home health agency a rating of 9 or 10 on a scale from 0 (lowest) to 10 (highest),"Percent of patients who reported YES, they would definitely recommend the home health agency to friends and family",Number of completed Surveys,Response rate
0,AK,89,85,81,82,81,,
1,AL,91,89,87,89,84,,
2,AR,90,87,86,87,83,,
3,AZ,87,84,80,81,75,,
4,CA,85,83,82,80,74,,


In [5]:
# As You Can See the Column Names Are Too Long.  We'll Take Care of That After We Join the
# Two Tables and Drop the Last Two NaN-Filled Columns.

df = pd.merge(df_rate,df_add, on = 'State')
df = df.dropna(axis=1)

In [6]:
# Checking to Ensure Only Two Columns Were Dropped.  Expected Amount is 29 Columns.
# Also Checking to Ensure US Territories Are Dropped in the Merge.  Expected Amount is 51 Rows.
df.shape


(51, 29)

In [7]:
df.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 51 entries, 0 to 50
Data columns (total 29 columns):
State                                                                                                                     51 non-null object
Quality of Patient Care Star Rating                                                                                       51 non-null float64
How often the home health team began their patients’ care in a timely manner                                              51 non-null float64
How often the home health team taught patients (or their family caregivers) about their drugs                             51 non-null float64
How often the home health team checked patients’ risk of falling                                                          51 non-null float64
How often the home health team checked patients for depression                                                            51 non-null float64
How often the home health team made sure that 

In [8]:
# Creating List and Renaming Columns
col = ['state','star_rating','timeliness','rx_ed','fall_risk','depression_check','flu_shot',
      'pneumonia_shot','d_foot_care','pain_check','pain_treat','heart_f_treat','p_sore_action',
      'p_sore_intreat','p_sore_check','move_buff','in_out_bed_buff','bathing_buff','move_pain_debuff',
      'breathing_buff','healing_buff','oral_rx_buff','hospital_admit','urgent_noadmit',
      'pos_prof_report','pos_communicate_report','pos_treatment_ed_report','nine_to_ten_rate',
      'would_recommend']
df.columns=col

In [9]:
df.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 51 entries, 0 to 50
Data columns (total 29 columns):
state                      51 non-null object
star_rating                51 non-null float64
timeliness                 51 non-null float64
rx_ed                      51 non-null float64
fall_risk                  51 non-null float64
depression_check           51 non-null float64
flu_shot                   51 non-null float64
pneumonia_shot             51 non-null float64
d_foot_care                51 non-null float64
pain_check                 51 non-null float64
pain_treat                 51 non-null float64
heart_f_treat              51 non-null float64
p_sore_action              51 non-null float64
p_sore_intreat             51 non-null float64
p_sore_check               51 non-null float64
move_buff                  51 non-null float64
in_out_bed_buff            51 non-null float64
bathing_buff               51 non-null float64
move_pain_debuff           51 non-null float64
breat

In [10]:
# Final Completely Merged DataFrame
final = pd.merge(df_cleaned,df, on='state')
final.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 51 entries, 0 to 50
Data columns (total 37 columns):
Unnamed: 0                 51 non-null int64
state                      51 non-null object
hospital_count             51 non-null int64
readmission_ratio          51 non-null float64
discharges                 51 non-null float64
predicted_rate             51 non-null float64
expected_rate              51 non-null float64
readmissions               51 non-null float64
excessive_count            51 non-null int64
star_rating                51 non-null float64
timeliness                 51 non-null float64
rx_ed                      51 non-null float64
fall_risk                  51 non-null float64
depression_check           51 non-null float64
flu_shot                   51 non-null float64
pneumonia_shot             51 non-null float64
d_foot_care                51 non-null float64
pain_check                 51 non-null float64
pain_treat                 51 non-null float64
heart_f_tre

In [11]:
# Removing Extra Unnamed: 0 Index Column
del final['Unnamed: 0']

In [12]:
# Save and Print Final DataFrame Heading
final.to_csv('hhc_readmit_final.csv')
final.head()

Unnamed: 0,state,hospital_count,readmission_ratio,discharges,predicted_rate,expected_rate,readmissions,excessive_count,star_rating,timeliness,...,breathing_buff,healing_buff,oral_rx_buff,hospital_admit,urgent_noadmit,pos_prof_report,pos_communicate_report,pos_treatment_ed_report,nine_to_ten_rate,would_recommend
0,AK,8,0.969563,5019.0,530.2,548.7,606.0,11,2.5,76.7,...,58.8,80.4,45.8,14.2,17.3,89,85,81,82,81
1,AL,85,1.017475,95303.0,5351.5,5308.2,15305.0,188,3.5,94.3,...,74.5,91.3,62.5,17.5,12.2,91,89,87,89,84
2,AR,45,1.032275,61703.0,2973.3,2879.7,9965.0,127,3.0,94.3,...,71.7,91.8,61.1,17.4,13.4,90,87,86,87,83
3,AZ,63,0.988116,76353.0,3930.2,3990.3,10290.0,104,3.5,93.4,...,71.3,85.8,55.2,14.9,13.5,87,84,80,81,75
4,CA,297,1.000689,303151.0,19823.2,19733.7,49252.0,580,3.5,90.3,...,71.9,91.9,55.5,14.8,11.9,85,83,82,80,74
