# **Adult Obesity and Physical Inactivity in Delaware**





# **Procedural Overview**
- After following the directions, you will have created a subset of the County Public Health Data.
- The instructions outline how to obtain a dataset, create and merge new subsets, and export the new subset for further analysis.
- The ending subset of data will enable users to better analyze a key relationship regarding public health in Delaware. The completed csv file will compare adult obesity to physical inactivity in Delaware.


# **Getting Started**
The first set of instructions will aid you in obtaining the original dataset so you can begin to code and manipulate the data.
1. Create a folder in Google Drive.
  - Name the folder something that you will recognize later.
2. Click this [link](https://drive.google.com/file/d/1mn2YSn3mK3lf6-cAkoyHzPPXe1b9CIiZ/view?usp=drive_link) to acess the County Public Health Data Set.
  - download the data set to your computer.
3. Upload the County Public Health Data set to the folder you just created.




# **System Requirements**
We will use Python, a computer programming language, to write code. The County Public Health Data will now need to be uploaded to Python for further use. To do so, follow the directions below.
1. Create a new folder in Google Colabs and name the folder somthing representative of your dataset.
2. To mount your Google Drive into Google Colab notebook, run the following code.


In [None]:
from google.colab import drive
drive.mount('/content/gdrive')

Drive already mounted at /content/gdrive; to attempt to forcibly remount, call drive.mount("/content/gdrive", force_remount=True).


3. Python contains different packages to run codes. We will be using the "pandas" package in Python. To import pandas, run the following code.

In [None]:
import numpy as np
import pandas as pd

4. Next, we will create our dataframe object.
  - pd.read_csv places the csv file into a dataframe
  - CHDATA represents the County Public Health Data; however, you can label the data set however you want
  - Substitue "English 105" with the name of the Google Drive folder that holds the csv file
  - Run the following code

In [None]:
CHDATA=pd.read_csv('gdrive/MyDrive/English 105/CountyHealthData_2014-2015 (5).csv')

# **Creating the Subset**
First, two new subsets must be created that represent Delaware's adult obesity and physical inactivity.
The subset focuses on the state Delaware; therefore, the first step is to create a new code representing only Delaware's data.
2. To isolate Delaware, indicate the desired isolated column, in this case, the column name "state" and "DE" for Delaware
3. Run the code below

In [None]:
CHDATA[CHDATA["State"] == "DE"]

Unnamed: 0,State,Region,Division,County,FIPS,GEOID,SMS Region,Year,Premature death,Poor or fair health,...,Drug poisoning deaths,Uninsured adults,Uninsured children,Health care costs,Could not see doctor due to cost,Other primary care providers,Median household income,Children eligible for free lunch,Homicide rate,Inadequate social support
610,DE,South,South Atlantic,Kent County,10001,10001,Region 23,1/1/2014,8338.0,0.147,...,9.55,0.131,0.044,9261.0,0.107,53.0,51695,0.382,3.93,0.187
611,DE,South,South Atlantic,Kent County,10001,10001,Region 23,1/1/2015,7886.0,0.147,...,13.97,0.125,0.033,9398.0,0.107,57.0,53811,0.441,4.3,
612,DE,South,South Atlantic,New Castle County,10003,10003,Region 23,1/1/2014,7270.0,0.11,...,12.13,0.127,0.043,9272.0,0.095,94.0,62086,0.407,7.49,0.166
613,DE,South,South Atlantic,New Castle County,10003,10003,Region 23,1/1/2015,7335.0,0.11,...,14.03,0.122,0.033,9081.0,0.095,105.0,63033,0.411,7.7,
614,DE,South,South Atlantic,Sussex County,10005,10005,Region 23,1/1/2014,7689.0,0.146,...,11.74,0.169,0.06,9472.0,0.108,52.0,48684,0.493,2.99,0.173
615,DE,South,South Atlantic,Sussex County,10005,10005,Region 23,1/1/2015,7150.0,0.146,...,14.59,0.154,0.038,9476.0,0.108,57.0,50207,0.475,3.3,


4. In order to use the new subset, save the new Delaware subset under a new name.
- This is also where we save the data using .copy() to avoid future run-ins with the SettingwithCopyWarning.

In [None]:
DE_subset = CHDATA[CHDATA["State"] == "DE"].copy()

5. Using the new subset, we now create a table isolating one variable related to Delaware.
- .iolc allows pandas to isolate certian rows and columns. In this case, we want to isolate row 14, which represents adult obesity.
- The colon represents wanting to include all rows.

In [None]:
DE_subset.iloc[:,14]

Unnamed: 0,Adult obesity
610,0.338
611,0.327
612,0.267
613,0.259
614,0.299
615,0.307


6. We run the same code but for column 16, which represents physical inactivity.

In [None]:
DE_subset.iloc[:,16]

Unnamed: 0,Physical inactivity
610,0.276
611,0.274
612,0.227
613,0.216
614,0.262
615,0.261


# **Merging Data**
1. Now, it is time to merge the two data subsets we have created, using key variables.
2. We will create two new data sets to merge, series0 and Series1.
3. Run both series0 and Series1 to see what the new tables look like.


In [None]:
series0 = pd.DataFrame({"State" : ["DE","DE","DE","DE","DE","DE"],
                   "Adult obesity" : [0.338,0.327,0.267,0.259,0.299,0.307]}).copy()

In [None]:
series1 = pd.DataFrame({"State" : ["DE","DE","DE","DE","DE","DE"],
                   "Physical inactitivy" : [0.276,0.274,0.227,0.216,0.262,0.261]}).copy()

In [None]:
series1

Unnamed: 0,State,Physical inactitivy
0,DE,0.276
1,DE,0.274
2,DE,0.227
3,DE,0.216
4,DE,0.262
5,DE,0.261


In [None]:
series0

Unnamed: 0,State,Adult obesity
0,DE,0.338
1,DE,0.327
2,DE,0.267
3,DE,0.259
4,DE,0.299
5,DE,0.307


3. To merge, we are going to use pd.concat to combine series0 and series1.
- Don't forget to copy.

In [None]:
merged_DE= pd.concat([series0,series1]).copy()

In [None]:
merged_DE

Unnamed: 0,State,Adult obesity,Physical inactitivy
0,DE,0.338,
1,DE,0.327,
2,DE,0.267,
3,DE,0.259,
4,DE,0.299,
5,DE,0.307,
0,DE,,0.276
1,DE,,0.274
2,DE,,0.227
3,DE,,0.216


## **Exporting Subset**
To better use and analyze the new data subset, we will create a csv file.
1. We will use .to_csv() to export the data
2. The parentheses will include the filename and index=false to disclude the column of indices from the original notebook.

In [None]:
merged_DE.to_csv("merged_DE.csv", index=False)

Congrats! You have now created a subset of the County Public Health Data!