# Obesity Among Children and Adolescents 2-19 Analysis
This notebook contains analysis on obesity amongst children using age, sex, year, race, and other factors to determine what factors make individuals aged 2-19 more likely to be obese.
## Project Introduction
With obesity at all time rates in the United States it is important to understand the individuals most at risk. Obesity is a pressing public health concern with far-reaching implications for everyone. This analysis is focused on the critical issue of obesity among children and adolescents aged 2-19 years, exploring data from selected population characteristics. This information was made available by the Centers of Disease Control and Prevention (CDC) and contains data from 1988-2018. The data gives us access to information regarding percent poverty level, race, hispanic origin, age, and sex of a age range in a given year. Few interesting questions we are investigating are if income is the leading factor in obesity. Though, children and many adolscents do not have income, we can use the poverty level to ask the question if the leading factor of obesity in individuals 2-19 is poverty. Other questions that can be answered are is the trend in obesity the same in all races, or if younger age range are more affected by obesity. 
## Any Changes
As of now, no changes have been made to the scope of the project since the check in. The current scope of the project is 
## Data Cleaning - Obesity Among Children and Adolescents 2-19 Analysis


In [154]:
import pandas as pd
import plotly as px
from obesity_children import final_clean, remove_single_unique, remove_num_columns

### Before Cleaning Data Set
Printed below is the original dataset

In [147]:
obesity_children = pd.read_csv("Obesity_among_children_and_adolescents_aged_2_19_years__by_selected_characteristics__United_States.csv")
obesity_children.head()


Unnamed: 0,INDICATOR,PANEL,PANEL_NUM,UNIT,UNIT_NUM,STUB_NAME,STUB_NAME_NUM,STUB_LABEL_NUM,STUB_LABEL,YEAR,YEAR_NUM,AGE,AGE_NUM,ESTIMATE,SE,FLAG
0,Obesity among children and adolescents aged 2-...,2-19 years,1,"Percent of population, crude",1,Total,0,0.0,2-19 years,1988-1994,1,2-19 years,0.0,10.0,0.5,
1,Obesity among children and adolescents aged 2-...,2-19 years,1,"Percent of population, crude",1,Total,0,0.0,2-19 years,1999-2002,2,2-19 years,0.0,14.8,0.7,
2,Obesity among children and adolescents aged 2-...,2-19 years,1,"Percent of population, crude",1,Total,0,0.0,2-19 years,2001-2004,3,2-19 years,0.0,16.3,0.8,
3,Obesity among children and adolescents aged 2-...,2-19 years,1,"Percent of population, crude",1,Total,0,0.0,2-19 years,2003-2006,4,2-19 years,0.0,16.3,0.9,
4,Obesity among children and adolescents aged 2-...,2-19 years,1,"Percent of population, crude",1,Total,0,0.0,2-19 years,2005-2008,5,2-19 years,0.0,16.2,0.9,


In [136]:
obesity_single_unique = remove_single_unique(obesity_children)
obesity_clean = remove_num_columns(obesity_single_unique)
sex_race_his_obesity, race_his_obesity, total_obesity, poverty_obesity, sex_obesity, age_obesity = final_clean(obesity_clean)

### After Cleaning Data Set
The functions above were called to clean the data. Below are the cleaned datasets split into categories based on population characteristics.

In [137]:
print(sex_race_his_obesity.head())

         PANEL                         STUB_NAME  \
80  2-19 years  Sex and race and Hispanic origin   
81  2-19 years  Sex and race and Hispanic origin   
82  2-19 years  Sex and race and Hispanic origin   
83  2-19 years  Sex and race and Hispanic origin   
84  2-19 years  Sex and race and Hispanic origin   

                                           STUB_LABEL       YEAR         AGE  \
80  Male: Not Hispanic or Latino: Black or African...  1988-1994  2-19 years   
81  Male: Not Hispanic or Latino: Black or African...  1999-2002  2-19 years   
82  Male: Not Hispanic or Latino: Black or African...  2001-2004  2-19 years   
83  Male: Not Hispanic or Latino: Black or African...  2003-2006  2-19 years   
84  Male: Not Hispanic or Latino: Black or African...  2005-2008  2-19 years   

    ESTIMATE   SE  
80      10.6  0.8  
81      16.0  0.9  
82      16.0  1.0  
83      17.4  1.0  
84      17.8  1.3  


In [138]:
print(race_his_obesity.head())

         PANEL                 STUB_NAME                          STUB_LABEL  \
30  2-19 years  Race and Hispanic origin  Not Hispanic or Latino: White only   
31  2-19 years  Race and Hispanic origin  Not Hispanic or Latino: White only   
32  2-19 years  Race and Hispanic origin  Not Hispanic or Latino: White only   
33  2-19 years  Race and Hispanic origin  Not Hispanic or Latino: White only   
34  2-19 years  Race and Hispanic origin  Not Hispanic or Latino: White only   

         YEAR         AGE  ESTIMATE   SE  
30  1988-1994  2-19 years       9.2  0.7  
31  1999-2002  2-19 years      12.6  1.0  
32  2001-2004  2-19 years      15.1  1.2  
33  2003-2006  2-19 years      14.7  1.3  
34  2005-2008  2-19 years      14.1  1.4  


In [149]:
print(poverty_obesity.head())


          PANEL                 STUB_NAME  STUB_LABEL       YEAR         AGE  \
171  2-19 years  Percent of poverty level  Below 100%  1988-1994  2-19 years   
172  2-19 years  Percent of poverty level  Below 100%  1999-2002  2-19 years   
173  2-19 years  Percent of poverty level  Below 100%  2001-2004  2-19 years   
174  2-19 years  Percent of poverty level  Below 100%  2003-2006  2-19 years   
175  2-19 years  Percent of poverty level  Below 100%  2005-2008  2-19 years   

     ESTIMATE   SE  
171      12.6  1.2  
172      17.6  1.1  
173      17.9  1.3  
174      18.9  1.4  
175      19.9  1.4  


160

In [168]:
print(total_obesity)
print("\n\n----------------------------------------------\n\n")
total_obesity.info()
print(len(total_obesity))
def convert_time(time):
    time_range = time.split('-') 
    return pd.date_range(start = time_range[0], end = time_range[1], freq = 'YS')


total_obesity["YEAR"].apply(convert_time)

        PANEL STUB_NAME  STUB_LABEL       YEAR         AGE  ESTIMATE   SE
0  2-19 years     Total  2-19 years  1988-1994  2-19 years      10.0  0.5
1  2-19 years     Total  2-19 years  1999-2002  2-19 years      14.8  0.7
2  2-19 years     Total  2-19 years  2001-2004  2-19 years      16.3  0.8
3  2-19 years     Total  2-19 years  2003-2006  2-19 years      16.3  0.9
4  2-19 years     Total  2-19 years  2005-2008  2-19 years      16.2  0.9
5  2-19 years     Total  2-19 years  2007-2010  2-19 years      16.8  0.7
6  2-19 years     Total  2-19 years  2009-2012  2-19 years      16.9  0.6
7  2-19 years     Total  2-19 years  2011-2014  2-19 years      17.0  0.7
8  2-19 years     Total  2-19 years  2013-2016  2-19 years      17.8  0.8
9  2-19 years     Total  2-19 years  2015-2018  2-19 years      18.9  0.8


----------------------------------------------


<class 'pandas.core.frame.DataFrame'>
Index: 10 entries, 0 to 9
Data columns (total 7 columns):
 #   Column      Non-Null Count  Dtype 

0    DatetimeIndex(['1988-01-01', '1989-01-01', '19...
1    DatetimeIndex(['1999-01-01', '2000-01-01', '20...
2    DatetimeIndex(['2001-01-01', '2002-01-01', '20...
3    DatetimeIndex(['2003-01-01', '2004-01-01', '20...
4    DatetimeIndex(['2005-01-01', '2006-01-01', '20...
5    DatetimeIndex(['2007-01-01', '2008-01-01', '20...
6    DatetimeIndex(['2009-01-01', '2010-01-01', '20...
7    DatetimeIndex(['2011-01-01', '2012-01-01', '20...
8    DatetimeIndex(['2013-01-01', '2014-01-01', '20...
9    DatetimeIndex(['2015-01-01', '2016-01-01', '20...
Name: YEAR, dtype: object

In [148]:
print(sex_obesity.head())


         PANEL STUB_NAME STUB_LABEL       YEAR         AGE  ESTIMATE   SE
10  2-19 years       Sex       Male  1988-1994  2-19 years      10.2  0.7
11  2-19 years       Sex       Male  1999-2002  2-19 years      15.5  0.8
12  2-19 years       Sex       Male  2001-2004  2-19 years      17.3  0.9
13  2-19 years       Sex       Male  2003-2006  2-19 years      17.0  1.1
14  2-19 years       Sex       Male  2005-2008  2-19 years      16.8  1.0


80

In [142]:
print(age_obesity.head())

         PANEL STUB_NAME STUB_LABEL       YEAR        AGE  ESTIMATE   SE
230  2-5 years       Age  2-5 years  1988-1994  2-5 years       7.2  0.7
231  2-5 years       Age  2-5 years  1999-2002  2-5 years      10.3  1.2
232  2-5 years       Age  2-5 years  2001-2004  2-5 years      12.4  1.2
233  2-5 years       Age  2-5 years  2003-2006  2-5 years      12.5  1.0
234  2-5 years       Age  2-5 years  2005-2008  2-5 years      10.5  0.9


## Exploratory Data Analysis (EDA) 


## Visualizations

## Reflections 

## Next Steps