## Cont. Data Cleaning

The goals:
1. To calculate the **estimated number** of the survey participant per location
2. To calculate the **per capita ratio** of participants per region

#### How to:

To get the count of an indicator, multiply the percentage by the total number of older adults in the location. 

For example, if the percentage is \(9.4\%\) and the population is approximately \(50\) million, the count would be \(50,000,000 * 0.094 = 4,700,000\). 

In [1]:
import pandas as pd

In [2]:
alzh_df = pd.read_excel('Alzheimers_Healthy_Aging_Clean_dataset.xlsx', sheet_name='All')

In [3]:
alzh_df.drop(columns='Column1',inplace=True)

In [4]:
alzh_df['Geographical Area'].unique()

array(['United States, DC & Territories', 'Northeast', 'South', 'West',
       'Midwest'], dtype=object)

In [5]:
alzh_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 30489 entries, 0 to 30488
Data columns (total 24 columns):
 #   Column                      Non-Null Count  Dtype         
---  ------                      --------------  -----         
 0   YearStart                   30489 non-null  datetime64[ns]
 1   RowId                       30489 non-null  object        
 2   YearEnd                     30489 non-null  int64         
 3   LocationAbbr                30489 non-null  object        
 4   Geographical Area           30489 non-null  object        
 5   Population                  30489 non-null  int64         
 6   Class                       30489 non-null  object        
 7   Topic                       30489 non-null  object        
 8   Question                    30489 non-null  object        
 9   Data_Value_Type             30489 non-null  object        
 10  Data_Value_Interpolated     30489 non-null  float64       
 11  Data_Value                  30066 non-null  float64   

In [6]:
alzh_df

Unnamed: 0,YearStart,RowId,YearEnd,LocationAbbr,Geographical Area,Population,Class,Topic,Question,Data_Value_Type,...,Data_Value_Footnote,Low_Confidence_Limit,High_Confidence_Limit,StratificationCategory1,Stratification1,StratificationCategory2,Stratification2,StratificationCategoryID2,StratificationID2,Type
0,2020-01-01,BRFSS~2020~2020~59~Q39~TGC04~AGE~RACE,2020,US,"United States, DC & Territories",335109142,Caregiving,Intensity of caregiving among older adults,Average of 20 or more hours of care per week p...,Percentage,...,Fewer than 50 States reporting,40.2,79.6,Age Group,65 years or older,Race/Ethnicity,Hispanic,RACE,HIS,All
1,2018-01-01,BRFSS~2018~2018~59~Q39~TGC04~AGE~RACE,2018,US,"United States, DC & Territories",330135836,Caregiving,Intensity of caregiving among older adults,Average of 20 or more hours of care per week p...,Percentage,...,Fewer than 50 States reporting,24.5,54.7,Age Group,50-64 years,Race/Ethnicity,Hispanic,RACE,HIS,All
2,2020-01-01,BRFSS~2020~2020~59~Q39~TGC04~AGE~RACE,2020,US,"United States, DC & Territories",335109142,Caregiving,Intensity of caregiving among older adults,Average of 20 or more hours of care per week p...,Percentage,...,Fewer than 50 States reporting,37.7,74.5,Age Group,Overall,Race/Ethnicity,Native Am/Alaskan Native,RACE,NAA,All
3,2022-01-01,BRFSS~2022~2022~59~Q39~TGC04~AGE~RACE,2022,US,"United States, DC & Territories",337489121,Caregiving,Intensity of caregiving among older adults,Average of 20 or more hours of care per week p...,Percentage,...,Fewer than 50 States reporting,33.5,46.7,Age Group,50-64 years,Race/Ethnicity,"Black, non-Hispanic",RACE,BLK,All
4,2022-01-01,BRFSS~2022~2022~59~Q39~TGC04~AGE~RACE,2022,US,"United States, DC & Territories",337489121,Caregiving,Intensity of caregiving among older adults,Average of 20 or more hours of care per week p...,Percentage,...,Fewer than 50 States reporting,23.0,47.1,Age Group,Overall,Race/Ethnicity,Asian/Pacific Islander,RACE,ASN,All
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
30484,2019-01-01,BRFSS~2019~2019~9002~Q44~TOC12~AGE~GENDER,2019,MDW,Midwest,68329004,Overall Health,Severe joint pain among older adults with arth...,Severe joint pain due to arthritis among older...,Percentage,...,,17.9,21.0,Age Group,65 years or older,Gender,Male,GENDER,MALE,Region
30485,2017-01-01,BRFSS~2017~2017~9002~Q44~TOC12~AGE~OVERALL,2017,MDW,Midwest,68126781,Overall Health,Severe joint pain among older adults with arth...,Severe joint pain due to arthritis among older...,Percentage,...,,20.7,22.7,Age Group,65 years or older,,,OVERALL,OVERALL,Region
30486,2017-01-01,BRFSS~2017~2017~9002~Q44~TOC12~AGE~RACE,2017,MDW,Midwest,68126781,Overall Health,Severe joint pain among older adults with arth...,Severe joint pain due to arthritis among older...,Percentage,...,Regional estimates may not represent all state...,42.6,50.3,Age Group,Overall,Race/Ethnicity,"Black, non-Hispanic",RACE,BLK,Region
30487,2015-01-01,BRFSS~2015~2015~9002~Q44~TOC12~AGE~GENDER,2015,MDW,Midwest,67860583,Overall Health,Severe joint pain among older adults with arth...,Severe joint pain due to arthritis among older...,Percentage,...,,24.6,27.4,Age Group,65 years or older,Gender,Female,GENDER,FEMALE,Region


In [7]:
alzh_df.isnull().sum()

YearStart                         0
RowId                             0
YearEnd                           0
LocationAbbr                      0
Geographical Area                 0
Population                        0
Class                             0
Topic                             0
Question                          0
Data_Value_Type                   0
Data_Value_Interpolated           0
Data_Value                      423
Data_Value_Alt                  423
Data_Value_Footnote_Symbol    12358
Data_Value_Footnote           12358
Low_Confidence_Limit            634
High_Confidence_Limit           634
StratificationCategory1           0
Stratification1                   0
StratificationCategory2        3870
Stratification2                3870
StratificationCategoryID2         0
StratificationID2                 0
Type                              0
dtype: int64

In [8]:
alzh_df.isnull().sum()

YearStart                         0
RowId                             0
YearEnd                           0
LocationAbbr                      0
Geographical Area                 0
Population                        0
Class                             0
Topic                             0
Question                          0
Data_Value_Type                   0
Data_Value_Interpolated           0
Data_Value                      423
Data_Value_Alt                  423
Data_Value_Footnote_Symbol    12358
Data_Value_Footnote           12358
Low_Confidence_Limit            634
High_Confidence_Limit           634
StratificationCategory1           0
Stratification1                   0
StratificationCategory2        3870
Stratification2                3870
StratificationCategoryID2         0
StratificationID2                 0
Type                              0
dtype: int64

### 1. To calculate the **estimated number** of the survey participant per location

In [9]:
alzh_df.Data_Value_Interpolated.unique() #checking Data Value Intr. values

array([ 61.9 ,  38.5 ,  57.1 ,  39.9 ,  34.  ,  23.4 ,  28.4 ,  31.8 ,
        37.2 ,  60.3 ,  32.3 ,  34.4 ,  41.  ,  15.9 ,  32.8 ,  47.3 ,
        31.7 ,  36.2 ,  39.3 ,  32.6 ,  62.4 ,  32.  ,  41.5 ,  25.6 ,
        33.2 ,  35.3 ,  28.6 ,  27.1 ,  31.9 ,  33.1 ,  38.7 ,  34.8 ,
        37.8 ,  34.6 ,  29.7 ,  35.9 ,  29.9 ,  22.4 ,  31.3 ,  33.3 ,
        30.7 ,  38.6 ,  35.7 ,  30.3 ,  34.1 ,  25.5 ,  44.6 ,  35.4 ,
        43.1 ,  29.3 ,  39.4 ,  34.2 ,  47.2 ,  37.5 ,  51.2 ,  32.5 ,
        27.7 ,  66.5 ,  30.5 ,  47.9 ,  69.  ,  33.4 ,  39.8 ,  32.4 ,
        45.6 ,  31.6 ,  46.9 ,  23.2 ,  29.  ,  39.1 ,  36.3 ,  30.9 ,
        34.5 ,  27.3 ,  35.2 ,  24.4 ,  27.5 ,  45.5 ,  36.5 ,  27.2 ,
        45.3 ,  28.7 ,  48.1 ,  38.  ,  49.  ,  52.  ,  70.5 ,  44.8 ,
        34.3 ,  57.  ,  35.1 ,  88.5 ,  30.1 ,  31.2 ,  55.2 ,  53.9 ,
        38.4 ,  33.  ,  36.8 ,  40.2 ,  28.8 ,  45.2 ,  29.8 ,  38.2 ,
        26.9 ,  52.7 ,  43.5 ,  30.2 ,  29.6 ,  33.8 ,  40.6 ,  36.1 ,
      

In [10]:
alzh_df.Data_Value_Interpolated/100 #breaking down the calculation before adding to the dataframe to calculate the count

0        0.619
1        0.385
2        0.571
3        0.399
4        0.340
         ...  
30484    0.194
30485    0.217
30486    0.465
30487    0.259
30488    0.230
Name: Data_Value_Interpolated, Length: 30489, dtype: float64

In [11]:
alzh_df['Data_Value_Interpolated_over100'] = pd.DataFrame(alzh_df.Data_Value_Interpolated/100)

In [12]:
alzh_df[['Data_Value_Interpolated','Data_Value_Interpolated_over100']]

Unnamed: 0,Data_Value_Interpolated,Data_Value_Interpolated_over100
0,61.9,0.619
1,38.5,0.385
2,57.1,0.571
3,39.9,0.399
4,34.0,0.340
...,...,...
30484,19.4,0.194
30485,21.7,0.217
30486,46.5,0.465
30487,25.9,0.259


In [13]:
# ParCount_df['Geographical Area'] = alzh_df['Geographical Area'] #adding Geo location column

In [14]:
# ParCount_df['Population'] = alzh_df['Population'] #adding population column

In [15]:
alzh_df #sanity check

Unnamed: 0,YearStart,RowId,YearEnd,LocationAbbr,Geographical Area,Population,Class,Topic,Question,Data_Value_Type,...,Low_Confidence_Limit,High_Confidence_Limit,StratificationCategory1,Stratification1,StratificationCategory2,Stratification2,StratificationCategoryID2,StratificationID2,Type,Data_Value_Interpolated_over100
0,2020-01-01,BRFSS~2020~2020~59~Q39~TGC04~AGE~RACE,2020,US,"United States, DC & Territories",335109142,Caregiving,Intensity of caregiving among older adults,Average of 20 or more hours of care per week p...,Percentage,...,40.2,79.6,Age Group,65 years or older,Race/Ethnicity,Hispanic,RACE,HIS,All,0.619
1,2018-01-01,BRFSS~2018~2018~59~Q39~TGC04~AGE~RACE,2018,US,"United States, DC & Territories",330135836,Caregiving,Intensity of caregiving among older adults,Average of 20 or more hours of care per week p...,Percentage,...,24.5,54.7,Age Group,50-64 years,Race/Ethnicity,Hispanic,RACE,HIS,All,0.385
2,2020-01-01,BRFSS~2020~2020~59~Q39~TGC04~AGE~RACE,2020,US,"United States, DC & Territories",335109142,Caregiving,Intensity of caregiving among older adults,Average of 20 or more hours of care per week p...,Percentage,...,37.7,74.5,Age Group,Overall,Race/Ethnicity,Native Am/Alaskan Native,RACE,NAA,All,0.571
3,2022-01-01,BRFSS~2022~2022~59~Q39~TGC04~AGE~RACE,2022,US,"United States, DC & Territories",337489121,Caregiving,Intensity of caregiving among older adults,Average of 20 or more hours of care per week p...,Percentage,...,33.5,46.7,Age Group,50-64 years,Race/Ethnicity,"Black, non-Hispanic",RACE,BLK,All,0.399
4,2022-01-01,BRFSS~2022~2022~59~Q39~TGC04~AGE~RACE,2022,US,"United States, DC & Territories",337489121,Caregiving,Intensity of caregiving among older adults,Average of 20 or more hours of care per week p...,Percentage,...,23.0,47.1,Age Group,Overall,Race/Ethnicity,Asian/Pacific Islander,RACE,ASN,All,0.340
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
30484,2019-01-01,BRFSS~2019~2019~9002~Q44~TOC12~AGE~GENDER,2019,MDW,Midwest,68329004,Overall Health,Severe joint pain among older adults with arth...,Severe joint pain due to arthritis among older...,Percentage,...,17.9,21.0,Age Group,65 years or older,Gender,Male,GENDER,MALE,Region,0.194
30485,2017-01-01,BRFSS~2017~2017~9002~Q44~TOC12~AGE~OVERALL,2017,MDW,Midwest,68126781,Overall Health,Severe joint pain among older adults with arth...,Severe joint pain due to arthritis among older...,Percentage,...,20.7,22.7,Age Group,65 years or older,,,OVERALL,OVERALL,Region,0.217
30486,2017-01-01,BRFSS~2017~2017~9002~Q44~TOC12~AGE~RACE,2017,MDW,Midwest,68126781,Overall Health,Severe joint pain among older adults with arth...,Severe joint pain due to arthritis among older...,Percentage,...,42.6,50.3,Age Group,Overall,Race/Ethnicity,"Black, non-Hispanic",RACE,BLK,Region,0.465
30487,2015-01-01,BRFSS~2015~2015~9002~Q44~TOC12~AGE~GENDER,2015,MDW,Midwest,67860583,Overall Health,Severe joint pain among older adults with arth...,Severe joint pain due to arthritis among older...,Percentage,...,24.6,27.4,Age Group,65 years or older,Gender,Female,GENDER,FEMALE,Region,0.259


In [16]:
alzh_df['Est.Count_Per_Location'] = alzh_df['Data_Value_Interpolated_over100']*alzh_df['Population'] #

In [17]:
alzh_df['Est.Count_Per_Location'] = alzh_df['Est.Count_Per_Location'].astype('int64')

In [18]:
alzh_df

Unnamed: 0,YearStart,RowId,YearEnd,LocationAbbr,Geographical Area,Population,Class,Topic,Question,Data_Value_Type,...,High_Confidence_Limit,StratificationCategory1,Stratification1,StratificationCategory2,Stratification2,StratificationCategoryID2,StratificationID2,Type,Data_Value_Interpolated_over100,Est.Count_Per_Location
0,2020-01-01,BRFSS~2020~2020~59~Q39~TGC04~AGE~RACE,2020,US,"United States, DC & Territories",335109142,Caregiving,Intensity of caregiving among older adults,Average of 20 or more hours of care per week p...,Percentage,...,79.6,Age Group,65 years or older,Race/Ethnicity,Hispanic,RACE,HIS,All,0.619,207432558
1,2018-01-01,BRFSS~2018~2018~59~Q39~TGC04~AGE~RACE,2018,US,"United States, DC & Territories",330135836,Caregiving,Intensity of caregiving among older adults,Average of 20 or more hours of care per week p...,Percentage,...,54.7,Age Group,50-64 years,Race/Ethnicity,Hispanic,RACE,HIS,All,0.385,127102296
2,2020-01-01,BRFSS~2020~2020~59~Q39~TGC04~AGE~RACE,2020,US,"United States, DC & Territories",335109142,Caregiving,Intensity of caregiving among older adults,Average of 20 or more hours of care per week p...,Percentage,...,74.5,Age Group,Overall,Race/Ethnicity,Native Am/Alaskan Native,RACE,NAA,All,0.571,191347320
3,2022-01-01,BRFSS~2022~2022~59~Q39~TGC04~AGE~RACE,2022,US,"United States, DC & Territories",337489121,Caregiving,Intensity of caregiving among older adults,Average of 20 or more hours of care per week p...,Percentage,...,46.7,Age Group,50-64 years,Race/Ethnicity,"Black, non-Hispanic",RACE,BLK,All,0.399,134658159
4,2022-01-01,BRFSS~2022~2022~59~Q39~TGC04~AGE~RACE,2022,US,"United States, DC & Territories",337489121,Caregiving,Intensity of caregiving among older adults,Average of 20 or more hours of care per week p...,Percentage,...,47.1,Age Group,Overall,Race/Ethnicity,Asian/Pacific Islander,RACE,ASN,All,0.340,114746301
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
30484,2019-01-01,BRFSS~2019~2019~9002~Q44~TOC12~AGE~GENDER,2019,MDW,Midwest,68329004,Overall Health,Severe joint pain among older adults with arth...,Severe joint pain due to arthritis among older...,Percentage,...,21.0,Age Group,65 years or older,Gender,Male,GENDER,MALE,Region,0.194,13255826
30485,2017-01-01,BRFSS~2017~2017~9002~Q44~TOC12~AGE~OVERALL,2017,MDW,Midwest,68126781,Overall Health,Severe joint pain among older adults with arth...,Severe joint pain due to arthritis among older...,Percentage,...,22.7,Age Group,65 years or older,,,OVERALL,OVERALL,Region,0.217,14783511
30486,2017-01-01,BRFSS~2017~2017~9002~Q44~TOC12~AGE~RACE,2017,MDW,Midwest,68126781,Overall Health,Severe joint pain among older adults with arth...,Severe joint pain due to arthritis among older...,Percentage,...,50.3,Age Group,Overall,Race/Ethnicity,"Black, non-Hispanic",RACE,BLK,Region,0.465,31678953
30487,2015-01-01,BRFSS~2015~2015~9002~Q44~TOC12~AGE~GENDER,2015,MDW,Midwest,67860583,Overall Health,Severe joint pain among older adults with arth...,Severe joint pain due to arthritis among older...,Percentage,...,27.4,Age Group,65 years or older,Gender,Female,GENDER,FEMALE,Region,0.259,17575890


### 2. To calculate the **per capita ratio** of participants per region

How to: 

To get the ratio per capita for a US region, divide the indicator's value (count or percentage) by the population of that region. For a percentage, this gives a decimal that represents the proportion per person; for a count, it gives the average count per person. 

You can then multiply this result by 100,000 to get a more readable number per 100,000 people. 

For example:
1. Total cases in the Southeast = \(50,000\)
2. Population of the Southeast = \(10,000,000\)

**Per capita cases** = \(50,000 / 10,000,000 = 0.005\) cases per person 
**Per capita cases per 100,000** = \(0.005 * 100,000=500\) cases per 100,000 people 

In [19]:
alzh_df['Per Capita Ratio'] = alzh_df['Est.Count_Per_Location']/alzh_df['Population']

In [20]:
alzh_df

Unnamed: 0,YearStart,RowId,YearEnd,LocationAbbr,Geographical Area,Population,Class,Topic,Question,Data_Value_Type,...,StratificationCategory1,Stratification1,StratificationCategory2,Stratification2,StratificationCategoryID2,StratificationID2,Type,Data_Value_Interpolated_over100,Est.Count_Per_Location,Per Capita Ratio
0,2020-01-01,BRFSS~2020~2020~59~Q39~TGC04~AGE~RACE,2020,US,"United States, DC & Territories",335109142,Caregiving,Intensity of caregiving among older adults,Average of 20 or more hours of care per week p...,Percentage,...,Age Group,65 years or older,Race/Ethnicity,Hispanic,RACE,HIS,All,0.619,207432558,0.619
1,2018-01-01,BRFSS~2018~2018~59~Q39~TGC04~AGE~RACE,2018,US,"United States, DC & Territories",330135836,Caregiving,Intensity of caregiving among older adults,Average of 20 or more hours of care per week p...,Percentage,...,Age Group,50-64 years,Race/Ethnicity,Hispanic,RACE,HIS,All,0.385,127102296,0.385
2,2020-01-01,BRFSS~2020~2020~59~Q39~TGC04~AGE~RACE,2020,US,"United States, DC & Territories",335109142,Caregiving,Intensity of caregiving among older adults,Average of 20 or more hours of care per week p...,Percentage,...,Age Group,Overall,Race/Ethnicity,Native Am/Alaskan Native,RACE,NAA,All,0.571,191347320,0.571
3,2022-01-01,BRFSS~2022~2022~59~Q39~TGC04~AGE~RACE,2022,US,"United States, DC & Territories",337489121,Caregiving,Intensity of caregiving among older adults,Average of 20 or more hours of care per week p...,Percentage,...,Age Group,50-64 years,Race/Ethnicity,"Black, non-Hispanic",RACE,BLK,All,0.399,134658159,0.399
4,2022-01-01,BRFSS~2022~2022~59~Q39~TGC04~AGE~RACE,2022,US,"United States, DC & Territories",337489121,Caregiving,Intensity of caregiving among older adults,Average of 20 or more hours of care per week p...,Percentage,...,Age Group,Overall,Race/Ethnicity,Asian/Pacific Islander,RACE,ASN,All,0.340,114746301,0.340
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
30484,2019-01-01,BRFSS~2019~2019~9002~Q44~TOC12~AGE~GENDER,2019,MDW,Midwest,68329004,Overall Health,Severe joint pain among older adults with arth...,Severe joint pain due to arthritis among older...,Percentage,...,Age Group,65 years or older,Gender,Male,GENDER,MALE,Region,0.194,13255826,0.194
30485,2017-01-01,BRFSS~2017~2017~9002~Q44~TOC12~AGE~OVERALL,2017,MDW,Midwest,68126781,Overall Health,Severe joint pain among older adults with arth...,Severe joint pain due to arthritis among older...,Percentage,...,Age Group,65 years or older,,,OVERALL,OVERALL,Region,0.217,14783511,0.217
30486,2017-01-01,BRFSS~2017~2017~9002~Q44~TOC12~AGE~RACE,2017,MDW,Midwest,68126781,Overall Health,Severe joint pain among older adults with arth...,Severe joint pain due to arthritis among older...,Percentage,...,Age Group,Overall,Race/Ethnicity,"Black, non-Hispanic",RACE,BLK,Region,0.465,31678953,0.465
30487,2015-01-01,BRFSS~2015~2015~9002~Q44~TOC12~AGE~GENDER,2015,MDW,Midwest,67860583,Overall Health,Severe joint pain among older adults with arth...,Severe joint pain due to arthritis among older...,Percentage,...,Age Group,65 years or older,Gender,Female,GENDER,FEMALE,Region,0.259,17575890,0.259


In [21]:
alzh_df['Per 100,000 People'] = alzh_df['Per Capita Ratio']*100000

In [22]:
alzh_df

Unnamed: 0,YearStart,RowId,YearEnd,LocationAbbr,Geographical Area,Population,Class,Topic,Question,Data_Value_Type,...,Stratification1,StratificationCategory2,Stratification2,StratificationCategoryID2,StratificationID2,Type,Data_Value_Interpolated_over100,Est.Count_Per_Location,Per Capita Ratio,"Per 100,000 People"
0,2020-01-01,BRFSS~2020~2020~59~Q39~TGC04~AGE~RACE,2020,US,"United States, DC & Territories",335109142,Caregiving,Intensity of caregiving among older adults,Average of 20 or more hours of care per week p...,Percentage,...,65 years or older,Race/Ethnicity,Hispanic,RACE,HIS,All,0.619,207432558,0.619,61899.999732
1,2018-01-01,BRFSS~2018~2018~59~Q39~TGC04~AGE~RACE,2018,US,"United States, DC & Territories",330135836,Caregiving,Intensity of caregiving among older adults,Average of 20 or more hours of care per week p...,Percentage,...,50-64 years,Race/Ethnicity,Hispanic,RACE,HIS,All,0.385,127102296,0.385,38499.999740
2,2020-01-01,BRFSS~2020~2020~59~Q39~TGC04~AGE~RACE,2020,US,"United States, DC & Territories",335109142,Caregiving,Intensity of caregiving among older adults,Average of 20 or more hours of care per week p...,Percentage,...,Overall,Race/Ethnicity,Native Am/Alaskan Native,RACE,NAA,All,0.571,191347320,0.571,57099.999976
3,2022-01-01,BRFSS~2022~2022~59~Q39~TGC04~AGE~RACE,2022,US,"United States, DC & Territories",337489121,Caregiving,Intensity of caregiving among older adults,Average of 20 or more hours of care per week p...,Percentage,...,50-64 years,Race/Ethnicity,"Black, non-Hispanic",RACE,BLK,All,0.399,134658159,0.399,39899.999917
4,2022-01-01,BRFSS~2022~2022~59~Q39~TGC04~AGE~RACE,2022,US,"United States, DC & Territories",337489121,Caregiving,Intensity of caregiving among older adults,Average of 20 or more hours of care per week p...,Percentage,...,Overall,Race/Ethnicity,Asian/Pacific Islander,RACE,ASN,All,0.340,114746301,0.340,33999.999959
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
30484,2019-01-01,BRFSS~2019~2019~9002~Q44~TOC12~AGE~GENDER,2019,MDW,Midwest,68329004,Overall Health,Severe joint pain among older adults with arth...,Severe joint pain due to arthritis among older...,Percentage,...,65 years or older,Gender,Male,GENDER,MALE,Region,0.194,13255826,0.194,19399.998864
30485,2017-01-01,BRFSS~2017~2017~9002~Q44~TOC12~AGE~OVERALL,2017,MDW,Midwest,68126781,Overall Health,Severe joint pain among older adults with arth...,Severe joint pain due to arthritis among older...,Percentage,...,65 years or older,,,OVERALL,OVERALL,Region,0.217,14783511,0.217,21699.999300
30486,2017-01-01,BRFSS~2017~2017~9002~Q44~TOC12~AGE~RACE,2017,MDW,Midwest,68126781,Overall Health,Severe joint pain among older adults with arth...,Severe joint pain due to arthritis among older...,Percentage,...,Overall,Race/Ethnicity,"Black, non-Hispanic",RACE,BLK,Region,0.465,31678953,0.465,46499.999758
30487,2015-01-01,BRFSS~2015~2015~9002~Q44~TOC12~AGE~GENDER,2015,MDW,Midwest,67860583,Overall Health,Severe joint pain among older adults with arth...,Severe joint pain due to arthritis among older...,Percentage,...,65 years or older,Gender,Female,GENDER,FEMALE,Region,0.259,17575890,0.259,25899.998531


In [23]:
alzh_df['Per Capita Ratio_Per'] = (alzh_df['Data_Value_Interpolated']/alzh_df['Population'])*100

In [24]:
alzh_df

Unnamed: 0,YearStart,RowId,YearEnd,LocationAbbr,Geographical Area,Population,Class,Topic,Question,Data_Value_Type,...,StratificationCategory2,Stratification2,StratificationCategoryID2,StratificationID2,Type,Data_Value_Interpolated_over100,Est.Count_Per_Location,Per Capita Ratio,"Per 100,000 People",Per Capita Ratio_Per
0,2020-01-01,BRFSS~2020~2020~59~Q39~TGC04~AGE~RACE,2020,US,"United States, DC & Territories",335109142,Caregiving,Intensity of caregiving among older adults,Average of 20 or more hours of care per week p...,Percentage,...,Race/Ethnicity,Hispanic,RACE,HIS,All,0.619,207432558,0.619,61899.999732,0.000018
1,2018-01-01,BRFSS~2018~2018~59~Q39~TGC04~AGE~RACE,2018,US,"United States, DC & Territories",330135836,Caregiving,Intensity of caregiving among older adults,Average of 20 or more hours of care per week p...,Percentage,...,Race/Ethnicity,Hispanic,RACE,HIS,All,0.385,127102296,0.385,38499.999740,0.000012
2,2020-01-01,BRFSS~2020~2020~59~Q39~TGC04~AGE~RACE,2020,US,"United States, DC & Territories",335109142,Caregiving,Intensity of caregiving among older adults,Average of 20 or more hours of care per week p...,Percentage,...,Race/Ethnicity,Native Am/Alaskan Native,RACE,NAA,All,0.571,191347320,0.571,57099.999976,0.000017
3,2022-01-01,BRFSS~2022~2022~59~Q39~TGC04~AGE~RACE,2022,US,"United States, DC & Territories",337489121,Caregiving,Intensity of caregiving among older adults,Average of 20 or more hours of care per week p...,Percentage,...,Race/Ethnicity,"Black, non-Hispanic",RACE,BLK,All,0.399,134658159,0.399,39899.999917,0.000012
4,2022-01-01,BRFSS~2022~2022~59~Q39~TGC04~AGE~RACE,2022,US,"United States, DC & Territories",337489121,Caregiving,Intensity of caregiving among older adults,Average of 20 or more hours of care per week p...,Percentage,...,Race/Ethnicity,Asian/Pacific Islander,RACE,ASN,All,0.340,114746301,0.340,33999.999959,0.000010
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
30484,2019-01-01,BRFSS~2019~2019~9002~Q44~TOC12~AGE~GENDER,2019,MDW,Midwest,68329004,Overall Health,Severe joint pain among older adults with arth...,Severe joint pain due to arthritis among older...,Percentage,...,Gender,Male,GENDER,MALE,Region,0.194,13255826,0.194,19399.998864,0.000028
30485,2017-01-01,BRFSS~2017~2017~9002~Q44~TOC12~AGE~OVERALL,2017,MDW,Midwest,68126781,Overall Health,Severe joint pain among older adults with arth...,Severe joint pain due to arthritis among older...,Percentage,...,,,OVERALL,OVERALL,Region,0.217,14783511,0.217,21699.999300,0.000032
30486,2017-01-01,BRFSS~2017~2017~9002~Q44~TOC12~AGE~RACE,2017,MDW,Midwest,68126781,Overall Health,Severe joint pain among older adults with arth...,Severe joint pain due to arthritis among older...,Percentage,...,Race/Ethnicity,"Black, non-Hispanic",RACE,BLK,Region,0.465,31678953,0.465,46499.999758,0.000068
30487,2015-01-01,BRFSS~2015~2015~9002~Q44~TOC12~AGE~GENDER,2015,MDW,Midwest,67860583,Overall Health,Severe joint pain among older adults with arth...,Severe joint pain due to arthritis among older...,Percentage,...,Gender,Female,GENDER,FEMALE,Region,0.259,17575890,0.259,25899.998531,0.000038


In [25]:
alzh_df['Per Capita Ratio_Per'].astype('float')

0        0.000018
1        0.000012
2        0.000017
3        0.000012
4        0.000010
           ...   
30484    0.000028
30485    0.000032
30486    0.000068
30487    0.000038
30488    0.000034
Name: Per Capita Ratio_Per, Length: 30489, dtype: float64

In [26]:
alzh_df.to_csv('Alzheimers_Healthy_Aging_Dataset_withCalc.csv') #exporting the result to a csv