# **Ellie White - Final Data Science Project**
### Analysis of Global Deaths from Dementia 2000-2021


## Importing libraries and loading data

In [None]:
#Connecting to Google Drive
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [None]:
import plotly.express as px
import plotly.graph_objects as go
import datetime as dt
import pandas as pd
import os
data_path = "/content/drive/MyDrive/Intro_Data_Science/Final_Project/FinalProjectData"
os.chdir(data_path)

In [None]:
# Loading the data files
ogdf_alz = pd.read_csv("deaths-from-alzheimers-other-dementias.csv")
ogdf_pop = pd.read_csv("population-with-un-projections.csv")

In [None]:
ogdf_alz.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4422 entries, 0 to 4421
Data columns (total 4 columns):
 #   Column                                                                    Non-Null Count  Dtype  
---  ------                                                                    --------------  -----  
 0   Entity                                                                    4422 non-null   object 
 1   Code                                                                      4290 non-null   object 
 2   Year                                                                      4422 non-null   int64  
 3   Total deaths from alzheimer disease and other dementias among both sexes  4422 non-null   float64
dtypes: float64(1), int64(1), object(2)
memory usage: 138.3+ KB


After inspecting the characteristics of the data frame, I noticed that some rows lacked codes. I inspected the codeless rows below:

In [None]:
# Checking why some entities don't have codes
ogdf_alz[ogdf_alz['Code'].isna()]

Unnamed: 0,Entity,Code,Year,Total deaths from alzheimer disease and other dementias among both sexes
22,Africa,,2000,35149.220
23,Africa,,2001,36481.430
24,Africa,,2002,38289.440
25,Africa,,2003,39873.992
26,Africa,,2004,41280.300
...,...,...,...,...
3691,South America,,2017,43309.350
3692,South America,,2018,46520.040
3693,South America,,2019,52220.188
3694,South America,,2020,53350.690


It seems that continents don't have assigned codes. That explains the missing codes. I then inspected the population data frame and found similar trends.

In [None]:
ogdf_pop.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 38656 entries, 0 to 38655
Data columns (total 5 columns):
 #   Column                                                 Non-Null Count  Dtype  
---  ------                                                 --------------  -----  
 0   Entity                                                 38656 non-null  object 
 1   Code                                                   35938 non-null  object 
 2   Year                                                   38656 non-null  int64  
 3   Population - Sex: all - Age: all - Variant: estimates  18944 non-null  float64
 4   Population - Sex: all - Age: all - Variant: medium     19712 non-null  float64
dtypes: float64(2), int64(1), object(2)
memory usage: 1.5+ MB


In [None]:
ogdf_pop[ogdf_pop['Code'].isna()]

Unnamed: 0,Entity,Code,Year,Population - Sex: all - Age: all - Variant: estimates,Population - Sex: all - Age: all - Variant: medium
151,Africa (UN),,1950,227776420.0,
152,Africa (UN),,1951,232556973.0,
153,Africa (UN),,1952,237539543.0,
154,Africa (UN),,1953,242685066.0,
155,Africa (UN),,1954,248005819.0,
...,...,...,...,...,...
36839,Upper-middle-income countries,,2096,,2.090600e+09
36840,Upper-middle-income countries,,2097,,2.074385e+09
36841,Upper-middle-income countries,,2098,,2.058232e+09
36842,Upper-middle-income countries,,2099,,2.042159e+09


Similar findings here -- broad regions and continents lack codes. That's fine because I'm focusing on individual countries for my analysis

In [None]:
ogdf_alz.head()

Unnamed: 0,Entity,Code,Year,Total deaths from alzheimer disease and other dementias among both sexes
0,Afghanistan,AFG,2000,902.83
1,Afghanistan,AFG,2001,922.18
2,Afghanistan,AFG,2002,992.61
3,Afghanistan,AFG,2003,1075.51
4,Afghanistan,AFG,2004,1132.83


In [None]:
ogdf_pop.head()

Unnamed: 0,Entity,Code,Year,Population - Sex: all - Age: all - Variant: estimates,Population - Sex: all - Age: all - Variant: medium
0,Afghanistan,AFG,1950,7776180.0,
1,Afghanistan,AFG,1951,7879343.0,
2,Afghanistan,AFG,1952,7987784.0,
3,Afghanistan,AFG,1953,8096703.0,
4,Afghanistan,AFG,1954,8207954.0,


The population dataset starts 1951, but the Alzheimer's dataset starts in 2000. I will slice the data frames and limit my analysis to 2000-2021, the range of the Alzheimer's data.

In [None]:
df_pop = ogdf_pop[(ogdf_pop['Year'] >= 2000) & (ogdf_pop['Year'] <= 2021)]

## Merging Alzheimer's and population datasets

To calculate dementia mortality rate by country, I merged the Alzheimer's and population datasets. I merged on 'Entity', 'Year', and 'Code' which were shared between the datasets.

In [None]:
# Merge the datasets now that the years are consistent
# Merge the population data on the Alzheimer's data
df_alzpop = pd.merge(ogdf_alz, df_pop, on=["Entity", "Year", "Code"], how="inner")
# how is important for how it merges. Make sure you're using the appropriate one
# set left_on and right_on to merge columns that are the same value but different column headers
df_alzpop.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4290 entries, 0 to 4289
Data columns (total 6 columns):
 #   Column                                                                    Non-Null Count  Dtype  
---  ------                                                                    --------------  -----  
 0   Entity                                                                    4290 non-null   object 
 1   Code                                                                      4290 non-null   object 
 2   Year                                                                      4290 non-null   int64  
 3   Total deaths from alzheimer disease and other dementias among both sexes  4290 non-null   float64
 4   Population - Sex: all - Age: all - Variant: estimates                     4290 non-null   float64
 5   Population - Sex: all - Age: all - Variant: medium                        0 non-null      float64
dtypes: float64(3), int64(1), object(2)
memory usage: 201.2+ KB


In [None]:
df_alzpop.rename(columns={'Total deaths from alzheimer disease and other dementias among both sexes': 'Total_Deaths_Alzheimer_Dementia'}, inplace=True)
df_alzpop.rename(columns={'Population - Sex: all - Age: all - Variant: estimates': 'Total_Pop_Est'}, inplace=True)
df_alzpop.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4290 entries, 0 to 4289
Data columns (total 6 columns):
 #   Column                                              Non-Null Count  Dtype  
---  ------                                              --------------  -----  
 0   Entity                                              4290 non-null   object 
 1   Code                                                4290 non-null   object 
 2   Year                                                4290 non-null   int64  
 3   Total_Deaths_Alzheimer_Dementia                     4290 non-null   float64
 4   Total_Pop_Est                                       4290 non-null   float64
 5   Population - Sex: all - Age: all - Variant: medium  0 non-null      float64
dtypes: float64(3), int64(1), object(2)
memory usage: 201.2+ KB


In [None]:
# Dropping the 'medium' column because it only has projections beyond this time period.
df_alzpop.drop(columns='Population - Sex: all - Age: all - Variant: medium', inplace=True)

## Calculating Alzheimer's mortality rate by population size

Mortality rate is commonly calculated per 10,000 or per 100,000 individuals. I calculated the mortality rate of dementia per 100,000 individuals since the mortality rate of dementias among a population is low.

In [None]:
# Adding calculation for deaths from Alzheimer's and dementia by pop
df_alzpop['Deaths_per_100k'] = (df_alzpop['Total_Deaths_Alzheimer_Dementia']/df_alzpop['Total_Pop_Est'])*100000

In [None]:
df_alzpop

Unnamed: 0,Entity,Code,Year,Total_Deaths_Alzheimer_Dementia,Total_Pop_Est,Deaths_per_100k
0,Afghanistan,AFG,2000,902.83,20130334.0,4.484923
1,Afghanistan,AFG,2001,922.18,20284303.0,4.546274
2,Afghanistan,AFG,2002,992.61,21378123.0,4.643111
3,Afghanistan,AFG,2003,1075.51,22733053.0,4.731041
4,Afghanistan,AFG,2004,1132.83,23560656.0,4.808143
...,...,...,...,...,...,...
4285,Zimbabwe,ZWE,2017,888.44,14812484.0,5.997914
4286,Zimbabwe,ZWE,2018,910.54,15034457.0,6.056354
4287,Zimbabwe,ZWE,2019,916.96,15271377.0,6.004436
4288,Zimbabwe,ZWE,2020,1034.04,15526887.0,6.659674


## Generating choropleth map figure

To visualize my analysis, I used AI to help generate an animated choropleth map showing dementia mortality over time.

In [127]:
# Used AI for help generating basic choropleth map and showing the year at the top left. Heavily modified the code for improved readability and design of the figure.
fig = px.choropleth(
    df_alzpop,
    locations='Entity',
    locationmode='country names',
    color='Deaths_per_100k',
    hover_name='Entity',
    animation_frame='Year',
    color_continuous_scale=px.colors.sequential.Agsunset,
    range_color=(df_alzpop['Deaths_per_100k'].min(), df_alzpop['Deaths_per_100k'].max()),
    title='Dementia Mortality Rate (2000-2021)'
)

# Consolidate all layout updates into a single call for clarity and consistency
fig.update_layout(
    height=500,
    width=1000,
    autosize=False, # Disable autosizing
    coloraxis_colorbar=dict(
        title=dict(text='Deaths per 100k'),
        len=0.7 # Make the color bar shorter
    ),
    sliders=[dict(
        y=-0.17, # Vertical position of the slider (0 is bottom, 1 is top)
        x=0.05, # Horizontal position
        xanchor='left',
        yanchor='bottom',
        currentvalue=dict(visible=False) # Hide the automatic 'Year = ' label
    )],
    updatemenus=[dict(
        y=0.03, # Vertical position of the play/pause buttons, slightly above slider
        x=0.05, # Horizontal position
        xanchor='left',
        yanchor='bottom'
    )],
    title=dict(y=0.9, x=0.27) # Move title to top
)

fig.show()

## Exploratory scatterplot

I made this exploratory animated scatterplot to better see overall trends in dementia related deaths.

In [None]:
fig_scatter = px.scatter(
    df_alzpop[df_alzpop['Entity'] != 'World'], # Filter out rows where Entity is 'World' for easier visibility of trends
    x='Total_Pop_Est',
    y='Total_Deaths_Alzheimer_Dementia',
    animation_frame='Year',
    hover_name='Entity',
    #size='Total_Pop_Est', # Make the size of points reflect population
    color='Deaths_per_100k', # Color points by deaths per population
    log_x=True, # Use a log scale for population for better distribution visibility
    log_y=True,
    title='Animated Scatter Plot of Dementia Deaths vs. Population (2000-2021)'
)

fig_scatter.show()

Noticed Monaco stood out. Looked up Monaco and found very high percentage of people 65+, and that it's a very wealthy and small country. Decided to compare GDP with Alzheimer's deaths by population.

## Loading and merging global GDP per capita data

In [None]:
# Loading the data files
ogdf_gdp = pd.read_csv("gdp-per-capita-worldbank.csv")


In [None]:
ogdf_gdp.info()
df_gdp = ogdf_gdp[(ogdf_gdp['Year'] >= 2000) & (ogdf_gdp['Year'] <= 2021)]
df_gdp.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 7311 entries, 0 to 7310
Data columns (total 5 columns):
 #   Column                                               Non-Null Count  Dtype  
---  ------                                               --------------  -----  
 0   Entity                                               7311 non-null   object 
 1   Code                                                 6876 non-null   object 
 2   Year                                                 7311 non-null   int64  
 3   GDP per capita, PPP (constant 2021 international $)  7236 non-null   float64
 4   World regions according to OWID                      272 non-null    object 
dtypes: float64(1), int64(1), object(3)
memory usage: 285.7+ KB
<class 'pandas.core.frame.DataFrame'>
Index: 4611 entries, 0 to 7307
Data columns (total 5 columns):
 #   Column                                               Non-Null Count  Dtype  
---  ------                                               -------------- 

In [None]:
#Merging the GDP dataset to the AlzPop dataset
df_all = pd.merge(df_alzpop, df_gdp, on=["Entity", "Year", "Code"], how="inner")
# how is important for how it merges. Make sure you're using the appropriate one
# set left_on and right_on to merge columns that are the same value but different column headers
df_all.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4079 entries, 0 to 4078
Data columns (total 8 columns):
 #   Column                                               Non-Null Count  Dtype  
---  ------                                               --------------  -----  
 0   Entity                                               4079 non-null   object 
 1   Code                                                 4079 non-null   object 
 2   Year                                                 4079 non-null   int64  
 3   Total_Deaths_Alzheimer_Dementia                      4079 non-null   float64
 4   Total_Pop_Est                                        4079 non-null   float64
 5   Deaths_per_100k                                      4079 non-null   float64
 6   GDP per capita, PPP (constant 2021 international $)  4079 non-null   float64
 7   World regions according to OWID                      0 non-null      object 
dtypes: float64(4), int64(1), object(3)
memory usage: 255.1+ KB


Noticed that the number of rows in the df_all dataset was less than in the df_alzpop and ogdf_gdp datasets. Used AI for help showing dropped rows.

#### Inspecting dropped rows from the merge

In [None]:
# Used AI for help counting and displaying dropped rows
# Perform a left merge to identify rows from df_alzpop that didn't have a match in df_gdp
merged_left = pd.merge(df_alzpop, df_gdp, on=['Entity', 'Year', 'Code'], how='left', suffixes=('_alzpop', '_gdp'))

# Find rows where the GDP data is missing (meaning no match was found in df_gdp)
missing_gdp_rows = merged_left[merged_left['GDP per capita, PPP (constant 2021 international $)'].isnull()]

print(f"Number of rows from df_alzpop that did not find a match in df_gdp: {len(missing_gdp_rows)}")
print("Sample of rows from df_alzpop that were dropped due to no GDP data:")
display(missing_gdp_rows.head())

# See the unique entities that were dropped
print("\nUnique entities from df_alzpop with no matching GDP data:")
display(missing_gdp_rows['Entity'].unique())

Number of rows from df_alzpop that did not find a match in df_gdp: 211
Sample of rows from df_alzpop that were dropped due to no GDP data:


Unnamed: 0,Entity,Code,Year,Total_Deaths_Alzheimer_Dementia,Total_Pop_Est,Deaths_per_100k,"GDP per capita, PPP (constant 2021 international $)",World regions according to OWID
858,Cook Islands,COK,2000,1.58,15876.0,9.952129,,
859,Cook Islands,COK,2001,1.55,15259.0,10.15794,,
860,Cook Islands,COK,2002,1.51,14994.0,10.070695,,
861,Cook Islands,COK,2003,1.56,15049.0,10.366137,,
862,Cook Islands,COK,2004,1.64,15103.0,10.85877,,



Unique entities from df_alzpop with no matching GDP data:


array(['Cook Islands', 'Cuba', 'Djibouti', 'Eritrea', 'Monaco', 'Niue',
       'North Korea', 'South Sudan', 'Venezuela', 'Yemen'], dtype=object)

These above countries were dropped because they lacked GDP data

In [None]:
# Perform a left merge using df_gdp as the left DataFrame to identify rows that didn't have a match in df_alzpop
merged_gdp_left = pd.merge(df_gdp, df_alzpop, on=['Entity', 'Year', 'Code'], how='left', suffixes=('_gdp', '_alzpop'))

# Finding rows where the Alzheimer's death data is missing (meaning no match was found in df_alzpop)
missing_alz_rows = merged_gdp_left[merged_gdp_left['Total_Deaths_Alzheimer_Dementia'].isnull()]

print(f"Number of rows from df_gdp that did not find a match in df_alzpop: {len(missing_alz_rows)}")
print("Sample of rows from df_gdp that were dropped due to no Alzheimer's data:")
display(missing_alz_rows.head())

# See the unique entities that were dropped
print("\nUnique entities from df_gdp with no matching Alzheimer's data:")
display(missing_alz_rows['Entity'].unique())

Number of rows from df_gdp that did not find a match in df_alzpop: 532
Sample of rows from df_gdp that were dropped due to no Alzheimer's data:


Unnamed: 0,Entity,Code,Year,"GDP per capita, PPP (constant 2021 international $)",World regions according to OWID,Total_Deaths_Alzheimer_Dementia,Total_Pop_Est,Deaths_per_100k
176,Aruba,ABW,2000,39244.83,,,,
177,Aruba,ABW,2001,40505.53,,,,
178,Aruba,ABW,2002,39846.062,,,,
179,Aruba,ABW,2003,39832.58,,,,
180,Aruba,ABW,2004,41834.926,,,,



Unique entities from df_gdp with no matching Alzheimer's data:


array(['Aruba', 'Bermuda', 'Cayman Islands', 'Curacao',
       'East Asia and Pacific (WB)', 'Europe and Central Asia (WB)',
       'European Union (27)', 'Faroe Islands', 'Greenland',
       'High-income countries', 'Hong Kong', 'Kosovo',
       'Latin America and Caribbean (WB)', 'Low-income countries',
       'Lower-middle-income countries', 'Macao',
       'Middle East, North Africa, Afghanistan and Pakistan (WB)',
       'North America (WB)', 'Palestine', 'Puerto Rico',
       'Sint Maarten (Dutch part)', 'South Asia (WB)',
       'Sub-Saharan Africa (WB)', 'Turks and Caicos Islands',
       'United States Virgin Islands', 'Upper-middle-income countries'],
      dtype=object)

These above regions / countries were dropped because they lacked Alzheimer's data

In [None]:
df_all.head()

Unnamed: 0,Entity,Code,Year,Total_Deaths_Alzheimer_Dementia,Total_Pop_Est,Deaths_per_100k,"GDP per capita, PPP (constant 2021 international $)",World regions according to OWID
0,Afghanistan,AFG,2000,902.83,20130334.0,4.484923,1617.8264,
1,Afghanistan,AFG,2001,922.18,20284303.0,4.546274,1454.1108,
2,Afghanistan,AFG,2002,992.61,21378123.0,4.643111,1774.3087,
3,Afghanistan,AFG,2003,1075.51,22733053.0,4.731041,1815.9282,
4,Afghanistan,AFG,2004,1132.83,23560656.0,4.808143,1776.9182,


In [None]:
# Dropping this unnecessary column
df_all.drop(columns='World regions according to OWID', inplace=True)

In [None]:
# Renaming for conciseness
df_all.rename(columns={'GDP per capita, PPP (constant 2021 international $)': 'GDP per capita'}, inplace=True)

In [None]:
# Reviewing the df_all dataframe with updated column names
df_all.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4079 entries, 0 to 4078
Data columns (total 7 columns):
 #   Column                           Non-Null Count  Dtype  
---  ------                           --------------  -----  
 0   Entity                           4079 non-null   object 
 1   Code                             4079 non-null   object 
 2   Year                             4079 non-null   int64  
 3   Total_Deaths_Alzheimer_Dementia  4079 non-null   float64
 4   Total_Pop_Est                    4079 non-null   float64
 5   Deaths_per_100k                  4079 non-null   float64
 6   GDP per capita                   4079 non-null   float64
dtypes: float64(4), int64(1), object(2)
memory usage: 223.2+ KB


In [None]:
# For troubleshooting and data inspection purposes - exporting data to examine in Excel
#df_all.to_excel('finalprojectcompletedata.xlsx')

## Generating GDP vs dementia mortality scatterplot

In [None]:
# Used AI for initial help generating this animated scatterplot, then heavily modified the code for improved readability and design.
fig_scattergdp = px.scatter(
    df_all[~df_all['Entity'].isin(['World', 'Djibouti'])], # Filter out rows where Entity is 'World' or 'Djibouti' for easier visibility of trends
    x='GDP per capita',
    y='Deaths_per_100k',
    animation_frame='Year',
    hover_name='Entity',
    size='Total_Pop_Est', # Make the size of points reflect population
    title='GDP per capita vs dementia mortality rate (2000-2021)',
    color_discrete_sequence=['purple']
)

# Add black outline to points
fig_scattergdp.update_traces(marker_line_color='black', marker_line_width=1.5)

# Updating axes ranges
fig_scattergdp.update_xaxes(title_text='GDP per capita', range = [0, df_all['GDP per capita'].max()])
fig_scattergdp.update_yaxes(title_text='Dementia deaths per 100k', range = [0, df_all['Deaths_per_100k'].max()])

# Set background color to white
fig_scattergdp.update_layout(
    plot_bgcolor='white',
    xaxis=dict(
        showgrid=True,
        gridcolor='lightgray'
    ),
    yaxis=dict(
        showgrid=True,
        gridcolor='lightgray'
    ),
    title=dict(y=0.88, x=0.03) # Move title down
)

#

fig_scattergdp.show()

I was curious to see how the strength of correlation looked over time, so I did some exploratory statistics.

#### Exploratory statistics

In [None]:
#Calculating r and p value over all
from scipy.stats import pearsonr, spearmanr
r_val, p_val_pearson = pearsonr(df_all['GDP per capita'], df_all['Deaths_per_100k'])

#Printing to inspect the values
print(r_val)
print(p_val_pearson)

0.5482537784437254
7.13233e-319


In [None]:
# How does strength and significance of correlation change over time?
# Used AI to make function to calculate correlation for each year
def calc_yearly_corr(df):
    # Make sure there's sufficient data
    if len(df) < 20: # To make sure my grouping worked and each group has many countries
      return pd.Series({'r_value': None, 'p_value': None})

    # If the df is long enough, go ahead with the calculations
    r_val, p_val = pearsonr(df['GDP per capita'], df['Deaths_per_100k'])
    return pd.Series({'r_value': r_val, 'p_value': p_val})

# Apply the function to each year of the df_all
df_stats = df_all.groupby('Year').apply(calc_yearly_corr).reset_index()





In [None]:
# Inspecting the stats df
df_stats.head()

Unnamed: 0,Year,r_value,p_value
0,2000,0.472621,1.102463e-11
1,2001,0.498214,5.34962e-13
2,2002,0.50649,1.903178e-13
3,2003,0.507991,1.573111e-13
4,2004,0.503651,2.72157e-13


In [None]:
df_stats.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 22 entries, 0 to 21
Data columns (total 3 columns):
 #   Column   Non-Null Count  Dtype  
---  ------   --------------  -----  
 0   Year     22 non-null     int64  
 1   r_value  22 non-null     float64
 2   p_value  22 non-null     float64
dtypes: float64(2), int64(1)
memory usage: 660.0 bytes


In [None]:
# Generating a line graph to view strength of correlation over time
stats_linegraph = px.line(df_stats, x='Year', y='r_value')
stats_linegraph.update_yaxes(range=[0, 1])
stats_linegraph.show()

In [None]:
print(df_stats)

    Year   r_value       p_value
0   2000  0.472621  1.102463e-11
1   2001  0.498214  5.349620e-13
2   2002  0.506490  1.903178e-13
3   2003  0.507991  1.573111e-13
4   2004  0.503651  2.721570e-13
5   2005  0.506972  1.790528e-13
6   2006  0.512223  9.149426e-14
7   2007  0.521057  2.880394e-14
8   2008  0.537281  3.153401e-15
9   2009  0.540416  2.028311e-15
10  2010  0.531167  7.359159e-15
11  2011  0.531631  6.904733e-15
12  2012  0.539685  2.248963e-15
13  2013  0.558271  1.243228e-16
14  2014  0.572968  1.278618e-17
15  2015  0.585016  1.817058e-18
16  2016  0.599655  1.518363e-19
17  2017  0.614673  1.039008e-20
18  2018  0.617847  5.786473e-21
19  2019  0.624011  1.821709e-21
20  2020  0.606503  4.548732e-20
21  2021  0.611429  1.877188e-20


In [None]:
# Sort df_alzpop to see which countries had highest rates of dementia mortality
df_alzpop[df_alzpop['Year'] == 2021].sort_values('Deaths_per_100k', ascending=False).head(50)

Unnamed: 0,Entity,Code,Year,Total_Deaths_Alzheimer_Dementia,Total_Pop_Est,Deaths_per_100k
2507,Monaco,MCO,2021,115.39,38548.0,299.341081
4069,United Kingdom,GBR,2021,88684.47,67668790.0,131.056685
1363,Finland,FIN,2021,6855.49,5541069.0,123.721434
1055,Denmark,DNK,2021,5296.7,5856774.0,90.437159
3717,Sweden,SWE,2021,9372.44,10416130.0,89.980025
2705,Netherlands,NLD,2021,15613.16,17730560.0,88.057882
2881,Norway,NOR,2021,4660.68,5408082.0,86.179906
4091,United States,USA,2021,286214.06,340161400.0,84.140654
3101,Portugal,PRT,2021,8484.71,10390960.0,81.65475
681,Canada,CAN,2021,30787.2,38454060.0,80.062291


# Downloading HTML figures for GitHub

In [None]:
import os
print(f"Your file will be saved here: {os.getcwd()}")

Your file will be saved here: /content/drive/MyDrive/Intro_Data_Science/Final_Project/FinalProjectData


In [122]:
# Save the figure to an HTML file in your current directory
file_name = "choropleth_alzheimers.html"
fig.write_html(file_name)