## Health Care Spending in OECD Countries
### As the Percentage of GDP Between 1970 and 2017
Data: [OECD](https://stats.oecd.org/index.aspx?DataSetCode=HEALTH_STAT#), Visualization: [Cansin Cagan Acarer](https://cacarer.com/)

I originally created this visualization in JavaScript ([see on Observable](https://observablehq.com/@cansin/health-care-spending-in-oecd-countries)). I recreated it here in Python with [bar_chart_race](https://github.com/dexplo/bar_chart_race) function as a learning exercise, using the same csv file and transforming it as necessary.

In [1]:
import pandas as pd
df = pd.read_csv("oecd-data.csv")
df.head()

Unnamed: 0,date,name,category,value
0,1970-01-01,Canada,Canada,6.353
1,1970-01-01,United States,United States,6.232
2,1970-01-01,Germany,United States,5.713
3,1970-01-01,Sweden,Sweden,5.506
4,1970-01-01,France,France,5.184


#### Transforming the data into the format bar_chart_race function expects:
Looking at how the csv file was read into the dataframe, we see 3 issues:

In [2]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1728 entries, 0 to 1727
Data columns (total 4 columns):
 #   Column    Non-Null Count  Dtype  
---  ------    --------------  -----  
 0   date      1728 non-null   object 
 1   name      1728 non-null   object 
 2   category  1728 non-null   object 
 3   value     1728 non-null   float64
dtypes: float64(1), object(3)
memory usage: 54.1+ KB


Issue 1: Dates are read as strings, we need to convert their format to datetime64 to make formatting them easier later on:

In [3]:
df['date'] = df['date'].astype('datetime64')
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1728 entries, 0 to 1727
Data columns (total 4 columns):
 #   Column    Non-Null Count  Dtype         
---  ------    --------------  -----         
 0   date      1728 non-null   datetime64[ns]
 1   name      1728 non-null   object        
 2   category  1728 non-null   object        
 3   value     1728 non-null   float64       
dtypes: datetime64[ns](1), float64(1), object(2)
memory usage: 54.1+ KB


Issue 2: The value column contains the percentage values (e.g. 6.2 for 6.2%) but we need to convert them into fractions (e.g. 0.062 for 6.2%):

In [4]:
df["value"] = df["value"]/100
df.head()

Unnamed: 0,date,name,category,value
0,1970-01-01,Canada,Canada,0.06353
1,1970-01-01,United States,United States,0.06232
2,1970-01-01,Germany,United States,0.05713
3,1970-01-01,Sweden,Sweden,0.05506
4,1970-01-01,France,France,0.05184


Issue 3: We need to retabulate the dataframe to have dates as rows and countries as columns as this is the format expected by the bar_chart_race function.

In [5]:
df = df.pivot_table(index='date', columns='name', values='value')
df.head()

name,Australia,Austria,Belgium,Canada,Chile,Czech Republic,Denmark,Estonia,Finland,France,...,Poland,Portugal,Slovak Republic,Slovenia,Spain,Sweden,Switzerland,Turkey,United Kingdom,United States
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
1970-01-01,0.0,0.04838,0.03836,0.06353,0.0,0.0,0.0,0.0,0.04986,0.05184,...,0.0,0.02261,0.0,0.0,0.03145,0.05506,0.04901,0.0,0.03971,0.06232
1971-01-01,0.04548,0.0477,0.03922,0.06611,0.0,0.0,0.07677,0.0,0.05286,0.0,...,0.0,0.02422,0.0,0.0,0.03584,0.05944,0.0512,0.0,0.04017,0.06366
1972-01-01,0.04548,0.04765,0.04076,0.06528,0.0,0.0,0.07768,0.0,0.05381,0.0,...,0.0,0.02988,0.0,0.0,0.03782,0.06046,0.05086,0.0,0.04088,0.06499
1973-01-01,0.04512,0.04857,0.04355,0.06207,0.0,0.0,0.07585,0.0,0.05231,0.0,...,0.0,0.03231,0.0,0.0,0.03699,0.0602,0.05161,0.0,0.04019,0.06517
1974-01-01,0.05114,0.05086,0.04419,0.06091,0.0,0.0,0.08291,0.0,0.05168,0.0,...,0.0,0.03456,0.0,0.0,0.04028,0.0626,0.05512,0.0,0.04647,0.06836


#### Calling the Build Function
We import [bar_chart_race](https://github.com/dexplo/bar_chart_race) to create the chart as a gif file.

In [6]:
%%capture --no-display
import bar_chart_race as bcr
bcr.bar_chart_race(
					df = df,
					filename = 'result.gif',
					n_bars = 8,
					title = 'Health Care Spending as the Percentage of GDP',
					orientation='h', 
					sort='desc',
					fixed_order=False, 
					fixed_max=True,
					interpolate_period=False,
					period_template='%Y', 
					colors='dark12',
					bar_size=.95, 
					bar_textposition='inside',
					bar_texttemplate='{x:.2%}', 
					bar_label_font=7, 
					tick_label_font=7, 
					tick_template='{x:.0%}',
					shared_fontdict=None, 
					scale='linear', 
					fig=None, 
					writer=None, 
					#fig_kwargs={'figsize': (6, 3.5), 'dpi': 144},
					filter_column_colors=False
					)

Note 1: This function requires ffmpeg to be installed on your machine ([see how-to here for Windows](https://www.wikihow.com/Install-FFmpeg-on-Windows)).

Note 2: pip does not have the most recent version of [bar_chart_race](https://github.com/dexplo/bar_chart_race). You can install it directly from the repo with `python -m pip install git+https://github.com/dexplo/bar_chart_race` command.