<a href="https://www.kaggle.com/code/mikedelong/python-eda-with-maps-and-a-slider?scriptVersionId=143422324" target="_blank"><img align="left" alt="Kaggle" title="Open in Kaggle" src="https://kaggle.com/static/images/open-in-kaggle.svg"></a>

In [None]:
import pandas as pd
iso_df = pd.read_csv(filepath_or_buffer='/kaggle/input/country-mapping-iso-continent-region/continents2.csv',
                    usecols=['name', 'alpha-3', 'region'],).rename(columns={'name': 'Countries'})
iso_df.head()

In [None]:
country_fixes = \
{'Ant.& Barb.' : 'Antigua and Barbuda',
 'Bosnia & Herz.' : 'Bosnia And Herzegovina',
 'Brunei' : 'Brunei Darussalam',
 'Burma' : 'Myanmar',
 'C.A. Republic' : 'Central African Republic',
 'Cape Verde' : 'Cabo Verde',
 'Czechia' : 'Czech Republic',
 'DR Congo' : 'Congo (Democratic Republic Of The)',
 'Domin. Rep.' : 'Dominican Republic',
 'Eq. Guinea' : 'Equatorial Guinea',
 'G.-Bissau' : 'Guinea Bissau',
 'Ivory Coast' : "Côte D'Ivoire",
 'Micronesia' : 'Micronesia (Federated States of)',
 'North Macedonia' : 'Macedonia',
 'Papua N.G.': 'Papua New Guinea',
 'R. of Congo' : 'Congo',
 'S.T.&Principe' : 'Sao Tome and Principe',
 'Solomon Isl.' : 'Solomon Islands',
 'St. Vincent & ...' : 'Saint Vincent and the Grenadines',
 'Swaziland' : 'Eswatini',
 'Tr.&Tobago' : 'Trinidad and Tobago',
 'UA Emirates' : 'United Arab Emirates',
 'UK' : 'United Kingdom',
 'USA' : 'United States'}
drop_columns = ['Global rank', 'Available data']
print('loaded country fixes and drop columns')

We need ISO codes to use choropleths; our country names do not match, so we need to do some touch-ups so we can merge.

In [None]:
inflation_df = pd.read_csv(filepath_or_buffer='/kaggle/input/imf-forecast-dataset/Inflation forecast.csv').drop(columns=drop_columns)
inflation_df['Countries'] = inflation_df['Countries'].replace(to_replace=country_fixes)
inflation_df.head()

In [None]:
from numpy import log10
df = inflation_df.merge(right=iso_df, on='Countries', how='inner')
df['log10_forecast'] = df['Inflation forecast, 2023'].apply(log10)
df.info()

In [None]:
from plotly.express import histogram
histogram(data_frame=df, x='Inflation forecast, 2023', hover_name='Countries', color='region')

There is a lot of skew in our data because of a few countries with forecast inflation above 35%.

In [None]:
from plotly.express import choropleth
choropleth(data_frame=df, locations='alpha-3', color='Inflation forecast, 2023', hover_name='Countries')

Here the skew in forecast inflation shows up as mostly dark blue with islands of not-blue.

In [None]:
histogram(data_frame=df, x='log10_forecast', hover_name='Countries', color='region')

If we plot the log of the forecast inflation it looks much more like a normal distribution.

In [None]:
choropleth(data_frame=df, locations='alpha-3', color='log10_forecast', hover_name='Countries', hover_data='Inflation forecast, 2023', color_continuous_scale='Reds')

By plotting the log we see a more moderate distribution of colors and we can add the forecast value to the hover data.

In [None]:
budget_df = pd.read_csv(filepath_or_buffer='/kaggle/input/imf-forecast-dataset/Budget balance forecast.csv').drop(columns=drop_columns)
current_df = pd.read_csv(filepath_or_buffer='/kaggle/input/imf-forecast-dataset/Current account forecast.csv').drop(columns=drop_columns)
growth_df = pd.read_csv(filepath_or_buffer='/kaggle/input/imf-forecast-dataset/Economic growth forecast.csv').drop(columns=drop_columns)
investment_df = pd.read_csv(filepath_or_buffer='/kaggle/input/imf-forecast-dataset/Investment forecast.csv').drop(columns=drop_columns)
unemployment_df = pd.read_csv(filepath_or_buffer='/kaggle/input/imf-forecast-dataset/Unemployment rate forecast.csv').drop(columns=drop_columns)
for item in [budget_df, current_df, growth_df, investment_df, unemployment_df]:
    print(item.shape)

In [None]:
all_df = budget_df.copy()
for item in [current_df, growth_df, inflation_df, investment_df, unemployment_df]:
    all_df = all_df.merge(right=item, on='Countries', how='left')
all_df['Countries'] = all_df['Countries'].replace(to_replace=country_fixes)
all_df = all_df.merge(on='Countries', how='inner', right=iso_df)
all_df.head()

In [None]:
from plotly.express import imshow
imshow(img=all_df.corr(numeric_only=True))

Our numeric data does not seem to be highly correlated either positive or negative.

In [None]:
from plotly.express import scatter
scatter(data_frame=all_df, x='Economic growth forecast, 2023', y='Current account forecast, 2023', color='region', hover_name='Countries',
       trendline='ols', trendline_scope='overall')

In [None]:
columns = ['Budget balance forecast, 2023', 'Current account forecast, 2023', 'Economic growth forecast, 2023', 'Inflation forecast, 2023', 
           'Investment forecast, 2023', 'Unemployment rate forecast, 2023',]
print(sorted(columns))

In [None]:
from plotly.express import scatter_matrix
scatter_matrix(data_frame=all_df[columns],)

And this is what a pairplot looks like when correlations are low.

Let's make a big map of all the data before we go.

In [None]:
from plotly.offline import init_notebook_mode
from plotly.offline import iplot

def make_plot_data() -> dict:
    data = [dict(type='choropleth', locations = all_df['alpha-3'], 
                 z=all_df[column],
                 hovertext=all_df[['Countries', column]]) for column in columns]
    steps = [dict(method='restyle', args=['visible', [other == column for other in columns]], label=column) for column in columns]
    layout = dict(geo=dict(scope='world'), sliders=[dict(active=0, pad={'t': 1}, steps=steps)], )
    return dict(data=data, layout=layout)

init_notebook_mode()
iplot(figure_or_data=make_plot_data())