**Tax on Personal Income** is defined as the taxes levied on the net income (gross income minus allowable tax reliefs) and capital gains of individuals. This indicator relates to government as a whole (all government levels) and is measured in percentage both of GDP and of total taxation.

https://data.oecd.org/tax/tax-on-personal-income.htm#:~:text=Tax%20on%20personal%20income%20is,GDP%20and%20of%20total%20taxation.

In this kernel I'll try to answer the following questions:
1. What are the countries that have higher and lower taxes on personal income?
2. Over the last few years, has been a trend on personal incomes taxes?

To do so I'll try to answer this questions through clear visualizations.

In [None]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import matplotlib.pyplot as plt # visualization package
import seaborn as sns # visualization package based on matplotlib

# Input data files are available in the read-only "../input/" directory
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# You can write up to 20GB to the current directory (/kaggle/working/) that gets preserved as output when you create a version using "Save & Run All" 
# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session

%matplotlib inline

In [None]:
it_data = pd.read_csv('/kaggle/input/oecd-tax-on-personal-income/Data Set-3.csv')
it_data.head()

It seems we have a DataFrame 698x8. Where the column 'Flag Codes' is completely fill with null values

In [None]:
display(it_data.shape)
it_data.info()

In [None]:
location_taxrate = it_data.pivot_table(values=['Value'], index=['LOCATION'], aggfunc=np.mean).reset_index()
location_taxrate.sort_values(by=['Value'], ascending=True, inplace=True)
location_taxrate.rename(columns= {'Value':'Personal Income Tax as % of GDP', 'LOCATION':'Country'}, inplace=True)
location_taxrate = location_taxrate[~(location_taxrate['Country'] == 'OAVG')].reset_index(drop=True)

In [None]:
g = sns.catplot(data=location_taxrate, y='Country', x='Personal Income Tax as % of GDP', height=7.5, aspect=1.5, kind='bar', palette='coolwarm')
g.set_xticklabels(rotation=0)
plt.axvline(location_taxrate['Personal Income Tax as % of GDP'].mean(), ymin=0, ymax=1, color='#000000', linestyle='--', label='OCDE Avg personal income tax rate')
plt.legend()
plt.title('Countries vs OCDE Avg')
plt.show()

In [None]:
over_years = it_data.pivot_table(columns='TIME', index='LOCATION', values='Value')
over_years = over_years.fillna(axis=1, method='bfill').fillna(axis=1, method='ffill')
over_years_change = over_years.pct_change(periods=it_data['TIME'].max() - it_data['TIME'].min(), axis=1).dropna(axis=1, how='all').rename(columns={2018:'% Change since 2000'})
over_years_change.sort_values(by=['% Change since 2000'], inplace=True)
over_years_change.reset_index(inplace=True)
over_years_change['% Change since 2000'] = over_years_change['% Change since 2000'] * 100

In [None]:
l = sns.catplot(data=over_years_change, x='LOCATION', y='% Change since 2000', height=7.5, aspect=1.5, kind='bar', palette='coolwarm')
l.set_xticklabels(rotation=90)
l.set(xlabel='Countries')
plt.title('% Change since 2000 in personal income tax rate')
plt.show()