# Introduction

It wasn't until 1999 that the euro really began its journey, when 11 countries (Austria, Belgium, Finland, France, Germany, Ireland, Italy, Luxembourg, the Netherlands, Portugal and Spain) fixed their exchange rates and created a new currency with monetary policy passed to the European Central Bank. Today euro is 20+ years old.

Currently, the euro (€) is the official currency of 19 out of 27 EU member countries which together constitute the Eurozone, officially called the euro area.
* Euro area member countries  
Although all EU countries are part of the Economic and Monetary Union (EMU), 19 of them have replaced their national currencies with the single currency – the euro. These EU countries form the euro area, also known as the eurozone: **Austria, Belgium, Cyprus, Estonia, Finland, France, Germany, Greece, Ireland, Italy, Latvia, Lithuania, Luxembourg, Malta, the Netherlands, Portugal, Slovakia, Slovenia, Spain**.
* Non-euro area member countries  
These are countries where the euro has still not been adopted, but who will join once they have met the necessary conditions. Mostly, it consists of countries of member states which acceded to the Union in 2004, 2007 and 2013, after the euro was launched in 2002: **Bulgaria, Croatia, Czech Republic, Hungary, Poland, Romania, Sweden**.
* Member countries with an opt-out  
Occasionally, member states can negotiate an opt-out from any of the European Union legislation or treaties, and agree to not participate in certain policy areas. Concerning the single currency, this is the case for Denmark. It kept its former currency after becoming member of the EU. This list includes only **Denmark**.
* Outside the EU  
The euro is also the sole currency of **Montenegro** and **Kosovo**.  

source: [Which countries use the euro?](https://europa.eu/european-union/about-eu/euro/which-countries-use-euro_en)

Watch short video about Euro history before we start exploratory analysis:

In [None]:
from IPython.display import HTML

HTML('<center><iframe width="700" height="400" src="https://www.youtube.com/embed/dIUktr3Zpyk" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe></center>')

# Read data

* import necessary modules: `pandas` (and `pandas.plotting`), `matplotlib` and `seaborn`; 
* register pandas formatters and converters with matplotlib;
* read dataset `euro-daily-hist_1999_2020.csv`, parse dates of first column and get 5 sample rows.

In [None]:
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
from pandas.plotting import register_matplotlib_converters
register_matplotlib_converters()
import matplotlib.pyplot as plt
import seaborn as sns
sns.set_style('white')

In [None]:
df_cur = pd.read_csv("/kaggle/input/euro-exchange-daily-rates-19992020/euro-daily-hist_1999_2020.csv", parse_dates=["Period\\Unit:"])
df_cur.sample(5)

# Preprocess data

### 1. Set index and rename columns
* clean headers and delete `[]` as well as unnecessary spaces;
* set datetime column as index (DatetimeIndex) of a dataframe;
* rename `Period\\Unit:` header.

In [None]:
names = str.maketrans('', '', '[]')
df_cur.columns = df_cur.columns.str.translate(names)
df_cur.columns = df_cur.columns.str.strip()
df_cur.set_index('Period\\Unit:', inplace=True)
df_cur.index.rename('DateSeries', inplace = True)
df_cur.info()

### Dataframe:  
* has 1 unique DatetimeIndex (column `DateSeries`) and 40 data columns of different currencies;
* Not all of 40 data columns were converted straight to `float64` type. `Object` type indicates a column has text. It's most common to one-hot encode these "object" columns, since they can't be plugged directly into most models;
* there is null data in some columns, for example Iceland krona, Greek Drachma etc.

### 2. Convert columns data to numeric
* convert all series to float64. Then invalid parsing will be set as NaN ('coerse' parameter).

In [None]:
cols = list(df_cur)
df_cur[cols] = df_cur[cols].apply(pd.to_numeric, errors='coerce')
df_cur.info()

### 2. Process NaN values

* count NaN values in every column:

In [None]:
df_cur.isnull().sum(axis = 0)

* find rows with only NaN values: print them by index (DateSeries) and delete these rows:

In [None]:
n = df_cur.index[df_cur.isnull().all(1)]
print(n)
print('Number of NaN rows: {}'.format(len(n)))

In [None]:
df_cur = df_cur.drop(n)

* U can replace currency which doesn't exist anymore by the last value it had before it was replaced by the Euro €: Cypriot pound (2007), Estonian kroon (2011), Greek drachma (2002), Lithuanian litas (2015), Latvian lats (2014), Maltese lira (2008), Slovenian tolar (2007), Slovak koruna (2009). For now this cell is commented.


In [None]:
#df_cur = df_cur.fillna(method='backfill')

### Get final description of every column in a dataframe:

In [None]:
df_cur.describe(include='all')

# Melt data
* Change the structure of dataframe:   
     reset index and make 2 columns: one with all the currency types: `Currency name` and another with `Value` attribute.

In [None]:
df_cur1 = df_cur.reset_index()
df_melted=df_cur1.melt(id_vars=['DateSeries'], var_name='Currency name', value_name='Value')
df_melted.head(5)

# Example: EUR/USD and EUR/GBP

* Create new dataframe `dataUSDGBP` containig only values of US dollar and UK pound sterling amd get 5 sample rows;
* Plot the graph including both currencies in rates of Euro.

In [None]:
dataUSDGBP = df_melted.loc[(df_melted['Currency name'] == 'US dollar') | (df_melted['Currency name'] == 'UK pound sterling')]
dataUSDGBP.sample(5)

In [None]:
fig = plt.figure(figsize=(15,8))
plt.grid(which='major', linewidth = 2)
plt.minorticks_on()
plt.grid(which='minor', linewidth = 0.5)
sns.lineplot(x='DateSeries', y='Value', hue='Currency name', data = dataUSDGBP)
plt.legend(bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0.);

From graph you can clearly see that 1 GBP was never less expensive than 1 EUR (but 1 USD was).

### Find MAXIMUM and MINIMUM 
* Minimum and maximum for USD (US dollar) over years (sorting descending):

In [None]:
dataUSD = dataUSDGBP.loc[(dataUSDGBP['Currency name'] == 'US dollar')]
dataUSD.set_index('DateSeries', inplace=True)
print('------USD: 5 largest values by dates------')
print(dataUSD['Value'].nlargest().sort_values(ascending = False))
print('------USD: 5 smallest values by dates-----')
print(dataUSD['Value'].nsmallest().sort_values(ascending = False))

* Minimum and maximum for GBP (UK pound sterling) over years (sorting descending):

In [None]:
dataGBP = dataUSDGBP.loc[(dataUSDGBP['Currency name'] == 'UK pound sterling')]
dataGBP.set_index('DateSeries', inplace=True)
print('------GBP: 5 largest values by dates------')
print(dataGBP['Value'].nlargest().sort_values(ascending = False))
print('------GBP: 5 smallest values by dates-----')
print(dataGBP['Value'].nsmallest().sort_values(ascending = False))

# Example: EUR/SIT
* Check the currency which doesn't exist anymore: Slovenian tolar was replaced by Euro in 2007;
* plot the graph.

In [None]:
dataSIT = df_melted.loc[(df_melted['Currency name'] == 'Slovenian tolar')]
dataSIT;

In [None]:
fig = plt.figure(figsize=(15,8))

plt.grid(which='major', linewidth = 2)
plt.minorticks_on()
plt.grid(which='minor', linewidth = 0.5)
sns.lineplot(x='DateSeries', y='Value', hue='Currency name', data = dataSIT)
plt.legend(bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0.);

# Conclusion

So we made small EDA over dataset [Daily Exchange Rates per Euro 1999-2020](https://www.kaggle.com/lsind18/euro-exchange-daily-rates-19992020) 💶 💶 💶   That's all!

### Please upvote my notebook if you find it useful or fork it 🙋🎓
Feel free to give any suggestions to improve my code.
## To make some further predictions don't forget to normalize data!