# Analysis and Exploration of Daily Exchange Rates per Euro 1999-2023

## Table of Contents

1. [**Introduction**](#1)
    - Project Description
    - Data Description
2. [**Acquiring and Loading Data**](#2)
	- Importing Libraries and Notebook Setup
    - Loading Data
    - Basic Data Exploration
    - Areas to Fix
3. [**Data Proprocessing**](#3)
4. [**Exploratory Data Analysis**](#4)
5. [**Conclusion**](#5)
    - Insights
    - Suggestions
    - Possible Next Steps
6. [**Epilogue**](#6) 
    - References
    - Versioning

---

# 1

## Introduction

![Dataset Cover]("dataset-cover.jpg")

### Project Description

**Goal/Purpose:** 

What is this project about? What is the the goal/purpose of this project? Why is it important for someone to read this notebook?
This project will demonstrate all skills learned in the Data Analysis in Python course through Dataquest, up through this assignment (Part 2: Intermediate Python and Pandas, Section 3: Telling Stories Using Data Visualization and Information Design, Guided Project: Storytelling Data Visualization on Exchange Rates).

The dataset describes daily Euro exchange rates between 4 January 1999 and 18 April 2023. The Euro (€) is the official currency in most of the countries of the European Union.

<p>&nbsp;</p>

**Questions to be Answered:**

- Question 1
- Question 2
- Question 3...

<p>&nbsp;</p>

**Assumptions/Methodology/Scope:** 

Briefly describe assumptions, processing steps, and the scope of this project.

<p>&nbsp;</p>

### Data Description

**Content:** 

This dataset is a CSV file of 6284 data points which contains daily euro foreign exchange rates observed on major foreign exchange trading venues at a certain point in time for 40 currencies. The rates are usually updated around 16:00 CET on every working day, except on TARGET closing days.
<p>&nbsp;</p>

**Description of Attributes:** 

Each column represents a different currency: Australian dollar, Bulgarian lev, Brazilian real, Canadian dollar, Swiss franc, Chinese yuan renminbi, Cypriot pound, Czech koruna, Danish krone, Estonian kroon, UK pound sterling, Greek drachma, Hong Kong dollar, Croatian kuna, Hungarian forint, Indonesian rupiah, Israeli shekel, Indian rupee, Iceland krona, Japanese yen, Korean won, Lithuanian litas, Latvian lats, Maltese lira, Mexican peso, Malaysian ringgit, Norwegian krone, New Zealand dollar, Philippine peso, Polish zloty, Romanian leu, Russian rouble, Swedish krona, Singapore dollar, Slovenian tolar, Slovak koruna, Thai baht, Turkish lira, US dollar, South African rand.

Notes: 
- The following countries have since switched to using the Euro and therefore no longer exists: Cypriot pound (2007), Estonian kroon (2011), Greek drachma (2002), Lithuanian litas (2015), Latvian lats (2014), Maltese lira (2008), Slovenian tolar (2007), Slovak koruna (2009). 
- Since 2002 the Bulgarian lev is locked to the rate of 1 € = 1.9558 leva.

<p>&nbsp;</p>

**Acknowledgements:** 

This dataset is provided by Kaggle. The original dataset was scraped by Daria Chemkaeva and the original source can be found on [Kaggle](https://www.kaggle.com/datasets/lsind18/euro-exchange-daily-rates-19992020?resource=download). This dataset is frequently updated, the data used for this set was downloaded 21 April 2023.

---

# 2

## Acquiring and Loading Data
### Importing Libraries and Notebook Setup

In [1]:
# Ignore warnings if needed
import warnings
warnings.filterwarnings('ignore')

# Data manipulation
import datetime
import numpy as np
import pandas as pd
import pandas.api.types as ptypes

# Visualizations
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline

# Pandas settings
pd.options.display.max_columns = None
pd.options.display.max_colwidth = 60
pd.options.display.float_format = '{:,.3f}'.format

# Visualization settings
from matplotlib import rcParams
plt.style.use('fivethirtyeight')
rcParams['figure.figsize'] = (16, 5)   
rcParams['axes.spines.right'] = False
rcParams['axes.spines.top'] = False
rcParams['font.size'] = 12
# rcParams['figure.dpi'] = 300
rcParams['savefig.dpi'] = 300
plt.rc('xtick', labelsize=11)
plt.rc('ytick', labelsize=11)
custom_palette = ['#003f5c', '#444e86', '#955196', '#dd5182', '#ff6e54', '#ffa600']
custom_hue = ['#004c6d', '#346888', '#5886a5', '#7aa6c2', '#9dc6e0', '#c1e7ff']
custom_divergent = ['#00876c', '#6aaa96', '#aecdc2', '#f1f1f1', '#f0b8b8', '#e67f83', '#d43d51']
sns.set_palette(custom_palette)
%config InlineBackend.figure_format = 'retina'

### Loading Data

In [2]:
# Load DataFrame
file = 'euro-daily-hist_1999_2022.csv'
euro = pd.read_csv(file)

### Basic Data Exploration

#### Number of Rows and Columns

In [3]:
# Show rows and columns count
print(f"Rows count: {euro.shape[0]}\nColumns count: {euro.shape[1]}")

Rows count: 6284
Columns count: 41


#### Display First and Last Rows

In [24]:
# Look at first 5 rows
euro.head()

Unnamed: 0,Period\Unit:,[Australian dollar ],[Bulgarian lev ],[Brazilian real ],[Canadian dollar ],[Swiss franc ],[Chinese yuan renminbi ],[Cypriot pound ],[Czech koruna ],[Danish krone ],[Estonian kroon ],[UK pound sterling ],[Greek drachma ],[Hong Kong dollar ],[Croatian kuna ],[Hungarian forint ],[Indonesian rupiah ],[Israeli shekel ],[Indian rupee ],[Iceland krona ],[Japanese yen ],[Korean won ],[Lithuanian litas ],[Latvian lats ],[Maltese lira ],[Mexican peso ],[Malaysian ringgit ],[Norwegian krone ],[New Zealand dollar ],[Philippine peso ],[Polish zloty ],[Romanian leu ],[Russian rouble ],[Swedish krona ],[Singapore dollar ],[Slovenian tolar ],[Slovak koruna ],[Thai baht ],[Turkish lira ],[US dollar ],[South African rand ]
0,2023-04-18,1.6276,1.9558,5.3899,1.4679,0.9831,7.5436,,23.373,7.4513,,0.88143,,8.6129,,371.68,16319.23,3.9937,89.9955,149.7,146.89,1445.35,,,,19.7174,4.865,11.4675,1.7637,61.73,4.618,4.936,,11.2955,1.4614,,,37.623,21.281,1.0972,19.9299
1,2023-04-17,1.6356,1.9558,5.3861,1.4673,0.9812,7.5433,,23.345,7.452,,0.88373,,8.62,,371.7,16299.22,4.0005,90.0607,149.7,146.97,1444.13,,,,19.7526,4.8558,11.364,1.7717,61.449,4.6288,4.942,,11.3163,1.4615,,,37.753,21.284,1.0981,19.8937
2,2023-04-14,1.6309,1.9558,5.441,1.4725,0.9827,7.5761,,23.341,7.451,,0.8844,,8.6797,,373.68,16291.79,4.0426,90.3595,149.7,146.6,1438.43,,,,19.9598,4.8673,11.402,1.7588,61.122,4.6435,4.942,,11.3455,1.4665,,,37.66,21.422,1.1057,19.9352
3,2023-04-13,1.6343,1.9558,5.4117,1.4759,0.9804,7.5758,,23.271,7.4509,,0.88058,,8.6468,,374.55,16224.93,4.0277,90.1665,149.1,146.81,1441.15,,,,19.9235,4.8466,11.457,1.7624,60.972,4.632,4.944,,11.3886,1.4592,,,37.548,21.291,1.1015,19.971
4,2023-04-12,1.6377,1.9558,5.4635,1.4728,0.9853,7.5183,,23.421,7.4506,,0.88038,,8.5737,,376.23,16253.32,4.0138,89.6875,149.1,146.09,1448.1,,,,19.7972,4.8193,11.4745,1.7649,60.291,4.6631,4.939,,11.348,1.4538,,,37.391,21.098,1.0922,20.133


In [25]:
# Look at last 5 rows
euro.tail()

Unnamed: 0,Period\Unit:,[Australian dollar ],[Bulgarian lev ],[Brazilian real ],[Canadian dollar ],[Swiss franc ],[Chinese yuan renminbi ],[Cypriot pound ],[Czech koruna ],[Danish krone ],[Estonian kroon ],[UK pound sterling ],[Greek drachma ],[Hong Kong dollar ],[Croatian kuna ],[Hungarian forint ],[Indonesian rupiah ],[Israeli shekel ],[Indian rupee ],[Iceland krona ],[Japanese yen ],[Korean won ],[Lithuanian litas ],[Latvian lats ],[Maltese lira ],[Mexican peso ],[Malaysian ringgit ],[Norwegian krone ],[New Zealand dollar ],[Philippine peso ],[Polish zloty ],[Romanian leu ],[Russian rouble ],[Swedish krona ],[Singapore dollar ],[Slovenian tolar ],[Slovak koruna ],[Thai baht ],[Turkish lira ],[US dollar ],[South African rand ]
6279,1999-01-08,1.8406,,,1.7643,1.6138,,0.58187,34.938,7.4433,15.6466,0.7094,324.0,9.0302,,250.15,9321.63,,,80.99,130.09,1366.73,4.6643,0.6654,0.4419,11.4414,4.4295,8.59,2.1557,44.295,4.0363,1.314,27.2075,9.165,1.9537,188.84,42.56,42.559,0.372,1.1659,6.7855
6280,1999-01-07,1.8474,,,1.7602,1.6165,,0.58187,34.886,7.4431,15.6466,0.70585,324.4,9.0131,,250.09,9218.77,,,81.06,129.43,1337.16,4.6548,0.6627,0.4413,11.5511,4.4203,8.6295,2.1531,44.436,4.0165,1.309,26.9876,9.18,1.9436,188.8,42.765,42.1678,0.37,1.1632,6.8283
6281,1999-01-06,1.882,,,1.7711,1.6116,,0.582,34.85,7.4452,15.6466,0.7076,324.72,9.101,,250.67,9337.68,,,81.54,131.42,1359.54,4.6994,0.6649,0.442,11.4705,4.4637,8.7335,2.189,44.872,4.0065,1.317,27.4315,9.305,1.9699,188.7,42.778,42.6949,0.372,1.1743,6.7307
6282,1999-01-05,1.8944,,,1.7965,1.6123,,0.5823,34.917,7.4495,15.6466,0.7122,324.7,9.1341,,250.8,9314.51,,,81.53,130.96,1373.01,4.7174,0.6657,0.4432,11.596,4.4805,8.7745,2.2011,44.745,4.0245,1.317,26.5876,9.4025,1.9655,188.775,42.848,42.5048,0.373,1.179,6.7975
6283,1999-01-04,1.91,,,1.8004,1.6168,,0.58231,35.107,7.4501,15.6466,0.7111,327.15,9.1332,,251.48,9433.61,,,81.48,133.73,1398.59,4.717,0.6668,0.4432,11.6446,4.4798,8.855,2.2229,45.51,4.0712,1.311,25.2875,9.4696,1.9554,189.045,42.991,42.6799,0.372,1.1789,6.9358


#### Check Data Types

In [27]:
# Show data types
euro.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 6284 entries, 0 to 6283
Data columns (total 41 columns):
 #   Column                    Non-Null Count  Dtype  
---  ------                    --------------  -----  
 0   Period\Unit:              6284 non-null   object 
 1   [Australian dollar ]      6284 non-null   object 
 2   [Bulgarian lev ]          5882 non-null   object 
 3   [Brazilian real ]         6016 non-null   object 
 4   [Canadian dollar ]        6284 non-null   object 
 5   [Swiss franc ]            6284 non-null   object 
 6   [Chinese yuan renminbi ]  6016 non-null   object 
 7   [Cypriot pound ]          2346 non-null   object 
 8   [Czech koruna ]           6284 non-null   object 
 9   [Danish krone ]           6284 non-null   object 
 10  [Estonian kroon ]         3130 non-null   object 
 11  [UK pound sterling ]      6284 non-null   object 
 12  [Greek drachma ]          520 non-null    object 
 13  [Hong Kong dollar ]       6284 non-null   object 
 14  [Croatia

- 38 columns are **strings**
- 3 columns are **floats**

#### Check Missing Data

In [28]:
# Print percentage of missing values
missing_percent = euro.isna().mean().sort_values(ascending=False)
print('---- Percentage of Missing Values (%) -----')
if missing_percent.sum():
    print(missing_percent[missing_percent > 0] * 100)
else:
    print(None)

---- Percentage of Missing Values (%) -----
[Greek drachma ]           91.725
[Slovenian tolar ]         66.820
[Cypriot pound ]           62.667
[Maltese lira ]            62.667
[Slovak koruna ]           58.498
[Estonian kroon ]          50.191
[Iceland krona ]           38.304
[Latvian lats ]            37.874
[Lithuanian litas ]        33.816
[Bulgarian lev ]            6.397
[Croatian kuna ]            5.458
[Russian rouble ]           4.615
[Brazilian real ]           4.265
[Indian rupee ]             4.265
[Chinese yuan renminbi ]    4.265
[Israeli shekel ]           4.265
[Romanian leu ]             0.987
[Turkish lira ]             0.987
dtype: float64


#### Check for Duplicate Rows

In [29]:
# Show number of duplicated rows
print(f"No. of entirely duplicated rows: {euro.duplicated().sum()}")

# Show duplicated rows
euro[euro.duplicated()]

No. of entirely duplicated rows: 0


Unnamed: 0,Period\Unit:,[Australian dollar ],[Bulgarian lev ],[Brazilian real ],[Canadian dollar ],[Swiss franc ],[Chinese yuan renminbi ],[Cypriot pound ],[Czech koruna ],[Danish krone ],[Estonian kroon ],[UK pound sterling ],[Greek drachma ],[Hong Kong dollar ],[Croatian kuna ],[Hungarian forint ],[Indonesian rupiah ],[Israeli shekel ],[Indian rupee ],[Iceland krona ],[Japanese yen ],[Korean won ],[Lithuanian litas ],[Latvian lats ],[Maltese lira ],[Mexican peso ],[Malaysian ringgit ],[Norwegian krone ],[New Zealand dollar ],[Philippine peso ],[Polish zloty ],[Romanian leu ],[Russian rouble ],[Swedish krona ],[Singapore dollar ],[Slovenian tolar ],[Slovak koruna ],[Thai baht ],[Turkish lira ],[US dollar ],[South African rand ]


#### Check Uniqueness of Data

In [30]:
# Print the number of unique values
num_unique = euro.nunique().sort_values()
print('---- Number of Unique Values -----')
print(num_unique)

---- Number of Unique Values -----
[Estonian kroon ]              2
[Bulgarian lev ]             106
[Greek drachma ]             323
[Maltese lira ]              426
[Danish krone ]              485
[Cypriot pound ]             498
[Lithuanian litas ]          771
[Latvian lats ]             1078
[Slovenian tolar ]          1377
[Iceland krona ]            1945
[Slovak koruna ]            2014
[Croatian kuna ]            2305
[Canadian dollar ]          3065
[Swiss franc ]              3215
[Australian dollar ]        3636
[Japanese yen ]             3679
[US dollar ]                3729
[UK pound sterling ]        3778
[Singapore dollar ]         3827
[Norwegian krone ]          3909
[Czech koruna ]             3969
[New Zealand dollar ]       4013
[Hungarian forint ]         4448
[Polish zloty ]             4545
[Romanian leu ]             4666
[Swedish krona ]            4850
[Malaysian ringgit ]        4994
[Israeli shekel ]           5088
[Chinese yuan renminbi ]    5283
[Turkish

#### Check Data Range

In [31]:
# Print summary statistics
euro.describe(include='all')

Unnamed: 0,Period\Unit:,[Australian dollar ],[Bulgarian lev ],[Brazilian real ],[Canadian dollar ],[Swiss franc ],[Chinese yuan renminbi ],[Cypriot pound ],[Czech koruna ],[Danish krone ],[Estonian kroon ],[UK pound sterling ],[Greek drachma ],[Hong Kong dollar ],[Croatian kuna ],[Hungarian forint ],[Indonesian rupiah ],[Israeli shekel ],[Indian rupee ],[Iceland krona ],[Japanese yen ],[Korean won ],[Lithuanian litas ],[Latvian lats ],[Maltese lira ],[Mexican peso ],[Malaysian ringgit ],[Norwegian krone ],[New Zealand dollar ],[Philippine peso ],[Polish zloty ],[Romanian leu ],[Russian rouble ],[Swedish krona ],[Singapore dollar ],[Slovenian tolar ],[Slovak koruna ],[Thai baht ],[Turkish lira ],[US dollar ],[South African rand ]
count,6284,6284,5882.0,6016,6284,6284,6016,2346.0,6284.0,6284,3130.0,6284,520.0,6284,5941,6284,6284,6016,6016,3877.0,6284,6284,4159.0,3904.0,2346.0,6284,6284,6284,6284,6284,6284,6222.0,5994,6284,6284,2085.0,2608,6284,6222.0,6284,6284
unique,6284,3636,106.0,5420,3065,3215,5283,498.0,3969.0,485,2.0,3778,323.0,5749,2305,4448,6188,5088,5696,,3679,5842,771.0,1078.0,426.0,6071,4994,3909,4013,5443,4545,,5705,4850,3827,1377.0,2014,5477,,3729,6036
top,2023-04-18,-,1.9558,-,-,-,-,0.5842,27.021,-,15.6466,-,340.75,-,-,-,-,-,-,,-,-,3.4528,0.696,0.4293,-,-,-,-,-,-,,-,-,-,239.5,-,-,,-,-
freq,1,62,4549.0,61,62,62,61,108.0,120.0,62,3074.0,62,9.0,62,61,62,62,62,61,,62,62,2732.0,194.0,690.0,62,62,62,62,62,62,,62,62,62,44.0,48,62,,62,62
mean,,,,,,,,,,,,,,,,,,,,107.72,,,,,,,,,,,,3.967,,,,,,,3.83,,
std,,,,,,,,,,,,,,,,,,,,34.223,,,,,,,,,,,,0.877,,,,,,,4.161,,
min,,,,,,,,,,,,,,,,,,,,68.07,,,,,,,,,,,,1.291,,,,,,,0.37,,
25%,,,,,,,,,,,,,,,,,,,,83.62,,,,,,,,,,,,3.549,,,,,,,1.727,,
50%,,,,,,,,,,,,,,,,,,,,89.26,,,,,,,,,,,,4.279,,,,,,,2.211,,
75%,,,,,,,,,,,,,,,,,,,,137.9,,,,,,,,,,,,4.553,,,,,,,3.925,,


### Areas to Fix
**Data Types**
- the 38 columns with **strings** should be converted to **floats** for calculations
- `column 0` should be converted to a **datetime** format

**Missing Data**
- As expected, some currencies are missing significant amounts of data. Exploration of these currencies will occur on a case by case basis, as determined by to-be-set threshold based on overall proportion of missing data.   

**Duplicate Rows**
- No duplicated rows

**Uniqueness of Data**
- Currencies with a low uniqueness of data will be cross checked with the list of currencies missing data to determine whether or not they are significant enough to be included in this analysis

**Data Range**
- Will need to be observed again after converting the datatypes

---

# 3

## Data Preprocessing

Here you can add sections like:

- Renaming columns
- Drop Redundant Columns
- Changing Data Types
- Dropping Duplicates
- Handling Missing Values
- Handling Unreasonable Data Ranges
- Feature Engineering / Transformation

Use `assert` where possible to show that preprocessing is done.

### Rename Columns

In [11]:
# # Rename columns
# columns_to_rename = {}
# df.rename(columns=columns_to_rename, inplace=True)

In [12]:
# # Verify columns are renamed
# df.columns

### Changing Data Types

In [32]:
euro.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 6284 entries, 0 to 6283
Data columns (total 41 columns):
 #   Column                    Non-Null Count  Dtype  
---  ------                    --------------  -----  
 0   Period\Unit:              6284 non-null   object 
 1   [Australian dollar ]      6284 non-null   object 
 2   [Bulgarian lev ]          5882 non-null   object 
 3   [Brazilian real ]         6016 non-null   object 
 4   [Canadian dollar ]        6284 non-null   object 
 5   [Swiss franc ]            6284 non-null   object 
 6   [Chinese yuan renminbi ]  6016 non-null   object 
 7   [Cypriot pound ]          2346 non-null   object 
 8   [Czech koruna ]           6284 non-null   object 
 9   [Danish krone ]           6284 non-null   object 
 10  [Estonian kroon ]         3130 non-null   object 
 11  [UK pound sterling ]      6284 non-null   object 
 12  [Greek drachma ]          520 non-null    object 
 13  [Hong Kong dollar ]       6284 non-null   object 
 14  [Croatia

In [33]:
# Convert columns to the right data types
euro[] = euro[].astype('float')
euro['Period\\Unit:']  = pd.to_datetime(euro['Period\\Unit:'], infer_datetime_format=True)

ValueError: could not convert string to float: '2023-04-17'

In [18]:
# # Verify conversion
# assert ptypes.is_string_dtype(col)
# assert ptypes.is_numeric_dtype(col)
# cols_to_check = []
# assert all(ptypes.is_datetime64_any_dtype(df[col]) for col in cols_to_check)

### Dropping Duplicates

In [19]:
# # Drop entirely duplicated rows
# df.drop_duplicates(inplace=True, ignore_index=True)

In [20]:
# # Verify rows dropped
# assert df.duplicated().sum()==0

### Handling Missing Values

### Handling Unreasonable Data Ranges

In [21]:
# # Drop affected rows
# df = df.loc[~((df['A'] == 0) | (df['B'] > 100))].reset_index()

In [22]:
# # Verify rows dropped
# len(df)

### Feature Engineering / Transformation

---

# 4

## Exploratory Data Analysis

Here is where your analysis begins. You can add different sections based on your project goals.

### Exploring `Column Name`

In [23]:
# Code and visualization

**Observations**
- Ob 1
- Ob 2
- Ob 3

---

# 5

## Conclusion

### Insights 
State the insights/outcomes of your project or notebook.

### Suggestions

Make suggestions based on insights.

### Possible Next Steps
Areas to expand on:
- (if there is any)

---

# 6

## Epilogue

### References

This is how we use inline citation[<sup id="fn1-back">[1]</sup>](#fn1).

[<span id="fn1">1.</span>](#fn1-back) _subject (date)._ Title. Available at: https://website.com (Accessed: Date). 

> Use [https://www.citethisforme.com/](https://www.citethisforme.com/) to create citations.

### Versioning
Notebook and insights by (author).
- Version: 1.0.0
- Date: 