# <center>Class 5</center>

## Plotting

### Getting Data: I/O and Manipulation

Load exchange rate data from JSON.

In [None]:
import json
import os

In [None]:
file = os.path.join(os.pardir, 'data', 'currency_rates.json')

In [None]:
with open(file, 'r') as rp:
    dc_ccies = json.load(rp)

Note the difference in the methods (check JSON-part in Class 2):
- `json.loads()` **takes a string** and tries to convert it to a dictionary
- `json.load()` **loads a file** with the appropriate I/O action. (Under the hood it does use json.load() though.)

In [None]:
type(dc_ccies)

In [None]:
dc_ccies.keys()

In [None]:
for key in dc_ccies.keys():
    print(key, type(dc_ccies[key]))

#### Excercise

- Print the first ten elements of each list. 
- Find the data type of the list elements. (Elements within each of these lists are of the same data type.)

In [None]:
for key in dc_ccies.keys():
    print(key, dc_ccies[key][0:10])

for key in dc_ccies.keys():
    print(key, type(dc_ccies[key][0]))

In [None]:
dc_ccies['Date'][0]

The `Date` key contains UNIX timestamps. Use ***list comprehension*** to convert these timestamps to `datetime.date` objects.

In [None]:
# Some help
import datetime

x = dc_ccies['Date'][0]
print(x)
print(datetime.date.fromtimestamp(x))

In [None]:
dc_ccies['Date'] = [datetime.date.fromtimestamp(x) for x in dc_ccies['Date']]

### Plotting with Matplotlib

In [None]:
import matplotlib.pyplot as plt
%matplotlib inline

`matplotlib` is the primary charting library of Python. It is a massive library, which offers so much, that it can easily become overwhelming. Creating a basic chart is fairly simple, but sometimes just a little customization already requires a deep dive into the API.

One of the reasons we cover matplotlib here though is that many other libraries are also built on the matplotlib API, and plotting charts directly from Pandas dataframes is easier if we have a basic understading of matplotlib's mechanics. There are other popular charting packages, such a`seaborn` n  `Plotly`  , but we think that a real Pythonista should be able to work with matplotlib objects.

A good sumary of the hows and whys of matplotlib can be found here: [https://heartbeat.comet.ml/introduction-to-matplotlib-data-visualization-in-python-d9143287ae39](https://heartbeat.comet.ml/introduction-to-matplotlib-data-visualization-in-python-d9143287ae39). 

There are two ways of creating a matplotlib plot.

**1. the functional approach**

In [None]:
x = range(0, 10)
y = [i ** 2 for i in x]

plt.plot(x,y)
plt.title('x-square')
plt.xlabel('x')
plt.ylabel('y')
plt.show()

In [None]:
plt.subplot(1,2,1) # nrows, ncols, index of the next plot starting with index 1 from the top left and increasing to the right
plt.plot(x, y, 'r--') # 'r' stands for red, '--' stands for dash
plt.title('x-square')
plt.subplot(1,2,2)
plt.plot(y, x, 'g*-')
plt.title('x-root');  # insted of plt.show() you can also use a semicolon to show the plot

Matplotlib color options can be found here: [https://matplotlib.org/stable/gallery/color/named_colors.html](https://matplotlib.org/stable/gallery/color/named_colors.html)

**2. the object-oriented API**

There are two key components in a Plot; namely, `Figure` and `Axes`.

The `Figure` is the top-level container that acts as the window or page on which everything is drawn. It can contain multiple independent figures, or `Axes`, a subtitle (which is a centered title for the figure), a legend, a color bar, etc.

In [None]:
fig = plt.figure()
axes = fig.add_axes([0.1, 0.1, 0.9, 0.9])

The `Axes` is the area on which we plot our data and any labels/ticks associated with it. Each Axes has an X-Axis and a Y-Axis

In [None]:
x = range(0, 10)
y = [i ** 2 for i in x]

fig = plt.figure()
axes = fig.add_axes([0.1, 0.1, 0.9, 0.9]) # left, bottom, width, height (range 0 to 1)

axes.plot(x, y, 'r')

axes.set_xlabel('x')
axes.set_ylabel('y')
axes.set_title('Simple x-squared as an OOP Plot');

We can do a plot within a plot.

In [None]:
fig = plt.figure(figsize = (6,10))

axes1 = fig.add_axes([0, 0, 0.8, 0.8]) # main axes
axes2 = fig.add_axes([0.1, 0.4, 0.4, 0.3]) # inset axes: left and bottom of the lower-left corner, width, height

# main figure
axes1.plot(x, y, 'r')
axes1.set_xlabel('x')
axes1.set_ylabel('y')
axes1.set_title('square')

# insert
axes2.plot(y, x, 'g')
axes2.set_xlabel('y')
axes2.set_ylabel('x')
axes2.set_title('root');

Charting currency movements, basic plot. 

In [None]:
date = dc_ccies['Date']
fxrate = dc_ccies['EURUSD']

fig = plt.figure(figsize=(10,6)) # figsize = width, height in inches
ax = fig.add_axes([0,0,1,1])
ax.set_title('EURUSD')
ax.plot(date, fxrate)
ax.plot;

Adding additional chart elements.
- y-axis limits
- legends

In [None]:
date = dc_ccies['Date']
fxrate = dc_ccies['EURUSD']

fig = plt.figure(figsize=(8,5))
ax = fig.add_axes([0,0,1,1])
ax.set_title('EURUSD')
ax.plot(date, fxrate, label = 'EURUSD closing prices')
ax.set_ylim(0.9,1.3)
plt.legend(loc = 'upper left');

- average line

In [None]:
import numpy as np

In [None]:
date = dc_ccies['Date']
fxrate = dc_ccies['EURUSD']
meanrate = np.mean(fxrate)

fig, ax = plt.subplots(figsize = (10,6)) # to add chart elements, we use the plt.subplots() method here
ax = fig.add_axes([0,0,1,1])
ax.hlines(y = meanrate, xmin = date[0], xmax = date[-1], linestyle = '--', label = 'avg')
ax.plot(date, fxrate, label = 'EURUSD', linewidth = 2)
ax.set_ylim(0.9,1.3)
ax.set_title('EURUSD against its average')
plt.legend();

#### Exercise

- Plot the same graph but for the average use only the last 200 days of data and position the average line accordingly.
- The average line should be black. 

In [None]:
# Your solution goes here

- secondary y-axis

In [None]:
date = dc_ccies['Date']
eurusd = dc_ccies['EURUSD']
usdjpy = dc_ccies['USDJPY']


fig, ax1 = plt.subplots(figsize = (10,6))

ax1.plot(date, eurusd, color = 'k')
ax1.xaxis_date()
ax1.set_ylabel("EURUSD", color = 'k')
ax2 = ax1.twinx()
ax2.plot(date, usdjpy, color = "firebrick")
ax2.set_ylabel("USDJPY", color = "firebrick")
plt.title('EURUSD and USDJPY, past two years');

#### Excercise

Construct a plot within a plot.
- The main plot is EURGBP for the whole period.
- The subplot is positioned on the bottom-right section of the main plot, and shows EURUSD for the last 200 days of data. 
- Add chart title, x and y axis legend for both plot elements. 

In [None]:
date = dc_ccies['Date']
usdjpy = dc_ccies['USDJPY']
eurusd = dc_ccies['EURUSD']

fig = plt.figure(figsize = (10,6))

axes1 = fig.add_axes([0, 0, 1, 1]) # main axes
axes2 = fig.add_axes([0.075, 0.55, 0.40, 0.4]) # inset axes

# main figure
axes1.plot(date, usdjpy)
axes1.set_xlabel('date')
axes1.set_ylabel('USDJPY')
axes1.set_title('USDJPY exchange rate')


# insert
axes2.plot(date[-200:], eurusd[-200:], color = 'black')
axes2.set_xlabel('date', fontsize = 8)
axes2.set_ylabel('EURUSD',  fontsize = 8)
axes2.set_title('EURUSD exchange rate', fontsize = 8);

In [None]:
import pandas as pd

In [None]:
df = pd.DataFrame(dc_ccies)

In [None]:
df.head()

In [None]:
df.info()

#### Histograms

- histograms of daily price changes

In [None]:
df['EURUSD_pct_chg'] = df.EURUSD.pct_change(periods=1)

In [None]:
fig = plt.figure(figsize=(8,5))
ax = fig.add_axes([0,0,1,1])
ax.set_title('EURUSD Daily Pct Price Changes')
ax.hist(df.EURUSD_pct_chg, bins = 50);

- spacing between the bars + horizontal grids

In [None]:
fig = plt.figure(figsize=(8,5))
ax = fig.add_axes([0,0,1,1])
ax.set_title('EURUSD Daily Pct Price Changes')
ax.hist(df.EURUSD_pct_chg, bins = 50, rwidth= 0.9)
plt.grid(axis = 'y', linestyle='--', linewidth=1);

 - add more in-between ticks for the x-axis

In [None]:
np.linspace(-0.05, 0.05, 5)

In [None]:
fig = plt.figure(figsize=(10,6))
ax = fig.add_axes([0,0,1,1])
ax.set_title('EURUSD Daily Pct Price Changes')
ax.hist(df.EURUSD_pct_chg, bins = 50, rwidth= 0.9)
ax.set_xticks(np.linspace(-0.05, 0.10, 33))
plt.grid(axis = 'y', linestyle='--', linewidth=1);

- format x-axis labels as percent
- define your own bins

In [None]:
import matplotlib as mpl

In [None]:
fig = plt.figure(figsize=(10,6))
ax = fig.add_axes([0,0,1,1])
ax.set_title('EURUSD Daily Pct Price Changes')
ax.hist(df.EURUSD_pct_chg, bins = np.linspace(-0.03, 0.02, 51) , rwidth= 0.9) # redefining bins
ax.set_xticks(np.linspace(-0.03, 0.02, 11))
ax.xaxis.set_major_formatter(mpl.ticker.PercentFormatter(xmax=10.0))
plt.grid(axis = 'y', linestyle='--', linewidth=1);

**Analytical question** (requires domain knowledge, or some critical thinking): 
- The distribution of the price changes looks kind of normal - except for the left tail. Why do we have that extra negative number? Is it an anomaly or something inherent in the underlying `Data Generation Process`?