<img src="https://matplotlib.org/_static/logo2_compressed.svg" width="25%" height="25%" />

# Matplotlib - Unit 04 - Customizing your plots

## <img width="3%" height="3%" align="top"  src="https://codeinstitute.s3.amazonaws.com/predictive_analytics/jupyter_notebook_icons/Icon%202%20-%20Unit%20Objective.png"> Unit Objectives

* Customize your plots by adding titles, legend, changing plot layout, adjusting line, color and marker style, adding horizontal or vertical line, changing colormap, updating grid or adding annotation.
* Save your plots in high resolution



---

## <img width="3%" height="3%" align="top"  src="https://codeinstitute.s3.amazonaws.com/predictive_analytics/jupyter_notebook_icons/Icon%204%20-%20Import%20Package%20for%20Learning.png"> Import Package for Learning

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns # used for loading datasets

---

## <img width="3%" height="3%" align="top"  src="https://codeinstitute.s3.amazonaws.com/predictive_analytics/jupyter_notebook_icons/Icon%2010-%20Lesson%20Content.png"> Customizing your plots

We will study approaches to customizing your plot, like:
* Mix multiple plot types
* Title, legend and layout
* Line style, color and marker
* Horizontal and Vertical Lines
* Grid
* Annotation
* Save a plot

---

### <img width="3%" height="3%" align="top"  src="https://codeinstitute.s3.amazonaws.com/predictive_analytics/jupyter_notebook_icons/Icon%2010-%20Lesson%20Content.png"> Mix multiple plot types in a Figure

We will consider the penguins dataset. It has records for 3 different species of penguins, collected from 3 islands in the Palmer Archipelago, Antarctica

df = sns.load_dataset('penguins').sample(50, random_state=1)
df.head(3)

Imagine you need 3 plots in a Figure
  * A bar plot informing the species distribution
  * A pie plot informing the proportion of different island in the dataset
  * A scatter plot informing the relationship of '`flipper_length_mm`' and  '`body_mass_g`'

* We will create a Figure and multiple Axes using `plt.subplots()`. For each Axes, we plot the respective chart type
  * Note: the generated Figure will miss some important visualization elements, like legend, title, axis label and Figure layout. We will cover that in next section. The exercise objective is to understand we can use mulitple different plot types in a Figure

fig, axes = plt.subplots(nrows=1, ncols=3, figsize=(13,5))

categorical_count = df['species'].value_counts()
axes[0].bar(x=categorical_count.index, height=categorical_count)


categorical_count = df.value_counts('island',normalize=True)
axes[1].pie(x=categorical_count, labels=categorical_count.index)


axes[2].scatter(data=df, x='flipper_length_mm', y= 'body_mass_g')

plt.show()

Alternatively, you can work with NumPy arrays
  * Consider 3 arrays
    * x is made with `np.linspace()`, which returns a evenly spaced numbers over a specified interval
    * y1, is the sine of x, made with `np.sin()`
    * y2, is x times sine of x, made with `np.sin()`

x = np.linspace(start=0, stop=10, num=500)
y1 = np.sin(x)
y2 = x * np.sin(x)

We will create a Figure with 3 Axes and draw a line plot in the first, and a histogram for y1 and y2 in the second and third Axes, respectively

fig, axes = plt.subplots(nrows=3, ncols=1, figsize=(8,10))

axes[0].plot(x, y1, label='sin(x)')
axes[0].plot(x, y2, label='x * sin(x)')
axes[0].legend()

axes[1].hist(y1, label='sin(x)')
axes[1].legend()

axes[2].hist(y2, label='x * sin(x)')
axes[2].legend()

plt.show()

---

### <img width="3%" height="3%" align="top"  src="https://codeinstitute.s3.amazonaws.com/predictive_analytics/jupyter_notebook_icons/Icon%2010-%20Lesson%20Content.png"> Titles, Legend and Tight Layout

* You can add title and Axis Label
* In the example below, we consider random data generated with NumPy.
* In a Figure with 1 Axes, you will write before `plt.show()`:
  * `plt.xlabel()` to set x axis label. The function documentation is [here](https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.xlabel.html)
  * `plt.ylabel()` to set y axis label. The function documentation is [here](https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.ylabel.html)
  * `plt.title()` to set title. The function documentation is [here](https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.title.html)

np.random.seed(seed=19)
x = np.random.randn(50)

plt.plot(x) 
plt.xlabel('X-Axis')
plt.ylabel('Y-Axis')
plt.title('Plot Title Here')
plt.show()

---

For the next few examples, we created an auxiliary function, that returns `fig` and `axes`, which in this case is **a Figure with 4 plots**. 
  * It draws mathematical functions (sine, cosine), made with NumPy arrays

def MultiplePlots(): 
  np.random.seed(seed=50)
  x = np.linspace(0, 10, 500)

  fig, axes = plt.subplots(nrows=2, ncols=2, figsize=(10,4))
  axes[0,0].plot(x, np.sin(x) )
  axes[0,1].plot(x, x * np.sin(x) )
  axes[1,0].plot(x, x * np.sin(x**2) )
  axes[1,1].plot(x, np.cos(2*x) * np.sin(x) )

  return fig, axes

Call the function to understand it

fig, axes = MultiplePlots()
plt.show()

---

Customize the Figure and Axes
* `MultiplePlots()` returns `fig` and `axes`, so you can customize them

  * To acess the Axes you want, we use the bracket notation we are already familiar with.
  * `.set_title()`, sets given Axes title. The function documentation is [here](https://matplotlib.org/stable/api/_as_gen/matplotlib.axes.Axes.set_title.html)
  * `.set_xlabel()`, sets x axis label. The function documentation is [here](https://matplotlib.org/stable/api/_as_gen/matplotlib.axes.Axes.set_xlabel.html)
  * `.set_ylabel()`, sets y axis label. The function documentation is [here](https://matplotlib.org/stable/api/_as_gen/matplotlib.axes.Axes.set_ylabel.html)

fig, axes = MultiplePlots()

axes[0,0].set_title('sin(x)')
axes[0,0].set_xlabel('Time')
axes[0,0].set_ylabel('Level')

axes[0,1].set_title('x * sin(x)')
axes[0,1].set_xlabel('Time')
axes[0,1].set_ylabel('Level')

axes[1,0].set_title('x * sin(x^2)')
axes[1,0].set_xlabel('Time')
axes[1,0].set_ylabel('Level')

axes[1,1].set_title('cos(2x) * sin(x)')
axes[1,1].set_xlabel('Time')
axes[1,1].set_ylabel('Level')

plt.show()

You noticed in the previous figure the **x-axis values from upper Axes are overlapping with the titles from lower Axes**.
* You could:
  * Increase Figure size, using figsize parameter, or
  * Add `plt.tight_layout()`, so the plots will not overlap anymore. The function documentation is found [here](https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.tight_layout.html)

fig, axes = MultiplePlots()

axes[0,0].set_title('sin(x)')
axes[0,0].set_xlabel('Time')
axes[0,0].set_ylabel('Level')

axes[0,1].set_title('x * sin(x)')
axes[0,1].set_xlabel('Time')
axes[0,1].set_ylabel('Level')

axes[1,0].set_title('x * sin(x^2)')
axes[1,0].set_xlabel('Time')
axes[1,0].set_ylabel('Level')

axes[1,1].set_title('cos(2x) * sin(x)')
axes[1,1].set_xlabel('Time')
axes[1,1].set_ylabel('Level')

plt.tight_layout()   #### added plt.tight_layout()
plt.show()

When your Figure has mulitple plots, you can add title to each plot, and a **title to the Figure**, applying `.suptitle()` method to your Figure. 
  * The function documentation is found [here](https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.suptitle.html)

fig, axes = MultiplePlots()

axes[0,0].set_title('sin(x)')
axes[0,0].set_xlabel('Time')
axes[0,0].set_ylabel('Level')

axes[0,1].set_title('x * sin(x)')
axes[0,1].set_xlabel('Time')
axes[0,1].set_ylabel('Level')

axes[1,0].set_title('x * sin(x^2)')
axes[1,0].set_xlabel('Time')
axes[1,0].set_ylabel('Level')

axes[1,1].set_title('cos(2x) * sin(x)')
axes[1,1].set_xlabel('Time')
axes[1,1].set_ylabel('Level')


fig.suptitle('Different types of functions', fontsize=16, y=1.1) #### added title for the Figure
plt.tight_layout()
plt.show()

You can add and customize your legend
  * Consider 2 arrays plotted in the same Axes
  * The first is a mathematical function os sin(x), and the second is x * sin(x)
  * You will notice the argument label at `plt.plot()`, where you set how you want to call that particular plot.
  * You will notice the labels are not displaying.

x = np.linspace(0, 10, 500)
plt.plot(x, np.sin(x), label='sin(x)')
plt.plot(x, x * np.sin(x), label='x * sin(x)')
plt.show()


We saw previously `plt.legend()`. Now we consider additional arguments
  * `loc`, set the legend at the corresponding corner of the Axes: '`upper left`', '`upper right`', '`lower left`', '`lower right`', '`center`', '`best`'
  * `title`, set legend title
  * and `frameon`, which is a `True` / `False` flag to indicate if you want the frame around the legend

x = np.linspace(0, 10, 500)
plt.plot(x, np.sin(x), label='sin(x)')
plt.plot(x, x * np.sin(x),label='x * sin(x)')
plt.legend(loc='upper left', title='Legend', frameon=False)
plt.show()


When you are using `plt.subplots()`, either 1 Axes or multiple Axes, you inform the axes then use `.legend()`
  * The documentation is found [here](https://matplotlib.org/stable/api/_as_gen/matplotlib.axes.Axes.legend.html)

* The example below shows a Figure with 1 Axes

x = np.linspace(0, 10, 500)
fig, axes = plt.subplots()
axes.plot(x, np.sin(x), label='sin(x)')
axes.plot(x, x * np.sin(x),label='x * sin(x)')
axes.legend()
plt.show()

---

The example below shows a Figure with 2 Axes
  * It considers data from NumPy arrays, using `np.linspace()` and `np.sin()`

x = np.linspace(0, 10, 500)
y1 = np.sin(x)
y2 = x * np.sin(x)

fig, axes = plt.subplots(nrows=1, ncols=2, figsize=(10,5))

axes[0].hist(x=y1,label='Histogram of sin(x)')
axes[0].legend(loc='best', frameon=False)

axes[1].plot(x, y1, label='sin(x)')
axes[1].plot(x, y2, label='x * sin(x)')
axes[1].legend(loc='lower left')


plt.show()

---

### <img width="3%" height="3%" align="top"  src="https://codeinstitute.s3.amazonaws.com/predictive_analytics/jupyter_notebook_icons/Icon%2010-%20Lesson%20Content.png"> Line Style, Color and Marker

In the plotting functions we studied, there are arguments to set **line style and color**

  * Once you find the argument notation for setting the color, in `plt.plot()` is `color`, you can set the colors, considering the basic options, like **['b', 'g', 'r', 'c', 'm', 'y', 'k']**, or write the [hexadecimal](https://htmlcolorcodes.com/) value of your desired color. Don't forget to add `#` before the color hex code when parsing to Matplotlib
  * Once you find the argument notation for setting the style, in plt.plot() is `linestyle`. This [link](https://matplotlib.org/3.0.3/gallery/lines_bars_and_markers/line_styles_reference.html) shows options for it: **[ '-' , '--' , '-.' , ':' ]**
  * In addition, you can set line width with linewidth parameter.

* The rule of thumb for customizing it is to double check the plotting function, so you can be aware of the proper argument notation

x = np.linspace(0, 10, 500)
fig, axes = plt.subplots(figsize=(8,4))
axes.plot(x, np.sin(x), color='m', linewidth=4 , linestyle=':', label='sin(x)')
axes.plot(x, x * np.sin(x),color='#B35946', linestyle='-.', label='x * sin(x)')
axes.legend()
plt.show()


You can set the **marker** notation
  * The marker options are found [here](https://matplotlib.org/stable/api/markers_api.html#module-matplotlib.markers)
  * The rule of thumb for customizing it is to double check the plotting function, so you can be aware of the proper argument notation. In this case, `.scatter()` has `marker` as notation to set marker.

x = np.linspace(0, 10, 500)
fig, axes = plt.subplots(figsize=(10,6))
axes.scatter(x=x,y= np.random.randn(500),marker='2')
plt.show()

---

### <img width="3%" height="3%" align="top"  src="https://codeinstitute.s3.amazonaws.com/predictive_analytics/jupyter_notebook_icons/Icon%2010-%20Lesson%20Content.png"> Add horizontal and vertical lines

You can add horizontal and vertical lines in your Figure to highlight something you are interested
  * If your Figure has a single plot, use
    * `plt.axhline()` to add horizontal line. The function documentation is [here](https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.axhline.html)
    * `plt.axvline()` to add vertical line. The function documentation is [here](https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.axvline.html)

  * Naturally you can customize the line with the aspects we learned so far, like color or line style

x = np.linspace(0, 10, 500)
plt.plot(x, np.sin(x))
plt.axhline(y=0.5, color='r', linestyle='dashed',linewidth=2)
plt.axvline(x=8, color='g', linestyle=':')

plt.show()

If your Figure has multiple Axes, you will select a given Axes and use:
  * `.axvline()` to add a vertical line. The function documentation is [here](https://matplotlib.org/stable/api/_as_gen/matplotlib.axes.Axes.axvline.html)
  * `.axhline()` to add a horizontal line. The function documentation is [here](https://matplotlib.org/stable/api/_as_gen/matplotlib.axes.Axes.axhline.html)

x = np.linspace(0, 10, 500)
y1 = np.random.randn(500)
y2 = np.cos(x)


fig, axes = plt.subplots(nrows=1, ncols=2, figsize=(10,6))
axes[0].scatter(x=x, y=y1)
axes[0].axvline(x=8,color='#D1349C', linestyle='-')

axes[1].plot(x,y2)
axes[1].axhline(y=0.5, color='g')

plt.show()

---

### <img width="3%" height="3%" align="top"  src="https://codeinstitute.s3.amazonaws.com/predictive_analytics/jupyter_notebook_icons/Icon%2010-%20Lesson%20Content.png"> Colormap

Let's use the dataset mpg. It has car records and its respective data on mileage, cylinders, horsepower, orign, name

df = sns.load_dataset('mpg')
df = df.head(50)
print(df.shape)
df.head(3)

If you plot in Matplotlib, for example, a scatter plot and add the parameter c, to color the dots based on a variable level, the plot will not show a bar-level, so you can relate the color to the levels.

plt.figure(figsize=(10,6))
plt.scatter(data=df, x='weight',y='acceleration',c='mpg',cmap='inferno')
plt.show()

You should add plt.colorbar() to display a bar for the color variable. The gallery for matplotlib colormap reference is found [here](https://matplotlib.org/stable/gallery/color/colormap_reference.html#sphx-glr-gallery-color-colormap-reference-py)

plt.figure(figsize=(10,6))
plt.scatter(data=df, x='weight',y='acceleration',c='mpg',cmap='inferno')
plt.colorbar()
plt.show()

---

### <img width="3%" height="3%" align="top"  src="https://codeinstitute.s3.amazonaws.com/predictive_analytics/jupyter_notebook_icons/Icon%2010-%20Lesson%20Content.png"> Grid

You can change grid properties, in a figure level, with plt.grid(). The documentation function is [here](https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.grid.html)
* In this case we just changed the linestyle. More options are available in the documentation

x = np.linspace(0, 10, 500)
y = np.random.randn(500)
plt.scatter(x,y)
plt.grid(True, linestyle='-.') 
plt.show()

You can change grid properties, in an axes level, with .grid(). The documentation function is [here](https://matplotlib.org/stable/api/_as_gen/matplotlib.axes.Axes.grid.html)
* In this case, we set a line style for the grid and applied only to y axis

x = np.linspace(0, 10, 500)
y = np.random.randn(500)

fig, axes = plt.subplots()
axes.scatter(x=x,y=y)
axes.grid(True, linestyle='-.', axis='y')
plt.show()

---

### <img width="3%" height="3%" align="top"  src="https://codeinstitute.s3.amazonaws.com/predictive_analytics/jupyter_notebook_icons/Icon%2010-%20Lesson%20Content.png"> Annotation

You can annotate your plot, to convey specific information with plt.text() The documentation function, which is used in a figure level, is [here](https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.text.html)

x = np.linspace(0, 10, 500)
plt.plot(x, np.sin(x))

plt.text(x=2, y=0, s='Text Annotation', fontsize=12, c='r')
plt.show()

* The documentation function to be used in an axes level can be found [here](https://matplotlib.org/stable/api/_as_gen/matplotlib.axes.Axes.text.html)

x = np.linspace(0, 10, 500)
fig, axes = plt.subplots(nrows=1, ncols=2)
axes[0].plot(x, np.sin(x))
axes[0].text(x=2, y=0, s='Text Annotation', fontsize=12, c='r')
plt.show()

---

### <img width="3%" height="3%" align="top"          src="https://codeinstitute.s3.amazonaws.com/predictive_analytics/jupyter_notebook_icons/Icon%2010-%20Lesson%20Content.png"> Save your plot

You can save your plot with plt.savefig(). The documentation is [here](https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.savefig.html)
* The arguments are the file path and the image name, that the image should be stored. `bbox_inches='tight'`, so the figure has nice and even space among in its elements, and `dpi=150` to generate an image with high quality.
* You will draw your plot, then you will run the plt.savefig() command. 
* In this case the image name is "created_image.png", and is saved in the root directory of your application.
* When saving image, you should not add plt.show(), since your objective is not to display the image
* When using a jupyter notebook, and saving an image with plt.savefig(), the image will appear as an output anywway, but not due to plt.show(), but because of plt.savefig() functionality

x = np.linspace(0, 10,500)
plt.plot(x, np.sin(3*x) * x)
plt.title("Nice title!")
plt.tight_layout()
plt.savefig('created_image.png', bbox_inches='tight', dpi=150)

Check your root folder of your application to check your new image!

! ls

---