# Basic Plotting using Matplotlib

Matplotlib is a Python 2D plotting library. Matplotlib starts with aim to provide Matlab-like plotting feature to Python. It offers rich <a href="https://matplotlib.org/tutorials/introductory/sample_plots.html">list of plotting types</a>.

## 1. Introduction and Setup

### Matplotlib Figure Hierarchy

<img src="attachment:image.png" width=300 />

### Anatomy of Figure

<img src="attachment:image.png" width=400 />

### Setup Notebook

The `%matplotlib` is a magic function to configure how Matplotlib works Jupyter Noteboiok to  present graph.

There are quite a number of options, but following 2 are most commonly used.
* `%matplotlib inline`: draw static images and store them in the notebook.
* `%matplotlib notebook`: interactive plots with zoom and resizing features embedded within the notebook

Import libraries `pandas` and `matplotlib.pyplot`.

## 1. Pandas Basic Plotting API

The `plot()` and `plot.xx()` in Pandas are wrapper functions which call matplotlib functions.
* They are friendlier to use.
* But only offer partial functionalities.

## Trigonometry

Initialize x and y values.

Create a dataframe from x and y.

Plot the graph. But it plots all columns on the graph with `index` as x-axis, which is not what we want.

We can specify the columns for `x` and `y`. We can also set title of the graph.

### Environment Data (Line Graph)

These 3 CSV files are downloaded from https://data.gov.sg website.
* air-pollutant-carbon-monoxide.csv
* air-pollutant-ozone.csv
* air-pollutant-sulphur-dioxide.csv

Load the 3 csv files into respective dataframe.
* Set index column
* Rename column with long name.

Merge 3 dataframes together on their index, which is the year.

Plot all 3 columns in the same graph.
* As 3 series are of different range, they are not suitable to share same y-axis.

It is better to plot them on different subplots.

Fine tune to the plot with following parameters.
* Use `xticks` parameter to specify the ticks on x-axis so that it doesn't show decimal values.
* Use `rot` to rotate `xticks` by some degree so that they don't overlap each other. 

### Student Marks (Bar Chart and Boxplot)

Load dataset from csv file `data/class1_test1.tsv`.

#### Average Marks of Students

Find the average mark of each student.
* Need to set `axis=1`

#### Average and All Subjects
Can we plot all subjects' marks together with average mark?

Add a column `Average` to dataframe.

Plot the dataframe with all columns.

#### Concatenate Dataframes

 Concatenate the two dataframes and set its index to `name`.

### Bar Chart (Olympics Medals)

There are `NaN` values in the dataframe. Let's replace them with 0.

Convert `Total` column from float to integer.

Filter only data related to Gold medal.

Sort them by `Total` column in descending order.

Select only top 10 countries with most Gold medals.

Plot bar graph and set axis reference to `ax`. 
* Use it to set xlabel and ylabel

### Save Figure

Charts can be saved using `savefig()` function of Figure object. 
* Get figure object from axes. 
* Tighten layout so that all labels are inside the figure.
* Save the figure

## 2. Matplotlib Plotting

Matplotlib provides 2 sets of APIs with same functionalities.
* Pyplot is the low-level API 
* Object-oriented API provides more flexible way of plotting using Figure and Axes.

### Trigonometry

Create a subplot with 1 axes in the figure.
* Each line requires 2 series and 1 optional marker format.

### Environment Data

### Olympics Medals

Use `pivot_table()` to create `Gold`, `Silver` and `Bronze` columns. 

Reset the index and set `NOC` as index.

Convert data type of medal columns to integer.

Sort the dataframe by medals.

Get the top 10 countries.

Plot the graph.
* Set color for each bar.
* Use `ax` to change `xlabel` and `ylabel`.

Change `stacked=True` to stack the bars.