# Plotting in matplotlib

Python has many add-on libraries for making static or dynamic visualizations, but I’ll be mainly focused on [matplotlib](https://matplotlib.org/).

matplotlib supports various GUI backends on all operating systems and can export visualizations to all of the common vector and raster graphics formats (PDF, SVG, JPG, PNG, BMP, GIF, etc.)

The handy source of capabilities of this library are [cheatsheets and handouts](https://matplotlib.org/cheatsheets/)

The simplest way to output plots in the Jupyter notebook. To set this up, execute the following statement in a Jupyter notebook:

In [None]:
%matplotlib inline

With matplotlib, we use the following import convention:

In [None]:
import matplotlib.pyplot as plt

To make some examples we need more resources.

In [None]:
import numpy as np
import pandas as pd

Now let's make a simplest possible graph:

In [None]:
data = np.arange(10)
plt.plot(data) # use 'plot' function from the plt library on data object.

For a little bit more advanced plot we can use example from the first matplotlib cheatsheet:

In [None]:
X = np.linspace(0, 4*np.pi, 100)
Y = np.cos(X)
plt.plot(Y)

In [None]:
plt.plot(X,Y)

## Figures and Subplots

Plots in matplotlib reside within a `Figure` object. You can create a new figure with `plt.figure`:

In [None]:
fig = plt.figure()

`plt.figure` has a number of options; notably, `figsize` will guarantee the figure has a certain size and aspect ratio if saved to disk.

You can’t make a plot with a blank figure. You have to create one or more `subplots` using `add_subplot`:

In [None]:
ax1 = fig.add_subplot(2, 2, 1)

This means that the figure should be 2 × 2 (so up to four plots in total), and we’re selecting the first of four subplots (numbered from 1).

In [None]:
ax2 = fig.add_subplot(2, 2, 2)
ax3 = fig.add_subplot(2, 2, 3)

In [None]:
fig

In [None]:
ax3.plot(np.random.standard_normal(50).cumsum(), color="black", linestyle="dashed")
fig

The additional options instruct matplotlib to plot a black dashed line. The objects returned by `fig.add_subplot` here are `AxesSubplot` objects, on which you can directly plot on the other empty subplots by calling each one’s instance method:

In [None]:
ax1.hist(np.random.standard_normal(100), bins=20, color="black", alpha=0.3)
ax2.scatter(np.arange(30), np.arange(30) + 3 * np.random.standard_normal
(30))
fig

The style option alpha=0.3 sets the transparency of the overlaid plot.

**Assignment**

Using [this cheatsheet](https://matplotlib.org/cheatsheets/cheatsheets.pdf) modify code below in following way:

1. change line style to dash-dot-dash and add markers
2. in historgram change number of bins to 30 and color to IndianRed
3. in scatterplot change transparency to 0.2 and color to lime

In [None]:
fig = plt.figure() # reset figure otherwise new plots would be added to existing ones

In [None]:
 ax1 = fig.add_subplot(2, 2, 1)
ax2 = fig.add_subplot(2, 2, 2)
ax3 = fig.add_subplot(2, 2, 4)

ax1.hist(np.random.standard_normal(100), bins=20, color="black", alpha=0.3)
ax2.scatter(np.arange(30), np.arange(30) + 3 * np.random.standard_normal (30))
ax3.plot(np.random.standard_normal(50).cumsum(), color="black", linestyle="dashed")
fig

To make creating a grid of subplots more convenient, matplotlib includes a `plt.subplots` method that creates a new figure and returns a NumPy array containing the created subplot objects:

In [None]:
fig, axes = plt.subplots(2, 3)

axes

The `axes` array can then be indexed like a two-dimensional array; for example, `axes[0, 1]` refers to the subplot in the top row at the center. You can also indicate that subplots should have the same x- or y-axis using `sharex` and `sharey`, respectively. This can be useful when you're comparing data on the same scale; otherwise, matplotlib autoscales plot limits independently. 

## Ticks, Labels, and Legends

Most kinds of plot decorations can be accessed through methods on matplotlib axes objects. This includes methods like `xlim`, `xticks`, and `xticklabels`. These control the plot range, tick locations, and tick labels, respectively. They can be used in two ways:

- Called with no arguments returns the current parameter value (e.g., `ax.xlim()` returns the current x-axis plotting range)

- Called with parameters sets the parameter value (e.g., `ax.xlim([0, 10])` sets the x-axis range to 0 to 10)

All such methods act on the active or most recently created `AxesSubplot`. Each corresponds to two methods on the subplot object itself; in the case of `xlim`, these are `ax.get_xlim` and `ax.set_xlim`.

### Setting the title, axis labels, ticks, and tick labels

To illustrate customizing the axes, I’ll create a simple figure and plot of a random walk 

In [None]:
fig, ax = plt.subplots() # empty brackets mean we want only one plot
ax.plot(np.random.standard_normal(1000).cumsum());

To change the x-axis ticks, it’s easiest to use `set_xticks` and `set_xticklabels`. The former instructs matplotlib where to place the ticks along the data range; by default these locations will also be the labels. But we can set any other values as the labels using `set_xticklabels`:

In [None]:
ticks = ax.set_xticks([0, 250, 500, 750, 1000])
labels = ax.set_xticklabels(["one", "two", "three", "four", "five"], rotation=30, fontsize=8)

The rotation option sets the x tick labels at a 30-degree rotation. Lastly, `set_xlabel` gives a name to the x-axis, and `set_title` is the subplot title:

In [None]:
ax.set_xlabel("Stages")
ax.set_title("My first matplotlib plot")

Modifying the y-axis consists of the same process, substituting `y` for `x` in this example. The axes class has a `set` method that allows batch setting of plot properties. From the prior example, we could also have written:

In [None]:
ax.set(title="My first matplotlib plot", xlabel="Stages")

You have made so many improvements to the figure but do you still remember how to show it?

In [None]:
fig

### Adding legends

Legends are another critical element for identifying plot elements. There are a couple of ways to add one. The easiest is to pass the label argument when adding each piece of the plot:

In [None]:
fig, ax = plt.subplots() # because we didn't declare three subplots all plots are shown in one figure
ax.plot(np.random.randn(1000).cumsum(), color="black", label="one");
ax.plot(np.random.randn(1000).cumsum(), color="black", linestyle="dashed", label="two");
ax.plot(np.random.randn(1000).cumsum(), color="black", linestyle="dotted", label="three");

**Assignment**
Plot the same data but in three separated subplots starting from the code below:

In [None]:
fig1, ax1 = plt.subplots(3,1) 

# YOUR CODE HERE
raise NotImplementedError()

`ax.legend()` to automatically create a legend when labels are provided:

In [None]:
ax.legend()
fig

The legend method has several other choices for the location `loc` argument. See the docstring (with `ax.legend?`) for more information.

The `loc` legend option tells matplotlib where to place the plot. The default is "best", which tries to choose a location that is most out of the way. To exclude one or more elements from the legend, pass no label or `label="_nolegend_"`.

### Annotations and Text

In addition to the standard plot types, you may wish to draw your own plot annotations, which could consist of text, arrows, or other shapes. You can add annotations and text using the `ax.text`, `ax.arrow`, and `ax.annotate` functions. text draws text at given coordinates (x, y) on the plot with optional custom styling:

In [None]:
ax.text(0, 35, "Beginning",
        family="monospace", fontsize=10, color="grey")
ax.arrow(100, 33, -100,-30, color="grey") 
ax.annotate("Look here!", xy=(0, -5), xytext=(20, -20), arrowprops=dict(arrowstyle="->"))
fig

### Saving Plots to File

You can save the active figure to file using the figure object’s savefig instance method. For example, to save an SVG version of a figure, you need only type:

```
fig.savefig("figpath.svg")
```

The file type is inferred from the file extension. So if you used `.pdf` instead, you would get a PDF.

### matplotlib Configuration

matplotlib comes configured with color schemes and defaults that are geared primarily toward preparing figures for publication. Fortunately, nearly all of the default behavior can be customized via global parameters governing figure size, subplot spacing, colors, font sizes, grid styles, and so on. One way to modify the configuration programmatically from Python is to use the rc method; for example, to set the global default figure size to be 10 × 10, you could enter:

```
plt.rc("figure", figsize=(10, 10))
```

All of the current configuration settings are found in the `plt.rcParams` dictionary, and they can be restored to their default values by calling the `plt.rcdefaults()` function.

The first argument to rc is the component you wish to customize, such as "figure", "axes", "xtick", "ytick", "grid", "legend", or many others. After that can follow a sequence of keyword arguments indicating the new parameters. A convenient way to write down the options in your program is as a dictionary:

```
plt.rc("font", family="monospace", weight="bold", size=8)
```

## Plotting pandas

Series and DataFrame have a `plot` attribute for making some basic plot types. By default, `plot()` makes line plots.

For the single Series object's index is passed to matplotlib for plotting on the x-axis, though you can disable this by passing use_index=False. The x-axis ticks and limits can be adjusted with the xticks and xlim options, and the y-axis respectively with yticks and ylim. See table below for a partial listing of plot options.

Table 1.

| Argument  | Description                                                                                                        |
|-----------|--------------------------------------------------------------------------------------------------------------------|
| `label`     | Label for plot legend                                                                                              |
| `ax`       | matplotlib subplot object to plot   on; if nothing passed, uses active matplotlib subplot                          |
| `style`     | Style string, like `"ko--"`, to be passed to   matplotlib                                                            |
| `alpha`     | The plot fill opacity (from 0 to   1)                                                                              |
| `kind`      | Can be `"area"`, `"bar"`, `"barh"`, `"density"`, `"hist"`, `"kde"`, `"line"`, `"box"`, or `"pie"`; for DataFrame also `"scatter"` and `"hexbin"`                      |
| `figsize`   | Size of the figure object to   create                                                                              |
| `logx`      | Pass `True` for logarithmic scaling on the x axis; pass `"sym"` for symmetric   logarithm that permits negative values |
| `logy`      | Pass `True` for logarithmic scaling on the y axis; pass `"sym"` for symmetric   logarithm that permits negative values |
| `title`     | Title to use for the plot                                                                                          |
| `use_index` | Use the object index for tick   labels                                                                             |
| `rot`       | Rotation of tick labels (0 through   360)                                                                          |
| `xticks`    | Values to use for x-axis ticks                                                                                     |
| `ytick`s    | Values to use for y-axis ticks                                                                                     |
| `xlim`      | x-axis limits (e.g., `[0, 10]`)                                                                                      |
| `ylim`      | y-axis limits                                                                                                      |
| `grid`      | Display axis grid (off by default)                                                                                 |
| `xlabel`    | Name to use for the xlabel on x-axis. Default uses index name as xlabel                                            |
| `ylabel`    | Name to use for the xlabel on x-axis. Default is no label

Most of pandas’s plotting methods accept an optional `ax` parameter, which can be a matplotlib subplot object. This gives you more flexible placement of subplots in a grid layout.

DataFrame’s `plot` method plots each of its columns as a different line on the same subplot, creating a legend automatically:

In [None]:
df = pd.DataFrame(np.random.standard_normal((10, 4)).cumsum(0),
    columns=["A", "B", "C", "D"],
    index=np.arange(0, 100, 10))
df.plot()

DataFrame has a number of options allowing some flexibility for how the columns are handled, for example, whether to plot them all on the same subplot or to create separate subplots:

Table 2.

| Argument     | Description                                                                 |
|--------------|-----------------------------------------------------------------------------|
| `subplots`     | Plot each DataFrame column in a   separate subplot                          |
| `layouts`      | 2-tuple (rows, columns) providing   layout of subplots                      |
| `sharex`       | If `subplots=True`, share the same x-axis, linking ticks and limits           |
| `sharey`       | If `subplots=True`, share the same y-axis                                     |
| `legend`       | Add a subplot legend (`True` by default)                                      |
| `sort_columns` | Plot columns in alphabetical   order; by default uses existing column order |

In [None]:
# Preparing dataset for later use
pov = pd.read_csv("C:\\Users\\iwo.augustynski\\Downloads\\share-of-population-in-extreme-poverty.csv", parse_dates=["Year"])

s = pov.Code.unique() # take only unique values from 'Code' column
s = np.random.choice(s, size = 10) # take 10 random values from previous line

pov_sample = pov[pov.Code.isin (s)] # take selected 10 random countries from the dataset

In [None]:
to_figure =pov_sample.pivot(columns="Entity", index = "Year", values= "$2.15 a day - share of population below poverty line")

In [None]:
to_figure

**Assignment**

Read `to_figure.plot?` and plot following figures with to_figure data:

1. one lineplot with all countries

2. one lineplot with all countries with grid

2. lineplots in individual subplots

3. barplots in subplots with rotated xticks

4. boxplot with rotated xticks

5. first one with removed x label and added y label

6. Add title to the boxplot graph



In [None]:
#ad 1.



In [None]:
#ad 2.



In [None]:
#ad 3.



In [None]:
#ad 4.



In [None]:
#ad 5.



In [None]:
#ad 6.
