# Matplotlib summary

> **Reference:**  
> - [Matplotlib - Python Data Visualization](https://matplotlib.org/stable/tutorials/introductory/pyplot.html)
> - [Matplotlib - Official Documentation](https://matplotlib.org/stable/api/matplotlib_configuration_api.html)

- Once the graph is built, instruction `plt.show` allows to visualize it. 
- However, with a Jupyter Notebook, adding at the top of the page `%matplotlib` inline automatically plot the figures at each execution of a cell calling a `pyplot` command.

    <div style="border: 1px solid black; padding: 10px; width: auto; height: auto;">
    <pre>plt.plot(df['x_values'], df['y_values'], c='r', marker = 'o', linewidth=5, label="label name")</pre>
    </div>

    or

    <div style="border: 1px solid black; padding: 10px; width: auto; height: auto;">
    <pre>df.plot(x=df['x_values'], y = df['y_values'], title ='Plot title', figsize= (10,6))</pre>
    </div>

Matplotlib also provides the ability to modify the style of elements on the graphs, which can be very useful for further distinguishing them and even adding additional meaning (such as using a specific color, for example).

For each pair (x,y), there is a third optional argument to specify the style and color of the line.

**Line style:** Available line styles are:

| Line styles | Description |
|-------------|-------------|
| `-` | continuous line |
| `--` | dashed line |
| `:` | dotted line |
| `-.` | mix of dashes and dots |

**Colors:** Available colors are:

| Colors |  Description |
|--------|--------------|
| `b` |  blue |
| `g` |  green |
| `r` | red |
| `c` | cyan |
| `m` | magenta |
| `y` | yellow |
| `k` | black |
| `w` | white |

The color and the style can be joined together. For example, the default format is `-b`, which is a blue continuous line. 

**Marker:** It is also possible to add markers with symbols, we can add them to lines or even use them instead of the lines. These symbols can be concatenated with above colors.

| Marker | Description |
|--------|-------------|
| `.`   | point marker |
| `,` | pixel marker |
| `o` | circle marker |
| `v` | triangle_down marker |
| `^` | triangle_up marker |
| `<` | triangle_left marker |
| `>` | triangle_right marker |
| `1` | tri_down marker |
| `2` | tri_up marker |
| `3` | tri_left marker |
| `4` | tri_right marker |
| `s` | square marker |
| `p` | pentagon marker |
| `*` | star marker |
| `h` | hexagon1 marker |
| `H` | hexagon2 marker |
| `+` | plus marker |
| `x` | x marker |
| `D` | diamond marker |
| `d` | thin_diamond marker |
| `_` | hline marker |

One can specify the style of a line chart in one character string following this order: the color, the line style and the markers. For exemple:

- `r-*` for a red continuous line with star markers,
- `y:d` for a yellow dotted line with diamond markers, etc..

**Multiple plot:** 
- `df.plot(x= col1, y = [col2,col3], style = ["r--", "c-+"], title = 'Title')`

**Barplot:**
- `plt.bar(x, height, width=0.8, bottom=None, *, align='center', data=None, **kwargs)`
- The `plt.bar()` function in Matplotlib is used to create a bar plot. Here's an explanation of its parameters:

  - `x`: The x-coordinates of the bars.
  - `height`: The heights of the bars.
  - `width`: The width of the bars. Default is 0.8.
  - `bottom`: The y-coordinate of the bottom of the bars. Default is None.
  - `align`: The alignment of the bars with the x-coordinates. Options include 'center', 'edge'.
  - `data`: The data to be plotted. Default is None.
  - `**kwargs`: Additional keyword arguments that control the appearance of the bars, such as color, edgecolor, etc.

- This function allows you to create vertical bar plots where the height of each bar represents the value of the data. The `x` parameter specifies the x-coordinates of the bars, and the `height` parameter specifies the heights of the bars. Other parameters allow you to customize the appearance and alignment of the bars.

**Scatterplot:**
- `plt.scatter(x, y, s=None, c=None, marker=None, cmap=None, norm=None, vmin=None, vmax=None, alpha=None, linewidths=None, *, edgecolors=None, plotnonfinite=False, data=None, **kwargs)[source]`

- The `plt.scatter()` function in Matplotlib is used to create a scatter plot. The parameters are:

  - `x`: The x-coordinates of the data points.
  - `y`: The y-coordinates of the data points.
  - `s`: The size of the markers. This can be a scalar or an array of the same length as `x` and `y`, controlling the size of each marker individually.
  - `c`: The color of the markers. This can be a single color format string, a sequence of color specifications, or an array containing the color of each marker.
  - `marker`: The marker style. This can be a string or a path specifying the marker shape.
  - `cmap`: The colormap used to map scalar values to colors.
  - `norm`: A normalization instance that scales data values to the interval [0, 1].
  - `vmin`, `vmax`: The minimum and maximum scalar values for mapping to colors.
  - `alpha`: The transparency of the markers.
  - `linewidths`: The width of the marker edges.
  - `edgecolors`: The color of the marker edges.
  - `plotnonfinite`: Whether to plot points with NaN and infinite values. If False, NaN and infinite values are masked out.
  - `data`: If provided, the data is interpreted as a `DataFrame` or `Series` with named variables.

- These parameters allow you to customize various aspects of the scatter plot, such as the marker size, color, style, transparency, and more.


**Historgram:** 
- `plt.hist(x, bins=None, range=None, density=False, weights=None, cumulative=False, bottom=None, histtype='bar', align='mid', orientation='vertical', rwidth=None, log=False, color=None, label=None, stacked=False, *, data=None, **kwargs)[source]`

- The `plt.hist()` function in Matplotlib is used to create a histogram. Here's an explanation of its parameters:

  - `x`: The data to be plotted.
  - `bins`: The number of bins to use or an array specifying the bin edges.
  - `range`: The range of values to be binned. If not provided, it is inferred from the data.
  - `density`: If True, the histogram is normalized to form a probability density.
  - `weights`: An array of weights, one per data point.
  - `cumulative`: If True, a cumulative histogram is computed.
  - `bottom`: The y-coordinate of the bottom of the bars.
  - `histtype`: The type of histogram to draw. Options include 'bar', 'barstacked', 'step', 'stepfilled'.
  - `align`: The alignment of the bars with the bin edges. Options include 'left', 'mid', 'right'.
  - `orientation`: The orientation of the histogram. Options include 'vertical' and 'horizontal'.
  - `rwidth`: The relative width of the bars as a fraction of the bin width.
  - `log`: If True, the histogram axis is set to a logarithmic scale.
  - `color`: The color of the bars.
  - `label`: The label for the histogram.
  - `stacked`: If True, multiple data are stacked on top of each other.

- These parameters allow you to customize various aspects of the histogram, such as the number of bins, range, normalization, bar appearance, and more.

**Box plot:** 
- `plt.boxplot(x, notch=None, sym=None, vert=None, whis=None, positions=None, widths=None, patch_artist=None, bootstrap=None, usermedians=None, conf_intervals=None, meanline=None, showmeans=None, showcaps=None, showbox=None, showfliers=None, boxprops=None, labels=None, flierprops=None, medianprops=None, meanprops=None, capprops=None, whiskerprops=None, manage_ticks=True, autorange=False, zorder=None, capwidths=None, *, data=None)`
- The `plt.boxplot()` function in Matplotlib is used to create a box plot. Here's an explanation of its parameters:

  - `x`: The data to be plotted as a box plot.
  - `notch`: Whether to draw a notch around the median. Default is None.
  - `sym`: The symbol used to represent outliers. Default is None.
  - `vert`: Whether to draw vertical or horizontal box plots. Default is None.
  - `whis`: The proportion of the IQR past the low and high quartiles to extend the plot whiskers. Default is None.
  - `positions`: The positions of the boxes along the x-axis. Default is None.
  - `widths`: The widths of the boxes. Default is None.
  - `patch_artist`: Whether to fill the box with color. Default is None.
  - `bootstrap`: Whether to bootstrap the confidence intervals around the median. Default is None.
  - `usermedians`: Custom values for the medians. Default is None.
  - `conf_intervals`: Custom confidence intervals. Default is None.
  - `meanline`: Whether to show a line for the mean. Default is None.
  - `showmeans`: Whether to show the mean. Default is None.
  - `showcaps`: Whether to show the caps. Default is None.
  - `showbox`: Whether to show the box. Default is None.
  - `showfliers`: Whether to show the outliers. Default is None.
  - `boxprops`: Properties for the boxes. Default is None.
  - `labels`: Labels for the boxes. Default is None.
  - `flierprops`: Properties for the outliers. Default is None.
  - `medianprops`: Properties for the medians. Default is None.
  - `meanprops`: Properties for the means. Default is None.
  - `capprops`: Properties for the caps. Default is None.
  - `whiskerprops`: Properties for the whiskers. Default is None.
  - `manage_ticks`: Whether to manage ticks. Default is True.
  - `autorange`: Whether to automatically adjust the range. Default is False.
  - `zorder`: The z-order of the box plot. Default is None.
  - `capwidths`: The widths of the caps. Default is None.
  - `data`: The data to be plotted. Default is None.

- This function allows you to create box plots to visualize the distribution of data and detect outliers. You can customize various aspects of the plot, including the appearance of the boxes, whiskers, caps, medians, means, and outliers.

In [1]:
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import seaborn as sns
