# Plots and Charts

A plot is the most intuitive way to demonstrate what you have discovered from data. 
The ability to produce well-designed plots is arguably more important than any 
other skill you might learn from this course.

The main package for generating plots in Python is ```matplotlib```.
This notebook will demonstrate how to plot several basic types of diagrams:
*histogram*, *pie chart*, *line plot* and *scatter plot*.

### A. Histogram

Histogram is a bar chart that shows the distribution of a data series. 
The syntax for plotting a histogram is: 
```python
matplotlib.pyplot.hist(data)
```

In [None]:
import matplotlib.pyplot as plt

# Data
y = [1,1,1,1,1,2,2,2,3,3,3,3,4,4,4,4,4,5,5,5]

# Plot a simple histogram


In the above example, we did not specify any parameters so ```matplotlib``` decides everything for us.
The parameters that we mostly work with is ```bins```, ```range``` and ```density```. 
- ```bins``` specifies how many bars we want the histogram to have. 
    Alternatively we can specify the edge values of each bar. 
- ```range``` specifies the range of values the histogram covers.
<!-- ```density``` specifies whether the area under the histogram should sum up to 1.-->

In [None]:
# Histogram with only two bars


In [None]:
# Histogram with bars starting at 1,3 and 5 and ends at 10


In [None]:
# Histogram only covers data between 2 and 4


## B. Bar Chart

We could produce the same diagram with a vanilla bar chart, 
but that would require us to count the repeated values ourselves.
```python
matplotlib.pyplot.bar(values,count)
```

In [None]:
# Plot a simple histogram with bar chart


Counting the number of repeated values by hand is obviously 
impractical for even moderately long data series, 
so in practice you will utilize functions such as ```numpy.histogram()```.

### C. Pie Chart

A pie chart is another common choice to show the distribution of a data series.
```python
matplotlib.pyplot.pie(wedge_sizes)
```

In [None]:
# Plot a simple bar chart


### D. Line Plot

Line plot draws a line joining all data points.
It is most suitable for data that comes in sequence, such as time series data.

To simply plot all data sequentially, use:
```python
matplotlib.pyplot.plot(y)
```

In [None]:
# Simple line chart


We can also specify the horizontal values:
```python
matplotlib.pyplot.plot(horizontal-values,vertical-values)
```

In [None]:
# Line chart with horizontal values specified
x = [20,19,18,17,16,15,14,13,12,11,10,9,8,7,6,5,4,3,2,1]


There are a lot of settings we can change: color, line style, etc.
See https://matplotlib.org/3.1.1/api/_as_gen/matplotlib.pyplot.plot.html#matplotlib.pyplot.plot
for details.
    

In [None]:
# Dash line in red


### E. Scatter Plot

A scatter plot shows the relationship between two variables by plotting 
data points on a 2D diagram.
Useful when we want to show the relationship between two data series that 
are not sequential in nature.

In [None]:
# Simple scatter plot


In [None]:
# Change the marker style to crosses
