# Plotting

The output we've created so far from our program has simply
been printing out text.  However, if you are using Python
for data science, statistics, applied mathematics, or many
other applications of computing, you often want to do more
than just compute and print one or more values.  Typically,
you may want to visualize the result of your simulation,
computation, or analysis.  To visualize results, we
will make use of Python libraries that allow for easy
creation of visualizations.


## Libraries
Up until now, we've focused primarily on the basics of Python -- 
knowing and using the
fundamental constructs available in Python and understanding how
to combine them to solve a problem computationally.  While this
fundamental understanding is crucial, in Python it's very
common to make use of other libraries that:
* provide specialized functionality not included in base python
* make other tasks that could be done with base python easier

We've seen a couple of libraries, namely `math` and `random`
already.  These our both examples of libraries included
in base Python.  However, one of the great things about Python
for applied computing is the sheer number of other, specialized
libraries available.  As of 2020, there are over 250,000 projects
available for Python on the Python Package Index -- the standard
repository for distributing libraries.  On most cloud based systems
for running Python (like CoCalc or Colab), many of the most
popular libraries are already installed.  So, to use them we
simply type `import libraryname`.  In some cases, if the library
name is really long, you may see an alternate form of this
import statement:  `import libraryname as someshortname`, where
`someshortname` is typically a 2-5 letter abbreviation for the library's
name.

## Matplotlib Library

The most common library used for visualizing results in Python
is Matplotlib.  It is great for producing a wide variety of 
visualizations with combinations of:
* line and scatter plots
* bar charts and histograms
* box plots
* contour plots
* vector fields
* images
* heat maps
* simple plot animations
* simple 3D plots

Any plot can be customized with labels, legends, colors, annotations,
and other stylistic choices to allow you to create professional
quality visualizations.

### Help with Matplotlib
This will walk through and demonstrate a variety of plots
you can do with matplotlib, but there are far more than can
be covered here.
Matplotlib has good
[documentation](https://matplotlib.org/api/pyplot_api.html).
and a really large, great
[gallery of examples](https://matplotlib.org/1.5.1/gallery.html)
that illustrates plots and the code used to create the plots.
Often, it is helpful to find an example in the gallery of what
sort of plot you are looking for and then view the code to create
the plot and adapt it to your specific use.

### Importing
To get started with matplotlib, we first need to import it.
With matplotlib, the library consists of multiple modules (essentially multiple small parts).  This is something we didn't see with
the `random` or `math` libraries.  To access functions within a
specific library in a module, we would need to call them with
`libraryname.modulename.functionname()`.  As you can imagine
this can get quite long and be intrusive to type.  Instead,
if you are going to access primarily functions from one module,
we can import it with `import libraryname.modulename as shortname`.
Then, we can just use `shortname.functionname()` when we wish
to call a function. For matplotlib, you typically really only
need to call functions from the `pyplot` module, so we use the
following import statement
```
import matplotlib.pyplot as plt
```
The choice of `plt` is optional, but is typically considered
the standard convention for `matplotlib`.  Most major libraries
have standard short names that are used by the vast majority of
people that write code, and sticking with the standard makes it
easier for others to read your code.

In jupyter notebooks (like this one), we typically add a
"magic command" that makes the plots fit a little nicer
directly in the notebooks.  Magic commands are lines
that start with `%`.  The magic command
typically used with matplotlib is
```
%matplotlib inline
```

We'll start by running both the magic command and our import statement.

In [None]:
%matplotlib inline
import matplotlib.pyplot as plt

### Plotting Overview
There are technically two different ways to interact
with matplotlib:
1. pyplot interface:  Calling a function makes some
   change to the current figure such as:
   * create a new figure (or switch the current figure)
   * add a plot to the figure
   * plot some lines in current plot ("axes")
2. object oriented API:  This is a more formal approach
   that explicitly creates "axes" objects and calls their
   methods.
We'll focus on (1), but you may see (2) if you look at
other resources or the matplotlib gallery.

### Standard Line/Scatter Plot
The simplest way to plot is to use the `plot` function.
This function can be called in many different ways (because
many of the arguments are optional).  Some of the
most common are:
* `plt.plot(yvals)` - takes list or array of $y$-values, assumes
   the $x$-values are integers ranging from 0 to `len(yvals)-1`
* `plt.plot(xvals,yvals)` - plots line of points given by y versus x
   for list/array of $y$-values and $x$-values
* `plt.plot(xvals, yvals, stylestring)` - same as previous but allows
   to customize line style by specifying color, whether the points should
   be marked with dots, whether they should be connected by a line, etc.
There are also many other arguments you can pass as keyword arguments to `plot`.

#### Starting Simple
Let's start by looking at the simplest example by plotting
the population (from the census data) in Detroit from
$1910, 1920, \ldots, 2000, 2010$.

In [None]:
detroit_populations = [465766, 993678, 1568662, 1623452, 1849568, 1670144, 1511482, 1203339, 1027974, 951270, 713777]
plt.plot(detroit_populations)
plt.show()

Note that this is just the population data, with the $x$-axis
ranging from 0 to 10.  We can also provide $x$-values
as well when calling `plt.plot`.  Since we know the
years start at 1910 and go up to 2010 by 10 each time,
we can easily make the list of years with `range`.

In [None]:
detroit_years = list(range(1910,2011,10))

plt.plot(detroit_years, detroit_populations)
plt.show()

#### Changing Line Style and Color
That plot looks okay, but a little plain.  This
is because it is the default plot style to simply
connect the data points with a line.
You can make it look better by customizing the
color, line style, line width, etc.  Some of the
most common keyword arguments to modify the style
are:
* color: change color of line, can be abbreviated as c
    * `'r'` = red
    * `'b'` = blue
    * `'m'` = magenta
    * `'c'` = cyan
    * `'g'` = green
    * `'k'` = black
    * `'y'` = yellow
* marker: marker to use for data points:
    * `'x'` = letter x
    * `'+'` = plus sign
    * `'o'` = normal dot
    * `'.'` = small dot
    * `'s'` = square
    * `''` = no marker
* linestyle: how to display line connecting points
    * `'-'` = solid line
    * `':'` = dotted line
    * `'--'` = dashed line
    * `'-.'` = alternating dash-dot
    * `''` = no line between data points
* linewidth: width of line, default value is 1.5, larger values are thicker lines
* markersize: size of marker, default value is 6, larger values are larger markers

The full list of arguments to plot can be seen in the
[documentation for the plot function](https://matplotlib.org/3.2.1/api/_as_gen/matplotlib.pyplot.plot.html).

Let's look at an example:

In [None]:
detroit_years = list(range(1910,2011,10))

plt.plot(detroit_years, detroit_populations, color='c', marker='o', linestyle='--', linewidth=2.5, markersize=8)
plt.show()

Since these are common, they can often be abbreviated in a couple of ways:
1. shortened keyword:
    * can use `c=` instead of `color=`
    * can use `ls=` instead of `linestyle=`
    * can use `lw=` instead of `linewidth=`
    * can use `ms=` instead of `markersize=`
2. provide a shorthand format string:
    * allows to specify color, marker, and linestyle in one argument
        * first character = color
        * second character = marker
        * third character = linestyle
    * provided as a short string directly after the data arguments

For example, using the shortened keywords:

In [None]:
plt.plot(detroit_years, detroit_populations, c='c', marker='o', ls='--', lw=2.5, ms=8)
plt.show()

To generate the same figure using a shorthand format string, 
we can use:

In [None]:
plt.plot(detroit_years, detroit_populations, 'co--', lw=2.5, ms=8)
plt.show()

#### Adding Titles and Labels

The previous plots had no labels, which really is
not very helpful if you are trying to create useful,
professional looking illustrations.  We can
add labels and a title to any matplotlib figure
(including other types of charts) with:

* `plt.xlabel('desired x-axis label')`
* `plt.ylabel('desired y-axis label')`
* `plt.title('desired title')`

Let's look at adding labels to our previous plot:

In [None]:
plt.plot(detroit_years, detroit_populations, 'co--', lw=2.5, ms=8)
plt.xlabel('census year')
plt.ylabel('population')
plt.title('Detroit Census Population')
plt.show()

#### Multiple Lines

To add more than one line to a plot, you can
simply make multiple calls to `plt.plot()` with
the different data (and potentially different styles)
you wish to plot.  However, any time you plot
multiple lines on a single plot, you should include
a legend to distinguish the different lines.
To indicate the line name that should be used
in the legend, we can pass
one additional keyword argument, `label`, to the
call to the `plot` function.  Then,
we tell matplotlib to put a legend on our plot by
calling `plt.legend()`.  

In [None]:
chicago_populations = [2185283, 2701705, 3376438, 3396808, 3620962, 3550404, 3366957, 3005072, 2783911, 2896016, 2695598]
chicago_years = list(range(1910,2011,10))
plt.plot(detroit_years, detroit_populations, 'co--', label='Detroit')
plt.plot(chicago_years, chicago_populations, 'ms-', label='Chicago')
plt.legend()
plt.show()

Sometimes the default location for the legend
may not be ideal (covering an important part
of the plot).  We can pass an optional keyword
argument, `loc` to `plt.legend()` to specify
where in the plot the legend should be placed.
The full list of locations is available in the
[legend documentation](https://matplotlib.org/3.1.1/api/_as_gen/matplotlib.pyplot.legend.html).  One of the most
useful choices for location is `'best'`, which
attempts to identify the best of the nine possible
locations.

In [None]:
chicago_populations = [2185283, 2701705, 3376438, 3396808, 3620962, 3550404, 3366957, 3005072, 2783911, 2896016, 2695598]
chicago_years = list(range(1910,2011,10))
plt.plot(detroit_years, detroit_populations, 'co--', label='Detroit')
plt.plot(chicago_years, chicago_populations, 'mo-', label='Chicago')
plt.legend(loc='upper left')
plt.show()

### Other Axis Scales

All of these plots used linear axis scales.  Sometimes,
you may want to visualize something using logarithmic
scales on one or both of the axes.  The choice of axis
scale is typically dependent on the data and what you hope
to show.

For instance, much of the behavior in
data that behaves like exponential growth is obscured
on a plot with linear axes.  Data that grows exponentially
is often better illustrated on a semilog scale with a linear
$x$-axis and a logarithmic $y$-axis.

Similarly, if your data follows a power law, the data
will appear as a straight line on a plot with logarithmic
scales on both axes.

We can create visualizations in matplotlib with other
axis scales by replacing `plt.plot` with one of the
following functions:

* `plt.semilogx(xvals, yvals)` = logarithmic $x$-axis and linear
  $y$-axis
* `plt.semilogy(xvals, yvals)` = linear $x$-axis and
  logarithmic $y$-axis
* `plt.loglog(xvals, yvals)` = logarithmic $x$-axis and
  logarithmic $y$-axis

These functions can also take the same additional arguments
as `plot` to specify line formatting.

In [None]:
ys = [2**i for i in range(1,14)]
xs = [i for i in range(1,14)]
plt.semilogy(xs, ys,'bo-')
plt.show()

### Bar Charts

Matplotlib also provides the ability to 
create bar charts.  To create a bar chart,
we can use one of two functions:
* `plt.bar(names, values)` - vertical bar plot
* `plt.barh(names, values)` - horizontal bar plot

#### Examples

In [None]:
sec8housing = [5528, 1141, 1386, 963, 981, 2271, 1427]
cities = ['Detroit', 'Saginaw', 'Plymouth', 'Flint', 'Wyoming', 'Grand Rapids', 'Lansing']
plt.bar(cities, sec8housing)
plt.show()

In [None]:
sec8housing = [5528, 1141, 1386, 963, 981, 2271, 1427]
cities = ['Detroit', 'Saginaw', 'Plymouth', 'Flint', 'Wyoming', 'Grand Rapids', 'Lansing']
plt.barh(cities, sec8housing)
plt.show()

As with `plot`, there are other optional arguments
that can be used to modify the plot, such as changing
colors, bar width, opacity, $y$-axis scale, edge color, etc.  
The full list of
available options to `plt.bar()` can be found in the
[documentation for bar function](https://matplotlib.org/3.3.2/api/_as_gen/matplotlib.pyplot.bar.html).

As with `plot`, we can also add titles and labels with the same functions
we used previously.

#### Example

In [None]:
plt.bar(cities, sec8housing, color='c', alpha=0.3, width=.5, edgecolor='k', lw=1.3)
plt.title('Section 8 Housing in MI Cities')
plt.xlabel('Quantity')
plt.ylabel('City')
plt.show()

#### Stacked Bar Charts

If you wish to display multiple pieces of quantitative
data that make sense to present both individually
and in total, it can make sense to use a stacked
bar chart.  We can create a stacked
bar chart in matplotlib by calling `plt.bar()` multiple
times with the different data, using the `bottom` keyword
argument when plotting the top bars to specify
that the bottom of the top bars should be at the values of
the bottom data.

In [None]:
plt.bar(cities, sec8housing, color='r', alpha=0.6, label='section 8')
lowrent = [4391, 605, 108, 1248, 197, 447, 834]
plt.bar(cities, lowrent, color='c', alpha=0.5, bottom=sec8housing, label='low rent')
plt.title('Low Rent and Section 8 Housing in MI Cities')
plt.xlabel('Quantity')
plt.ylabel('City')
plt.legend()
plt.show()

There are many other functions that can be called to slightly
modify the appearance of a plot, far more than we can cover
here.  For instance, in the above plot, the city names are overlapping.
One way we can correct that is to rotate the tick labels
by calling `plt.xticks()`.  This is a multi-purpose function
that works for all plot types including line/scatter plots, bar charts, etc.
and allows the tick labels and rotation to be customized.

For the above plot, it would be very helpful if the tick
labels (cities) were rotated so they did not overlap.

In [None]:
plt.bar(cities, sec8housing, color='r', alpha=0.6, label='section 8')
lowrent = [4391, 605, 108, 1248, 197, 447, 834]
plt.bar(cities, lowrent, color='c', alpha=0.5, bottom=sec8housing, label='low rent')
plt.title('Low Rent and Section 8 Housing in MI Cities')
plt.xlabel('Quantity')
plt.ylabel('City')
plt.legend()
plt.xticks(rotation=45)
plt.show()

### Histograms

Histograms are very similar to bar charts, but differ in
that they display the number of values in each of a set of
bins.  Matplotlib will both compute the bins and visualize
the histogram with:
* `plt.hist(values)`
* `plt.hist(values, bins)`
The second call is exactly the same as the first except
it allows you to specify the bins by choosing:
* the number of bins if `bins` is an integer
* the bin edges if `bins` is a list
* the binning strategy if `bins` is a string
As with `bar`, there are many different keyword arguments
you can pass when calling the function to customize

#### Example

In [None]:
monthly_precip_gr = [2.73,1.99,2.9,3.43,2.23,4.05,5.96,4.36,3.92,3.41,7.32,7.13,2.62,4.04,3.65,1.2,3.27,4.09,4.22,2.84,4.75,2.6,3.07,2.67]
plt.hist(monthly_precip_gr)
plt.title('Histogram of Total Monthly Precipitation in Grand Rapids')
plt.xlabel('Precipitation (inches)')
plt.ylabel('Frequency')
plt.show()

### Boxplot

Box and whisker plots are very common in statistics.
We can create a box plot with the `boxplot` function,
with the list of values:
* `plt.boxplot(values)` - vertical boxplot
* `plt.boxplot(values, vert=False)` - horizontal boxplot

#### Examples

In [None]:
rent_prices = [350, 400, 450, 425, 375, 500, 480, 550, 650, 700]
plt.boxplot(rent_prices)
plt.show()

In [None]:
rent_prices_allendale = [350, 400, 450, 425, 375, 500, 480, 550, 650, 700]
plt.boxplot(rent_prices_allendale, vert=False)
plt.show()

The `values` can be a lists of lists to plot more than one
boxplot on the same figure.  You can also specify a label
(to replace the 1 on the left in the previous plot), which
is used to label the different boxplots when there is
more than one.  For example,

In [None]:
rent_prices_allendale = [350, 400, 450, 425, 375, 500, 480, 550, 650, 700]
rent_prices_gr = [500, 550, 700, 850, 1000, 750, 600, 650, 800]
all_prices = [rent_prices_allendale, rent_prices_gr]
cities = ['Allendale', 'Grand Rapids']
plt.boxplot(all_prices, vert=False, labels=cities)
plt.show()

### Images
Matplotlib also has some basic functionality for:
* loading images
* interpolating pixels (think smoothing)
* visualizing images in different colormaps

More advanced image manipulation typically relies
on additional advanced pages we have not discussed
yet, so for now we'll just look at how to load
and show an image.

We can load an image with the `imread(filename)` function,
where `filename` is a string containing the name of the file.
This is the first (and only) function we'll look at that is
not in `matplotlib.pyplot`.  This function is instead
in the `image` module, so we must first import `matplotlib.image`.  Since this is also longer than we
wish to type, we import it as `mpimg`.

Once the image is loaded, we can call `plt.imshow` with
the loaded image as an argument to visualize it.  We
can optionally pass in values for the keyword arguments
`cmap` or `interpolation` to change how the image is displayed.

#### Example

In [None]:
import matplotlib.image as mpimg
img = mpimg.imread('../media/dog.jpg')
plt.imshow(img)

Note that above we used what's known as a *path* in place of the filename.  We saw these previously when reading in files.  This one is slightly different in that the folder `media` is not in our current folder, but one level above.  We indicate one level above by adding `../` to the start, followed by the folder name it's in, followed by the filename itself.

### Multiple Plots in One Figure

Each of the previous visualizations has been a single plot
(with potentially a couple of lines, but all on a single "axis"
in matplotlib terminology).  Often, we want multiple subplots
in a single figure (so they are still grouped, but not plotted
on the same $x$-$y$ axis).  To do this, we first create a "figure"
by calling `plt.figure()`.  We've been working with a "figure" all along,
just not one we explicitly created.

Optionally, we could have passed a specific figure number as an argument
to `plt.figure()`, which can be useful if you are switching between figures
you wish to plot on.  When given a figure number, if the figure already exists
it will be made active, otherwise a new figure will be created.

Once we have a figure, we need to add "subplots" (aka the individual
$x$-$y$ axes we want to plot on).  We specify the subplot we want to make active
by calling 
```
plt.subplot(nrows, ncols, index)
```
where:
* `nrows` is the number of rows of subplots we wish to have
* `ncols` is the number of columns of subplots we wish to have
* `index` is the index of the subplot we want to make active.  `index` starts
  at 1 in the upper left and increases going left-to-right and top-to-bottom (like
  reading a book).

Once a given subplot is active, we simply call the desired plotting function
to create the desired type of visualizations.  The visualization types (aka plotting
function) can be the same or different in each subplot.

**Note: often the axis labels end up overlapping.  This can almost always be fixed
by calling `plt.tight_layout()` at the end.**

#### Example

Consider the population data for Detroit and Chicago.  Instead of plotting
both of the on the same axis, we could choose to plot them on separate axes in
the same figure.

In [None]:
# Create the figure
plt.figure()

# Plot Detroit Data
plt.subplot(2,1,1) # 2 rows, 1 column, 1st axis
plt.xlabel('Census Year')
plt.ylabel('Population')
plt.title("Detroit Population Over Time")
plt.plot(detroit_years, detroit_populations, 'co--', label='Detroit')

# Plot Chicago Data
plt.subplot(2,1,2) # 2 rows, 1 column, 2nd axis
plt.xlabel('Census Year')
plt.ylabel('Population')
plt.title("Chicago Population Over Time")
plt.plot(chicago_years, chicago_populations, 'mo--', label='Chicago')

plt.tight_layout() # Important for spacing
plt.show()