# Data Visualization with PyLab

Pylab is a module for graphical data visualization (so it turns data into graphs). It is part of the matplotlib library. This is helpful for data anlaysis of all kinds. at the end of this lesson, after you learn how to use the basic commands, we will learn to use it to visualize orders of growth within our code.

In [None]:
import pylab as plt

The is useful anytime you need to visualize data.

Let's use pylab to graph out and visualize computational complexity of algorithms. In the following code, we create 5 lists. `mySamples` will serves as points for the x-axis. `myLinear`, `myQuadratic` and `myExponential` will serve as y-axes.

In [None]:
mySamples = []
myLinear = []
myQuadratic = []
myCubic = []
myExponential = []

Let's fill up the lists so we can make our graphs.

In [None]:
for i in range(0, 30):
    mySamples.append(i)
    myLinear.append(i)
    myQuadratic.append(i**2)
    myCubic.append(i**3)
    myExponential.append(1.5**i) #2**i grows so fast it is too hard to visualize and compare to these other growth types

## Create a Plot Graph

#### `plt.plot(x, y)`

The simplest way to plot data on a visual graph is with `plt.plot(x,y)`. This method takes two lists of equal length and creates a data visualization.

In [None]:
#run this cell
plt.plot(mySamples, myLinear)

If you make `multiple pl.plot(x, y)` calls, they will all move into a single window.

In [None]:
# plot out the remaining orders of growth: myCubic, myQuadratic, myExponential:



Click the three dots below to see the answer.

In [None]:
plt.plot(mySamples, myLinear)
plt.plot(mySamples, myQuadratic)
plt.plot(mySamples, myCubic)
plt.plot(mySamples, myExponential)

It's not really useful to compare exponential with other types of growth on a single graph. Exponential growth moves too fast and other growth types can't be seen clearly. Let's put each line on it's own graph.

## Create a Figure

#### `plt.figure('figure name')`

You can create multiple graphs to display data with the `.figure()` method shown above. Once you create a new figure (aka plot graph, aka window) you can label it, label the axes, switch out data being graphed, change line colors and more.

In [None]:
# Run This Cell:

plt.figure('Linear')
plt.plot(mySamples, myLinear)
#plt.figure('Quadratic')            # uncomment the method and run again.
plt.plot(mySamples, myQuadratic)

This puts the `plt.plot(x, y)` calls immediately under `.figure()` into one window, and assigns that window a string name. In this example case, the name is `Linear`. You can reopen, and read or edit `Linear` later.

This allows you to output to multiple windows at the same time. If this is not used, all plots will be shown in the same window.

In [None]:
plt.plot(mySamples, myCubic)
plt.plot(mySamples, myExponential)

## Label Axes

#### `plt.xlabel('x-axis label')`

#### `plt.ylabel('y-axis label')`

These two functions allow labeling of the axes. The ordering of the functions below is important.

In [None]:
plt.figure('Linear Growth')
plt.xlabel('sample points')
plt.ylabel('linear function')
plt.plot(mySamples, myLinear)

In [None]:
# Relabel the graph and axes for the other 3 orders of growth.
plt.figure('Quadratic Growth')
plt.xlabel('sample points')
plt.ylabel('quadratic function')
plt.plot(mySamples, myQuadratic)

plt.figure('Cubic Growth')
plt.xlabel('sample points')
plt.ylabel('cubic function')
plt.plot(mySamples, myCubic)

## Add a Title

We can label graph itself with the `.title('Graph Title')` method. Let's do that:

In [None]:
# Run this code. The uncomment the second line and run it again:
plt.figure('Linear')
plt.title('Linear Growth')
plt.plot(mySamples, myLinear)

In [None]:
# Add titles for the other 3 growth charts and print the graphs out:
# Click the 3 dots below to see the answer

In [None]:
plt.figure('Quadratic')
plt.title('Quadratic Growth')
plt.plot(mySamples, myQuadratic)

plt.figure('Cubic')
plt.title('Cubic Growth')
plt.plot(mySamples, myCubic)

plt.figure('Exponential')
plt.title('Exponential Growth')
plt.plot(mySamples, myExponential)


## Clear the Windows

#### `plt.clf()`

If you notice the plots from above, the same colors are being used again and again. Blue seems to get used first, then orange, then green, then red. This is because you are using the same 4 pylab windows in all the calls above that we created with `plt.figure()`. Plot values, axis labels, and graph names are getting redefined in the same window.

Since figures are getting reused, things can get messy. For example, you may reuse `plt.figure('Linear Growth')` multiple times and you may need to be sure you clear out data it is holding from a previous call.

In [None]:
# Follow the instructions below.

plt.figure('Linear')

# Uncomment the line below and run the cell. Not suprsisingly, it adds a title to the graph.
#plt.title('Linear Growth')

# Uncomment the line below and run the cell. A new orange line will be added, directly on top of the blue line.
#plt.plot(mySamples, myLinear)

# Uncomment the line below and run the cell. This clears out all previous modifications to the preceding figure.
#plt.clf()

plt.plot(mySamples, myLinear)

## Comparing Plots

When the scales of plots are extremely different, it is really hard to compare them. There are two ways to deal with this:
- set limits explicitly (which pylab did for us automatically in the above examples)
- plot multiple functions on the same graph (which we also tried, and found it hard to compare data, because exponential growth is
so fast)

When we compared graphs by putting them in the same window, this was a problem because exponential increased so quickly, and was not possible to see growth changes for linear and quadratic growth. Second, we put the graphs in separate windows, but this was also misleading because the y-axis columns are so different (the linear y-axis went to 30 and the exponential y-axis went somewhere over 120,000).

Instead of using defaults, let's manually adjust these two parameters to make the data more easily comparable.

#### Change Limits on the Axes

Changing the limits is one way to make dramatically different data more comparable. In the cell below, we use `plt.ylim(bottom, top)` to set the range of the y-axis.

**Note:** you could also set the range on the x-axis with `plt.xlim(start_int, end_int)`

In [None]:
plt.figure('Linear')
plt.clf()
plt.title('Linear Growth')
plt.ylim(0, 1000)                # In case it wasn't obvious, this sets the y-axis from 0 to 30.
plt.plot(mySamples, myLinear)

# Remove the quotations from around the code below
'''
plt.figure('Quadratic Growth')
plt.clf()
plt.ylim(0,1000)
plt.title('Quadratic Growth')
plt.plot(mySamples, myQuadratic)
'''

In the example above we put the graphs into separate graphs but specified the y-limit. We could also put them in the same graph/window/figure.

In [None]:
plt.figure('lin quad')
plt.clf()
plt.title('Linear vs Quadratic Growth')
plt.ylim(0,1000)
plt.plot(mySamples, myLinear)
plt.plot(mySamples, myQuadratic)

Now, since these two orders of growth are similar enough, we didn't have to actually adjust the y-limit. Which leads us to another strategy for comparing data, overlaying plots with similar growth.

#### Overlaying Plots with Similar Growth

It can be useful to put plots with similar orders of growth into their own graphs/figures/windows. Doing this, we can:

- linear to quadratic
- quadratic to cubic
- cubic to exponential

In [None]:
plt.figure('lin v quad')
plt.clf()
plt.title('Linear vs. Quadratic')
plt.plot(mySamples, myLinear)
plt.plot(mySamples, myQuadratic)


plt.figure('quad cub')
plt.clf()
plt.title('Quadratic vs Cubic')
plt.plot(mySamples, myQuadratic)
plt.plot(mySamples, myCubic)


plt.figure('cub expo')
plt.clf()
plt.title('Cubic vs Exponential')
plt.plot(mySamples, myCubic)
plt.plot(mySamples, myExponential)

***Now you can overlay plots into the same window, put them in separate windows or adjust y-limits to make sense of the data as you see fit for any given occassion.***

### Adding Legends

To assign a legend to a graph, you must define the labels in the `plt.plot(x-data, y-data, label)` and then call the `.legend()` method. Read the code cell below closely:

In [None]:
plt.plot(mySamples, myLinear, label = 'linear')
plt.plot(mySamples, myQuadratic, label = 'quadratic')
plt.legend()

#### Specify Legend Location

You can specify the region in the graph that you would like the legend to appear with the `loc` parameter.

```python
plt.legend(loc = 'upper left')
```

In [None]:
# Add a legend for a graph with the cubic function 


In [None]:
plt.plot(mySamples, myQuadratic, label = 'quadratic')
plt.legend(loc = 'upper left')

In [None]:
# Graph exponential growth and create a legend:


## Control Display Parameters

Let's look at how to change line color, line width, make lines into dots/stripes, crate subplots in a plot, etc..

#### Change Color and Shape

In the following code, we change the appearance of a line with `'ro'`:

```python
plt.plot(mySamples, myLinear, 'ro', label='linear')
```
Notice `'ro'` in this line of code. `r` will make the line red, and `o` will make the line  into circles.


Read more about all the line customization options [here](https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.plot.html). At the end of the page, there are tables of all the values you could use in this parameter.

In [None]:
#Run Cell. Then add an 'o' to the end of string 'r'.

plt.plot(mySamples, myLinear, 'r', label='linear')
plt.legend()

**Let's see a few more examples:**


```python
plt.plot(mySamples, myLinear, 'b^', label='linear')
```
`b` makes the line blue.

```python
plt.plot(mySamples, myLinear, 'y--', label='linear')
```
`y` makes the line yellow, `--` makes it into dashes.


In [None]:
plt.plot(mySamples, myLinear, 'b--', label='linear')
#plt.plot(mySamples, myQuadratic, 'y-', label='Quadratic')
plt.legend()

#### Line Width

You can change line width of a plot like so:

In [None]:
plt.plot(mySamples, myLinear, 'b--', label='linear', linewidth = 2)
plt.plot(mySamples, myQuadratic, 'y--', label='linear', linewidth = 6)

#### Subplots

We can put multiple graphs into columns or rows in a window.

In [None]:
plt.figure('lin quad')
plt.clf()
plt.subplot(211)
plt.title('Linear vs. Quadratic')
plt.plot(mySamples, myLinear, 'b--', label='linear', linewidth = 2)
plt.legend()
plt.subplot(212)
plt.plot(mySamples, myQuadratic, 'y--', label='linear', linewidth = 6)
plt.legend()

Each number in `subplot('212')` modifies the graph:

- num rows
- num columns
- location to use

So `subplot('212')` displays a window with 2 rows of graphs, 1 column, and fills the second column with a plot (the plot given beneath it).

In [None]:
# Plot Cubic and Exponential growth as 2 columns and one row.



In [None]:
plt.figure('cub exp')
plt.clf()
plt.subplot(121)
plt.title('Cubis vs. Expo')
plt.plot(mySamples, myCubic, 'b--', label='cubic', linewidth = 2)
plt.legend()
plt.subplot(122)
plt.plot(mySamples, myExponential, 'y--', label='exponential', linewidth = 6)
plt.legend()

## Changing Scales

One last way to handle comparisons between plots with different orders of growth is to change scales on the graph.

In [None]:
#Run this cell. Then uncomment plt.yscale() down below and rerun.

plt.figure('cub quad')
plt.clf()
plt.title('cubic vs. Exponential')
plt.plot(mySamples, myCubic, 'ko', label='linear', linewidth = 1)
plt.plot(mySamples, myExponential, 'g^', label='exponential', linewidth = 1)
#plt.yscale('log')
plt.legend()

Take note that the y-axis labels change when `plt.yscale` is changed.

In Lesson 4-2, we will use pylab to graph out comuptational complexity on several coding problems.