# Working with multivariate data
So far, most of our work has been with single-variable data which can easily be plotted with a line chart or bar chart.  Here we'll look at some ways to manipulate and plot "multivariate" data, i.e., data sets which have multiple independent variables.


## N-dimensional arrays
As you may remember, NumPy was designed with multi-dimensional data in mind.  That's why the core NumPy type is called `ndarray` --- it's an N-dimensional array!

We can create an 2-dimensional array manually just by using a list of lists:

In [None]:
import numpy as np

a = np.array([[10,  9, 10, 10, 12, 11,  9, 15, 13],
              [ 2,  0,  4,  6, 12, 11, 14, 27, 24],
              [ 5,  6, 10,  8, 14, 14, 18,  9, 19],
              [23, 16, 11, 14,  7,  1,  2,  0, 25]])

print(type(a))

What do you get as the result of the following operations?  What are they doing?
* `a.shape` (note that there are no parentheses)
* `a.T` (no parentheses here either)
* `len(a)`  (what about `len(a.T)`)?
* `np.sum(a)`
* `np.sum(a, axis=0)` (what about `axis=1`?)

*Challenge*: 
* Create a 3-dimensional `ndarray`.
* Which of the above operators work on your array?  Do they do anything differently?

In [None]:
# Your code here...

Just like a 1D NumPy array, we can slice the array to get individual rows or columns or chunks of the data.  Experiment with the following slices, and make some notes about how slicing works in multiple dimensions.

* `a[0]`
* `a[0, 3]`
* `a[3, 0]`
* `a[:, 2]`

*Challenge*:
* Create an array `b` which is a copy of `a` but with all values > 10 replaced with 10.
* Do the above with two lines of code or fewer.

In [None]:
# Your code here...

## Multidimensional plots

Since paper (and computer screens) are 2-dimensional, we have to get clever if we want to show more than one variable as a function of another.  There are lots of ways to do this, as discussed in class.  We'll do a couple examples here, and encourage you to check out the [Matplotlib examples gallery](https://matplotlib.org/stable/gallery/index.html) for inspiration as to what is possible.


We can create multiple charts on one figure using `subplot()`:

In [None]:
import matplotlib.pyplot as plt

N = 4
for i in range(N):
    plt.subplot(N, 1, i+1)
    plt.plot(a[i])
plt.show()


* Experiment with the parameters to `subplot()`.  What do they do?
* How does your ability to draw comparisons change as you change the size and shape of the plots?

*Challenge*:  Try drawing multiple bar charts instead of line charts.

## Showing images

If we have two-dimensional data, we can display it as an image (and if we have higher-dimensional data, we can always take 2D slices of it) using `imshow()`.

In [None]:
plt.imshow(a)
plt.show()

The colors are pretty, but what is this showing?  Take a minute to figure out how this plot is generated and what it's telling you.


The matplotlib documentation has a [whole page about colormaps](https://matplotlib.org/stable/tutorials/colors/colormaps.html).  There's also an excellent discussion of the development of the [default colormap, called "viridis"](http://bids.github.io/colormap/).

And wait, [what's this](https://www.youtube.com/watch?v=QlwatKpla8s?t=48)?
