## Data and Plotting (featuring Numpy and Matplotlib)



## Objectives



-   Use `matplotlib` to generate plots
-   Generate <span class="underline">arrays</span> of numerical values
-   Save and load data in CSV format



## Your first plot



Let's start with the import statements



In [1]:
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np

`matplotlib` is a library that contains a lot of plotting functionality, perhaps more than you will ever know. `NumPy` (Numerical Python) is also incredibly feature rich, but there is too much to explain here. We will start with minimal examples and encourage you to look up additional functions as you find a need for them

Two comments above. The `%` indicates a special keyword for ipython. This is an example of what's called a magic function (find a more complete list here). The second new thing we introduced is the `as` keyword. What this syntax indicates is we want to be able to refer to a particular library or module by a nickname



In [1]:
x_array = np.linspace(0, 2 * np.pi, 101) # create an array of 101 numbers equally spaced between 0 and 2pi
y_array = np.sin(x_array) # calculate the sine of each of the points
plt.plot(x_array, y_array)

Now that we've got something working, let's dig a little deeper.

By the way, you are highly encouraged to copy cells or insert new ones to play around with the variables. One thing that is helpful is the `?` functionality. Try running the following cell



In [1]:
np.sin?

This should bring up <span class="underline">documentation</span> or information about how the function works, what it's expecting, and what it returns. It also (usually) explains what arguments the function is expecting and additional optional parameters.



## Numpy arrays



In the previous notebook we talked about lists as a way to store multiple objects. NumPy Arrays are another datatype which are very closely related, except they make dealing with numbers much earlier. The simplest explanation is that an array is like a vector. One distinction between this and a list is that a vector has objects that are all the same type.

For example, if you multiply a vector by 2, you can simply multiply each element by 2.



In [1]:
my_array = np.array([1, 2, 3, 4, 5])
print(2 * my_array)

Not all things in Python are this intuitive. While generally it doesn't hurt to experiment and there are plenty of nice surprises, occasionally you will find some seemingly odd behavior.

For example, what happens if you multiply a list of numbers by 2?



In [1]:
my_list = [1, 2, 3, 4, 5]
print(2 * my_list)

As you can see, it might not be what we expected, but it can certainly be helpful in other contexts.

\#+ipynb-newcela

A very nice feature of arrays is that they do automatic <span class="underline">broadcasting</span>, meaning that numpy functions try to infer whether you meant to apply the function to the array or each element in the array. We made use of this in our first plot



In [1]:
# Calculate sine of some common angles (in radians)
print(np.sin(0))
print(np.sin(np.pi/2))

# Demonstrating broadcasting on arrays
print(x_array)
print(np.sin(x_array))

Note how passing in an array to the sin function returns an array whose entries are the sine of our `x_array`.



### Indexing arrays



Getting elements of the array work very similarly to the case of lists. To access specific elements, we can use the indexing techniques we learned earlier.

Using the `my_array` variable from earlier



In [1]:
print(my_array[0]) # remember that the first element is at index 0!
print(my_array[-1]) # Should print 5
print(my_array[:2]) # When leaving off the start index in a slice, it starts from the beginning

### Multi dimensional arrays



Note: numpy also has rich functionality built in for higher dimensional arrays. Most commonly, you might encounter 2d arrays since these can represent matrices and tables. Their use is just beyond the scope of this introductory material but there are plenty of resources online.



## Generating data



In Python, there are many useful ways to generate data, a few of which we've already seen.

If you want to generate an array of $n$ points between $a$ and $b$, the function `np.linspace(a, b, n)` will do that for you. This is what we did to create the $x$ points for our sine plot.

Similarly, if you want to generate an array of points that are spaced by an interval, you can call the function `np.arange(start, stop, interval)`.



#### Exercise



Use the `np.linspace` and `np.arange` to generate an array with the following values `[1, 1.5, 2, 2.5, 3, 3.5, 4, 4.5, 5]`



In [1]:
# Insert code here

## Saving Data



Saving data is critical for science. Whether it's saving data that's generated for the first time, or saving data as the result of some computation, saving your data is an important step.

Depending on the type of data, there are different ways to save your data. For example, text and comma separated files (csv) are probably most common. Different formats have different purposes.
Luckily, numpy provides a function for that.  Probably the most common data we'll use in physics is numerical data, so we will use the csv format since it can also be easily viewed in other programs.



In [1]:
my_integers = np.array([1, 2, 3, 4, 5])
my_squares = my_integers**2   # remember the funny syntax for exponents
filename = "saved_squares.csv"
np.savetxt(filename, my_squares, delimiter=',')

### Where does it get saved?



Much like your own computer, the computer this notebook is running on has a <span class="underline">file system</span>. This is a fancy way of referring to the folders and files such as `Documents` or `Downloads`. When you save data in this way, it gets put in the same folder that the notebook lives. You can save it in a different folder by specifying the <span class="underline">path</span> to a different folder.



## Loading Data



Saving data is pretty useless if you don't have a way to access it later. Luckily, there is a corresponding method also available in numpy.



In [1]:
loaded_data = np.loadtxt(filename)
print(loaded_data)

It's really that simple!



Note: there are many utilities for loading in data from files depending on the application. Because we are primarily dealing with numerical data, the `savetxt` and `loadtxt` routines will probably be most helpful. For more general purposes, consider looking at the documentation for file reading



### Exercise: Plotting motion



Using the position function written in notebook 1, plot the motion of a rabbit that starts at .1 m/s and accelerates at a rate .05 m/s<sup>2</sup> over the course of 10 seconds.



In [1]:
# generate an array of times using linspace or arange
# define the position function
# plt.plot(t_array, pos_array)

### Exercise: Plotting Random Numbers



Use the function `np.random.uniform` to generate a 100 (x,y) coordinate pairs. Then use `plt.scatter` to plot them.



### Exercise: Documentation



Being able to read documentation is very daunting at first, but extremely helpful in figuring out how to unlock new functions to add to your toolbelt. The format isn't the most beginner friendly, but there's enough predictability reading new documentation takes less time as you read more.



Try looking up the documentation for `np.linspace` and `np.arange`.



In [1]:
np.linspace?

What is common to the documentation for these pages?



### Exercise: Plot Customization



Use the documentation of `plt.plot` to figure how to modify our first plot. As a starting point, try customizing the line to be red and dashed. Then, you can play with it to match your own preferred style.

