# Introduction to NumPy
by Maxwell Margenot

Part of the Quantopian Lecture Series:

Notebook released under the Creative Commons Attribution 4.0 License.

NumPy is an incredibly powerful package in Python that is ubiquitous throughout the Quantopian platform. It has strong integration with Pandas, another tool we will be covering in the lecture series. NumPy adds support for multi-dimensional arrays and mathematical functions that allow you to easily perform linear algebra calculations. This lecture will be a collection of linear algebra examples computed using NumPy.

In [1]:
import numpy as np
import matplotlib.pyplot as plt

### Basic NumPy arrays

In [2]:
stock_list = [3.5, 5, 2, 8, 4.2]

We can make an array by calling a function on the list:

In [20]:
returns = np.array(stock_list)
print(returns, type(returns))

[3.5 5.  2.  8.  4.2] <class 'numpy.ndarray'>


You'll notice that the type of our array is 'ndarray', not just 'array'. This is because NumPy arrays can be created with multiple dimensions. If we pass np.array() a list of lists, it will create a 2-dimensional array. If we pass a list of lists of lists, it will create a 3-dimensional array, and so on and so forth.

In [21]:
A = np.array([[1, 2], [3, 4]])
print(A, type(A))

[[1 2]
 [3 4]] <class 'numpy.ndarray'>


We can access the dimensions of an array by looking at its `shape` member variable.

In [22]:
print(A.shape)

(2, 2)


Arrays are indexed in much the same way as lists in Python. Elements of a list begin indexing from $0$ and end at $n - 1$, where $n$ is the length of the array.

In [23]:
print(returns[0], returns[len(returns) - 1])

3.5 4.2


We can take a slice of an array using a colon, just like in a list.

In [24]:
print(returns[1:3])

[5. 2.]


A slice of an array, like in a list, will select a group of elements in the array starting from the first element indicated and going up to (but not including) the last element indicated.

In the case of multidimensional arrays, many of the same conventions with slicing and indexing hold. We can access the first column of a 2-dimensional array like so:

In [10]:
print(A[:, 0])

[1 3]


And the first row of a 2-dimensional array like so:

In [25]:
print(A[0, :])

[1 2]


Notice that each slice of the array returns yet another array!

In [12]:
print(type(A[0,:]))

<class 'numpy.ndarray'>


Passing only one index to a 2-dimensional array will result in returning the row with the given index as well, providing us with another way to access individual rows.

Accessing the index of an individual element will return only the element.

In [13]:
print(A[1, 1])

4


#### Array functions

Functions built into NumPy can be easily called on arrays. Most functions are applied to an array element-wise (as scalar multiplication is). For example, if we call `log()` on an array, the logarithm will be taken of each element.

In [26]:
print(np.log(returns))

[1.25276297 1.60943791 0.69314718 2.07944154 1.43508453]


Some functions return a single value. This is because they treat the array as a collection (similar to a list), performing the designated function. For example, the `mean()` function will do exactly what you expect, calculating the mean of an array.

In [27]:
print(np.mean(returns))

4.54


Or the `max()` function will return the maximum element of an array.

In [28]:
print(np.max(returns))

8.0


For further reading on the universal functions in NumPy, check out the [documentation](https://docs.scipy.org/doc/numpy/user/quickstart.html#universal-functions).

### Return to the returns

Now let's modify our returns array with scalar values. If we add a scalar value to an array it will be added to every element of the array. If we multiply an array by a scalar value it will be multiplied against every element of the array. If we do both, both will happen!

In [17]:
returns*2 + 5

array([12. , 15. ,  9. , 21. , 13.4])

NumPy also has functions specifically built to operate on arrays. Let's take the mean and standard deviation of this group of returns.

In [19]:
print("Mean: ", np.mean(returns), "Std Dev: ", np.std(returns))

Mean:  4.54 Std Dev:  1.9915822855207364
