We're going to start with a package called [_NumPy_](https://docs.scipy.org/doc/numpy/reference/). NumPy is the basic package for doing slightly more advanced math and storing data in an analytics-friendly form. We'll make use of NumPy throughout this prep course and the bootcamp.

You should have installed NumPy on your local environment in the previous Unit. If you don't yet have NumPy installed, install it now with `pip install numpy`.

Once NumPy is installed we have to import the package into our current environment to actually use it. We do this with an `import` statement. When importing a package you also have the opportunity to set an abbreviation for recalling its functions. Many packages have standard shorthand, which we will follow. The shorthand for `numpy` is simply `np`. You can import and set the abbreviation like this:


In [1]:
import numpy as np



Now that the package is installed on our machine and imported into the environment, we're ready to start working with NumPy.

Before we do that, however, it is worth making a note about writing code with import statements. Import statements will work at any point in a script or any cell in a notebook. However, [Python style](https://www.python.org/dev/peps/pep-0008/#imports) requires they should always appear at the beginning of the script or in the first cell (most notebooks will use the first cell just for this purpose). This allows for easy validation if the necessary packages are installed and keeps track of them in a single place.


## Arrays

As we said above, NumPy is the fundamental package for storing and manipulating mathematical data in Python. NumPy primarily accomplishes this with a new data structure: the _array_.

A NumPy array can be thought of as a Python list with additional mathematical functionality and properties. One of the great attributes of the array is that, like lists, it can have multiple dimensions. A single dimensional array works like an ordered set of values, with various data points entered in it. Arrays use bracket notation `[` `]` to access items by index, just like lists and strings. We can create an array by calling `np.array()` and passing in any iterable. A list, for example:



In [2]:
x = np.array([0, 1, 2, 3])
x

array([0, 1, 2, 3])

Remember that you can run and re-run these code cells individually. A good shortcut for running a cell is pressing <kbd>shift</kbd> + <kbd>enter</kbd> from within the cell.

You can add multiple dimensions to your array by either manually creating an array of arrays or with `np.arange()`. Here are two ways to generate the same thing:

In [3]:
w = np.array([[0, 1, 2, 3],[4, 5, 6, 7]])
w

array([[0, 1, 2, 3],
       [4, 5, 6, 7]])

In [4]:
y = np.arange(8).reshape(2, 4)
y

array([[0, 1, 2, 3],
       [4, 5, 6, 7]])

`np.arange()` ("arange" is short for "array range") works similarly to the basic Python `range()` function by generating a sequence of integers, starting at 0 by default and incrementing by 1. But instead of returning a `range()` object it returns an array. We're taking advantage of that by calling the `.reshape()` [array method](https://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.reshape.html) to reshape the initial eight-item array into two four-item arrays.

Feel free to play with it. Using `arange()` and `.reshape()`is a common way to create arrays.


## Element-wise and Aggregator functions

Now that you've seen the basic data structure of NumPy we can introduce a few other functionalities of this package. NumPy's primary value is the ability to do more sophisticated arithmetic than basic Python can do out of the box. NumPy allows you to do these computations in two ways: with *element-wise* functions that process array elements one at a time and then return a new array, and with *aggregator* functions that process the array into a single value the function returns.

Let's play with a simple array and look at what's possible. First, some element-wise functions that return a new array:

In [5]:
x = np.array([0, 1, 2, 3])

# Square each value.
print(np.square(x))

# Square root of each value.
print(np.sqrt(x))

# Cosine of each value.
print(np.cos(x))


[0 1 4 9]
[ 0.          1.          1.41421356  1.73205081]
[ 1.          0.54030231 -0.41614684 -0.9899925 ]


Note that these methods return arrays of the same length as the input array, just like the built-in Python function `map()`. Use element-wise functions when you want to transform each individual element in an array and get back a collection of all the results.

Here are some aggregator functions that aggregate the elements of an array and return a single value:

In [6]:
x = np.array([0, 1, 2, 3])

# Find the maximum value.
print(np.max(x))

# Find the minimum value.
print(np.min(x))

# Find the mean of the input array.
print(np.mean(x))

# Find the standard deviation of the input array.
print(np.std(x))

3
0
1.5
1.11803398875


Note that these aggregator functions return single values, rather than arrays. That is what an _aggregator_ does. It takes a set of multiple data values and condenses them (or aggregates them) into a single value according to some rule. So `np.min()` returns the minimum value of all the data given to it, `np.mean()` the mean, and so on.

These are some of the basic functions of NumPy, but there are many more. We'll continue to use the package throughout the course and learn more as we go. If you'd like to know more now you can look through the basic [NumPy documentation](https://docs.scipy.org/doc/numpy-dev/user/quickstart.html).

