# Numpy Tutorial

`numpy` is a popular python library for linear algebra and using matricies. This notebook will walk through some basic features of `numpy`

## Introduction

In [None]:
import numpy as np

`numpy` has many math functions built in. For example, `numpy` stores the value of pi

In [None]:
np.pi

3.141592653589793

`numpy` is also able to generate random numbers using the `randint` function.

For example, to generate 2 random numbers between 0 and 10, we write:

In [None]:
np.random.randint(0,10,2)

array([7, 9])

We can use random seeds to always produce the same output while generating random numbers (much like using random_state=1 with `sklearn`)


In [None]:
np.random.seed(1337)
np.random.randint(0, 10, 2)

array([7, 8])

`numpy` is great at managing lists of data efficiently using arrays. To generate an array of 4 zeros we write:

In [None]:
np.zeros(4)

array([0., 0., 0., 0.])

Similarly, we can generate an array of only 1's using the `ones` function.

In [None]:
np.ones(4)

array([1., 1., 1., 1.])

Another powerful function is the `arange` function (not to be confused with the word arrange) that generates numbers from 0 to n-1. To generate an array of 0,1,2,3,4,5 we use the following:

In [None]:
np.arange(0, 6)

array([0, 1, 2, 3, 4, 5])

In addition to one-dimensional arrays, `numpy` can store data in more than one dimension. To create a 2x3 matrix from an array, we use the `reshape` function:

In [None]:
a = np.arange(0,6 )
a = a.reshape(2,3)
a

array([[0, 1, 2],
       [3, 4, 5]])

Note that `reshape` only works when the array can be correctly divided into the required dimensions.

In [None]:
# a.reshape(4, 5)

We can "undo" a reshape by using the `flatten` function.

In [None]:
a.flatten()

array([0, 1, 2, 3, 4, 5])

Note that `reshape` and `flatten` do **not** change the original array. 

In [None]:
a

array([[0, 1, 2],
       [3, 4, 5]])

## Operations on arrays

`numpy` also allows us to apply functions to entire arrays at once. For example, we can take the log of every element in an array using the `np.log` function

In [None]:
np.log(a)

  """Entry point for launching an IPython kernel.


array([[      -inf, 0.        , 0.69314718],
       [1.09861229, 1.38629436, 1.60943791]])

The `argmin` function will return the index of theminimum value given a certain axis.

Note that axes can be difficult to understand at first. If your array is two-dimensional, axis=0 corresponds to rows and axis=1 corresponds to the columns.

Let's use `argmin` to determine the index of the minimum element inside of the following array.

In [None]:
print(a.flatten())
np.argmin(a)

[0 1 2 3 4 5]


0

Note that when you omit the `axis` argument, `argmin` calculates the index using the flattened array.

Now let's try the same thing but using `axis=0`

In [None]:
print(a)
np.argmin(a, axis=0)

[[0 1 2]
 [3 4 5]]


array([0, 0, 0])

`argmin` returns an array with three elements where each element is the index in the y direction that holds the maximum element.

What happens if we use `axis=1`?

In [None]:
print(a)
np.argmin(a, axis=1)

[[0 1 2]
 [3 4 5]]


array([0, 0])

Likewise we can use `argmax` to find the indicies of the maximal elements in an axis.

In [None]:
np.argmax(a, axis=0)

array([1, 1, 1])

We can also use functions on an axis. Let's write code to compute the sum of the elements inside each column.

In [None]:
np.sum(a, axis=0)

array([3, 5, 7])

Another potentially useful function is `bincount` which counts how many times each number occurs in the array. For example, suppose we declared an array `b`:

In [None]:
b = np.array([0, 1, 1, 3, 2, 1, 7])
print("Index: 0  1  2  3  4  5  6  7")
np.bincount(b)

Index: 0  1  2  3  4  5  6  7


array([1, 3, 1, 1, 0, 0, 0, 1])

The length of the array returned by `bincount` will always be one greater than the largest element in the array.

## Comparing Arrays

Finally, we can use `numpy` to compare arrays for equality using the `array_equal` function:

In [None]:
c = np.array([1, 2])
d = np.array([2, 1])
np.array_equal(c, d)

False

When using floats, it is often necessary to use the `allclose function`

In [None]:
e = np.array([1e10,1e-8])
f = np.array([1.00001e10,1e-9])
np.allclose(e, f)

True