# Numpy

In this section we'll briefly introduce the numpy library,
it's core data structures, and give a hint
of the vast capabilities it provides. 
To start, we'll need to import numpy.
Most often professionals will `import numpy as np`. 
The `import x as y` notation just allows us to rename a library for convenience.
Since we'll use numpy so often, it's easier if we only have to type `np` each time.
Then we can spend more of our money on ice cream and less on arthritis medication.

In [2]:
import numpy as np

Now we can access all kinds of functions, objects, and special numbers from the numpy library.

In [4]:
print(np.e)
print(np.pi)

2.718281828459045
3.141592653589793


We can also access a host of numpy functions, including those we've already encountered in the standard `math` library.

In [7]:
np.ceil(np.e)
np.sin(np.pi/2)

1.0

## Numpy's ndarray

At the heart of numpy is a powerful data object for manipulating n-dimensional arrays. 
In short, Python's lists aim extreme flexibility - but come at the expense of extreme slowness.
Imagine that we had a two-dimensional array:

In [9]:
matrix1 = [[0,1,2,3,4,5,6,7,8,9],[10,11,12,13,14,15,16,17,18,19],[20,21,22,23,24,25,26,27,28,29]]
print(matrix1)

[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9], [10, 11, 12, 13, 14, 15, 16, 17, 18, 19], [20, 21, 22, 23, 24, 25, 26, 27, 28, 29]]


Now at this size, for most purposes, this list is fine. But say for example that we wanted to calcualte the average value in the list. How might we do it? We'd likely have to iterate over all rows and all columns of the list.

In [13]:
total = 0
denominator = len(matrix1) * len(matrix1[0])
for row in range(len(matrix1)):
    for col in range(len(matrix1[0])):
        total += matrix1[row][col]
print(total/denominator)

14.5


This is problematic for a couple reasons. First, it's a lot of code to do a very standard numerical calcualtion. Second, in order facilitate the flexibility of Python's lists to store aribitrary objects, the numbers are very slow to access. It might seem fast to us (many thousands of accesses per second!) but compared to the same operation using arrays implemented efficiently in C, this is a glacial speed. 

Numpy gives us an extremely optimized tools for storing and manipulating arrays comprised of elments all of which are contstrained to assume the same type (selected among a restricted list of allowed types).
We can get started by creating an numpy ndarray:

In [19]:
matrix2 = numpy.array([[0,1,2,3,4,5,6,7,8,9],[10,11,12,13,14,15,16,17,18,19],[20,21,22,23,24,25,26,27,28,29]])
print(matrix2)

[[ 0  1  2  3  4  5  6  7  8  9]
 [10 11 12 13 14 15 16 17 18 19]
 [20 21 22 23 24 25 26 27 28 29]]


We can access and ndarray's data type via the `.dtype` attribute:

In [20]:
matrix2.dtype

dtype('int64')

It's now easy to access standard numperical functions either via the numpy library functinos or via the corresponding methods of the ndarray object

In [22]:
print(np.mean(matrix2))
print(matrix2.mean())

14.5
14.5


## Shape and Axes

An ndarray has multiple axes. For a 2nd-order array (i.e. matrix),
we have two axes to work with. We can learn about the dimensions of an ndarray along each of its axes via the shape attribute:


In [23]:
matrix2.shape

(3, 10)

## Creating ndarrays

There are many ways to create ndarrays. 
We already saw one: by using the `np.array(...)` function
on a Python list consisting of only numerical values.
We can also just call up lists of ones or zeros,
by passing the appropriate shape to the `np.ones()` and `np.zeros()` funcitons respectively.

In [46]:
print("Using ones: \n", np.ones((4,10)))
print("\nUsing zeros: \n", np.zeros((3,6)))

Using ones: 
 [[1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]]

Using zeros: 
 [[0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0.]]


## Accessing values

We can access elements by punching in the indices along each axis separated by commas. 
Note that `my_array[i,j]` is equivalent to `my_array[i][j]`.

In [28]:
print(matrix2[1,6])
print(matrix2[1][6])

16
16


## Updating values
Just as with any Python list, we can 

In [30]:
matrix2[1,6] = 314
print(matrix2)

[[  0   1   2   3   4   5   6   7   8   9]
 [ 10  11  12  13  14  15 314  17  18  19]
 [ 20  21  22  23  24  25  26  27  28  29]]


## Aggregating over a specific axis

For many operations to selectively apply the operation along a specific axis, we can specify the desired axis with a named argument. For example say that we want to compute the sum of our matrix, but only over each column:

In [31]:
np.sum(matrix2, axis=0)

array([ 30,  33,  36,  39,  42,  45, 346,  51,  54,  57])

Note that the original array had shape (3, 10). The resulting array has shape (10). That's because we eliminated the first axis (`axis=0`) by summing over it. 
If instead, we sum over `axis 1`, we'll expect a result of dimension 3.

In [35]:
np.sum(matrix2, axis=1)

array([ 45, 443, 245])