# 03 Numpy

In [1]:
import numpy as np

The core of the `numpy` package is the `array` class. Let's examine that first. We can make an array out of a sequence, like a list.

In [2]:
d = [1, 2, 3, 4, 5]
np.array(d)

array([1, 2, 3, 4, 5])

### data types

Unlike lists, arrays must be homogeneous, in that the data types of each element must be the same. The data type of the array us upcast to be able to represent all of the data. So, if only one element is a float, all elements will be converted to floats.

In [3]:
d = [1, 2, 3.1415, 4, 5]
np.array(d)

array([ 1.    ,  2.    ,  3.1415,  4.    ,  5.    ])

You can query the datatype by examaning the dtype attribute of the array.

In [4]:
d = [1, 2, 3.1415, 4, 5]
arr = np.array(d)
arr.dtype

dtype('float64')

As we have seen before, arrays are like multidimensional sequences. We can create a 2D array by supplying a list of lists as the argument.

In [5]:
arr = np.array([[1., 2., 3.,], [4., 5., 6.]])
arr

array([[ 1.,  2.,  3.],
       [ 4.,  5.,  6.]])

### Array attributes

Arrays have a few other important attributes. Note attributes never have paretheses after them. Methods always do.

In [6]:
arr.size          # The number of elements in the array

6

In [7]:
arr.shape         # The shape of the array (i.e., the size of each dimension)

(2, 3)

In [8]:
arr.ndim          # The number of dimensions of the array

2

### Setting array shape

You can set the `array.shape` attribute to change the shape of the array. This attribute does not change the elements of the array, or how it is stored in memory, just how it is seen.

In [12]:
arr.shape = (3, 2)
arr

array([[ 1.,  2.],
       [ 3.,  4.],
       [ 5.,  6.]])

In [15]:
arr.shape = (6,)   # equivalent to arr.shape = (6, 1)
arr

array([ 1.,  2.,  3.,  4.,  5.,  6.])

In [19]:
arr.shape = (1, 6)
arr   # Note that there are *two* square brackets in the output sequence.

array([[ 1.,  2.,  3.,  4.,  5.,  6.]])

### Array methods

Arrays have a number of methods. Let's take a look at the 'mean' method as an example. 

In [21]:
arr = np.array([[1., 2., 3.,], [4., 5., 6.]])  # reset the array to our 2x3 array.

arr.mean()        # The mean of all of the elements in the array

3.5

Mean takes the optional argument `axis` that can be used to take the mean along a single axis of the array. Just like with indexing, the axes are reference in a zero-based system; `axis=0` means the first dimension. 

In [11]:
arr.mean(axis=0)  # The mean 

array([ 3.,  4.])

In this case, there are two rows in the first dimension, and `arr.mean(axis=0)` takes the average in the 'row' direction, resulting in a 1D array that is the average of each column.

---
### *Exercise*

> Find the mean of the array in the 'column' direction, along `axis=1`.

> Use the `sum` method of the array class to get the sum of the numbers in each column. The result should be a 1D array with three elements.

---

You can also use the `reshape` method to change the shape of an array.

In [20]:
arr.shape

(1, 6)

NOTES:  Things to include

views of arrays
vectorization
basic performance evaluation
linear algebra
array broadcasting
overview of scipy packages
