# NumPy basics

In [2]:
import numpy as np

## What is NumPy?

The [NumPy homepage](www.numpy.org) explains: 

* NumPy is a fundamental package for scientific computing in Python
* it has a powerful N-dimensional array object: ``np.array``

## creating a numpy array

Create a 3 x 3  array:

In [93]:
foo_a = np.array([[1., 2., 3.], [4., 5., 6.], [7., 8., 9.]])

foo_a.shape  # return the dimensions of the array

(3, 3)

In [94]:
print foo_a

[[ 1.  2.  3.]
 [ 4.  5.  6.]
 [ 7.  8.  9.]]


Create a 3 x 3 array with random numbers, drawn from a uniform distribution over ``[0, 1)``:

In [58]:
foo_rand = np.random.rand(3, 3)
print foo_rand

[[ 0.42743052  0.39807618  0.60244859]
 [ 0.81798928  0.53784336  0.78194494]
 [ 0.03345745  0.00732675  0.07813795]]


## indexing

Keep in mind: Python is **zero-indexed**. 
Show the first row of the array:

In [25]:
foo_a[0, :]

array([ 70.,  20.,  30.])

Show the last column:

In [27]:
foo_a[:, -1]

array([ 30.,  60.,  90.])

## slicing, viewing, copying

### reference
Simple assignments do not copy the object, but create a reference to the original object: ``foo_a`` and ``foo_b`` are different names for the same object.

In [4]:
foo_b = foo_a
foo_b

array([[ 1.,  2.,  3.],
       [ 4.,  5.,  6.],
       [ 7.,  8.,  9.]])

In [5]:
foo_b is foo_a

True

### slicing

In [6]:
b_slice = foo_b[0, :]
b_slice

array([ 1.,  2.,  3.])

``b_slice`` is called a _view_. It is a new object but shares all data with ``foo_b``.

In [7]:
b_slice[0] = 7.
b_slice

array([ 7.,  2.,  3.])

In [8]:
foo_b

array([[ 7.,  2.,  3.],
       [ 4.,  5.,  6.],
       [ 7.,  8.,  9.]])

Any changes made to the view are also reflected in the full object. This makes looping over objects easy, as one don't have to worry about indices:

In [9]:
for row in foo_b:
    row *= 10.
    print foo_b

[[ 70.  20.  30.]
 [  4.   5.   6.]
 [  7.   8.   9.]]
[[ 70.  20.  30.]
 [ 40.  50.  60.]
 [  7.   8.   9.]]
[[ 70.  20.  30.]
 [ 40.  50.  60.]
 [ 70.  80.  90.]]


and, since ``foo_b`` is a reference to ``foo_a``:

In [12]:
foo_a

array([[ 70.,  20.,  30.],
       [ 40.,  50.,  60.],
       [ 70.,  80.,  90.]])

### copying

In [13]:
foo_c = foo_a.copy()
foo_c

array([[ 70.,  20.,  30.],
       [ 40.,  50.,  60.],
       [ 70.,  80.,  90.]])

In [14]:
foo_c is foo_a

False

``copy()`` creates a deep copy of ``foo_a``, i.e., a new object containing a copy of the content of ``foo_a``. Making changes to this copy does not change the original object:

In [15]:
foo_c[0, 1] = 1.
foo_c

array([[ 70.,   1.,  30.],
       [ 40.,  50.,  60.],
       [ 70.,  80.,  90.]])

In [16]:
foo_a

array([[ 70.,  20.,  30.],
       [ 40.,  50.,  60.],
       [ 70.,  80.,  90.]])

## stacking arrays

In [19]:
foo_d = foo_a.copy() / 10
foo_d

array([[ 7.,  2.,  3.],
       [ 4.,  5.,  6.],
       [ 7.,  8.,  9.]])

stack the array ``foo_a`` and ``foo_d`` **horizontally**:

In [22]:
stacked = np.hstack((foo_a, foo_d))
stacked

array([[ 70.,  20.,  30.,   7.,   2.,   3.],
       [ 40.,  50.,  60.,   4.,   5.,   6.],
       [ 70.,  80.,  90.,   7.,   8.,   9.]])

stack the array ``foo_a`` and ``foo_d`` **vertically**:

In [79]:
stacked = np.vstack((foo_a, foo_d))
stacked

array([[ 70.,  20.,  30.],
       [ 40.,  50.,  60.],
       [ 70.,  80.,  90.],
       [  7.,   2.,   3.],
       [  4.,   5.,   6.],
       [  7.,   8.,   9.]])

### doing the opposite: splitting
split the array vertically in 2 equally sized arrays:

In [87]:
foo_x = np.vsplit(stacked, 2)

# what is the returned object?
type(foo_x)

list

In [88]:
len(foo_x)  # how long is the list?

2

In [90]:
print foo_x[0]
print foo_x[1]

[[ 70.  20.  30.]
 [ 40.  50.  60.]
 [ 70.  80.  90.]]
[[ 7.  2.  3.]
 [ 4.  5.  6.]
 [ 7.  8.  9.]]


## simple operations on numpy arrays

### minimum and maximum

In [41]:
print foo_a  # to have a point of reference
print foo_a.min()
print foo_a.max()

[[ 70.  20.  30.]
 [ 40.  50.  60.]
 [ 70.  80.  90.]]
20.0
90.0


### sum

In [47]:
foo_a.sum()

510.0

print the sum per rows and per columns:

In [43]:
print foo_a.sum(axis=0)
print foo_a.sum(axis=1)

[ 180.  150.  180.]
[ 120.  150.  240.]


### cumulative sum

In [46]:
foo_a.cumsum()

array([  70.,   90.,  120.,  160.,  210.,  270.,  340.,  420.,  510.])

### mean

In [48]:
foo_a.mean()

56.666666666666664

In [51]:
np.mean(foo_a)

56.666666666666664

## Excursus: don't import *

In [52]:
print sum(foo_a)
print np.sum(foo_a)

[ 180.  150.  180.]
510.0


In [55]:
from numpy import *

In [56]:
print sum(foo_a)

510.0


_Explanation_: after importing * from numpy, Python's ``sum()`` is now overwritten by numpy's ``sum()``, which works differently.