# Introduction to Scientific Python 

## Numpy

Numpy is imported and aliased to np by convention

In [1]:
import numpy as np

### Array creation

The basic data structure provided by Numpy is the ndarray (n-dimensional array). Each array can store items of the same datatype (e.g. integers, floats, etc.).
Arrays can be created in a number of ways:

- from a python iterable (usually a list)

In [2]:
a = np.array([0, 1, 2]) # a rank 1 array of integers

- using arange (similar to Python's builtin range but returns an array):

In [3]:
a = np.arange(start=0., stop=1., step=.1) # a rank 1 array of floats 
                                          # note that start is included but stop is not
b = np.arange(10) # a rank 1 array of ints from 0 to 9

- using linspace and logspace

In [4]:
a = np.linspace(start=0, stop=5, num=10) # a rank 1 array of 10 evenly-spaced floats between 0 and 5
                                         # this can be thought of as a linear axis of a graph with evenly-spaced ticks
b = np.logspace(-6, -1, 6) # same as above but on logarithmic scale

- using built-in helper functions

In [47]:
zeros = np.zeros(5) # rank 1 array of 5 zeros
ones = np.ones(3)   # it's in the name
eye = np.eye(8) # 8x8 identity matrix

### Exercises

- create a set of axes for a graph as Numpy arrays. Let the x axis be **uniformly spaced** from 0 to 60 with scale 1 (i.e. one tick per value) and the y axis be on **log scale** and represent the powers of 10 from 0 to 6 *[1]*.

In [30]:
# answer:
x = np.linspace(0, 60, 61)
y = np.logspace(0, 6, 7)

### Array indexing and shapes

Basic array indexing is similar to Python lists. However, arrays can be indexed along multiple dimensions.

In [5]:
a = np.array([[0, 1, 2],
              [3, 4, 5],
              [6, 7, 8]])
print(a[0])    # prints [0, 1, 2]
print(a[0, 0]) # prints 0
print(a[:, 1]) # prints [1, 4, 7]

[0 1 2]
0
[1 4 7]


Here : denotes all elements along the axis. For a 2D array (i.e. a matrix) axis 0 is along rows and axis 1 along columns.

#### Slicing
Arrays support slicing along each axis, which works very similarly to Python lists.

In [22]:
a = np.linspace(0, 9, 10)
print(a[0:5]) # prints [0., 1., 2., 3., 4.]
b = np.array([[0,  1,  2],
              [3,  4,  5],
              [6,  7,  8],
              [9, 10, 11]])

print(b[:, 0:2]) # prints [[0,  1]
                 #         [3,  4]
                 #         [6,  7]
                 #         [9, 10]]

[ 0.  1.  2.  3.  4.]
[[ 0  1]
 [ 3  4]
 [ 6  7]
 [ 9 10]]


#### Boolean indexing
Arrays can also be indexed using boolean values

In [13]:
a = np.arange(10)
print(a[a < 5]) # prints [0, 1, 2, 3, 4]
print(a[a % 2 == 0]) # prints [0, 2, 4, 6, 8]

[0 1 2 3 4]
[0 2 4 6 8]


#### Shapes
The shape of the array is a tuple of size ```(array.ndim)```, where each element is the size of the array along that axis.

In [8]:
a = np.array([0, 1, 2])
print(a.shape) # prints 3
b = np.array([[[0], [1], [2]],
              [[3], [4], [5]],
              [[6], [7], [8]]])
print(b.shape) # prints (3, 3, 1)

(3,)
(3, 3, 1)


The shape of the array can be changed using the reshape method.

In [9]:
a = np.arange(9).reshape((3, 3))
print(a) # prints [[0 1 2]
         #         [3 4 5]
         #         [6 7 8]]

[[0 1 2]
 [3 4 5]
 [6 7 8]]


Note that the total number of elements in the new array has to be equal to the number of elements in the initial array. The code below will throw an error.

In [10]:
try:
    a = np.arange(5).reshape((3, 3))
except ValueError as e:
    print(e)

cannot reshape array of size 5 into shape (3,3)


### Exercises

- get the first column and the third row of the following array: *[1]*

In [37]:
a = np.array([[23, 324,  21, 116], 
              [ 0,  55, 232, 122],
              [42,  43,  44,  45],
              [178, 67, 567,  55]])

In [None]:
# answer
first_column = a[:, 0] 
third_row = a[2]

- find all numbers divisible by 7 between 0 and 50 *[1]*

In [39]:
# answer
a = np.arange(0, 51)
divisible_by_7 = a[a % 7 == 0]

In [None]:
- an image can be represented as a 3D array with shape ```[height, width, n_channels]```. Write a function to 

### Mathematical operations and broadcasting

#### Basic arithmetics
Numpy arrays support the standard mathematical operations.

In [20]:
a = np.arange(9).reshape(3, 3)
b = np.array([10, 11, 12])
print(a + 1) # operation is performed on the whole array
print('-'*12)
print(a - 5)
print('-'*12)
print(b * 3)
print('-'*12)
print(b / 2)

[[1 2 3]
 [4 5 6]
 [7 8 9]]
------------
[[-5 -4 -3]
 [-2 -1  0]
 [ 1  2  3]]
------------
[30 33 36]
------------
[ 5.   5.5  6. ]


#### Broadcasting
Operations involving 2 or more arrays are also possible. However, they must obey the rules of broadcasting.

In [19]:
print(a * a) # multiply each element of a by itself
print('-'*12)
print(a + b) # add b elementwise to every row of a
print('-'*12)
print(a[:, 0] + b) # add b only to the first column of a

[[ 0  1  4]
 [ 9 16 25]
 [36 49 64]]
------------
[[10 12 14]
 [13 15 17]
 [16 18 20]]
------------
[10 14 18]


Notice that in the last 2 examples above we could perform the operations even though the arrays did not have the same shape. This is one of the most powerful features of Numpy that allows for very efficient computations. You can read more about broadcasting [in the official documentation](https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html) and [here](http://cs231n.github.io/python-numpy-tutorial/#numpy-broadcasting).

#### Built-in functions
Numpy has efficient implementations of many standard mathematical and statistical functions.

In [71]:
a = np.random.normal(0, 1, (4, 3)) # np.random.normal creates an array of normally-distributed random numbers
                                   # with given mean and standard deviation (e.g. 0 and 1) and in given shape.
print(np.mean(a)) # a number close to 0
print(np.mean(a) == a.mean()) # many functions are also implemented as methods in the ndarray class
print(np.std(a)) # close to 1

print(np.sum(a))  # Compute sum of all elements
print(np.sum(a, axis=0))  # Compute sum of each column
print(np.sum(a, axis=1))  # Compute sum of each row

-0.299042695639
True
0.739101722217
-3.58851234766
[-0.0193929  -0.66852686 -2.90059259]
[-2.11980549  0.12013663 -0.17582538 -1.41301811]


### Exercises

- create a 6x6 array of normally distributed random numbers with mean 5 and standard deviation 10. Print its computed mean and standard deviation *[1]*.

In [52]:
# answer
a = np.random.normal(5, 10, (6, 6))
print(np.mean(a))
print(a.std())

5.09409892265
8.87640405235


- add 5 to every element of the array from the previous exercise *[1]*.

In [53]:
# answer
a += 5

- multiply the third column of the array from the previous exercise by the given array *[1]*.

In [54]:
b = np.array([0, 1, 0, 1, 0, 1])

In [55]:
# answer
a[:, 2] *= b

- create a 7x7 matrix with 7 on the leading diagonal and 0 everywhere else (**note**: you might find the ```eye``` function useful) *[1]*.

In [56]:
# answer
a = np.eye(7) * 7

- the following array represents the spending, in pounds of 4 people over 5 months (rows represent time and columns the individuals).
  Compute the total spending of person 2. Compute the average spending in each month XXX (**note**: use the ```axis``` argument) *[2]*.

In [76]:
#            person     1        2       3        4      # month
spending = np.array([[450.55, 340.67, 1023.98, 765.30],  # 1
                     [430.46, 315.99,  998.48, 760.78],  # 2
                     [470.30, 320.34, 1013.67, 774.50],  # 3
                     [445.62, 400.60, 1020.20, 799.45],  # 4
                     [432.01, 330.13, 1011.76, 750.91]]) # 5

In [78]:
# answer
total_person_2 = np.sum(spending[:, 1])
mean_all = np.mean(spending, axis=0)

- the *outer product* of two vectors ```u``` and ```v``` can be obtained by multiplying each element of ```u``` by each element of ```v```. Compute the outer product of the given vectors (**note**: use the ```reshape``` function) *[2]*.

In [57]:
v = np.array([1,2,3])
w = np.array([4,5])

In [51]:
# answer
outer = np.reshape(v, (3, 1)) * w

[[ 4  5]
 [ 8 10]
 [12 15]]


### Linear algebra

Numpy has extensive support for linear algebra operations.

In [69]:
u = np.array([0, 2, 5]) # a row vector in R^3, shape (3,)
v = np.array([.5, .3, .87])
a = np.arange(0, 9).reshape(3, 3) # a 3x3 matrix
b = np.array([[.25,  .2],
              [4.3,  .1],
              [ 1., .82]]) # a 3x2 matrix

print(u.dot(v)) # the dot product of vectors
print(a.dot(u)) # the matrix vector product
print(a.dot(b)) # matrix multiplication
print(np.linalg.norm(u)) # norm aka magnitude of a vector


29
