# Introduction to Numpy 

NumPy is a package for storing and performing numerical computations in Python using multidimensional data arrays (vectors, matrices, and N-dimensional arrays in general).

For simplisity, since this notebook focuses on numpy, we'll import it here directly. In the future, we'll import it through the convention `import numpy as np`

In [10]:
import numpy as np

In [11]:
from numpy import *

## Creating *numpy* arrays

We can create a numpy array directly from Python lists

In [4]:
l = [0,-2,3,10]
x = array(l)
x

array([ 0, -2,  3, 10])

In [5]:
type(x)

numpy.ndarray

In [6]:
x.shape

(4,)

In [7]:
L = [[0, -2], [3, 10],[-7, 1]]
L

[[0, -2], [3, 10], [-7, 1]]

In [14]:
X = array(L)
X

array([[ 0, -2],
       [ 3, 10],
       [-7,  1]])

In [15]:
X.shape

(3, 2)

Numpy arrays differ from Python lists mainly in supporting mathematical computations, and in being much more memory efficient. To achieve that, they keep homogenous datatypes for all elements 

In [16]:
x = array([0,-2,3,10])
print(x)
x.dtype

[ 0 -2  3 10]


dtype('int64')

In [17]:
x = array([0,-2.1,3,10])
print(x)
x.dtype

[  0.   -2.1   3.   10. ]


dtype('float64')

**Exercise**

For each of the matrices (arrays) below, 
1. create a corresponding numpy array 
1. print the array you created
2. print its shape
3. print its data type

$$
X = \left[\begin{array}{cc} 
8 & 1.4
\end{array}\right]
$$ 

$$
Z = \left[\begin{array}{cc} 
8 & 1.4\\
-3.1 & 0.18\\
9.1 & 1
\end{array}\right]
$$ 

In [21]:
X = [8,14]
Z = [ [8,14],[-3.1,0.18],[9.1,1]]
nx = array(X)
print(nx)
print(nx.shape)
print(nx.dtype)
nz = array(Z)
print(nz)
print(nz.shape)
print(nz.dtype)

[ 8 14]
(2,)
int64
[[  8.    14.  ]
 [ -3.1    0.18]
 [  9.1    1.  ]]
(3, 2)
float64


**End exercise**

## Manipulating arrays

### Accessing elements

In [22]:
X=array([[0, 1],
         [2, 3],
         [4,5]])
X

array([[0, 1],
       [2, 3],
       [4, 5]])

How can we access the element at the last row and first column?

In [23]:
row = 2
col = 0
X[row,col]

4

How can we access the last row?

In [24]:
X[row,:]

array([4, 5])

In [25]:
X[row]

array([4, 5])

How can we access the first column?

In [26]:
X[:,col]

array([0, 2, 4])

### Assigning values to elements

In [27]:
X[2,0] = 41
print(X)

[[ 0  1]
 [ 2  3]
 [41  5]]


**Exercise**: 

create the following array, and print its value at the second column of the third row

$$
Z = \left[\begin{array}{cc} 
0.8 & 1.4\\
-3.1 & 0.18\\
9.1 & -0.2
\end{array}\right]
$$ 

In [29]:
Z = [ [8,14],[-3.1,0.18],[9.1,-0.2]]
nz = array(Z)
print(nz[2,1])


-0.2


**End exercise**

### Slicing

Numpy slicing works similarly to lists slicing `X[start:end:step]`

In [30]:
x = array([-1,2,7,69,100])
print(x[1:3])

[2 7]


In [31]:
print(x[::2])

[ -1   7 100]


In [32]:
X = arange(30).reshape(5,6)
print(X)

[[ 0  1  2  3  4  5]
 [ 6  7  8  9 10 11]
 [12 13 14 15 16 17]
 [18 19 20 21 22 23]
 [24 25 26 27 28 29]]


In [33]:
print(X[:3, 3:])# all rows until the third, and all columns from the third

[[ 3  4  5]
 [ 9 10 11]
 [15 16 17]]


Slicing using an array or list as indices 

In [34]:
inds = [1, 3]
print(X[2,inds])

[13 15]


Slicing using masks 

In [35]:
inds_mask = array([True, False, True, False, True, False])
print(X[1,inds_mask])

[ 6  8 10]


**Exercise**: From the array Z which you have already created, access and print its two bottom values in the the second column, meaning the sub-array

$$
\left[\begin{array}{cc} 
0.18\\
-0.2
\end{array}\right]
$$ 

in two different ways:
1. Directly accessing them
2. Masking the array Z 

In [41]:
print(nz[1:3,1:3])
mask = array([False,True,True])
print(nz[mask,1])

[[ 0.18]
 [-0.2 ]]
[ 0.18 -0.2 ]


**End exercise**

## Some Linear Algebra

### Element-wise operations

In [42]:
x = arange(1,22,10)
print('x = '+str(x))
y = arange(7,12,2)
print('y  = '+str(y))

x = [ 1 11 21]
y  = [ 7  9 11]


In [43]:
print('x+y = ' +str(x+y))

x+y = [ 8 20 32]


In [44]:
print('x*y = ' +str(x*y))

x*y = [  7  99 231]


In [45]:
X = array([[1,2,3],[4,5,6]])
X*X

array([[ 1,  4,  9],
       [16, 25, 36]])

### Finding an element by conditioning

In [46]:
print(x)
r=where(x==11)
print(r)

[ 1 11 21]
(array([1]),)


In [47]:
print(x)
r=where(x>=10)
print(r)

[ 1 11 21]
(array([1, 2]),)


In [48]:
print(X)
i=where(X>=2)
print(i)

[[1 2 3]
 [4 5 6]]
(array([0, 0, 1, 1, 1]), array([1, 2, 0, 1, 2]))


In [49]:
print(X)
r,c=where(X>=2)
print('r = '+str(r))
print('c = '+str(c))

[[1 2 3]
 [4 5 6]]
r = [0 0 1 1 1]
c = [1 2 0 1 2]


**Exercise**

1. Copy the following command into a code cell and run it to generate an array with 10 random elements
```Python 
A = random.rand(10)
``` 
2. Find the indices of all elements bigger than 0.5
3. Find the sum of all elements bigger than 0.5
4. Repeat the above using the following code to generate the random array
```Python 
A = random.rand(3,4)
``` 


In [59]:
A = random.rand(10)
print(A)
i = where(A>=0.5)
print(i)
s = [A[ii] for ii in i]
print(sum(s))
A = random.rand(3,4)
sum( where(A>=0.5))




[ 0.72298599  0.89166255  0.23116415  0.19924921  0.43343815  0.13543342
  0.37043989  0.19808408  0.58876429  0.95858795]
(array([0, 1, 8, 9]),)
3.16200077274


12

**End exercise**

## Broadcasting

In [60]:
2*X

array([[ 2,  4,  6],
       [ 8, 10, 12]])

In [61]:
print('X=')
print(X)
print('x=')
print(x)
X * x

X=
[[1 2 3]
 [4 5 6]]
x=
[ 1 11 21]


array([[  1,  22,  63],
       [  4,  55, 126]])

In [None]:
X+x

**Exercise**

1. create numpy arrays that correspond to the matrices x, y, Z, W below 
2. compute and print the result of x+y (element-wise summation)
3. compute and print the result of x*y (element-wise multiplication)
4. compute and print the result of Z+W (element-wise summation)
5. compute and print the result of Z*W (element-wise multiplication)
5. compute and print the result of Z*x (element-wise multiplication)

$$
x = \left[\begin{array}{cc} 
8 & 1.4
\end{array}\right]
$$

$$
y = \left[\begin{array}{cc} 
2.7 & 4
\end{array}\right]
$$ 

$$
Z = \left[\begin{array}{cc} 
8 & 1.4\\
-3.1 & 0.18\\
9.1 & 1
\end{array}\right]
$$ 

$$
W = \left[\begin{array}{cc} 
0 & 0.1\\
-1 & 8\\
3 & 0
\end{array}\right]
$$ 

In [73]:
x = array([8,1.4])
y = array([2.7,4])
Z = array([ [8,1.4],[-3.1,0.18],[9.1,1]])
W = array([ [0,0.1],[-1,8],[3,0]])
print(x+y)
print(x*y)
print(W+Z)
print(W * Z)
print(Z*x)

[ 10.7   5.4]
[ 21.6   5.6]
[[  8.     1.5 ]
 [ -4.1    8.18]
 [ 12.1    1.  ]]
[[  0.     0.14]
 [  3.1    1.44]
 [ 27.3    0.  ]]
[[ 64.      1.96 ]
 [-24.8     0.252]
 [ 72.8     1.4  ]]


**End exercise**

### Matrix computations

In [75]:
data=array([[1,2,3],[6,5,4],[7,8,9]])
data

array([[1, 2, 3],
       [6, 5, 4],
       [7, 8, 9]])

#### min and max

In [76]:
data.min()

1

In [77]:
data.max()

9

In [78]:
data.min(axis=0)

array([1, 2, 3])

In [79]:
data.max(axis=1)

array([3, 6, 9])

In [80]:
data.argmin(axis=1)

array([0, 2, 0])

In [81]:
data.argmax(axis=0)

array([2, 2, 2])

#### sum

In [82]:
data.sum()

45

In [83]:
data.sum(axis=0)

array([14, 15, 16])

In [84]:
data.sum(axis=1)

array([ 6, 15, 24])

In [85]:
data.sum(axis=1,keepdims=True)

array([[ 6],
       [15],
       [24]])

#### mean

In [86]:
data.mean()

5.0

In [87]:
data.mean(axis=0)

array([ 4.66666667,  5.        ,  5.33333333])

In [88]:
data.mean(axis=1)

array([ 2.,  5.,  8.])

In [89]:
data.mean(axis=1,keepdims=True)

array([[ 2.],
       [ 5.],
       [ 8.]])

#### standard deviation

In [90]:
data.std()

2.5819888974716112

In [91]:
data.std(axis=0)

array([ 2.62466929,  2.44948974,  2.62466929])

In [92]:
data.std(axis=1)

array([ 0.81649658,  0.81649658,  0.81649658])

In [93]:
data.std(axis=1,keepdims=True)

array([[ 0.81649658],
       [ 0.81649658],
       [ 0.81649658]])

#### prod

In [94]:
data.prod(axis=0)

array([ 42,  80, 108])

**Exercise**: 

Consider the array Z from the previous exercise
1. Compute the sum of each column of the array Z
2. Compute the sum of each row of the array Z
3. Compute the mean of each column of the array Z
4. Compute the mean of each row of the array Z
5. Compute the max of each row of the array Z
6. For each row of the array Z, find the location of the maximal value within this row 

In [100]:
print(Z)
print(Z.sum(axis=0))
print(Z.sum(axis=1))
print(Z.mean(axis=0))
print(Z.mean(axis=1))
print(Z.max(axis=1))
print(Z.argmax(axis=1))

[[ 8.    1.4 ]
 [-3.1   0.18]
 [ 9.1   1.  ]]
[ 14.     2.58]
[  9.4   -2.92  10.1 ]
[ 4.66666667  0.86      ]
[ 4.7  -1.46  5.05]
[ 8.    0.18  9.1 ]
[0 1 0]


**End exercise**

## Reshaping, resizing and stacking arrays

The shape of an Numpy array can be modified without copying the underlaying data, which makes it a fast operation even for large arrays.

In [101]:
X = arange(0,9)
X

array([0, 1, 2, 3, 4, 5, 6, 7, 8])

In [102]:
X = X.reshape((3,3))
X

array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])

In [103]:
X.reshape(9)

array([0, 1, 2, 3, 4, 5, 6, 7, 8])

**Exercise**
1. Copy the following command into a code cell and run it to generate a random array with 5 rows and 16 columns. Print A and make sure you understand what you see
```Python 
A = random.rand(5,16)
``` 
2. Select the first row of A, reshape it to have a shape of four rows and four columns and print it
3. Write a for loop that does the same for every row of A 

In [118]:
A = random.rand(5,16)
print(A)
print(A[0].reshape(4,4))
print("reshape all")

for i in range(A.shape[0]):
        print(A[i].reshape(4,4))
        print(i)


[[ 0.50184138  0.89417531  0.77524769  0.18950454  0.6093141   0.17752043
   0.04246347  0.32768658  0.82396369  0.17797875  0.28670532  0.09751858
   0.72649203  0.97536148  0.61953827  0.76607297]
 [ 0.62810504  0.19823147  0.99545107  0.60410259  0.54535637  0.11519433
   0.0598548   0.65156754  0.30291723  0.04585297  0.49033988  0.36458764
   0.23141794  0.49645958  0.14780498  0.2583214 ]
 [ 0.86407829  0.35683621  0.83949082  0.15190864  0.33338663  0.39137071
   0.62404102  0.73448912  0.68476139  0.01187356  0.6693975   0.85545538
   0.89905999  0.52802401  0.76990569  0.60024134]
 [ 0.50332712  0.51043801  0.94944656  0.38549097  0.46716122  0.84890692
   0.94634809  0.91206124  0.84755156  0.86540415  0.07750472  0.49267483
   0.90724822  0.8611822   0.30395357  0.15127369]
 [ 0.18582452  0.54868653  0.2138231   0.36371195  0.3136559   0.88354385
   0.49690197  0.76232157  0.99384457  0.50214365  0.70527341  0.39521269
   0.30457095  0.22482418  0.27545366  0.97969827]]
[[ 0

**End exercise**

## Adding a new dimension: newaxis

With `newaxis`, we can insert new dimensions in an array, for example converting a vector to a column or row matrix:

In [119]:
v = array([1,2,3])

In [120]:
shape(v)

(3,)

In [128]:
# make a column matrix of the vector v
v[:, newaxis]

array([[1],
       [2],
       [3]])

In [129]:
# make a column matrix of the vector v
v[newaxis, :]

array([[1, 2, 3]])

In [130]:
# column matrix
v[:,newaxis].shape

(3, 1)

In [131]:
# row matrix
v[newaxis,:].shape

(1, 3)

## Stacking arrays

### concatenate

In [132]:
a = array([[1, 2], [3, 4]])
a

array([[1, 2],
       [3, 4]])

In [133]:
b = array([[5, 6]])
b

array([[5, 6]])

In [134]:
concatenate((a, b), axis=0)

array([[1, 2],
       [3, 4],
       [5, 6]])

In [125]:
concatenate((a, b), axis=1)# error

ValueError: all the input array dimensions except for the concatenation axis must match exactly

In [126]:
concatenate((a, b.reshape(2,1)), axis=1)

array([[1, 2, 5],
       [3, 4, 6]])

In [127]:
c = array([[5], [6]])
c

array([[5],
       [6]])

In [None]:
concatenate((a, c), axis=1)

### hstack and vstack

In [135]:
b = array([[5, 6]])
b

array([[5, 6]])

In [136]:
vstack((a,b))

array([[1, 2],
       [3, 4],
       [5, 6]])

In [137]:
hstack((a,c))

array([[1, 2, 5],
       [3, 4, 6]])

**Exercise**

Consider the arrays Z and W we've used above

$$
Z = \left[\begin{array}{cc} 
8 & 1.4\\
-3.1 & 0.18\\
9.1 & 1
\end{array}\right]
$$ 

$$
W = \left[\begin{array}{cc} 
0 & 0.1\\
-1 & 8\\
3 & 0
\end{array}\right]
$$ 

and print the result of
1. concatenating them horizontally $\left[Z, W\right]$
2. concatenating them vertically $\left[\begin{array}{cc} 
Z\\
W
\end{array}\right]$


In [141]:
print(hstack((Z,W)))
print(vstack((Z,W)))


[[ 8.    1.4   0.    0.1 ]
 [-3.1   0.18 -1.    8.  ]
 [ 9.1   1.    3.    0.  ]]
[[ 8.    1.4 ]
 [-3.1   0.18]
 [ 9.1   1.  ]
 [ 0.    0.1 ]
 [-1.    8.  ]
 [ 3.    0.  ]]


**End exercise**

## Using arrays in conditions

When using arrays in conditions,for example `if` statements and other boolean expressions, one needs to use `any` or `all`, which requires that any or all elements in the array evalutes to `True`:

In [None]:
M = 10*random.rand(10)
M

In [None]:
M > 5

In [None]:
# number of elements in M which are larger than 5
(M > 5).sum()

#### Multidimensional arrays

In [None]:
x = random.randint(50, size=(3, 4, 5))  # Three-dimensional array
x

In [None]:
x = random.randint(50, size=(3, 4, 5,6))  # four-dimensional array
x

### Some useful numpy built-in functions

#### zeros and ones

In [142]:
zeros((3,5),dtype=int)

array([[0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0]])

In [143]:
ones((3,5))

array([[ 1.,  1.,  1.,  1.,  1.],
       [ 1.,  1.,  1.,  1.,  1.],
       [ 1.,  1.,  1.,  1.,  1.]])

#### arange

In [144]:
arange(-1, 2, 0.5)# start, end, step size

array([-1. , -0.5,  0. ,  0.5,  1. ,  1.5])

#### linspace

In [145]:
# NOTE: both start and end values are included
linspace(0, 10, 25)# start, end, number of values

array([  0.        ,   0.41666667,   0.83333333,   1.25      ,
         1.66666667,   2.08333333,   2.5       ,   2.91666667,
         3.33333333,   3.75      ,   4.16666667,   4.58333333,
         5.        ,   5.41666667,   5.83333333,   6.25      ,
         6.66666667,   7.08333333,   7.5       ,   7.91666667,
         8.33333333,   8.75      ,   9.16666667,   9.58333333,  10.        ])

In [146]:
linspace(0,10,11)

array([  0.,   1.,   2.,   3.,   4.,   5.,   6.,   7.,   8.,   9.,  10.])

#### random data

In [147]:
# uniformly distributed random numbers in [0,1]
random.rand(3,4)

array([[ 0.10264795,  0.88400464,  0.87427144,  0.96287844],
       [ 0.1146749 ,  0.90872114,  0.14755648,  0.67141376],
       [ 0.81106304,  0.06183787,  0.67872097,  0.54712102]])

In [148]:
# normally distributed random numbers (mean=0, std=1)
random.randn(3,4)

array([[-1.17360311, -0.32483946, -1.01560843,  1.42114764],
       [-2.03093562, -0.25170466, -0.82731581, -1.12328775],
       [-2.55567442,  0.45265414, -0.3618605 , -0.9920707 ]])

In [149]:
# binomial
random.binomial(10,.7,(3,4))

array([[ 8,  6,  7,  4],
       [ 5,  7,  6,  7],
       [ 7,  7, 10,  8]])