In [1]:
import numpy as np
import numpy.linalg as la

#Matrices

### Simple matrix

In [2]:
m = np.array([
    [1, 2, 3],
    [4, 5, 6]
])

* Matrix shape is a tuple storing the number of size of each dimentions

* For 2D matrices, the order is always rows-columns

In [3]:
m.shape

(2, 3)

### Special matrices

#### Identity

In [13]:
np.eye(3, 3)

array([[ 1.,  0.,  0.],
       [ 0.,  1.,  0.],
       [ 0.,  0.,  1.]])

#### Zeros

In [14]:
np.zeros((3, 3))

array([[ 0.,  0.,  0.],
       [ 0.,  0.,  0.],
       [ 0.,  0.,  0.]])

#### Ones

In [15]:
np.ones((3, 3))

array([[ 1.,  1.,  1.],
       [ 1.,  1.,  1.],
       [ 1.,  1.,  1.]])

## Indexing

For two dimentional matrices, rows are dimention 0 and columns are dimention 1.

In [16]:
m = np.array([
    [1, 2, 3],
    [4, 5, 6],
    [7, 8, 9]
])

In [17]:
m[1, 1]

5

##Slicing

Use the : symbol to get all values across a dimention.

In [18]:
m[1, :]

array([4, 5, 6])

The : can also represent a range

In [19]:
m[:, 0:2]

array([[1, 2],
       [4, 5],
       [7, 8]])

## Reshaping


Preserve the data, but change the shape of the matrix.

In [66]:
m = np.array([
    [1, 2],
    [3, 4],
    [5, 6]
])
m.reshape((2, 3))

array([[1, 2, 3],
       [4, 5, 6]])

Reshape a matrix into a single row, figuring out the correct number of columns


In [67]:
m.reshape((1, -1))

array([[1, 2, 3, 4, 5, 6]])

## Getting useful statistics

Specify dimension using the axis keyword.

Minimum and maximum values

In [22]:
m.max(axis=1)

array([2, 4, 6])

<b>Mean, standard deviation, and variance</b>

In [23]:
m.mean(axis=0)

array([ 3.,  4.])

In [24]:
m.std(axis=0)

array([ 1.63299316,  1.63299316])

In [25]:
m.var(axis=0)

array([ 2.66666667,  2.66666667])

## Operations on matrices

#### Element-wise

In [26]:
a = np.array([
    [1, 2],
    [3, 4]
])

a + a

array([[2, 4],
       [6, 8]])

In [27]:
a * a

array([[ 1,  4],
       [ 9, 16]])

In [28]:
a ** a

array([[  1,   4],
       [ 27, 256]])

#### Matrix multiplication

In [29]:
a = np.array([
    [1, 2],
    [3, 4]
])

b = np.array([
    [1, 1, 1], 
    [1, 1, 1]
])
a.dot(b)

array([[3, 3, 3],
       [7, 7, 7]])

##Common linear algebra operations



In [30]:
import numpy.linalg as la

### Transpose

In [31]:
m = np.array([
    [1, 2],
    [3, 4]
])
m.T

array([[1, 3],
       [2, 4]])

### Eigenvalues and eigenvectors

In [32]:
evals, evects = la.eig(m)

### Singular value decomposition

In [33]:
U, s, V = np.linalg.svd(m) 

### Other useful operations
* Determinant: `la.det(m)`
* Norm: `la.norm(m)`
* Inverse: `la.inv(m)`

# Vectors

A vector is a special case of a matrix, and has a single dimension.

In [34]:
np.array([0.1, 0.3, 0.1, 0.5])

array([ 0.1,  0.3,  0.1,  0.5])

Comma after length unpacks the first and only element of the tuple into the variable

In [35]:
p = np.array([0.1, 0.3, 0.1, 0.5])
length, = p.shape
length

4

## Filtering

In [64]:
p > 0.4

array([False, False, False,  True], dtype=bool)

In [36]:
p[p >= 0.2]

array([ 0.3,  0.5])

## Searching and sorting


In [37]:
p.min()

0.10000000000000001

In [38]:
p.max()

0.5

Index of the max value in the array

In [39]:
p.argmax()

3

Sorting the values

In [42]:
p.sort()
p

array([[1, 2],
       [3, 4]])

Getting the indeces that correspond to a sorted order of the elements

In [41]:
p = np.array([0.1, 0.3, 0.1, 0.5])
p.argsort()

array([0, 2, 1, 3])

In [32]:
events = np.array(['A', 'B', 'C', 'D'])

Suppose we have an array of events, and `p` defines the probability mass function for these event.

To get the two most likely events, we need to figure out the indices of the top two values.

Note: the fancy indexing `[::-1]` reverses the elements in the array using Python slice syntax (start : stop : step).

In [63]:
i = p.argsort()[::-1]
i

array([3, 1, 2, 0])

Getting the top most probable event is easy now:

In [34]:
events[i[:2]]

array(['D', 'B'], 
      dtype='<U1')

# Working with data


###Loading data from a text file

Specify the delimiter: comma, tab, space, etc.

In [66]:
path = 'sample_text'
# Create a text file
with open(path, 'w') as f:
    f.write('1, 2, 3')
np.genfromtxt(path, delimiter=',')

array([ 1.,  2.,  3.])

`genfromtxt` has a lot of useful optional ([arguments](http://docs.scipy.org/doc/numpy/reference/generated/numpy.genfromtxt.html) )

Allows to specify the datatype to load the data as, specify comment format, skipping headers, etc.

### Loading data from a Matlab file with scipy


In [None]:
from scipy.io import loadmat
data = {}
loadmat(filename, data)

`data` is a now dictionary of variable name to data. If the Matlab dump contained a variable `D`, access it by `data['D']` 

### Dumping numpy data to file

In [62]:
data = np.array([1, 2, 3])
np.save('numpy_data', data)

### Loading numpy data from file

In [43]:
data = np.load('numpy_data')
data

array([1, 2, 3])

In [57]:
m2 = np.arange(9).reshape((3, 3))
m2.argsort()

array([[0, 1, 2],
       [0, 1, 2],
       [0, 1, 2]])