# Assignment 1 Preparation

## How to start
***
- Setup conda environment
- Download assignment package from http://cs231n.github.io/assignments/2018/spring1718_assignment1.zip
- Unpack the assignment package
- Download dataset by running `./get_datasets.sh` in `<assignment_home>/cs231n/datasets/` or download directly from http://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz if in MS Windows, and unzip it to current folder
- Run jupyter notebook in `<assignment_home>`
- Start with `knn.ipynb`

## A little linear algebra
***

### Addition and subtraction
Two matrices can only be added or subtracted if they have <font color=red>__the same size__</font>. Matrix addition and subtraction are done entry-wise, which means that each entry in A+B is the sum of the corresponding entries in A and B.
$$ A=\begin{bmatrix} 7 & 5 & 3 \\ 4 & 0 & 5 \end{bmatrix} \qquad B=\begin{bmatrix} 1 & 1 & 1 \\ -1 & 3 & 2 \end{bmatrix}$$

$$ A+B=\begin{bmatrix} 7+1 & 5+1 & 3+1 \\ 4-1 & 0+3 & 5+2\end{bmatrix}=\begin{bmatrix}8 & 6 & 4 \\ 3 & 3 & 7 \end{bmatrix}$$

$$ A-B=\begin{bmatrix} 7-1 & 5-1 & 3-1 \\ 4-(-1) & 0-3 & 5-2\end{bmatrix}=\begin{bmatrix}6 & 4 & 2 \\ 5 & -3 & 3 \end{bmatrix}$$

The following rules applies to sums and scalar multiples of matrices.

Let $A,B,C$ be matrices of the same size, and let $r,s$ be scalars.
- $A+B=B+A$
- $(A+B)+C=A+(B+C)$
- $A+0=A$
- $r(A+B)=rA+rB$
- $(r+s)A=rA+sA$
- $r(sA)=(rs)A$

### Multiplication
What is matrix multiplication? You can multiply two matrices if, and only if, <font color=red>__the number of columns in the first matrix equals the number of rows in the second matrix__</font>.

Otherwise, the product of two matrices is undefined.

$$A=\begin{bmatrix}a_{11} & a_{12} & a_{13} \\ a_{21} & a_{22} & a_{23}\end{bmatrix} \qquad B=\begin{bmatrix}b_{11} & b_{12}\\b_{21}&b_{22}\\b_{31}&b_{32}\end{bmatrix} $$

$$ A\cdot B=\begin{bmatrix}a_{11}\times b_{11}+a_{12}\times b_{21}+a_{13}\times b_{31} & a_{11}\times b_{12}+a_{12}\times b_{22}+a_{13}\times b_{23} \\ a_{21}\times b_{11}+a_{22}\times b_{21}+a_{23}\times b_{31} & a_{21}\times b_{12}+a_{22}\times b_{22}+a_{23}\times b_{23} \end{bmatrix}$$

$$ B \cdot A=undefined$$

The product matrix's dimensions are $(\text{rows in first matrix}) \times (\text{columns of the second matrix})$

### Transpose
Given the $m \times n$ matrix $A$, the __transpose__ of $A$ is the $n \times m$, denoted $A^T$, whose columns are formed from the corresponding rows of $A$. 


## NumPy
***

NumPy Reference: https://docs.scipy.org/doc/numpy/reference/

### Use numpy for most array, vector or matrix calculation

NumPy is the fundamental package for scientific computing with Python. It contains among other things:

- a powerful N-dimensional array object
- sophisticated (broadcasting) functions
- tools for integrating C/C++ and Fortran code
- useful linear algebra, Fourier transform, and random number capabilities

Besides its obvious scientific uses, NumPy can also be used as an efficient multi-dimensional container of generic data. Arbitrary data-types can be defined. This allows NumPy to seamlessly and speedily integrate with a wide variety of databases.

### Data Type: list (python array), tuple, np.ndarray (numpy array)

In [59]:
import numpy as np

A = [[1,2,3],[4,5,6]]
print(A)
print("type of A: %s" % type(A))
print()

B = np.array([[1,4,7],[2,5,8],[3,6,9]])
print(B)
print("type of B: %s" % type(B))
print()

print(B.shape)
print("type of B.shape: %s" % type(B.shape))

[[1, 2, 3], [4, 5, 6]]
type of A: <class 'list'>

[[1 4 7]
 [2 5 8]
 [3 6 9]]
type of B: <class 'numpy.ndarray'>

(3, 3)
type of B.shape: <class 'tuple'>
1
2


### Two ways to use numpy when doing matrix calculation

In [60]:
A = np.array([[1,2,3], [4,5,6]])
B = np.array([[1,4,7],[2,5,8],[3,6,9]])

print("Method I:")
print(np.dot(A, B))
print()
print("Method II:")
print(A.dot(B))
print()
print("Just a test:")
print(np.dot(B, A))

Method I:
[[ 14  32  50]
 [ 32  77 122]]

Method II:
[[ 14  32  50]
 [ 32  77 122]]

Just a test:


ValueError: shapes (3,3) and (2,3) not aligned: 3 (dim 1) != 2 (dim 0)

In [9]:
print(np.dot(B, A.T))

[[30 66]
 [36 81]
 [42 96]]


### Shape
A tuple is an (immutable) ordered list of values.

Shape is a tuple of number of elements in each dimensions of an array(matrix).

In [61]:
print(A.shape)
print(B.shape)
print(np.dot(A, B).shape)

(2, 3)
(3, 3)
(2, 3)


### Broadcasting
The term broadcasting describes how numpy treats arrays with different shapes during arithmetic operations. Subject to certain constraints, the smaller array is “broadcast” across the larger array so that they have compatible shapes.

Broadcasting provides a means of vectorizing array operations so that looping occurs in C instead of Python. It does this without making needless copies of data and usually leads to efficient algorithm implementations.

An example:

$$A=\begin{bmatrix}a_{11} & a_{12} \\ a_{21} & a_{22} \\ a_{31}&a_{32}\end{bmatrix} \qquad B=\begin{bmatrix}b_{11} & b_{12}\end{bmatrix} \qquad A+B=?$$

$$A+B=\begin{bmatrix}a_{11} & a_{12} \\ a_{21} & a_{22} \\ a_{31}&a_{32}\end{bmatrix}+\begin{bmatrix}b_{11} & b_{12}\\b_{11} & b_{12}\\b_{11} & b_{12}\end{bmatrix}=\begin{bmatrix}a_{11}+b_{11} & a_{12}+b_{12} \\ a_{21}+b_{11} & a_{22}+b_{12} \\ a_{31}+b_{11}&a_{32}+b_{12}\end{bmatrix} $$

#### General Broadcasting Rules
When operating on two arrays, NumPy compares their shapes element-wise. It starts with the trailing dimensions, and works its way forward. Two dimensions are compatible when:
1. they are equal, or
1. one of them is 1

More examples:

```
A      (4d array):  8 x 1 x 6 x 1
B      (3d array):      7 x 1 x 5
result (4d array):  8 x 7 x 6 x 5

A      (2d array):  5 x 4
B      (1d array):      1
Result (2d array):  5 x 4

A      (2d array):  5 x 4
B      (1d array):      4
Result (2d array):  5 x 4

A      (3d array):  15 x 3 x 5
B      (3d array):  15 x 1 x 5
Result (3d array):  15 x 3 x 5

A      (3d array):  15 x 3 x 5
B      (2d array):       3 x 5
Result (3d array):  15 x 3 x 5

A      (3d array):  15 x 3 x 5
B      (2d array):       3 x 1
Result (3d array):  15 x 3 x 5
```

In [24]:
C = np.array([[1,2,3]])
print("A.shape: %s" % (A.shape,))
print("B.shape: %s" % (B.shape,))
print("C.shape: %s" % (C.shape,))
print("B+C= %s" %((B+C).shape, ))
print(B+C)
print()
print("A+B=")
print(A+B)


A.shape: (2, 3)
B.shape: (3, 3)
C.shape: (1, 3)
B+C= (3, 3)
[[ 2  6 10]
 [ 3  7 11]
 [ 4  8 12]]

A+B=


ValueError: operands could not be broadcast together with shapes (2,3) (3,3) 

### Indices

In [48]:
D = np.array(range(10))
D *= 2
print(D)
print()
print("D[8]=%s" % (D[8]))
print()
print("D[[1,3,5]]=%s" % (D[[1,3,5]]))
print()
B = np.array([[1,4,7],[2,5,8],[3,6,9]])
print("B=%s" % B)
print()
print("B[:, 1]=%s" % B[:, 1])
print()
E = np.copy(B)
E[:, 1] = 10
print("B after B[:, 1]=10: \n%s" % E)
print()
F = np.copy(B)
F[:, [1, 2]] = 10
print("B after B[:, [1,2]]=10: \n%s" % F)
print()
G = np.copy(B)
G[[0,1,2], [0,1,2]] = 10
print("B after B[:, [1,2]]=10: \n%s" % G)

[ 0  2  4  6  8 10 12 14 16 18]

D[8]=16

D[[1,3,5]]=[ 2  6 10]

B=[[1 4 7]
 [2 5 8]
 [3 6 9]]

B[:, 1]=[4 5 6]

B after B[range(3), 1]=10: 
[[ 1 10  7]
 [ 2 10  8]
 [ 3 10  9]]

B after B[:, [1,2]]=10: 
[[ 1 10 10]
 [ 2 10 10]
 [ 3 10 10]]

B after B[:, [1,2]]=10: 
[[10  4  7]
 [ 2 10  8]
 [ 3  6 10]]
