### CASA0002

# Urban Simulation
***
## Linear Algebra with Numpy

Mateo Neira
***

For both Spatial Interaction Models and Urban Networks it's important to have a basic understanding of linear algebra concepts and to be able to work with and manipulate vectors and matrices in python. 

In this lab we will cover the basics of working with numpy, which will serve as a base for the rest of the practicals in this module.

**Objectives**:
* Review basic numpy functions
* define a vector and calculate a vector length and dot product
* define a matrix and calculate a matrix multiplication, transpose, and inverse
* explain cosine similarity and compute the similarity between two vectors
* eigenvalues and eigenvectors

## Numpy

Numpy is the fundamental package for scientific computing with Python. It provides numerical functions on _ndarray_ which are fixed size n-dimensional array data structures. Numpy is implemented in C where its memory is more efficiently stored. 

Numpy arrays form the core of nearly the entire ecosystem of data science tools in Python, so time spent learning to use numpy effectively will be useful not only in this module, but other data science applications as well. Here we will be using _ndarray_ to represent vectors and matrices. 

We can import numpy using:

```python
import numpy as np
```



In [4]:
### let's first import the numpy library
import numpy as np

There are different types of objects (or structures) in linear algebra:
* Scalar: Single number
* Vector: Array of numbers
* Matrix: 2-dimensional array of numbers
* Tensor: N-dimensional array of numbers where n > 2

### Numpy Arrays - Vectors
> $x \in \mathbb{R}^n$
 
One dimensional ndarray represent a vector of elements.

For example we can create the following vector $\vec{x} = \begin{bmatrix}2 & 3 & 4 \end{bmatrix}$ in numpy:

```python
x = np.array([2,3,4])
```

In [5]:
### make a 1d array
x=np.array([1,1,1,1,1,1,1,2,2,3,4,5,6,7,8,10])

print (x.ndim)
print (x.size)
print (x.shape)

# similar to python data type (int,float,bool)
print (x.dtype)

1
16
(16,)
int32


### Numpy Arrays - Matrix

> $X \in \mathbb{R}^{n*m}$

Two dimensional ndarray represents a matrix of elements.

For example we can create the following matrix $ X = \begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \end{bmatrix}$ in numpy:

> X = np.array([[1, 2, 3], [4, 5, 6]])

In [6]:
### make a 2d array
X=np.array([[1,2,3],
            [4,5,6]])

print (X.ndim)
print (X.size)
print (X.shape)
print (X.dtype)

2
6
(2, 3)
int32


### Numpy Arrays - Multidimensional arrays (Tensors)

We can also use ndarray to create multidimensional arrays. These are often useful to represent tensors.

Images, for example can be represented as a three-dimensional array where the shape = (channel,height, width).

```python
X = np.array([ [[1, 2],[4, 5]],
               [[2, 5], [6, 4]],
               [[2, 5],[6, 4]]])
```

In [10]:
### make a 3d array
X=np.array( [[[1, 2], [4, 5]],
             [[2, 5], [6, 4]],
             [[2, 5], [6, 4]]])

print (X.ndim)
print (X.size)
print (X.shape)
print (X.dtype)

3
12
(3, 2, 2)
int32


### indexing

Individual elements in arrays can be retrieved using the [] indexer. We can also use the [] indexer to set values of individual array elements. Python is zero-indexed, meaning that the first element has to be accessed with index 0.

```python 
x[0] = 5
```

In [13]:
x=np.array([5,3,8,9])

### indexing
print(x[0])

### setting the value of individual element
### this will set the second element in the vector to 0
x[1]=0
print(x)

5
[5 0 8 9]


In multi-dimensional arrays, items can be accessed using a comma-separated tuple of indices. 

For example the element $x_{1,2}$ from the matrix $X$ can be accessed using:

```python
X[0,1]
```

In [14]:
X = np.array([[1, 2, 3], [4, 5, 6]])
print(X)
print(X[0,1])

[[1 2 3]
 [4 5 6]]
2


Values can also be modified in multi-dimensional arrays using the same index notation.

In [15]:
X[1,2]=10
X

array([[ 1,  2,  3],
       [ 4,  5, 10]])

### slicing

In the same way that we can use [] to access individual elements, we can also use them to access subarrays with the *slice* notation by using colon (:). 

```python 
x[start:stop:step]
```

By default these values take start = 0, stop = size of dimension, step = 1

In [16]:
### get first two elements
x[:2]

array([5, 0])

In [17]:
### get all elements after second
x[2:]

array([8, 9])

In [18]:
## elements in the middle of the array
x[2:4]

array([8, 9])

In [19]:
## similarly with multi-dimensional arrays
## get first 2 columns and first 2 rows
X[:2,:2]

array([[1, 2],
       [4, 5]])

### reshaping

Reshaping is another useful operation, and can be called using the _.reshape_ method. 

As an example, we can create an array of the number 1 through 9, and reshape into a 3 x 3 grid:

```python
x = np.arange(1,10)
X = x.reshape((3,3))
```

In [20]:
# .arange() creates envely spaced values with a given interval
x = np.arange(1,10)
X = x.reshape((3,3))
print(x)
print(X)

[1 2 3 4 5 6 7 8 9]
[[1 2 3]
 [4 5 6]
 [7 8 9]]


### mathematical function

So far we have been discussing some of the basics of numpy, basically creating, accessing and modifying _ndarrays_. However, the power of **numpy** lies in its easy and flexible interface to optimize computation over these _ndarrays_.

### array arithmetic

We can use python native arithmatic operators on _ndarrays_. For example, standard addition, subtraction, multiplication, and division can be used.

In [22]:
# arithmatic operations on arrays 
x = np.array([1, 2, 3])

print (x + 1)
print (x * 2)
print (x ** 2)

[2 3 4]
[2 4 6]
[1 4 9]


Note that these operations are 'broadcasted' to the array. In a nutshell 'broadcasting' describes how numpy treats arrays with different shapes during arithmatic operation. These operations are run element-wise.

see: https://numpy.org/doc/stable/user/basics.broadcasting.html

We can also sum over a given axis in an array using the _.sum()_ method. 

Imagine we have 
$$ T = \begin{bmatrix} 5 & 2 & 10 \\ 6 & 8 & 4 \\ 10 & 4 & 6  \end{bmatrix}$$ 

representing a origin-destination matrix. We can quickly calculate totals at origins $O_i = \sum_j T_{ij}$ and destination $D_j = \sum_i T_{ij}$:

```python
Origins = T.sum(axis=1)
Destinations = T.sum(axis=0)
```

In [24]:
# define array
T = np.array([[5,2,10],
              [6,8,4],
              [10,4,6]
             ])

# sum across rows
origins = T.sum(axis=1)
print(origins)

# sums across columns
destinations = T.sum(axis=0)
print(destinations)

[17 18 20]
[21 14 20]


### dot products

In [25]:
# dot product between two arrays (product-sum between the two arrays)
rv = np.array([1,2,3])
cv = np.array([4,5,6])
np.dot(rv,cv)

32

### matrix multiplication

32

### inverting a matrix

### References

* Python for Data Analysis: Data Wrangling with Pandas, NumPy, and Ipython (Wes McKinney)


* Numpy reference: https://numpy.org/doc/stable/reference/index.html

