# 1. Vector
-  Linear Algebra: mathematics that deals with vector spaces
- how useful the vectors are for data science?
  1. for us, vectors are some points in a free dimensional space
  2. to represent any numeric data, as list of numbers

- Using lists as vectors is great for exposition but terrible for performance
- In production code, we use NumPy library, which includes a high performance array class with all sorts of arithmetic operations included

In [6]:
# List as vector
from typing import List  #for lists' type annotation

vector = List[float] #here vector is list of float values


height_weight_age = [160,    #vector containing height, weight, age
                     65,
                     40]

grades = [95,         #vector with grades in four tests
          78,
          92,
          65]

### 1. add(a: Vector, b: Vector) -> Vector
- beacuse lists are not vectors, we need to perform vector operations on them using zip-ing

In [18]:
from typing import List #for lists' type annotation

Vector= List[float]     #define 'Vector' annotation

def add_longcode(a: Vector, b: Vector) -> Vector:   #define add function   
    """Adds corresponding elements"""
    
    assert len(a)==len(b), "Input vectors must be of same length"
    
    sum_vector = [] 
    zip_vector = zip(a,b) #forms list of tuples
    
    for a_i, b_i in zip_vector:
        sum_i = a_i + b_i
        sum_vector.append(sum_i)
    
    return sum_vector

add_longcode([1,2],[2,1])

[3, 3]

- The above program can be shortened as:

In [19]:
from typing import List  #for lists' type annotation

Vector = List[float]  

def add(a: Vector, b: Vector) -> Vector:
    """Adds corresponding elements"""
    
    assert len(a)==len(b), "Input vectors must be of same length"
    
    return [(a_i + b_i) for a_i, b_i in zip(a, b)]

assert add([3,4],[2,1]) == [5,5]
# assert add([2,3],[3,5,3]) == [5,8] #Will generate assertion error, not same length

### 2. subtract(a: Vector, b: Vector) -> Vector

In [20]:
#subtracting

from typing import List 

Vector = List[float]

def subtract(a: Vector, b: Vector) -> Vector:
    """Subtracts corresponding elements"""

    assert len(a) == len(b)

    return [(a_i - b_i) for a_i, b_i in zip(a,b)]

assert subtract([1,2],[3,4]) == [-2,-2]

## componentwise operation
- componentwise sum of vectors i.e. result vector is a vector of sum of respetive elements of all vectors
- Example `assert sum_total([1,2],[3,4],[5,6],[7,8])==[16, 20]

## 3. vector_sum(list_of_vectors: List[Vector]) -> Vector

In [12]:
from typing import List 
Vector = List[float]

def vector_sum_longcode(list_of_vectors: List[Vector]) -> Vector:
   
    #check if list_of_vectors is empty
    assert list_of_vectors, "no vectors provided"

    #check if vectors are of same size
    l = len(list_of_vectors[0])  #length of first vector
    #all() returns True if all elements in given iterable are true
    assert all(len(v)==l for v in list_of_vectors), "vectors are of different sizes!" 
    
    s = [0]*l  # define 0 vale list of length l
    sum_vector = [] #define empty vector to collate sum values
    
    for i in range(l):
        for v in list_of_vectors:
            s[i] = sum([s[i],v[i]])
        sum_vector.append(s[i])
    return sum_vector

vector_sum_longcode([[1,2],[3,4],[5,6]])

[9, 12]

Above code can be written in one line, because sum() takes a list/array to calculate sum

In [13]:
from typing import List 
Vector = List[float]

def vector_sum(list_of_vectors: List[Vector]) -> Vector:
    """sum of all corresponding elements"""
   
    #check if list_of_vectors is empty
    assert list_of_vectors, "no vectors provided"

    #check if vectors are of same size
    l = len(list_of_vectors[0])  #length of first vector
    #all() returns True if all elements in given iterable are true
    assert all(len(v)==l for v in list_of_vectors), "vectors are of different sizes!" 

    return [sum(v[i] for v in list_of_vectors) for i in range(l)]

vector_sum([[1,2],[3,4],[5,6]])

[9, 12]

- **For loops are suitable for iterating over existing collections where the size of the collection is not prohibitively large.**
- **Generators are ideal for calculating large sets of results (especially in calculations involving loops themselves) or in situations where the full size of the collection is not known in advance and/or doesn't need to be stored in memory.**

## 4. scalar_multiply(v:Vector, c:float) -> Vector

In [14]:
from typing import List
Vector = List[float]

def scalar_multiply(v:Vector, c:float) -> Vector:
    """multiplies every element by c"""
    l = len(v)
    return([c*v[i] for i in range(l)])


scalar_multiply([1,2,3],2)

[2, 4, 6]

## 5. vector_mean(v: List[Vector]) -> Vector
- example list of vector = [[1,2],[3,4],[5,6]]
- we want output as mean of 1,3,5 and 2,4,6 in a vector of 2 elements
- fisrt find componentwise sum vector of, then divide by number of vectors

- mean using functions vector_sum and scalar_multiplication

In [21]:
def vector_mean(v: List[Vector]) -> Vector:
    """Computes the element-wise average"""
    a = vector_sum(v)
    return scalar_multiply(a, 1/len(v))
    

vector_mean([[1,2],[3,4],[5,6]])

[3.0, 4.0]

## 6. dot_product(a: Vector, b: Vector) -> float
- `a.b = sum(a[i]*b[i])`

In [16]:
def dot(a: Vector, b: Vector) -> float:
    """Computes v_1 * w_1 + ... + v_n * w_n"""
    assert len(a)==len(b), "different sizes"
    l = len(a)
    return(sum(a[i]*b[i] for i in range(l)))

dot([1,2,3],[4,5,6])

32

- If a has magnitude 1, the dot product measures how far the vector b extends in the a direction. For example, if b = [1, 0], then dot(a, b) is just the first component of a.
- Another way of saying this is that it’s the length of the vector you’d get if you projected a onto b

## 7. sum_of_squares(a: Vector) -> float

In [23]:
#sum of squares
from typing import List
import math

Vector = List[float]

def sum_of_squares(a: Vector) -> float:
    """Returns v_1 * v_1 + ... + v_n * v_n"""
    l = len(a)
    sum_a = sum(math.pow(a[i],2) for i in range(l))
    return sum_a

assert sum_of_squares([1,2])==5

## 8.  magnitude(a: Vector)
- magnitude of Vector = sqrt(sum_of squares)

In [24]:
def magnitude(a: Vector):
    """Returns the magnitude (or length) of v"""
    return math.sqrt(sum_of_squares(a))
    
assert magnitude([3,4]) == 5 

## 9. squared_distance(a: Vector, b: Vector) ->float

In [25]:
def squared_distance(a: Vector, b: Vector) -> float:
    """Computes (v_1 - w_1) ** 2 + ... + (v_n - w_n) ** 2"""
    return sum_of_squares(subtract(a,b))

assert squared_distance([1,2],[4,6]) == 25


## 10. distance(a: Vector, b: Vector) -> float
- magnitude of a vector x = `sqrt(x1^2 + x2^2 +...)`
- distance between two vectors x, y = vector_x - vector_y
- magnitude of distance = `sqrt((x1-y1)^2 + (x2-y2)^2 ....)`

In [30]:
import math
from typing import List

Vector = List[float]

def distance(a: Vector, b: Vector) -> float:
    """Computes the distance between v and w"""
    return math.sqrt(squared_distance(a,b))

assert distance([1,2],[4,6])==5


#Can also be given as:
def distance(v: Vector, w: Vector) -> float:
    return magnitude(subtract(v, w))

**Important** 


Using lists as vectors is great for exposition but terrible for performance.
In production code, you would want to use the **NumPy library**, which includes a high- performance array class with all sorts of arithmetic operations included.

# 2. Matrices
- list of lists of same size
- A[i][j] -> Element ith row and jth column of matrix A 

In [None]:
Matrix = List[List[float]]   #type alias/annotation 

A = [[1,2,3], [4,5,6]]  #rows=2, col=3
B = [[1,2], [3,4], [5,6]] #rows=3, col=3


- indexing in a matrix starts with 0, as it is list of lists

## 1. shape(A: Matrix) -> Tuple[int, int]

In [31]:
#shape of matrix
from typing import Tuple

Matrix = List[List[float]]

def shape(A: Matrix) -> Tuple[int, int]:
    """Returns (# of rows of A, # of columns of A)"""
    n_rows = len(A)
    n_col = len(A[0]) if A else 0  
    return (n_rows, n_col)

assert shape ([[1,2,3], [4,5,6]]) == (2,3)

- each column in a matrix of size nxk, each n is vector of length k and vice versa.
## 2. get_row(A: Matrix, i: int) -> List

In [32]:
#get_row[i]

from typing import List

Matrix = List[List[float]]

def get_row(A: Matrix, i: int) -> List:
    """Returns the i-th row of A (as a Vector)"""
    return A[i]

  
assert get_row([[1,2,3], [4,5,6]], 1) == [4,5,6]

## 3. get_column(A: Matrix, j: int) -> List

In [33]:
#get_column[j]

from typing import List

Matrix = List[List[float]]

def get_column(A: Matrix, j: int) -> List:
    """Returns the j-th column of A (as a Vector)"""
    return [r[j] for  r in A]

assert get_column([[1,2,3], [4,5,6]], 1) == [2,5]

## 4. make_matrix(num_rows: int, num_cols: int, entry_fn: Callable[[int, int], float] -> Matrix

In [34]:
# make a matrix with values defined by entry_fn
from typing import Callable, List

def make_matrix(num_rows: int,
                num_cols: int, 
                entry_fn: Callable[[int, int],float]) -> Matrix:
    """
Returns a num_rows x num_cols matrix
whose (i,j)-th entry is entry_fn(i, j)
"""
    return [[entry_fn(i,j) for j in range(num_cols)] for i in range(num_rows)]


- create indentity matrix using make_matrix()

## 5. identity_matrix(size: int) -> Matrix

In [35]:
#5x5 I matrix

def identity_matrix(size: int) -> Matrix:
    """Returns the n x n identity matrix"""
    return make_matrix(size, size, lambda i, j: 1 if i==j else 0)

identity_matrix(5)

[[1, 0, 0, 0, 0],
 [0, 1, 0, 0, 0],
 [0, 0, 1, 0, 0],
 [0, 0, 0, 1, 0],
 [0, 0, 0, 0, 1]]

## How to use matrices in Data Scinece?

1. to represent dataset containing multiple vectors. e.g. age, height, weight of 1000 people as 1000x3 matrix
2. we can use an n × k matrix to represent a linear function that maps k-dimensional vectors to n-dimensional vectors.
3. To represent binary relationships

- Example: users 0-9 are connected to each other in following given way  
   friendships = [(0, 1), (0, 2), (1, 2), (1, 3), (2, 3), (3, 4),(4, 5), (5, 6), (5, 7), (6, 8), (7, 8), (8, 9)]

- This can be represented in form of matrix to read and understand easily
- create 9 rows and 9 columns, if tuple (i,j) is present in 'friendship' put value=1

In [None]:
friend_matrix = [[0, 1, 1, 0, 0, 0, 0, 0, 0, 0], # user 0
                 [1, 0, 1, 1, 0, 0, 0, 0, 0, 0], # user 1
                 [1, 1, 0, 1, 0, 0, 0, 0, 0, 0], # user 2
                 [0, 1, 1, 0, 1, 0, 0, 0, 0, 0], # user 3
                 [0, 0, 0, 1, 0, 1, 0, 0, 0, 0], # user 4
                 [0, 0, 0, 0, 1, 0, 1, 1, 0, 0], # user 5
                 [0, 0, 0, 0, 0, 1, 0, 0, 1, 0], # user 6
                 [0, 0, 0, 0, 0, 1, 0, 0, 1, 0], # user 7
                 [0, 0, 0, 0, 0, 0, 1, 1, 0, 1], # user 8
                 [0, 0, 0, 0, 0, 0, 0, 0, 1, 0]] # user 9

assert friend_matrix[0][2] == 1, "0 and 2 are not friends"
assert friend_matrix[1][1] == 1, "1 and 1 are not friends"

In [None]:
#to find any node's (e.g. 2nd user's) connection
num_col = len(friend_matrix[0])
x = [friend_matrix[2][j] for j in range(num_col)]
print(x)
print([(index) for index, value in enumerate(x) if value])