### Math to Code

For each problem you'll be given a set of mathematical expressions with sample data to work with, and you'll need to implement a function that will execute the statement that you see in latex.  You can verify you did it correctly by running the cell with the unit tests afterwards.

It is possible to implement these operations using external libraries but we ask that you only use regular python to accomplish this task.

While completing these tasks, it's helpful to make sure you understand what data type the input is, and what data type the expression evaluates to, as these are common sources of confusion when people are trying to interpret these statements.

One purpose of this exercise is to familiarize ourselves with different mathematical expressions which might be unfamiliar to you.

As a reference, here's a useful list to find out what something refers to: https://byjus.com/maths/math-symbols/

**1).** Please implement the following expression:

$$ \sum_{i=1}^{n} x_i $$

In [None]:
# sample variable for you to use
x = [4, 6, 8, 10, 12]

# function for you to complete
def calc_x(x_vec: list) -> int:
    assert type(x_vec) == list, "Input to function should be a list"
    
    # calculate this value
    ans = None
    
    # YOUR CODE HERE
    return ans

In [None]:
# test cases
x_test = [4, 6, 8, 10, 12]

assert calc_x(x_test) == 40, "Function did not return the correct value"
assert calc_x([0]*10) == 0, "Function did not return the correct value"

print("passed!")

**2).** Please implement the following expression:

$$ \sum_{i=1}^{n}x_i - y_i $$

In [None]:
# sample variable for you to use
x = [4, 6, 8, 10, 12]
y = [2, 4, 6, 8, 10]

# function for you to complete
def calc_x_y(x_vec: list, y_vec: list) -> int:
    assert type(x_vec) == list, "Input to function should be a list"
    assert type(y_vec) == list, "Input to function should be a list"
    
    # calculate this value
    ans = None
    
    # YOUR CODE HERE
    
    return ans

In [None]:
# test cases

x_test = [3, 5, 7, 9, 11]
y_test = [3, 4, 6, 8, 10]

assert calc_x_y(x_test, y_test) == 4, "Function did not return the correct value"
assert calc_x_y([0]*10, [1]*10) == -10, "Function did not return the correct value"

print("passed!")

**3).** Please implement the following expression:

$$ \sum_{i=1}^{n} x_i + y_i $$

In [None]:
# sample variable for you to use
x = [4, 6, 8, 10, 12]
y = [2, 4, 6, 8, 10]

# function for you to complete
def calc_x_y_add(x_vec: list, y_vec: list) -> int:
    assert type(x_vec) == list, "Input to function should be a list"
    assert type(y_vec) == list, "Input to function should be a list"
    
    # calculate this value
    ans = None
    
    # YOUR CODE HERE
    
    return ans

In [None]:
# test cases

x_test = [3, 5, 7, 9, 11]
y_test = [3, 4, 6, 8, 10]

assert calc_x_y(x_test, y_test) == 4, "Function did not return the correct value"
assert calc_x_y([0]*10, [1]*10) == -10, "Function did not return the correct value"

print("passed!")

**4).** Please implement the following expression:

$$ x^Tx $$

**Hint:** If you want to know the meaning of the $^T$, you can read about that here:  https://www.cuemath.com/algebra/transpose-of-a-matrix/

**Hint 2:** This operation would normally depend on the particular geometry of $x$, however regular python does not allow you to specify this, so just focus on the numeric value this operation should return, and not so much on possible geometric interpretations of the above operation.

In [None]:
# sample variable for you to use
x = [4, 6, 8, 10, 12]

def x_t_x(x_vec: list) -> int:
    assert type(x_vec) == list, "Input to function should be a list"
    
    # calculate this value
    xtx = None
    
    # YOUR CODE HERE
    
    return xtx

In [None]:
# test cases

x_test = [3, 5, 7, 9, 11]

ans = x_t_x(x_test)

assert type(ans) in [int, float], "Function did not return the correct data type"
assert x_t_x(x_test) == 285, "Function did not return the correct value"
print("passed!")

**5).** Please implement the following expression:

$$ x \in \mathbb{R}^{n}, \,\,\,j \in \mathbb{R}^{n-1}\,\,,\,\,\, j_i =  x_{i}  - x_{i-1}\,\,\, \forall\,\, i = 1, 2,...,n$$

**Hint:** The $\mathbb{R}^{n}$ symbol is often confusing for people to interpret.  Here's an explanation:  https://en.wikipedia.org/wiki/Real_coordinate_space

In [None]:
# sample variables for you to use
x = [1, 3, 5, 7, 9, 11]

def calc_j(x_vec: list) -> list:
    assert type(x_vec) == list, "Input to function needs to be a list"
    
    # return this value
    j = None
    
    # YOUR CODE HERE
    
    return j

In [None]:
# test cases
x_test = [4, 3, 2, 7, 9, 11]

ans = calc_j(x_test)
assert len(ans) == 5, "function did not return vector with correct length"
assert calc_j(x_test) == [-1, -1, 5, 2, 2], "function did not return the correct value"
print("passed!")

**6).** Please implement the following expression:

$$ x \in \mathbb{R}^{n}, \,\,\,j \in \mathbb{R}^{n-1},\,\,\,\,\, \alpha\in(0, 1)\,\,\, \\j_i =  \alpha x_{i}  + (1-\alpha)x_{i-1}\,\,\, \forall\,\, i = 1, 2,...,n$$

**Question:** How will changing the value of $\alpha$ affect the values that appear in $j$?

In [None]:
# sample variables for you to use
x = [2, 4, 6, 8, 11, 13, 21, 4]

def calc_j_alpha(x_vec: list, alpha: float) -> list:
    assert type(x_vec) == list, "input should be a list"
    
    # Calculate this value
    j = None
    
    # YOUR CODE HERE
    
    return j

In [None]:
# test cases
x_test = [4, 3, 11, 8, 5, 4, 5, 7]
alpha  = 0.5

ans    = calc_j_alpha(x_test, alpha)
assert type(ans) == list, "Function should return a list"
assert len(ans) == 7, "Function did not return value with the correct dimensions"
assert ans == [3.5, 7.0, 9.5, 6.5, 4.5, 4.5, 6.0], "Function did not return the correct value"

**7).** Please implement the following expression:
$$  x \in \mathbb{R}^{n}, \,\,\,y \in \mathbb{R}^{n},\,\,\, d = \sqrt{\sum_{i=1}^n (x_i - y_i)^2} $$

**Question:** What does the value $d$ represent above?  (It is a very common metric that's used all the time).

**Question 2:** What if we modified the formula above to look like the following:

$$ d_p = (\sum_{i=1}^n (x_i - y_i)^p)^{1/p} $$

What would changing the value of $p$ have on the formula?  What do you refer to the formula when it's referred to in this way?

In [None]:
# sample variables for you to use
x = [1, 2, 3, 4, 5]
y = [2, 3, 5, 4, 3]

def calc_d(x_vec: list, y_vec: list) -> int:
    assert type(x_vec) == list and type(y_vec) == list, "Inputs need to be lists"
    assert len(x_vec) == len(y_vec), "inputs need to be the same length"
    
    # Calculate this value
    d = None
    
    # YOUR CODE HERE
    
    return d

In [None]:
# test cases
x_test = [4, 2, 1, 8, 7]
y_test = [2, -3, 5, 7, 9]

ans = calc_d(x_test, y_test)
assert type(ans) == float, "Function should return a number as output"
assert round(ans, 2) == 7.07, "Function did not return the correct value"

print("Passed!")

**8).** Please implement the following expression:

$$X, Y \in \mathbb{R}^{m\,x\,n}\,\,\,\\ Y_{i, j} = X_{i, j} - 1 \\\forall \, i = 1, 2, ...., m,\,\,\,\forall \, j = 1, 2, ..., n$$

In [None]:
# sample variables for you to use
X = [[2, 6, 3, 9, 1], 
     [1, 5, 7, 4, 0], 
     [7, 3, 8, 2, 5], 
     [3, 5, 9, 2, 8]]

def calc_y_ij(x_mat: list) -> list:
    assert type(x) == list, "Input needs to be a list"
    
    # Calculate this value
    Y = None
    
    # YOUR CODE HERE
    
    return Y

In [None]:
# test cases
X_test = [[8, 1, 9, 4, 7], 
          [8, 4, 6, 9, 0], 
          [8, 6, 7, 1, 9], 
          [6, 9, 3, 4, 1]]

ans    = calc_y_ij(X_test)
assert len(ans) == 4, "Answer did not have the correct dimensions"
assert sum(1 for row in ans for i in row) == 20, "Answer did not have the correct dimensions"
assert sum(sum(row) for row in ans) == 90, "Answer did not have the correct values"
print("passed!")

**9).** Please implement the following expression:

$$ A \in \mathbb{R^{m\,x\,n}},\, x\in\mathbb{R}^n,\, b\in\mathbb{R}^m $$
$$ b = \sum_{i=1}^m\sum_{j=1}^n A_{i, j}*x_{j} $$

In [1]:
import numpy as np

In [5]:
A = np.random.normal(size = (10, 5))
x = np.random.normal(size = 5)

In [6]:
A

array([[-0.48371158, -0.748006  , -0.39440267, -2.08619296,  1.23668749],
       [ 0.31133866,  1.00277322, -0.73397584, -1.05207761,  0.68917856],
       [ 1.21517792,  1.11563411,  0.72096404, -0.65595135, -1.56800354],
       [ 0.38338748,  0.94869908, -1.11779235,  1.59239138,  0.77818087],
       [-0.37033995,  0.2812268 ,  0.29529717,  0.4138392 , -0.38869025],
       [-0.58271034,  1.20595011, -0.81355555, -0.83822453, -1.19709544],
       [ 0.79995129,  1.61836255,  0.07264136, -0.32243754,  0.08979003],
       [-0.02254941,  0.15408156,  0.44317953,  0.3280503 ,  1.26491484],
       [-1.23325199,  0.25343771,  0.83153194, -0.5860247 ,  1.23115079],
       [ 0.65327381,  0.08330851,  1.03455267, -0.43773826,  1.44577787]])

In [12]:
sum(A[0]*x)

-1.3323063414202962

In [8]:
x

array([-0.26409361,  1.02944047, -0.91282802, -0.25824372, -1.28471572])

In [9]:
A @ x

array([-1.33230634,  1.0063604 ,  2.35327611,  0.48476496,  0.51024024,
        3.89237377,  1.35634962, -1.94974665, -1.60279421, -2.77550333])

In [None]:
# sample variables for you to use
A = [[1, 2], 
     [3, 4],
     [5, 6]]

x = [3, 4]

def calc_a_x(a_mat: list, x_vec: list) -> list:
    assert type(A) == list, "A matrix needs to be a list"
    assert type(x) == list, "x vector needs to be a list"
    
    # Calculate this value
    b = None
    
    # YOUR CODE HERE
    
    return b

In [None]:
# test cases
A_test = [[2, 3], 
          [1, 5],
          [4, 2]]

x_test =  [2, 6]

ans = calc_a_x(A_test, x_test)
assert len(ans) == 3, "Answer did return the correct dimensions"
assert sum(ans) == 74, "Answer did not return the correct dimensions"
print("passed!")

**10).** Please implement the following expression:

$$ A \in \mathbb{R^{m\,x\,n}},\, B\in \mathbb{R^{n\,x\,k}},\, C\in\mathbb{R^{m\, x\, k}}$$
$$ C_{i, j} = \sum_{k=1}^n A_{i, k}B_{k, j}$$

**Hint:** This expression might look confusing to you.  Don't panic, that's normal.  If you want a slightly formal defintion of what this expression means, you can find one here:  https://math.stackexchange.com/questions/2063241/matrix-multiplication-notation

If you want to visualize what this expression does, here's a useful resource: http://matrixmultiplication.xyz

You are basically implementing exactly this, but in python code.

If you want a video walk through of how these types of operations work, there is no one better than Gilbert Strang:  https://www.youtube.com/watch?v=FX4C-JpTFgY

**Hint 2:** This sort of operation is a little tricker in regular python than it will be with other tools like numpy.  If you have a list of lists like we have here, you can grab a row from it easily with an index, but there is not a direct way to grab a column from it.  You might have to create a separate helper function to grab a column from your (pseudo) matrix to aid in this calculation.

In [None]:
# sample variables for you to use
A = [[6, 7, 4], 
     [7, 8, 9], 
     [5, 9, 8], 
     [1, 3, 5], 
     [1, 4, 7]]

B = [[8, 9],
     [6, 8],
     [4, 2]]

def calc_A_B(a_mat: list, b_mat: list) -> list:
    assert type(A) == list, "A input needs to be a list"
    assert type(B) == list, "B input needs to be a list"
    
    # Calculate this value
    C = None
    
    # YOUR CODE HERE
    return C

In [None]:
# test cases

A_test = [[1, 2], 
          [3, 4],
          [4, 2],
          [5, 1]]

B_test = [[4, 3, 1],
          [3, 1, 5]]

ans = calc_A_B(A_test, B_test)
assert len(ans) == 4, "Answer did not have correct dimensions"
assert all(len(row) == 3 for row in ans), "Answer did not have the correct dimensions"
assert sum(sum(row) for row in ans) == 185, "Answer did not have the correct value"
print("passed!")

**11).** Implement the following expression:

$$|| x ||_{2} $$

**Hint:** This sort of notation might look a little strange, but it comes up a lot in ML so you ought to get used to it.  Here's a good article describing what this expression is referring to:  https://towardsdatascience.com/a-quick-guide-to-understanding-vectors-norms-84eb802f81f9

In [None]:
# sample variable for you to use
x = [2, 4, 6, 8, 11]

def calc_l2_norm(x_vec: list) -> int:
    assert type(x_vec) == list, "input needs to be a list"
    
    # Calculate this value
    norm_val = None
    
    # YOUR CODE HERE
    
    return norm_val

In [None]:
# test cases
x_test = [3, 5, 1, 2]

ans = calc_l2_norm(x_test)
assert type(ans) == float, "Function did not return the correct data type"
assert round(ans, 2) == 6.24, "Answer did not return the correct value"
print("passed!")

**12).** Implement the following expression:

$$ || x - y ||_1 $$

In [None]:
# sample variables for you to use
x = [2, 3, 4, 5]
y = [4, 1, 3, 0]

def calc_l1_norm(x_vec: list, y_vec: list) -> list:
    assert type(x_vec) == list, "x_vec needs to be a list"
    assert type(y_vec) == list, "y_vec needs to be a list"
    
    # Calculate this value
    norm_val = None
    
    # YOUR CODE HERE
    
    return norm_val

In [None]:
# test cases
x_test = [3, 1, 4, 0]
y_test = [2, 3, 1, 2]

ans = calc_l1_norm(x_test, y_test)
assert type(ans) == int, "Function should return a number"
assert ans == 8
print("passed!")

**13).** Implement the following expression:

$$ || x - y ||_2^2 $$

In [None]:
# sample variables for you to use
x = [2, 3, 4, 5]
y = [4, 1, 3, 0]

def calc_norm_val(x_vec: list, y_vec: list) -> int:
    assert type(x_vec) == list, "x_vec needs to be a list"
    assert type(y_vec) == list, "y_vec needs to be a list"
    
    # Calculate this value
    norm_val = None
    
    # YOUR CODE HERE
    
    return norm_val

In [None]:
# test cases
x_test = [4, 6, 3, 1]
y_test = [2, 5, 1, 8]

ans    = calc_norm_val(x_test, y_test)
assert ans == 58, "Function did not return the correct value"
print("passed!")

**14).** Implement the following expression:

$$ var = \frac{1}{n-1}\sum_{i=1}^n (x_i - \bar{x})^2 $$

The variable name kind of gives it away, but what common term does this expression refer to?

In [None]:
# sample variable for you to use
x = [4, 3, 1, 6, 8]

def calc_var(x_vec: list) -> list:
    assert type(x_vec) == list, "x_vec needs to be a list"

    # Calculate this value
    var = None
    
    # YOUR CODE HERE
    
    return var

In [None]:
# test cases
x_test = [11, 12, -2, 6]

ans = calc_var(x_test)
assert type(ans) == float, "Function did not return the correct data type"
assert round(ans, 2) == 40.92, "Function did not return the correct value"
print("passed!")

**15).** Implement the following expression:

$$ std = \sqrt{\frac{1}{n-1}\sum_{i=1}^n (x_i - \bar{x})^2} $$

This is another metric you're probably familiar with, so make sure to find out for yourself what it refers to.

In [None]:
# sample variable for you to use
x = [4, 3, 1, 6, 8]

def calc_std(x_vec: list) -> list:
    assert type(x_vec) == list, "x_vec needs to be a list"

    # Calculate this value
    std = None
    
    # YOUR CODE HERE
    
    return std

In [None]:
# test cases
x_test = [11, 12, -2, 6]

ans = calc_std(x_test)
assert type(ans) == float, "Function did not return the correct data type"
assert round(ans, 2) == 6.4, "Function did not return the correct value"
print("passed!")

**16).** Implement the following expression:

$$ cov = \frac{1}{n-1}\sum_{i=1}^n(x_i - \bar{x})(y_i - \bar{y}) $$

In [None]:
# sample variables for you to use
x = [4, 3, 1, 6, 8]
y = [2, 6, 7, 3, 11]

def calc_cov(x_vec: list, y_vec: list) -> int:
    assert type(x_vec) == list, "x_vec needs to be a list"
    assert type(y_vec) == list, "y_vec needs to be a list"
    
    # calculate this value
    cov = None
    
    # YOUR CODE HERE
    
    return cov

In [None]:
# test cases
x_test = [4, 6, 3, 1]
y_test = [2, 5, 1, 8]

ans = calc_cov(x_test, y_test)
assert type(ans) == float, "Function did not return the correct data type"
assert round(ans, 2) == -2.33
print("passed!")

**17).** Implement the following expression:

$$ corr = \frac{\sum_{i=1}^n(x_i - \bar{x})(y_i - \bar{y})}{\sqrt{\sum_{i=1}^n(x_i - \bar{x})^2}\sqrt{\sum_{i=1}^n(y_i - \bar{y})^2}} $$

This expression can also be re-written as:

$$ corr = \frac{cov(x, y)}{\sigma(x)*\sigma(y)} $$

Hopefully you can guess what this expression is referring to, but again, try and look it up if you are not sure.

**Hint:** You should be able to re-use previous functions to calculate this value.

In [None]:
# sample variables for you to use
x = [4, 3, 1, 6, 8]
y = [2, 6, 7, 3, 11]

def calc_corr(x_vec: list, y_vec: list) -> float:
    assert type(x_vec) == list, "x_vec needs to be a list"
    assert type(y_vec) == list, "y_vec needs to be a list"
    
    # calculate this value
    corr = None
    
    # YOUR CODE HERE
    
    return corr

In [None]:
# test cases
x_test = [4, 6, 3, 1]
y_test = [2, 5, 1, 8]

ans = calc_corr(x_test, y_test)
assert type(ans) == float, "Function did not return the correct data type"
assert round(ans, 2) == -0.35, "Function did not return the correct value"
print("passed!")

**18).** Calculate the distance matrix!

The final challenge for this notebook will be to calculate an $l2$ distance matrix from your data.  This will incorporate the metric that you calculated for question 7 to arrive at the answer.  

Quantifying how each data point is related to each other one is a common data transformation for ML models, so going through this type of exercise now is a nice prep for future material.  

Since this is the math-to-code notebook we'll begin by trying to express the distance matrix formally:

$$ A \in\mathbb{R^{m\,x\,n}}, D\in\mathbb{R^{m\,x\,m}} $$
$$ D_{i, j} = dist(A_{i*}, A_{j*}) $$

In plain english, you're going to create a matrix that will display the distance between each row in your data with every other row inside of it.  So if your original data was 50x10, your distance matrix will be 50x50.  

And the point at index position (1, 3) in D will denote the distance metric between the row at position 1 and the row at position 3 in A.

In [None]:
# sample variable for you to use
A = [[0, 9, 8, 5, 3],
     [8, 3, 1, 5, 0],
     [5, 3, 2, 6, 1],
     [6, 2, 4, 1, 3],
     [7, 1, 3, 9, 8],
     [7, 6, 3, 5, 8]]

def calc_dist_mat(a_mat: list) -> list:
    assert type(a_mat) == list, "a_mat needs to be a list"
    
    # Calculate this value
    dist_mat = None
    
    # YOUR CODE HERE
    
    return dist_mat

In [None]:
# test cases
A_test = [[13, 12], 
          [4, -2]]

ans = calc_dist_mat(A_test)
assert round(sum(sum(row) for row in ans), 2) == 33.29, "Function did not return the correct value"
assert len(ans) == 2, "Function did not return the correct dimensions"
print("passed!")