# Machine Learning Quiz 2 Practice
- toc: true
- badges: true
- comments: true
- author: Sachin Yadav
- categories: [MLCourse2022]

In [1]:
#collapse
import numpy as np
import pandas as pd
import torch

## Maths for ML

1. Given a vector $\epsilon$, we can calculate $\sum\epsilon_{i}^{2}$ using $\epsilon^{T} \epsilon$

In [2]:
#collapse
# As per the convention we take epsilon to be a 2D column vector
y = np.array([1.3, 2.5, 6.4, 8.1, 9.0]).reshape(-1, 1)
y_hat = np.array([1.5, 2.0, 5.9, 8.5, 9.0]).reshape(-1, 1)

epsilon = np.abs(y - y_hat)

epsilon_square_sum1 = np.sum(epsilon**2)
epsilon_square_sum2 = (epsilon.T @ epsilon).item()

assert np.allclose(epsilon_square_sum1, epsilon_square_sum2)

👍 works!

2. $(AB)^{T} = B^{T}A^{T}$

In [3]:
#collapse
A = np.random.randn(50, 10)
B = np.random.randn(10, 20)

ab_t = (A @ B).T
b_t_a_t = B.T @ A.T
assert np.allclose(ab_t, b_t_a_t)

🙄 Knew it already!

3. For a scalar $s$, $s = s^{T}$

4. Derivative of scalar $s$ with respect to (yes!, I wrote wrt as full here 😁) vector $\theta$
   $$\theta = \begin{bmatrix} \theta_{1} \\ \theta_{2} \\ \vdots \\ \theta{n} \end{bmatrix}$$
    $$\frac{\partial s}{\partial \theta} = \begin{bmatrix}
     \frac{\partial s}{\partial \theta_{1}} \\
     \frac{\partial s}{\partial \theta_{2}} \\
     \frac{\partial s}{\partial \theta_{3}} \\
     \vdots \\
     \frac{\partial s}{\partial \theta_{n}} 
     \end{bmatrix} $$

5. If $A$ is a matrix and $\theta$ is a vector, and $A\theta$ is a scalar. Then 
   $$ \frac{\partial A \theta}{\partial \theta} = A^{T} $$

🤔 Taking some similarity with $a\theta$, where both $a$ and $\theta$ are scalar, I have an idea that it would be A. But shape of gradient would be $N \times 1$, so $A^{T}$ is my guess before starting any calculations.

In [4]:
#collapse
N = 20
# as A $\theta$ is scalar, so A.shape[0] should be 1.
A = torch.randn((1, N))
theta = torch.randn((N, 1), requires_grad=True)
scalar = A @ theta
scalar.backward()
assert torch.allclose(theta.grad, A.T)

👍 all good

6. Assume $Z$ is a matrix of form $X^{T}X$, then 
   $$ \frac{\partial (\theta^{T}Z\theta)}{\partial \theta} = 2Z^{T}\theta$$

🤔 Let me again make a good guess before any calculation, if $\theta$ and $Z$ are both scaler, then the derivative would look like $2Z\theta$. So my guess would $2Z\theta$, which is equal to $2Z^{T}\theta$ as both are $Z$ is symmetric.

In [5]:
#collapse
X = torch.randn((N, N))
Z = X.T @ X
theta = torch.randn((N, 1), requires_grad=True)

scalar = theta.T @ Z @ theta
scalar.backward()

assert torch.allclose(theta.grad, 2 * Z.T @ theta)

👍 good

Let's skip over the content of Rank topic for now. 

The maximum rank possible for a matrix is $max(R, C)$ 

But an interesting question would be 🤔, what is the minimum rank possible for a matrix, is it 0, is it 1?

Ans: Rank is zero, in case of zero matrix.

Just a leaving thought, if I would have been a developer of Numpy, I would not have allowed `np.eye` as the method for identity matrix. Better to use `np.identity` only. 😞

## Linear Regression

Considering `weight` as a linear function of `height`:
- $weight_{1} \approx \theta_{0} + \theta_{1} * height_{1}$
- $weight_{2} \approx \theta_{0} + \theta_{1} * height_{2}$
- $weight_{N} \approx \theta_{0} + \theta_{1} * height_{N}$


$$ W_{N\times1} = X_{N\times2} \, \theta_{2\times1} $$
where the feature matrix $X$, $X = \begin{bmatrix}
1 & height_{1} \\
1 & height_{2} \\
\vdots & \vdots \\
1 & height_{N}
\end{bmatrix}$

- $\theta_{0}$, Bias/Intercept term : (the value of $y$, when $x$ is set to zero)
- $\theta_{1}$, Slope term : (the increase in $y$, when $x$ is increased by 1 unit)

In [25]:
import gradio as gr

weight_height_df = pd.read_csv("assets/2022-02-17-machine-learning-quiz2-practice/weight-height.csv")
# take 30 points
sampled_idx = np.random.choice(np.arange(len(weight_height_df)), size = 30, replace = False)
weight_height_df = weight_height_df.iloc[sampled_idx][["Height", "Weight"]]
display(weight_height_df)

def plot_line(theta0, theta1):
    y = weight_height_df["Height"]
    x = weight_height_df["Weight"]

gr.Interface(fn = plot_line, inputs = ["number", "number"], outputs = gr.outputs.Timeseries(x = live = True).launch()

Unnamed: 0,Height,Weight
601,67.115564,168.202167
4738,67.222467,176.538232
6375,66.15718,162.339164
9129,66.313844,139.141991
5347,67.098583,160.69333
8963,64.334788,161.746797
1393,68.342365,187.633463
2241,64.721622,150.619457
8061,65.11834,146.650334
2494,74.153065,212.276328


ValueError: Output interface must be of type `str` or `dict` or`OutputComponent` but is None

In [39]:
from ipywidgets import interact, interactive, fixed, interact_manual
import ipywidgets as widgets

import matplotlib.pyplot as plt
%matplotlib inline

import numpy as np

def plot_func(freq, val):
    x = np.linspace(0, 2*np.pi)
    y = np.sin(x * freq)
    y2 = np.zeros(x.shape)
    y2[:] = val
    plt.plot(x, y2)
    plt.plot(x, y)

interact(plot_func, freq = widgets.FloatSlider(value=7.5,
                                               min=1,
                                               max=5.0,
                                               step=0.5), val = widgets.FloatSlider(value=7.5,
                                               min=1,
                                               max=5.0,
                                               step=0.5))

interactive(children=(FloatSlider(value=5.0, description='freq', max=5.0, min=1.0, step=0.5), FloatSlider(valu…

<function __main__.plot_func(freq, val)>