In [2]:
import numpy as np
import matplotlib.pyplot as plt
import csv
import pandas as pd
%matplotlib inline

# Refresher on Linear Algebra and Derivatives


- (a) Let $A$ be a $3 \times 4$ matrix and $B$ a $3 \times 2$ matrix, what is the size of $A^T B$.

> the size of $A^T B$ is $4 \times 2$ matrix.

 Answer:    
 As we known A is a $3 \times 4$ matrix, so the  transpose of A is a $4 \times 3$. Then the size  of $A^T B$ is a $4 \times 2$ matrix.

In [8]:
A = np.ones(shape=(3,4))
B = np.ones(shape=(3,2))
C = A.T.dot(B)
print("A.T:{} x B:{}=C:{}".format(A.T.shape, B.shape, C.shape))

A.T:(4, 3) x B:(3, 2)=C:(4, 2)


- (b) Let $x \in R^n$ be a column vector (vectors are always columns for us) and $A$ a $m × n$ matrix. What is the size of $Ax$.

> the size of $Ax$ is $m \times 1$. matrix

Answer:   
From above we know that , x is a column vector ,also $n×1$ matrix. A is a $m × n$ matrix. So the size of $Ax$ is $m×1$ matrix.

In [9]:
x = np.ones(shape=(4,1))
A = np.ones(shape=(5,4))
y = A.dot(x)
print("A:{} * x:{} = y:{}".format(A.shape, x.shape, y.shape))

A:(5, 4) * x:(4, 1) = y:(5, 1)


- (c) What is the derivative of $f(x) = (2x + y)^2$ w.r.t. x:$\frac{\partial}{\partial x}f(x)$

$$\begin{aligned}
\frac{\partial}{\partial x} f(x) &= \frac{\partial}{\partial x} (2x+y)^2 \\
                                 &=  2(2x+y).2 \\
                                 &= 8x + 4y
\end{aligned}$$

- (d) Given $f(x) = g(x^2)$ where $g(x) = (x + y)^2$, what is $\frac{\partial}{\partial x}f(x)$

$$\begin{aligned}
\frac{\partial}{\partial x}f(x) &= \frac{\partial}{\partial x} g(x^2) \\
                                &= (\frac{\partial}{\partial x} (x^2+y)^2 )*(\frac{\partial}{\partial x} (x^2))\\
                                &= (2(x+y))*(2x) \\
                                &= 4x^3 + 4xy
\end{aligned}$$

# Multivariable Calculus

Recall that a matrix $A \in R^{n\times n}$ is symmetric if $A^T = A$, that is, $A_{ij} = A_{ji}$ for
all $i, j$. Also recall the gradient $\nabla f(x)$ of a function $f : R^n → R$ is the $n$−vector
of partial derivatives

$$
\nabla f(x) = \left\{
\begin{matrix}
    \frac{\partial}{\partial x_1}f(x) \\
    ... \\
    \frac{\partial}{\partial x_1}f(x)
    \end{matrix}
\right\}
$$

where

$$
x = \left\{
\begin{matrix}
    x_1 \\
    ... \\
    x_n
\end{matrix}
\right\}
$$

The hessian $\nabla^2 f(x)$ is the $n\times n$ symmetric matrix of twice partial derivatives,

$$\begin{aligned}
\nabla^2f(x) = \left\{
\begin{matrix}
    \frac{\partial^2}{\partial x_1^2}f(x) & \frac{\partial^2}{\partial x_1 \partial x_2}f(x) & \cdots & \frac{\partial^2}{\partial x_1 \partial x_n} \\
    \frac{\partial^2}{\partial x_2 \partial x_1}f(x) & \frac{\partial^2}{\partial x_2^2}f(x) & \cdots & \frac{\partial^2}{\partial x_2 \partial x_n} \\
    \vdots & \vdots & \ddots & \vdots \\
    \frac{\partial^2}{\partial x_n \partial x_1}f(x) & \frac{\partial^2}{\partial x_n \partial x_2}f(x) & \cdots & \frac{\partial^2}{\partial x_n^2} \\
\end{matrix}
\right\}
\end{aligned}$$


- a) Let $f(x) = \frac{1}{2}x^TAx+b^Tx$, where $a$ is a sysmmetric matrix and $b\in R^n$ is a vector. What is $\nabla f(x)$?

$\because$ $A\in R^{n\times n}$, $A_{ij} = A_{ji}$ and $f(x)= \frac{1}{2}x^TAx+b^Tx$

$\therefore$
$$\begin{aligned}
    \frac{\partial}{\partial x_i} f(x) &= \frac{\partial}{\partial x_i} (\frac{1}{2}x^TAx+b^Tx) \\
    &= \frac{\partial}{\partial x_i}(\space \frac{1}{2}\left\{\begin{matrix}
            x_1 \cdots x_n
            \end{matrix}\right\}
            \left\{\begin{matrix}
            A_{1,1} & A_{1,2}\space & \cdots & A_{1,n} \\
            A_{2,1} & A_{2,2}\space & \cdots & A_{2,n} \\
            \vdots & \vdots & \ddots & \vdots \\
            A_{n,1} & A_{n,2} & \cdots & A_{n,n} \\
            \end{matrix}\right\}
            \left\{\begin{matrix}
            x_1 \\
            x_2 \\
            \vdots \\
            x_n
            \end{matrix}\right\}
            +\left\{\begin{matrix}
            b_1\space b_2\space ... b_n
            \end{matrix}\right\}\left\{\begin{matrix}
            x_1 \\
            x_2 \\
            \vdots \\
            x_n
            \end{matrix}\right\} )\\
    &= \frac{\partial}{\partial x_i} (\frac{1}{2} \sum_{i=1}^{n}\sum_{j=1}^{n}A_{ij}x_ix_j + \sum_{j=1}^n b_jx_i) \\
    &= \frac{\partial}{\partial x_i} (\frac{1}{2} (\sum_{i=1}^{n}A_{ii}x_i^2 + \sum_{i=1, i\neq j}^{n}\sum_{j=1, i\neq j}^{n}A_{ij}x_ix_j) + \sum_{j=1}^n b_jx_i) \\
    &= A_{ii}x_i + \sum_{j=1,i\neq j}^{n}A_{ij}x_i + \sum_{j=1}^{n}b_j \\
    &= \sum_{j=1}^{n}A_{ij}x_i + \sum_{j=1}^{n}b_j
\end{aligned}$$


$\therefore$
$$\begin{aligned}
\Delta f(x) &= \left\{
        \begin{matrix}
            \frac{\partial}{\partial x_1}f(x) \\
            \vdots \\
            \frac{\partial}{\partial x_n}f(x) \\
        \end{matrix}\right\} \\
    &= \left\{
        \begin{matrix}
            \sum_{j=1}^{n}A_{1j}x_1 + \sum_{j=1}^{n}b_j \\
            \vdots \\
            \sum_{j=1}^{n}A_{nj}x_n + \sum_{j=1}^{n}b_j
        \end{matrix}
        \right\} \\
    &= Ax+b
\end{aligned}$$

- b Let $f(x) = g(a^Tx)$

![WechatIMG169.jpeg](attachment:WechatIMG169.jpeg)

- c

![WechatIMG170.jpeg](attachment:WechatIMG170.jpeg)

- d

![WechatIMG171.jpeg](attachment:WechatIMG171.jpeg)

# Hands On

In [10]:
with open("diabetes.txt","r") as file:
    reader = csv.reader(file, delimiter=' ')
    table = np.asarray([row for row in reader], dtype=np.float)


In [11]:
training_set = table[:200,:]
validation_set = table[200:,:]

In [12]:
table.shape

(442, 11)