# Derivative of function with respect to vector argument

$f(x)$ - function returning scalar value taking vector input $x$

We want to find: $\frac{d}{dy}f(y) = (\frac{d}{dy_0}f(y)...\frac{d}{dy_i}f(y)...\frac{d}{dy_{N-1}}f(y))$, here $0 \leq i \leq N-1$, $N$ - size of the vector.

If $y$ is a vector function of some vector $x$: $y(x)$

And we want to find: $$\frac{d}{dx}f(y(x)) = \frac{d}{dx_j} f(y(x)) = \sum\limits_i\frac{df}{dy_i}\frac{dy_i}{dx_j} = \left(\frac{d}{dy_0}f(y(x))...\frac{d}{dy_{N-1}}f(y(x))\right) \begin{pmatrix}\frac{d}{dx_0}y_0 & ... & \frac{d}{dx_{N-1}}y_0 \\ ... & ... & ... \\ \frac{d}{dx_{0}}y_{N-1} & ... & \frac{d}{dx_{N-1}}y_{N-1}\end{pmatrix} = \frac{df}{dy}\frac{dy}{dx}$$

## Let's try to solve following task:

Find the derivative $\frac{d}{dx}f(x)$

### SubTask 1 :

$f(x) = Ax - b$

### SubTask 2:

$f(x) = \|x\|_2^2$

### SubTask 3:

$f(x) = \|Ax-b\|^2_2$

### Subtask 4:

$f(x) = \|Ax-b\|^2_2 + \alpha \|x\|_2^2$

# Under-determined problem solving with l1 regularization

![](https://drive.google.com/uc?export=view&id=1LImALY7lZnF5hryguEKgALT10JWyTmlc)


![](https://drive.google.com/uc?export=view&id=1r-7Fj4ranAQxCkYBNVxNYoPxKBx3dfki)

![](https://drive.google.com/uc?export=view&id=130bZFiTJ43eO-65B8CP6lsiv3NffOVVG)

# Application to standard problem

![](https://drive.google.com/uc?export=view&id=1BH0jrR-yuzby_AwPlqzQYF0YoC6MZh20)

![](https://drive.google.com/uc?export=view&id=1JWadh_BlwTcqxYfakZ8lknZYKjTTkk47)

In [None]:
from jax import numpy as jnp

![](https://drive.google.com/uc?export=view&id=1j7JFGPEquuMPBda1yrCsdJr3As0ezi-Q)

In [None]:
import numpy as np
from scipy.optimize import minimize

In [None]:
X = np.random.uniform(size=(5,5))
y = np.random.uniform(size=(5))
w_hat = np.linalg.inv(X) @ y
w_hat

array([ 0.02118832,  0.34455939,  1.21204   , -0.98028037, -0.33646532])

In [None]:
def f_l1(w,alpha=0.0):
  return ((y - X @ w) ** 2).sum() + alpha * np.abs(w).sum()

In [None]:
res_l1 = minimize(f_l1,np.zeros_like(w_hat),tol=1e-10)
np.linalg.norm(y - X @ res_l1.x)

1.6673446800650432e-08

In [None]:
(np.abs(res_l1.x) < 1e-6).sum()

0

In [None]:
res_l1 = minimize(lambda x: f_l1(x,alpha=0.1),np.zeros_like(w_hat),tol=1e-10)

In [None]:
(np.abs(res_l1.x) < 1e-6).sum()

3

In [None]:
res_l1 = minimize(lambda x: f_l1(x,alpha=10.),np.zeros_like(w_hat),tol=1e-10)

In [None]:
(np.abs(res_l1.x) < 1e-6).sum()

5

# The Gram–Schmidt process

$V$ set of vectors we would like to orthogonalize.

Basic algorithm:

$u_1 = \frac{v_1}{\|v_1\|}$

$u_2 = v_2 - (v_2,u_1)u_1$, $u_2 = \frac{u_2}{\|u_2\|}$

...

$u_N = v_N - (v_N,u_{N-1})u_{N-1} - ... - (v_N,u_1)u_1$, $u_N = \frac{u_N}{\|u_N\|}$

In [None]:
import numpy as np

In [None]:
V = np.random.normal(size=(5,5))

In [None]:
def simple_ortho(v):
  u = np.copy(v)
  for i in range(len(v)):
    u[:,i+1:] -= v[:,i:i+1].T @ u[:,i+1:] * u[:,i:i+1]
    u[:,i] /= np.linalg.norm(u[:,i])
  return u

In [None]:
U = simple_ortho(V)

In [None]:
A = np.random.normal(size=(3,3))
A[:,1:2] = A @ np.random.normal(size=(3,1))

In [None]:
Q,R = np.linalg.qr(A)

In [None]:
R

array([[-1.03535947,  1.71497802,  1.55178398],
       [ 0.        ,  0.54752739,  1.73103542],
       [ 0.        ,  0.        ,  0.00512746]])

In [None]:
np.abs(U @ U.T - np.eye(5)).mean()

0.7246949150073461

Modified algorithm:

Set $u = copy(v)$

$u_1 = \frac{u_1}{\|u_1\|}$

$u_2 = v_2 - (v_2,u_1)u_1$, $u_2 = \frac{u_2}{\|u_2\|}$

...

$u_N = v_N - (v_N,u_1)u_1$, ...,  $u_N = u_N - (u_N,u_{N-1})u_{N-1}$, $u_N = \frac{u_N}{\|u_N\|}$

In [None]:
def error_aware_ortho(v):
  u = np.copy(v)
  for i in range(len(v)):
    u[:,i+1:] -= u[:,i:i+1].T @ u[:,i+1:] / np.linalg.norm(u[:,i])**2 * u[:,i:i+1]
    u[:,i] /= np.linalg.norm(u[:,i])
  return u

In [None]:
U_good = error_aware_ortho(V)

In [None]:
np.abs(U_good @ U_good.T - np.eye(5)).mean()

9.331082644195687e-17

In [None]:
(U_good.T @ V)[:3,:3]

array([[ 3.10316011e+00, -3.80696085e-01, -1.18223457e+00],
       [ 5.93826128e-17,  1.92460561e+00,  1.44292069e+00],
       [ 1.36393345e-16,  1.50014707e-17,  2.07764823e+00]])