# Partials etc involving matricies

## Preliminaries

In [1]:
#%matplotlib widget
%matplotlib inline
%load_ext autoreload
%autoreload 2

In [2]:
import numpy as np
import matplotlib.pyplot as plt
import sympy

### A few ways to get test numpy arrays

In [3]:
np.arange(3), np.arange(4,8), np.arange(5,1,-2)

(array([0, 1, 2]), array([4, 5, 6, 7]), array([5, 3]))

For experiments with multiplication, arrays of primes may be helpful:

In [4]:
def arangep(n, starting_index=0):
    sympy.sieve.extend_to_no(starting_index + n)
    return np.array(sympy.sieve._list[starting_index:starting_index + n])

In [5]:
arangep(5), arangep(4,2)

(array([ 2,  3,  5,  7, 11]), array([ 5,  7, 11, 13]))

In [6]:
M = arangep(4).reshape(2,2)
x = arangep(2,4)
# x = np.arange(2)+1
M,x

(array([[2, 3],
        [5, 7]]),
 array([11, 13]))

## Einstein summation notation

Numpy provides [Einstein summation](https://mathworld.wolfram.com/EinsteinSummation.html) operations with [einsum](https://numpy.org/devdocs/reference/generated/numpy.einsum.html)
1. Repeated indices are implicitly summed over.
1. Each index can appear at most twice in any term.
1. Each term must contain identical non-repeated indices.

In [7]:
es = np.einsum

 $$a_{ik}a_{ij} \equiv \sum_{i} a_{ik}a_{ij}$$

In [8]:
es('ij,j', M, x), es('ij,i', M, x)

(array([ 61, 146]), array([ 87, 124]))

___

# Partials

## Preliminaries

A matrix __M__ multiplies a (column) vector __x__ to its right to produce a (column) vector __y__:
$$ \mathbf{M} \mathbf{x} = \mathbf{y} $$
where
$$
\mathbf{x} = \sum_{j=1}^{n} x_j \mathbf{\hat{x}}_j \\
\mathbf{y} = \sum_{i=1}^{m} y_i \mathbf{\hat{y}}_i
$$
and $\mathbf{M}$ can be written
$$
\begin{bmatrix}
    m_{1,1} & \dots & m_{1,n} \\
    \vdots & \ddots & \vdots \\
    m_{m,1} & \dots & m_{m,n}
\end{bmatrix} \\
$$

A `python` example:

In [9]:
y = M @ x
y

array([ 61, 146])

Using Einstein summation notation, $y_i = m_{ij}x_j$

In [10]:
np.einsum('ij,j', M, x)

array([ 61, 146])

## Partial derivative of a matrix multiply of a vector

### $\partial\mathbf{y} / \partial\mathbf{x}$

How does vector $\mathbf{y}$ vary with vector $\mathbf{x}$, with $M$ held constant? I.e. what is $\partial\mathbf{y}/\partial\mathbf{x}$?

"In general, the partial derivative of an [n-ary](http://en.wikipedia.org/wiki/Arity) function $f(x_1, \dots, x_n)$ in the direction $x_i$ at the point $(a_1, \dots, a_n)$ [is defined](https://en.wikipedia.org/w/index.php?title=Partial_derivative) to be:"

$$\frac{\partial f}{\partial x_i}(a_1, \ldots, a_n) = \lim_{h \to 0}\frac{f(a_1, \ldots, a_i+h,\ldots,a_n) - f(a_1,\ldots, a_i, \dots,a_n)}{h} \tag{2.1} \label{partial}$$

The matrix equation $\mathbf{M} \mathbf{x} = \mathbf{y}$ can be written as
$$\begin{align}
  \mathbf{y} 
  &= \mathbf{M}\mathbf{x} \\
%\partial\mathbf{y}/\partial\mathbf{x}  &= \mathbf{F}(\mathbf{x}) \\
  &=\sum_i y_i \mathbf{\hat{y}}_i \tag{2.2} \label{mmul} \\
\end{align}
$$

where
$$\begin{align}
y_i &= f_i(x_1, x_2, \dots x_n) \\
  &= \sum_j m_{ij}x_j \tag{2.3}
\end{align}
$$

Substituting (2.3) into (2.1):
$$ \normalsize
\begin{align}
  \frac{\partial y_i}{\partial x_j} &= \frac{\partial f_i(x_1, x_2, \ldots, x_n)}{\partial x_j} \\
  &= \lim_{h \to 0}\frac{
        \sum_{k=1}^{n} m_{ik}(x_k + \delta_{kj}h)
      - \sum_{k=1}^{n} m_{ik}x_k}{h} \\
  &=\lim_{h \to 0}\frac{
        \sum_{k=1}^{n} m_{ik}x_k
      + \sum_{k=1}^{n} m_{ik}\delta_{kj}h
      - \sum_{k=1}^{n} m_{ik}x_k}{h} \\
  &=\lim_{h \to 0}\frac{\sum_{k=1}^{n} m_{ik}\delta_{kj}h
      }{h} \\
&= \lim_{h \to 0}\frac{m_{ij}h}{h} \\
&= m_{ij}
\tag{2.4}
\end{align}
$$

where $\delta_{ij}$ is the [Kronecker delta function](https://mathworld.wolfram.com/KroneckerDelta.html):
$$ \delta_{ij} =
    \begin{cases}
            1 &         \text{for } i=j,\\
            0 &         \text{for } i\neq j.
    \end{cases}
$$

Hence __[FIXME: justify]__
$$ \partial\mathbf{y} = \mathbf{M}\partial\mathbf{x} \\
\frac{\partial\mathbf{y}}{\partial\mathbf{x}} = \mathbf{M} \tag{2.5}$$

Approximating ([2.1](#mjx-eqn-partial)) numerically with our example:

In [11]:
(M@(x + np.array([0.001, 0])) - M@x) / 0.001, (M@(x + np.array([0, 0.001])) - M@x) / 0.001

(array([2., 5.]), array([3., 7.]))

Test (2.5) numerically:

In [18]:
max(err.dot(err)
    for err in (((M@(x + veps) - M@x) - M@veps)
              for M,x,veps in ((np.random.randn(2,2), np.random.randn(2), np.random.randn(2) * 0.001)
                          for i in range(1000))))

1.6722287350616936e-30

### $\partial\mathbf{y} / \partial\mathbf{M}$

How does vector $\mathbf{y}$ vary with matrix $M$, with vector $\mathbf{x}$ held constant? I.e. what is $\partial\mathbf{y}/\partial\mathbf{M}$?

From (2.3):
$$\begin{align}
 y_i &= \sum_j m_{ij}x_j \\
 \partial y_i &= \sum_j \partial m_{ij}x_j \\
% \frac{\partial y_i}{\partial M_{ij}} &= 2
\end{align}
$$

Then _[explain]_
$$
 \partial\mathbf{y} = \partial\mathbf{M}\mathbf{x} \\
 \frac{\partial\mathbf{y}}{\partial\mathbf{M}} = \mathbf{x}
$$

Numeric demonstration

In [13]:
M, x, M@x

(array([[2, 3],
        [5, 7]]),
 array([11, 13]),
 array([ 61, 146]))

In [14]:
k11 = np.array([[1, 0], [0, 0]])
k12 = np.fliplr(k11)
k21 = np.flipud(k11)
k22 = np.fliplr(k21)
singles = (k11, k12, k21, k22)
singles

(array([[1, 0],
        [0, 0]]),
 array([[0, 1],
        [0, 0]]),
 array([[0, 0],
        [1, 0]]),
 array([[0, 0],
        [0, 1]]))

In [15]:
[((M+(e*0.001))@x - M@x) / 0.001 for e in singles]

[array([11.,  0.]), array([13.,  0.]), array([ 0., 11.]), array([ 0., 13.])]

In [16]:
[e@x for e in singles]

[array([11,  0]), array([13,  0]), array([ 0, 11]), array([ 0, 13])]

Test numerically: Create random vector x and random M and dM matricies. Use an approximation of (2.1) to estimate
$\partial\mathbf{y}/\partial\mathbf{M}$ numerically, and compare to $\partial\mathbf{M}\mathbf{x}$. Find the maximum squared error in a number of random trials.

In [23]:
max(v.dot(v)
    for v in (dM@x - (((M+(dM*0.001))@x - M@x) / 0.001)
              for M,dM,x in ((np.random.randn(2,2), np.random.randn(2,2), np.random.randn(2))
                          for i in range(1000))))

2.103932709864864e-24

# END
---