In [1]:
from IPython.display import display, Markdown#
import pandas as pd
import numpy as np
import copy
def latexify(x):
    out = '$' + x + '$'
    return out

def lprint(x):
    display(Markdown(latexify(latex(x))))

# Implicit Function Theorem

The implicit function theorem is a core ingredient in bifurcation theory, in this notebook we will explore how to apply it algorithmically 

## Statement of theorem

We will consider functions of the form:

$$f : \mathbb{R}^n+m \rightarrow \mathbb{R}^m $$

usually k-times continuously differentiable for $k \in \mathbb{N}$

The implicit function theorem is used to characterise the zero set of the function around a know solution, suppose (without loss of generality) that:

$$f(0) = 0$$

We then split $\mathbb{R}^n$ into the product of two spaces:

$$\mathbb{R}^{n+m} = \mathbb{R}^n \times \mathbb{R}^m$$ 

From here on we denote $\mathbb{R}^n = X$ and $\mathbb{R}^m = Y$

So that:

$$f : X \times Y \rightarrow \mathbb{R}^m$$

If we then suppose that the derivtative with respect to $Y$ is an isomorphism at 0, i.e:

$$\partial_{Y}f(0,0)$$ is invertible

Then the Implict Function Theorem states that there exists, open subsets of $X$ and $Y$ and a k-times differentiable function:

$$h : U_{X} \rightarrow U_{y}$$

That paramatrises the zero set of $f$ close to zero:

$$\{ (x,y) \in U_{X} \times U_{Y} \;| \;\; f(x,y) = 0 \} = \{ (x,h(x)) \;\; | \;\; x \in U_{X} \}$$

## Computing $h$

We will now construct an algorithm, that can produce a Taylor polynomial approximation of the $h$ function, assuming we know that $\mathbb{R}^{n+m}$ meets the conditions required.

Recall that:

$$h(x) = h(a) + \frac{h'(a)}{1!}(x-a) + \frac{h(a)}{2!}(x-a)^2 + \frac{h'(a)}{3!}(x-a)^3+\dotsb = \sum_{k=0}^\infty \frac{h^{\left(k\right)}(a)}{k!} (x-a)^k$$

is the k-th order Taylor expansion of $h$ at 0

We require a way to find the derivatives of $h$, specifically:

$\partial_{X}h$,     $\partial_{XX}h$,     $\partial_{XXX}h$ 

and so on

To do this observe that since $f(x, h(x)) = 0 \;\;\forall x \in U_{X}$

$$ \partial_{X}^{k}f(x,h(x)) = 0 \;\; \forall k \in \mathbb{N}, \forall x \in U_{X}$$

The first couple of applications of this idea are:

$$ 0 = f(x, h(x))$$

$$ 0 = f_{X} + f_{Y}h_{X}$$ 

$$ 0 = f_{XX} + 2f_{XY}h_{X} + f_{YY}h_{X}^2 + f_{Y}h_{XX}$$

(now omitting inner variables for brevity) 

https://math.stackexchange.com/questions/2037753/implicit-function-theorem-second-derivative-calculation-help

By assumption $f_{Y}$ is always invertible, so we can solve for the values of $h_{X}$ and $h_{XX}$.

Note that:

$$ h(0) = 0$$ 

since it can't be anything else, otherwise $(0,0)$, wouldn't be in the zero set - a contradiction

### Iterating the derivative

First we consider how to generate equation of the form above, we can split each term at each plus sign into a generalised term of the form:


$$ c \; f_{S_{1}S_{2} \cdots S_{k}}(x,h(x)) \cdot \left[P_{1}(X_{i_{1,1}}, \cdots, X_{i_{1,l_{1}}}), \cdots, P_{k}(X_{i_{k,1}}, \cdots, X_{i_{k,l_{k}}}) \right] $$

Where:

$$ S_{i} \in \{X,Y\} $$

And the $P_{i}$'s are either the identity map: $id: X \rightarrow X$, or a $l_{i}$th derivative of $h$

The $X_{\cdots}$ maps indicate where to put the input spaces' values, the $P_{i}$ maps can be viewed as "premultiples", preparing the input from: $X \times \cdots \times X$ to be fed in the $f$ partial.

Previously little attention was payed to the ordering, but now it is imperative to keep track of how the derivatives develop.

Translating the earlier work into this format (now omit $(x,h(x))$ for brevity):

$$ 0 = f$$

$$ 0 = f_{X} \cdot \left[ id(X_{1}) \right] + f_{Y} \cdot \left[ (h_{X}(X_{1}) \right]$$ 

$$ 0 = f_{XX} \cdot \left[ id(X_{1}), id(X_{2}) \right] + f_{XY} \cdot \left[ id(X_{1}),h_{X}(X_{2}) \right] + f_{YX}\left[ h_{X}(X_{1}),id(X_{2}) \right]+ f_{YY} \left[ h_{X}(X_{1}), h_{X}(X_{2}) \right] + f_{YY} \cdot \left[ (h_{XX}(X_{1}, X_{2}) \right] $$

Note: here taking a further derivative is indicated by appending $X$ or $Y$ to the end, contrary to common use, the reason for this is that it makes the book-keeping a lot easier

If we can form an encoding scheme for terms like this, and a way to differentiate them with respect to $X$ then we have a method to generate arbitrarlily high order equations to solve.

#### Differentiating

We consider:

$$ \partial_{X} \left( c \; f_{S_{1}S_{2} \cdots S_{k}} \cdot \left[P_{1}(X_{i_{1,1}}, \cdots, X_{i_{1,l_{1}}}), \cdots, P_{k}(X_{i_{k,1}}, \cdots, X_{i_{k,l_{k}}}) \right] \right)$$


$$ = c \left[ \left( \partial_{X}f_{S_{1}S_{2} \cdots S_{k}}  \right)\cdot \left[P_{1}(X_{i_{1,1}}, \cdots, X_{i_{1,l_{1}}}), \cdots, P_{k}(X_{i_{k,1}}, \cdots, X_{i_{k,l_{k}}}) \right] + f_{S_{1}S_{2} \cdots S_{k}} \cdot \left( \partial_{X} \left[P_{1}(X_{i_{1,1}}, \cdots, X_{i_{1,l_{1}}}), \cdots, P_{k}(X_{i_{k,1}}, \cdots, X_{i_{k,l_{k}}}) \right]  \right) \right] $$

Splitting this into the two expressions either side of the $+$

$$  \left( \partial_{X}f_{S_{1}S_{2} \cdots S_{k}}  \right)\cdot \left[P_{1}(X_{i_{1,1}}, \cdots, X_{i_{1,l_{1}}}), \cdots, P_{k}(X_{i_{k,1}}, \cdots, X_{i_{k,l_{k}}}) \right]  = f_{S_{1}S_{2} \cdots S_{k}X} \cdot \left[P_{1}(X_{i_{1,1}}, \cdots, X_{i_{1,l_{1}}}), \cdots, P_{k}(X_{i_{k,1}}, \cdots, X_{i_{k,l_{k}}}), id(X_{n+1}) \right] + f_{S_{1}S_{2} \cdots S_{k}Y} \cdot \left[P_{1}(X_{i_{1,1}}, \cdots, X_{i_{1,l_{1}}}), \cdots, P_{k}(X_{i_{n,1}}, \cdots, X_{i_{k,l_{k}}}), h_{X}(X_{n+1}) \right]
$$

This happens since the partial derivative is evaluated at $(x,h(x))$, $n$ was the order of the orginal function, i.e the maximum value for $X_{\cdots}$

And:

$$ f_{S_{1}S_{2} \cdots S_{k}} \cdot \left( \partial_{X} \left[P_{1}(X_{i_{1,1}}, \cdots, X_{i_{1,l_{1}}}), \cdots, P_{k}(X_{i_{k,1}}, \cdots, X_{i_{k,l_{k}}}) \right]  \right) =  f_{S_{1}S_{2} \cdots S_{k}} \sum_{i = 1}^{j} { \left[P_{1}(X_{i_{1,1}}, \cdots, X_{i_{1,l_{1}}}), \cdots, (\partial_{X}P_{i})(X_{i_{i,1}}, \cdots, X_{i_{i,l_{i}}}, X_{n+1}) , \cdots, P_{k}(X_{i_{k,1}}, \cdots, X_{i_{k,l_{k}}}) \right] } $$

Here we abuse notation by using $i$ twice, unfortunately we have run out the usual indices.

Observe that the $X_{k+1}$ is now embedded inside the pre-multiples.

When $P_{i}$ is the identity $\partial_{X}P_{i} = \partial_{X}id = 0$, and so as expected the sum only picks up non-trivial premultiples.


While very complicated, we see that all the terms we computed stay as sums of expressions of the form:

$$ c \; f_{S_{1}S_{2} \cdots S_{k}}(x,h(x)) \cdot \left[P_{1}(X_{i_{1,1}}, \cdots, X_{i_{1,l_{1}}}), \cdots, P_{k}(X_{i_{k,1}}, \cdots, X_{i_{k,l_{k}}}) \right] $$

So inductively we see how the code can proceed

#### Coding the expression block storage + representation

Now we build the framework to do this, to stay in the spirit of sage, a latex output will be avaible. For expressions of the form:

$$ c \; f_{S_{1}S_{2} \cdots S_{k}}(x,h(x)) \cdot \left[P_{1}(X_{i_{1,1}}, \cdots, X_{i_{1,l_{1}}}), \cdots, P_{k}(X_{i_{k,1}}, \cdots, X_{i_{k,l_{k}}}) \right] $$

Also implemented is the differentiation rule derived earlier, returning a list of ExpressionBlock objects

In [2]:
class ExpressionBlock:
    def __init__(self, c = 1, s_string = '', n = 0, p_dict = {}, xs_dict = {}, func1 = 'f', func2 = 'h', X = 'X', Y = 'Y'):
        self.c = c # constant
        self.s_string = s_string # partials, i.e. 'XXY'
        self.n = n # number of X inputs the expression has
        self.k = len(s_string) # even if no partials, still include an identity premultiple
        self.p_dict = p_dict # P functions, 0 --> id, otherwise is a partial of h
        self.xs_dict = xs_dict # which X's go into each P
        
        # the rest are just for the cosmetic output
        self.func1 = func1 
        self.func2 = func2
        self.X = X
        self.Y = Y
        
    def __str__(self):
        # overide representation so can be printed as latex
        if self.c == 0:
            return ""
        
        out = ""
        # add the constant
        if self.c != 1:
            out += str(self.c) + " "
            
        # add f
        out += self.func1
        
        # add partials if there are any
        if len(self.s_string) != 0:
            out += "_{"
            out += self.s_string
            out += "} "
            out += r"\cdot \left["
            

        for i in range(1,self.k + 1):
            if self.p_dict[i] == 0:
                out += "id("
            else:
                out += "h_{"
                out += "X"*len(self.xs_dict[i])
                out += "}("
                
            for xi in self.xs_dict[i]:
                out += "X_{" + str(xi) + "}, "
                
            # inner excess comma deletion
            out = out[:-2]
            
            out += "), "
        
        if len(self.s_string) != 0:
            out = out[:-2] # outer excess comma deletion
            out += r"\right]"       
            
                
        # and we are done, here just list out the h terms               
        return out

    def _latex_(self):
        # so works with lprint
        return str(self)
    
    def diff(self):
        # apply the rule we worked out before
        # returns a list of ExpressionBlock instances
        
        out = []
        # first half
        temp = copy.deepcopy(self)
        temp.n += 1
        temp.s_string += 'X'
        temp.k += 1
        temp.p_dict[temp.k] = 0 # identity
        temp.xs_dict[temp.k] = (temp.n,)
        out.append(temp)
        
        temp = copy.deepcopy(self)
        temp.n += 1
        temp.s_string += 'Y'
        temp.k += 1
        temp.p_dict[temp.k] = 1 # identity
        temp.xs_dict[temp.k] = (temp.n,)
        out.append(temp)
        
        # second half
        # loop over each term in the sum
        for i in range(1,self.k + 1):
            # ktuple = (gamma, k)
            # indexing starts from 0
            if self.p_dict[i] == 0:
                # differentiating id --> zeros out
                continue 
            
            temp = copy.deepcopy(self)
            temp.n += 1
            temp.p_dict[i] = temp.p_dict[i] + 1 # take a further derivative of h
            new_xs = list(temp.xs_dict[i])
            new_xs.append(temp.n)
            
            temp.xs_dict[i] = tuple(new_xs)

            out.append(temp)
            
        # and we are done
        return out
            
            
            

        
        
k1 = ExpressionBlock(c = 3, s_string = 'XYX', n = 5,
                     p_dict = {1 : 0, 2: 3, 3 : 0},
                     xs_dict = {1 : (1,), 2 : (3,4,5), 3 : (2,)},)
lprint(k1)
k2 = ExpressionBlock()
lprint(k2)
lprint(k2.diff())

#[lprint(i) for i in k2.diff()]

$ 3 f_{XYX} \cdot \left[id(X_{1}), h_{XXX}(X_{3}, X_{4}, X_{5}), id(X_{2})\right] $

$ f $

$ \left[f_{X} \cdot \left[id(X_{1})\right], f_{Y} \cdot \left[h_{X}(X_{1})\right]\right] $

We see that at least for $f$ the rule works, lets wrap this up into a new class that deals with multiple blocks

In [3]:
class Expression:
    def __init__(self, blocks = [ExpressionBlock()]):
        if blocks is None:
            self.blocks = [] # python reasons - list is mutable
        else:
            self.blocks = blocks
            
    def __str__(self):
        if not self.blocks:
            # empty list
            return ""
        out = ""
        for block in self.blocks:
            # should be an instance of the ExpressionBlock class
            out += str(block) + " + "
        
        out = out[:-3] # remove last plus
        
        return out
    
    def _latex_(self):
        # so works with lprint
        return str(self)
    
    
    def diff(self):
        # returns a new Expression object that is the partial x derivative of the old one
        out = []
        for block in self.blocks:
            block_list = block.diff()
            
            # append to the list
            out += block_list
            
        return Expression(blocks = out)
            
        
            
    

e = Expression(blocks = [k1])
lprint(e)
lprint(e.diff())

$ 3 f_{XYX} \cdot \left[id(X_{1}), h_{XXX}(X_{3}, X_{4}, X_{5}), id(X_{2})\right] $

$ 3 f_{XYXX} \cdot \left[id(X_{1}), h_{XXX}(X_{3}, X_{4}, X_{5}), id(X_{2}), id(X_{6})\right] + 3 f_{XYXY} \cdot \left[id(X_{1}), h_{XXX}(X_{3}, X_{4}, X_{5}), id(X_{2}), h_{X}(X_{6})\right] + 3 f_{XYX} \cdot \left[id(X_{1}), h_{XXXX}(X_{3}, X_{4}, X_{5}, X_{6}), id(X_{2})\right] $

This all seems to be working fine

In [4]:
b = ExpressionBlock()
lprint(b)

$ f $

Now put into an expresssion so we can differentiate:

In [5]:
e = Expression(blocks = [b])
lprint(e)

$ f $

In [6]:
lprint(e.diff())

$ f_{X} \cdot \left[id(X_{1})\right] + f_{Y} \cdot \left[h_{X}(X_{1})\right] $

In [7]:
lprint(e.diff().diff())

$ f_{XX} \cdot \left[id(X_{1}), id(X_{2})\right] + f_{XY} \cdot \left[id(X_{1}), h_{X}(X_{2})\right] + f_{YX} \cdot \left[h_{X}(X_{1}), id(X_{2})\right] + f_{YY} \cdot \left[h_{X}(X_{1}), h_{X}(X_{2})\right] + f_{Y} \cdot \left[h_{XX}(X_{1}, X_{2})\right] $

We see that this is working as we expect, observe how the symmetry of the total expression is conserved

In [8]:
lprint(e.diff().diff().diff())

$ f_{XXX} \cdot \left[id(X_{1}), id(X_{2}), id(X_{3})\right] + f_{XXY} \cdot \left[id(X_{1}), id(X_{2}), h_{X}(X_{3})\right] + f_{XYX} \cdot \left[id(X_{1}), h_{X}(X_{2}), id(X_{3})\right] + f_{XYY} \cdot \left[id(X_{1}), h_{X}(X_{2}), h_{X}(X_{3})\right] + f_{XY} \cdot \left[id(X_{1}), h_{XX}(X_{2}, X_{3})\right] + f_{YXX} \cdot \left[h_{X}(X_{1}), id(X_{2}), id(X_{3})\right] + f_{YXY} \cdot \left[h_{X}(X_{1}), id(X_{2}), h_{X}(X_{3})\right] + f_{YX} \cdot \left[h_{XX}(X_{1}, X_{3}), id(X_{2})\right] + f_{YYX} \cdot \left[h_{X}(X_{1}), h_{X}(X_{2}), id(X_{3})\right] + f_{YYY} \cdot \left[h_{X}(X_{1}), h_{X}(X_{2}), h_{X}(X_{3})\right] + f_{YY} \cdot \left[h_{XX}(X_{1}, X_{3}), h_{X}(X_{2})\right] + f_{YY} \cdot \left[h_{X}(X_{1}), h_{XX}(X_{2}, X_{3})\right] + f_{YX} \cdot \left[h_{XX}(X_{1}, X_{2}), id(X_{3})\right] + f_{YY} \cdot \left[h_{XX}(X_{1}, X_{2}), h_{X}(X_{3})\right] + f_{Y} \cdot \left[h_{XXX}(X_{1}, X_{2}, X_{3})\right] $

In [9]:
lprint(e.diff().diff().diff().diff())

$ f_{XXXX} \cdot \left[id(X_{1}), id(X_{2}), id(X_{3}), id(X_{4})\right] + f_{XXXY} \cdot \left[id(X_{1}), id(X_{2}), id(X_{3}), h_{X}(X_{4})\right] + f_{XXYX} \cdot \left[id(X_{1}), id(X_{2}), h_{X}(X_{3}), id(X_{4})\right] + f_{XXYY} \cdot \left[id(X_{1}), id(X_{2}), h_{X}(X_{3}), h_{X}(X_{4})\right] + f_{XXY} \cdot \left[id(X_{1}), id(X_{2}), h_{XX}(X_{3}, X_{4})\right] + f_{XYXX} \cdot \left[id(X_{1}), h_{X}(X_{2}), id(X_{3}), id(X_{4})\right] + f_{XYXY} \cdot \left[id(X_{1}), h_{X}(X_{2}), id(X_{3}), h_{X}(X_{4})\right] + f_{XYX} \cdot \left[id(X_{1}), h_{XX}(X_{2}, X_{4}), id(X_{3})\right] + f_{XYYX} \cdot \left[id(X_{1}), h_{X}(X_{2}), h_{X}(X_{3}), id(X_{4})\right] + f_{XYYY} \cdot \left[id(X_{1}), h_{X}(X_{2}), h_{X}(X_{3}), h_{X}(X_{4})\right] + f_{XYY} \cdot \left[id(X_{1}), h_{XX}(X_{2}, X_{4}), h_{X}(X_{3})\right] + f_{XYY} \cdot \left[id(X_{1}), h_{X}(X_{2}), h_{XX}(X_{3}, X_{4})\right] + f_{XYX} \cdot \left[id(X_{1}), h_{XX}(X_{2}, X_{3}), id(X_{4})\right] + f_{XYY} \cdot \left[id(X_{1}), h_{XX}(X_{2}, X_{3}), h_{X}(X_{4})\right] + f_{XY} \cdot \left[id(X_{1}), h_{XXX}(X_{2}, X_{3}, X_{4})\right] + f_{YXXX} \cdot \left[h_{X}(X_{1}), id(X_{2}), id(X_{3}), id(X_{4})\right] + f_{YXXY} \cdot \left[h_{X}(X_{1}), id(X_{2}), id(X_{3}), h_{X}(X_{4})\right] + f_{YXX} \cdot \left[h_{XX}(X_{1}, X_{4}), id(X_{2}), id(X_{3})\right] + f_{YXYX} \cdot \left[h_{X}(X_{1}), id(X_{2}), h_{X}(X_{3}), id(X_{4})\right] + f_{YXYY} \cdot \left[h_{X}(X_{1}), id(X_{2}), h_{X}(X_{3}), h_{X}(X_{4})\right] + f_{YXY} \cdot \left[h_{XX}(X_{1}, X_{4}), id(X_{2}), h_{X}(X_{3})\right] + f_{YXY} \cdot \left[h_{X}(X_{1}), id(X_{2}), h_{XX}(X_{3}, X_{4})\right] + f_{YXX} \cdot \left[h_{XX}(X_{1}, X_{3}), id(X_{2}), id(X_{4})\right] + f_{YXY} \cdot \left[h_{XX}(X_{1}, X_{3}), id(X_{2}), h_{X}(X_{4})\right] + f_{YX} \cdot \left[h_{XXX}(X_{1}, X_{3}, X_{4}), id(X_{2})\right] + f_{YYXX} \cdot \left[h_{X}(X_{1}), h_{X}(X_{2}), id(X_{3}), id(X_{4})\right] + f_{YYXY} \cdot \left[h_{X}(X_{1}), h_{X}(X_{2}), id(X_{3}), h_{X}(X_{4})\right] + f_{YYX} \cdot \left[h_{XX}(X_{1}, X_{4}), h_{X}(X_{2}), id(X_{3})\right] + f_{YYX} \cdot \left[h_{X}(X_{1}), h_{XX}(X_{2}, X_{4}), id(X_{3})\right] + f_{YYYX} \cdot \left[h_{X}(X_{1}), h_{X}(X_{2}), h_{X}(X_{3}), id(X_{4})\right] + f_{YYYY} \cdot \left[h_{X}(X_{1}), h_{X}(X_{2}), h_{X}(X_{3}), h_{X}(X_{4})\right] + f_{YYY} \cdot \left[h_{XX}(X_{1}, X_{4}), h_{X}(X_{2}), h_{X}(X_{3})\right] + f_{YYY} \cdot \left[h_{X}(X_{1}), h_{XX}(X_{2}, X_{4}), h_{X}(X_{3})\right] + f_{YYY} \cdot \left[h_{X}(X_{1}), h_{X}(X_{2}), h_{XX}(X_{3}, X_{4})\right] + f_{YYX} \cdot \left[h_{XX}(X_{1}, X_{3}), h_{X}(X_{2}), id(X_{4})\right] + f_{YYY} \cdot \left[h_{XX}(X_{1}, X_{3}), h_{X}(X_{2}), h_{X}(X_{4})\right] + f_{YY} \cdot \left[h_{XXX}(X_{1}, X_{3}, X_{4}), h_{X}(X_{2})\right] + f_{YY} \cdot \left[h_{XX}(X_{1}, X_{3}), h_{XX}(X_{2}, X_{4})\right] + f_{YYX} \cdot \left[h_{X}(X_{1}), h_{XX}(X_{2}, X_{3}), id(X_{4})\right] + f_{YYY} \cdot \left[h_{X}(X_{1}), h_{XX}(X_{2}, X_{3}), h_{X}(X_{4})\right] + f_{YY} \cdot \left[h_{XX}(X_{1}, X_{4}), h_{XX}(X_{2}, X_{3})\right] + f_{YY} \cdot \left[h_{X}(X_{1}), h_{XXX}(X_{2}, X_{3}, X_{4})\right] + f_{YXX} \cdot \left[h_{XX}(X_{1}, X_{2}), id(X_{3}), id(X_{4})\right] + f_{YXY} \cdot \left[h_{XX}(X_{1}, X_{2}), id(X_{3}), h_{X}(X_{4})\right] + f_{YX} \cdot \left[h_{XXX}(X_{1}, X_{2}, X_{4}), id(X_{3})\right] + f_{YYX} \cdot \left[h_{XX}(X_{1}, X_{2}), h_{X}(X_{3}), id(X_{4})\right] + f_{YYY} \cdot \left[h_{XX}(X_{1}, X_{2}), h_{X}(X_{3}), h_{X}(X_{4})\right] + f_{YY} \cdot \left[h_{XXX}(X_{1}, X_{2}, X_{4}), h_{X}(X_{3})\right] + f_{YY} \cdot \left[h_{XX}(X_{1}, X_{2}), h_{XX}(X_{3}, X_{4})\right] + f_{YX} \cdot \left[h_{XXX}(X_{1}, X_{2}, X_{3}), id(X_{4})\right] + f_{YY} \cdot \left[h_{XXX}(X_{1}, X_{2}, X_{3}), h_{X}(X_{4})\right] + f_{Y} \cdot \left[h_{XXXX}(X_{1}, X_{2}, X_{3}, X_{4})\right] $

### How to evaluate?

This is a good exercise, but how can we use this to get the Taylor polynomial of f?, for each differentiation level, observe each term can be evaluated at a constant number of vectors $v \in X$

Recalling that we can view derivatives as linear maps

### Extracting the Taylor polynomial

We want to get a polynomial from the k-th partial of $h$, to do this we evaluate on variants of the basis vector to get coefficients of the term in the polynomial, for example; if we want to find the coefficient of:

$$x_{1}^2 x_{3}$$

then we would evaluate:


$$h_{XXX}(e_{1}, e_{1}, e_{3})$$

Well almost, in fact we would need to combine the coefficients of:

$$x_{1} x_{1} x_{3}, \;\; x_{1} x_{3} x_{1}, \;\; x_{3} x_{1} x_{1}$$

Summming them

See that the formula for the number of these evaluations is:

$$\frac{3!}{2! 1!}$$

Since we can view as a permuation of $S_n$ where some are indistinguishable since variable have the same name

In general:

$$n_{\text{evals}}\left(\prod_{i = 1}^{n} {x_{i}^{k_{i}}}\right) = \frac{n!}{k_{1}! \cdots k_{n}!} $$

But by symmetry of the derivative multilinear map we only need to evaluate once, then multiple by this $n_{\text{evals}}$ value and we are done

In [10]:
# TODO will need to use tensors
# since h will lead to arbitrary elements of Y
# numpy implementation probably the best way to go
# but will need a way to extract partial information in a sensible way to build the tensor

### Tensor implementation

I will implement a symbolic tensor class, interestingly the pandas dataframe class supports both multindexing and custom index collapsing operations, as such I will store the tensor information in one of these for now

We will aim to mimic the functionality of the numpy tensor class, here we build a 3-dimensional tensor

In [11]:
x = np.array([[[56, 183, 1],
               [65, 164, 0]],
              [[85, 176, 1],
               [44, 164, 0]]])

In [12]:
print(x)

[[[ 56 183   1]
  [ 65 164   0]]

 [[ 85 176   1]
  [ 44 164   0]]]


In [13]:
x[0][1][1]

164

In [14]:
a = []
a.append(list(range(0, 3)))
print(a)

[[0, 1, 2]]


In [140]:
class SymbolicXYTensor():
        def __init__(self, x_dim, y_dim, xy_order):
            self.x_dim = x_dim
            self.y_dim = y_dim
            self.xy_order = xy_order
            self.data = pd.DataFrame
            
            # the sets from which we draw the possible multi-indices
            x_dims = list(range(1, x_dim+1)) 
            y_dims = list(range(1, y_dim+1))
            x_dims = ['x' + str(dim) for dim in x_dims]
            y_dims = ['y' + str(dim) for dim in y_dims]

            iterables = [] 

            self.size = 1
            for space in self.xy_order:
                if (space == 'x') or (space == 'X'):
                    iterables.append(x_dims)
                    self.size = self.size*x_dim 
                elif (space == 'y') or (space == 'Y'):
                    iterables.append(y_dims)
                    self.size = self.size*y_dim
                else:
                    raise(Exception('invalid xy_order syntax'))

            multindex = pd.MultiIndex.from_product(iterables, names = list(range(1, len(xy_order) + 1)))

            self.data = pd.DataFrame(pd.Series(np.zeros(self.size), index = multindex), columns = ['data'] )
            
        def fill_from_function(self, function, var_dict, position):
            def row_func(row):
                partials = row.name # a tuple
                temp = function # will be differentiating
                #print(row.name)
                for partial in partials:
                    temp = temp.diff(var_dict[partial])
                    #lprint(temp)
                return temp(*position) # unpack tuple as coordinates

                
            self.data['data'] = self.data.apply(row_func, axis = 1)
            
        def vec_mult(self, vec):
            if len(self.xy_order) == 1:
                # dual space case
                return sum(self.data['data']*vec)
            
            out = SymbolicXYTensor(x_dim = self.x_dim, y_dim = self.y_dim, xy_order = self.xy_order[:-1])
            
            out.data = self.data.copy(deep = True)
            out.data['vec'] = list(vec)*int(out.size)
            out.data['data'] = out.data['data']*out.data['vec']
            
            out.data = out.data.groupby(level=[Integer(i) for i in range(1,len(self.xy_order))]).sum().drop(columns = 'vec')
            return out
            # for some reason only sage integers work here, who knows why
   


In [141]:
var('x1 x2 x3 y1 y2')
var('l', latex_name=r'\lambda') # raw string kills off some random error
f1(x1, x2, x3, y1, y2) = x1*x2*x3*y1 + cos(y2)
f2(x1, x2, x3, y1, y2) = sin(x1 + x2 + y1) + (x3 - y2)^3
lprint(f1)
lprint(f2)

$ \left( x_{1}, x_{2}, x_{3}, y_{1}, y_{2} \right) \ {\mapsto} \ x_{1} x_{2} x_{3} y_{1} + \cos\left(y_{2}\right) $

$ \left( x_{1}, x_{2}, x_{3}, y_{1}, y_{2} \right) \ {\mapsto} \ {\left(x_{3} - y_{2}\right)}^{3} + \sin\left(x_{1} + x_{2} + y_{1}\right) $

In [148]:
a = SymbolicXYTensor(x_dim = 3, y_dim = 2,xy_order = 'XYX')

a.fill_from_function(f2, {'x1' : x1, 'x2' : x2, 'x3' : x3, 'y1' : y1, 'y2' : y2}, (1,1,1,1,1))

vec1 = (1,2,1)
vec2 = (-1,1)
vec3 = (1,2,3)

a.data.head()

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,data
1,2,3,Unnamed: 3_level_1
x1,y1,x1,-cos(3)
x1,y1,x2,-cos(3)
x1,y1,x3,0
x1,y2,x1,0
x1,y2,x2,0


In [143]:
a.vec_mult(vec1).data

Unnamed: 0_level_0,Unnamed: 1_level_0,data
1,2,Unnamed: 2_level_1
x1,y1,-3*cos(3)
x1,y2,0
x2,y1,-3*cos(3)
x2,y2,0
x3,y1,0
x3,y2,-6


In [144]:
a.vec_mult(vec1).vec_mult(vec2).data

Unnamed: 0_level_0,data
1,Unnamed: 1_level_1
x1,3*cos(3)
x2,3*cos(3)
x3,-6


In [149]:
lprint(a.vec_mult(vec1).vec_mult(vec2).vec_mult(vec3))

$ 9 \, \cos\left(3\right) - 18 $

It works! we evaluated the slots of $f_{XYX}$ right from left

This tensor just got us the value for the first slot of the output, a similar process with the f2 function would get that

In [150]:
# TODO - check this works like the numpy version