# Implementation of zero-divergence Inference Learning in a Predictive Coding Network

## Predictive Coding Network

A predictive coding network is a probabilistic model that calculates.

Variables on adjacent levels are assumed to be related by

$$ P(x_i^l | \bar x^{l+1}) = \mathcal{N}( x_i^l; \mu_i^l, \Sigma_i^l) $$

where

$$ \mu_i^l = {\theta_i^l}^T f(\bar x^{l+1}) $$

with the objective being to maximize

$$ F = \ln P(\bar x^1,...,\bar x^{L-1} | \bar x^L) $$

due to the assumed relationship between adjacent layers this simplifies to

$$ \begin{align*}
    F &= \sum_{l=0}^{L-1} \ln P(\bar x^l | \bar x^{l+1})    \\
    &= \sum_{l=0}^{L-1} \sum_{i=1}^n \ln \mathcal{N}( x_i^l; \mu_i^l, \Sigma_i^l)    \\
    &= \sum_{l=0}^{L-1} \sum_{i=1}^n \ln \frac{1}{\sqrt{2\pi\Sigma_i^l}} - \frac{1}{2}\frac{(x_i^l - \mu_i^l)^2}{\Sigma_i^l}
\end{align*} $$

ignoring the constant term (since we are going to use the derivative with respect to $x_i^l$)

$$ F = -\frac{1}{2} \sum_{l=0}^{L-1} \sum_{i=1}^n \frac{(x_i^l - \mu_i^l)^2}{\Sigma_i^l} $$

In this model we will assume the variances to be 1, and letting $\epsilon_i^l = x_i^l - \mu_i^l$

$$ F = -\frac{1}{2} \sum_{l=0}^{L-1} \sum_{i=1}^n (\epsilon_i^l)^2 $$

to update each $x_i$ we will use the partial derivative of $F$ with respect to $x_i$

$$ \frac{\partial F}{x_i^l} = -\epsilon_i^l + f'(x_i^l) \sum_{k=1}^n \theta_{i,k}^l \epsilon_k^{l-1}$$

updating the weights
$$ \frac{\partial F}{\theta_{i,j}^l} = \epsilon_i^l f(x_j^{l-1}) $$

In [3]:
from read_image import *

train_images = read_mnist_images('data/train-images-idx3-ubyte.gz')
train_labels = read_mnist_labels('data/train-images-idx3-ubyte.gz')
test_images = read_mnist_images('data/t10k-images-idx3-ubyte.gz')
test_labels = read_mnist_labels('data/t10k-images-idx3-ubyte.gz')

In [9]:
import numpy as np

X = np.array([[0,0,1],
            [0,1,1],
            [1,0,1],
            [1,1,1]])

y = np.array([[0],
            [1],
            [1],
            [0]])

w1 = 2*np.random.random((4,3)) - 1
w2 = 2*np.random.random((1,4)) - 1

h = np.zeros((4,1))

# prediction
# 1. 

[[0.]
 [0.]
 [0.]
 [0.]]
