# Inhibition of odors with IBCM neurons: 2D toy model

A layer of IBCM neurons is used to inhibit the activity of ORNs in response to a fluctuating olfactory background. Synaptic weights from the input layer to the inhibition layer, $M = (\vec{m}_1, \vec{m}_2, \ldots)$, are learnt according to the IBCM rule. Synaptic weights for inhbition, from the inhibitory neurons to the projection layer, are learnt to minimize the squared norm of the projection neuron (PN) layer. In this way, the network of IBCM neurons is like an autoencoder applying feedforward inhibition. 

![test](figures/feedforward_inhibitory_network.png)

## Network of IBCM neurons

Consider a feedforward network of IBCM  neurons. Each will be paired with an inhibitory neuron, but those inhibitory neurons will not interact with other neurons except their own IBCM neuron, so we can discuss them later. Each IBCM neuron has $n_R \geq N_I$ input synapses, represented by the connectivity vector $\vec{m}_i = (m^1_i, \ldots, m^{n_R}_i)$. Its activation upon stimulation, before coupling to other IBCM neurons is considered, is given by $c = \vec{m}_i \cdot \vec{x}$, where $\vec{x}$ is a two-dimensional input vector. Its activity inhibited by other IBCM neurons in the network is 

$$ \bar{c}_i = c_i - \eta \sum_{j \neq i} c_j \quad \mathrm{where} \, \, c_i(t) = \vec{m}_i(t) \cdot \vec{x}(t)  \, \, .$$

The update equation of each IBCM neuron's weights uses this inhibited activity:

$$ \dot{m}_i = \mu \left[ \bar{c}_i(\bar{c}_i - \bar{\Theta}_i) \vec{x} - \eta \sum_{j \neq i} \bar{c}_j(\bar{c}_j - \bar{\Theta}_j) \vec{x} \right]  \quad \mathrm{where} \, \, \bar{\Theta}_i = \mathbb{E}[\bar{c}_i^2] $$

The parameter $\eta$ is the coupling strength. The thresholds $\bar{\Theta}_i$ are time-dependent averages of the IBCM neuron's inhibited activity squared, $\bar{c}_i^2$; they evolve with $\vec{m}_i$ according to the differential equation

$$ \dot{\bar{\Theta}}_i = \frac{1}{\tau_\Theta} (\bar{c}_i^2 - \bar{\Theta}_i)  \,\, .$$

Hence, they effectively take the average of $\bar{c}_i^2$ over a sliding exponential window with a time scale $\tau_\Theta$. This time scale should be a lot longer than the fluctuation time scale of the input, $\tau_c$, but also a lot faster than the slow time scale of $\vec{m}_i$ learning itself. In short, we should have $\tau_c \ll \tau_{\Theta} \ll \frac{1}{\mu}$. 

## Feedforward inhibitory neurons

During the learning phase (i.e. exposition to the background only), we want the inhibitory neurons to silence the activity of projection neurons in response to the varying input. Define $\vec{s}$ the vector of PN activities, with 
$$\vec{s}= R(\vec{x}_{in} -  W\vec{\bar{c}})   \quad ,$$

where $R$ is an element-wise activation function (e.g., identity or ReLU), and $W\vec{\bar{c}}$ is the output sent by the inhibitory neurons to projection neurons (each column of $W$ is the vector $\vec{w}^j$ of synaptic weights leaving inhibitory vector $j$ towards the different PNs). We want the inhibitory weight matrix $W$ to minimize the cost function

$$ C(W) = \frac12 \mathbb{E}[\vec{s}^T \vec{s}] + \frac12 \frac{\beta}{\alpha} \mathbb{E}[\vec{w}^T \vec{w}] $$

The first term ensures minimization of $\vec{s}$'s magnitude. The second term is a regularization ensuring $\vec{w}$ does not diverge when, for instance, $R$ is a ReLU function, such that large negative values of $\vec{w}$ would make $\vec{s}$ zero. In other words, the regularization ensures that the layer inhibits just enough the background and will still let through some of the new odor when it arrives. 

If we use gradient descent dynamics for the weights $\vec{w}_j$, 

$$ \frac{d \vec{w}^j}{dt} = - \alpha \nabla_{\vec{w}^j} C(W) $$

we find, after computing the gradient, 

$$  \frac{d \vec{w}^j}{dt} = \alpha \bar{c}^j \vec{s} - \beta \vec{w}^j $$

which is a hebbian learning rule for each synaptic weight, because it is proportional to the activity of the two neurons at the endpoints ($\bar{c}^j$ and $W_i^j$). Here, $\alpha$ and $\beta$ are the learning rates of the $W$ synapses; usually, one takes $\beta < \alpha$ to ensure that $\vec{s}$ minimization is the dominant term of the cost function. 

In the toy model studied here, it is possible to calculate analytically the expectation value of $\vec{w}^j$, under a few approximations. It is therefore a good starting point for this model more generally. 

In [None]:
import numpy as np
import scipy as sp
from scipy import stats
import matplotlib.pyplot as plt
import matplotlib as mpl
import seaborn as sns

from utils.smoothing_function import moving_average
from utils.statistics import seed_from_gen
from scipy import signal

In [None]:
from modelfcts.ibcm import integrate_inhib_ibcm_network, integrate_ibcm, relu_inplace

## Fluctuating gaussian binary mixture
### Definition of the process
We simulate a simple two-component mixture where the proportion of the two odors change:

$$ \vec{x} = \left(\frac12 + \nu\right) \vec{x}_a + \left(\frac12 - \nu\right) \vec{x}_b $$

where $\nu$ follows a univariate Ornstein-Uhlenbeck process with mean $0$, time scale $\tau_{\nu}$, and steady-state variance $\sigma^2$. We set $\tau_{\nu} = 2$ time steps. 

In [None]:
from modelfcts.backgrounds import update_ou_2inputs

### Analytical prediction of the fixed points

#### Single IBCM  neuron
Let's first consider a single IBCM neuron. 
To calculate the fixed points, it is useful to rewrite the background vector as 
$$ \vec{x} = \vec{x}_d + \nu \vec{x}_s $$
where $\vec{x}_d = \frac12(\vec{x}_a + \vec{x}_b)$ is the "deterministic" part, and $\vec{x}_s = \vec{x}_a - \vec{x}_b$ is the "stochastic" part. The two stable fixed points, $\vec{m}_{\pm}$, are thus found to be

$$c_{d, \pm} = \vec{m}_{\pm} \cdot \vec{x}_d = 1$$
$$c_{s, \pm} = \vec{m}_{\pm} \cdot \vec{x}_s = \pm \frac{1}{\sigma} $$

or, equivalently,

$$ c_{a, \pm} = \vec{m}_{\pm} \cdot \vec{x}_a = 1 \pm \frac{1}{2 \sigma} $$
$$ c_{b, \pm} = \vec{m}_{\pm} \cdot \vec{x}_b = 1 \mp \frac{1}{2 \sigma} \,\, .$$

In general, $\vec{x}_a$ and $\vec{x}_b$ may have different norms and a non-zero dot product $\Omega = \vec{x}_a \cdot \vec{x}_b$. In this case, $\vec{m}_{\pm}$ is written as a linear combination of $\vec{x}_a$ and $\vec{x}_b$ as

$$ \vec{m}_{\pm} = \frac{1}{|\vec{x}_a|^2 |\vec{x}_b|^2 - \Omega^2} \Big[(c_{a, \pm}|\vec{x}_b|^2 - \Omega c_{b, \pm})\vec{x}_a + (c_{b, \pm}|\vec{x}_a|^2 - \Omega c_{a, \pm})\vec{x}_b  \Big]$$

Moreover, the response to any $\vec{x}(\nu)$ of the IBCM neuron at a fixed point is

$$ \vec{m}_{\pm} \cdot \vec{x} = 1 \pm \frac{\nu}{\sigma} \, \, .$$


#### Network of IBCM  neurons
For a network of IBCM neurons, the equations decouple when written in terms of the variables $\vec{\bar{m}}^i = \vec{m}^i - \eta \sum_{j \neq i} \vec{m}^j$, which we could call the "inhibited synaptic weights", since $\bar{c}$, the inhibited neuron activity, can simply be written as $\vec{\bar{m}} \cdot \vec{x}$. The fixed points for the $\vec{\bar{m}}$ are then the same as for a single IBCM neuron:

$$ \vec{\bar{m}}^i_{\pm} = \vec{x}_d \pm \frac{1}{\sigma} \vec{x}_s $$

If we have only two neurons, it is fairly easy to write the solution in terms of the "un-inhibited" synaptic weights $\vec{m}^i$, although the results depend on whether both neurons are at the same fixed point ($+$ or $-$ sign) or at opposite fixed points. 

For neurons at the same fixed point, we find that $\vec{m}^i$ is just rescaled by $\frac{1}{1 - \eta}$:

$$\vec{m}^i_{++, --} = \frac{1}{1 - \eta} \vec{m}_{\pm}$$

For neurons at opposite fixed points, we find that the dot products with $\vec{x}_a$ and $\vec{x}_b$ are 

$$ c^i_a = \vec{m}^i_{+-, -+} \cdot \vec{x}_a = \frac{1}{1 - \eta} \pm \frac{1}{1 + \eta} \frac{1}{2 \sigma}$$
$$ c^i_a = \vec{m}^i_{+-, -+} \cdot \vec{x}_b = \frac{1}{1 - \eta} \mp \frac{1}{1 + \eta} \frac{1}{2 \sigma}$$

where one neuron has the upper sign, and the other has the lower sign; they both are still written as the linear combination

$$ \vec{m}^i_{+-, -+} = \frac{1}{|\vec{x}_a|^2 |\vec{x}_b|^2 - \Omega^2} \Big[(c^i_a|\vec{x}_b|^2 - \Omega c^i_b)\vec{x}_a + (c^i_b|\vec{x}_a|^2 - \Omega c^i_a)\vec{x}_b  \Big]$$


#### Fixed points of the inhibitory synaptic weights, $\vec{w}^i$

If there are two IBCM neurons converging each to a different fixed point, i.e. we are in the case where one $\vec{\bar{m}}^j$ is $\vec{\bar{m}}_+$ and the other is $\vec{\bar{m}}_-$, then we find that 

$$ \vec{w}^{\pm} = \frac{\alpha}{2\alpha + \beta} \vec{x}(\nu = \pm\sigma) $$

That means $\vec{w}^{\pm}$ is parallel to the input vector at one standard deviation away from the mean, on either side ($\pm \sigma$). 

#### Fixed points of the projection neurons, $\vec{s} = R(\vec{x} - W \vec{\bar{c}})$

Using the instantaneous response of the two IBCM neurons to $\vec{x}$, 

$$ \bar{c} = \vec{\bar{m}} \cdot \vec{x}(t) = 1 \pm \frac{\nu(t)}{\sigma} $$

we can calculate the average and instantaneous values of $\vec{s} = R(\vec{x} - W \vec{\bar{c}})$, considering that $W$ is at steady-state. We find:

$$ \vec{s} = R\left(\frac{\beta}{2 \alpha + \beta} \vec{x}\right) $$

which means that the average and squared norm of the inhibited background are

$$ \langle \vec{s} \rangle = R\left(\frac{\beta}{2 \alpha + \beta} \vec{x}_d \right) $$
$$ \langle \vec{s}^T \vec{s} \rangle = \left( \frac{\beta}{2 \alpha + \beta} \right)^2 \langle \vec{x}^T \vec{x} \rangle $$

In other words, there is a reduction of the background's average *and standard deviation* down to $\frac{\beta}{2\alpha + \beta}$ of its original amplitude. 

In [None]:
from modelfcts.ibcm_analytics import (
    fixedpoints_m_2vectors, 
    fixedpoints_barm_2vectors, 
    fixedpoints_w_2vectors, 
    fixedpoint_s_2vectors_instant, 
    fixedpoint_s_2vectors_mean, 
    fixedpoint_s_2vectors_norm2
)

In [None]:
def correlate_stats(s1, s2, blocksize=1000):
    """Compure the (auto)correlation of random variables which have
    a long stationary realization in s1 and s2. s1 and s2 are split in small blocks, 
    the length of which should be the max correlation time to consider, 
    the correlation is computed in each block, and the resulting
    correlation function samples are averaged. 
    """
    if s1.size != s2.size:
        raise ValueError("Should use arrays of same length")
    # Count number of blocks available; do not add one for remainder because it wouldn't have 
    # the same duration as other full blocks
    n_blocks = max(1, s1.size // blocksize)  # At least one block
    correl_blocks = []
    for i in range(n_blocks):
        sli = slice(blocksize*i, blocksize*(i+1))
        cr = sp.signal.correlate(s1[sli], s2[sli], mode="same") / blocksize
        cr = cr[blocksize//2:]  # Keep only positive tau part
        correl_blocks.append(cr)
    # Sum arrays of same length and divide by number of blocks to estimate average. 
    return sum(correl_blocks) / n_blocks

#### First-order corrections to the $\vec{w}$ fixed points due to correlations

TODO: the correction calculations are utterly useless, unfortunately, because I changed the inhibition network. 

If I am really into it, redo the correction calculation for fun. Former explanation was:

    Initially, my simulations did not converge to the expected fixed point (even though the analytical calculation was straigthfoward and seemed correct), so I used van Kampen's discussion of stochastic differential equations, chap. XVI, to calculate the first-order correction to the mean steady-state $\vec{w}_{\pm}$ due to correlations of the stochastic processes $\nu$ and $\vec{m}$. I'm not reporting the details here, but the resulting corrections are computed by the function ``fixedpoints_w_empirical_corrections`` below. As could be expected from the analytical study, they are utterly negligible, being of order $\alpha dt$. But confirming this numerically helped me pinpoint the silly coding mistake that I had made. Once corrected, the numerical simulations confirmed the zeroth order solution for $\vec{w}$ given above. 

In [None]:
from modelfcts.backgrounds import decompose_nonorthogonal_basis
from utils.statistics import seed_from_gen

# Run a simulation
Hopefully, the inhibitory neurons can be combined to perfectly inhibit the input. 

In [None]:
### General simulation parameters
n_dimensions = 2
n_components = 2
n_neurons = 2

# Simulation times
duration = 80000.0
deltat = 1.0
learnrate = 0.0025
tau_avg = 300
coupling_eta = 0.1
inhib_rates = [.00025, .00005]  # alpha, beta
#learnrate = 0.01
#tau_avg = 50
#inhib_rates = [0.005, 0.001]
lambd_ibcm = 2.0
ibcm_rates = [learnrate, tau_avg, coupling_eta, lambd_ibcm]

# Initial synaptic weights: small positive noise near origin
rgen_meta = np.random.default_rng(seed=92347287)
init_synapses = 0.1*rgen_meta.random(size=[n_neurons, n_dimensions]) * lambd_ibcm

# Choose three LI vectors in (+, +, +) octant
back_components = 0.25*np.ones([n_components, n_dimensions])
for i in range(n_components):
    back_components[i, i] = 1.0
    # Normalize
    back_components[i] = back_components[i] / np.sqrt(np.sum(back_components[i]**2))

# Initial background vector and initial nu values
average_nu = np.zeros(1)
init_nu = np.zeros(1)
init_bkvec = 0.5*back_components[0] + 0.5*back_components[1]
# nus are first in the list of initial background params
init_back_list = [init_nu, init_bkvec]

## Compute the coefficients in the Ornstein-Uhlenbeck update equation
sigma2_nu = 0.09
tau_nu = 2.0  # Fluctuation time scale of the background nu_alphas (same for all)
update_coefs_mean = np.exp(-deltat/tau_nu)
update_coefs_noise = np.sqrt(sigma2_nu*(1 - np.exp(-2*deltat/tau_nu)))

bk_update_params = [average_nu, update_coefs_mean, update_coefs_noise, back_components]

In [None]:
# m_init, update_bk, bk_init, ibcm_params,
#    inhib_params, bk_params, tmax, dt, seed=None, noisetype="normal"
sim_results = integrate_inhib_ibcm_network(init_synapses, update_ou_2inputs, init_back_list, ibcm_rates, 
                    inhib_rates, bk_update_params, duration, deltat, 
                    seed=seed_from_gen(rgen_meta), noisetype="normal")
tser, nuser, bkvecser, mser, cbarser, _, wser, yser = sim_results

### Check the synaptic weights against fixed points
The analytical prediction neglecting correlations between $\vec{m}$ and $\nu$ is verified, provided that the time scales $\tau_{\nu}$ and $\frac{1}{\mu}$ are different enough. Computing corrections to account for incompletely separated time scales would be very hard, since the equation for $\vec{m}$ is a multivariate, non-linear stochastic differential equation. 

In [None]:
# Check that neurons went to different fixed points (Pearson correlation of \bar{c} = -1)
transient = 30000
print("Pearson correlation:", np.corrcoef(cbarser[transient:].T)[0, 1])

In [None]:
analytical_barm, _ = fixedpoints_barm_2vectors(back_components, np.sqrt(sigma2_nu), coupling_eta, lambd=lambd_ibcm)
# Plot dot products with x_a and x_b, in terms of reduced m vectors. 
# Take any of the fixed points for one neuron, its two dot products are the only two possible dot product values. 
analytical_mbara = np.dot(analytical_barm[2, 0], back_components[0])
analytical_mbarb = np.dot(analytical_barm[2, 0], back_components[1])

# Compute mbar for the two neurons. 
mbarser = mser*(1 + coupling_eta) - coupling_eta * np.sum(mser, axis=1, keepdims=True)
mbarser_a = mbarser.dot(back_components[0])
mbarser_b = mbarser.dot(back_components[1])

fig, ax = plt.subplots()
# First neuron
skp = 10
ax.plot(tser[::skp], mbarser_a[::skp, 0], color="red", label=r"Neuron 1, $\overline{c}_a^-$")
ax.plot(tser[::skp], mbarser_b[::skp, 0], color="pink", label=r"Neuron 1, $\overline{c}_b^-$")
ax.plot(tser[::skp], mbarser_a[::skp, 1], color="blue", label=r"Neuron 2, $\overline{c}_a^+$")
ax.plot(tser[::skp], mbarser_b[::skp, 1], color="cyan", label=r"Neuron 2, $\overline{c}_b^+$")

ax.axhline(analytical_mbara, color="k", ls="--", label=r"$1 + 1/2 \sigma$")
ax.axhline(analytical_mbarb, color="k", ls="-.", label=r"$1 - 1/2 \sigma$")
ax.legend()

ax.set(xlabel="Time", ylabel="Dot products")
fig.set_size_inches(4.5, 3.5)
fig.tight_layout()
# fig.savefig("figures/two_odors/two_ibcm_neurons_mbar_dot_products_simulation.pdf", transparent=True)
plt.show()
plt.close()

### Check the neurons' responses compared to analytical fixed points
For a neuron at the $\pm$ fixed point, its response should be $\bar{c}_{\pm} = 1 \pm \nu/\sigma$ to any input vector $\vec{x}(\nu)$.  The plot below shows $\bar{c}_{\pm}$ versus each neuron's actual $\bar{c}$ to prove it is indeed the case.  

In [None]:
fig, ax = plt.subplots()

tslice = slice(int(duration/deltat)-2001, int(duration/deltat)-1901, 1)
ax.plot(tser[tslice], cbarser[tslice, 0]/lambd_ibcm, color="cyan", label="Response of neuron 0", alpha=0.95)
ax.plot(tser[tslice], cbarser[tslice, 1]/lambd_ibcm, color="pink", label="Response of neuron 1", alpha=0.95)
ax.plot(tser[tslice], (1 + nuser[tslice, 0]/np.sqrt(sigma2_nu)), color="red", label="Response at f.p. +", alpha=0.95)
ax.plot(tser[tslice], (1 - nuser[tslice, 0]/np.sqrt(sigma2_nu)), color="blue", label="Response at f.p. -", alpha=0.95)

ax.set(xlabel="Time", ylabel="Neuron response")
ax.legend()
plt.show()
plt.close()

## Evolution of the inhibitory neurons' weights $\vec{w}^i$
Compare to the analytical prediction that, on average, $\vec{w}^i$ converges to $\vec{x}(\nu = \pm \sigma)$, i.e. to either input vector one standard deviation away from the mean input. 

In [None]:
# Analytical prediction for the w vectors, neglecting correlations between nu, m, and w
analytical_m, _ = fixedpoints_m_2vectors(back_components, np.sqrt(sigma2_nu), coupling_eta, lambd=lambd_ibcm)
analytical_mbar, _ = fixedpoints_barm_2vectors(back_components, np.sqrt(sigma2_nu), coupling_eta, lambd=lambd_ibcm)
fixed_wvecs = fixedpoints_w_2vectors(inhib_rates, analytical_mbar[3], back_components, sigma2_nu, lambd=lambd_ibcm)

# Analytical prediction above, plus first-order corrections from numerical estimates of autocorrelations
#empirical_corrected_w = fixedpoints_w_empirical_corrections(inhib_rates, sigma2_nu, back_components, 
#                        analytical_m[3], mser[transient:], use_bar=True, eta=coupling_eta, dt=deltat, t_nu=tau_nu)

In [None]:
fig, ax = plt.subplots()
skp = 10
#colors_choice = ["green", "xkcd:lime green", "purple", "violet"]
ax.plot(tser[::skp], wser[::skp, 0, 0], color="red", label=r"$\vec{w}_1$ elem. 1", alpha=0.95)
ax.plot(tser[::skp], wser[::skp, 0, 1], color="pink", label=r"$\vec{w}_1$ elem. 2", alpha=0.95)
ax.plot(tser[::skp], wser[::skp, 1, 0], color="blue", label=r"$\vec{w}_2$ elem. 1", alpha=0.8)
ax.plot(tser[::skp], wser[::skp, 1, 1], color="cyan", label=r"$\vec{w}_2$ elem. 2", alpha=0.8)

ax.axhline(fixed_wvecs[0, 0], color="k", label=r"Analytical $\vec{w}$", ls="--")
ax.axhline(fixed_wvecs[0, 1], color="k", ls="--")
#ax.axhline(empirical_fixed_w[0, 0], color="k", label="Prediction from empirical means", ls=":")
#ax.axhline(empirical_fixed_w[0, 1], color="k", ls=":")
#ax.axhline(empirical_corrected_w[0, 0], color="orange", label="With 1st order corrections", ls="-.")
#ax.axhline(empirical_corrected_w[0, 1], color="orange", ls="-.")

ax.set(xlabel="Time", ylabel="Elements of inhibitory neuron vectors")
ax.legend(ncol=2, labelspacing=0.5)
fig.set_size_inches(4.5, 3.5)
fig.tight_layout()
# fig.savefig("figures/two_odors/ibcm_inhibition_2d_analytical_numerical_w_fixed_points.pdf")
plt.show()
plt.close()

In [None]:
# Decomposition on the basis of background components
back_components_ds = np.stack([(back_components[0] + back_components[1])/2, back_components[0]-back_components[1]])
print(decompose_nonorthogonal_basis(np.mean(wser[transient:, 0], axis=0), back_components_ds.T))
print(decompose_nonorthogonal_basis(fixed_wvecs[0], back_components_ds.T))
print(decompose_nonorthogonal_basis(fixed_wvecs[1], back_components_ds.T))
#print(decompose_nonorthogonal_basis(empirical_corrected_w[0], back_components_ds.T))

In [None]:
# Plot the vectors themselves. 
# Plot time course of m vector
fig, ax = plt.subplots()
aprops = dict(arrowstyle="<-", color="k", lw=3)
input_vecs = back_components
ax.annotate(r"$\vec{x}_a$", xy=(0, 0), xytext=input_vecs[0], arrowprops=aprops, xycoords="data", ha="center")
ax.annotate(r"$\vec{x}_b$", xy=(0, 0), xytext=input_vecs[1], arrowprops=aprops, xycoords="data", ha="center")
#ax.annotate(r"$\vec{x}_{mean}$", xy=(0, 0), xytext=np.sum(input_vecs, axis=0)/2, arrowprops=aprops,
#            xycoords="data", ha="center")

# Compare to analytical fixed points
aprops = dict(arrowstyle="<-", color="blue", lw=3)
ax.annotate("", xy=(0, 0), xytext=analytical_barm[2, 0], arrowprops=aprops, xycoords="data", ha="center", va="center")
ax.annotate(r"$\vec{\overline{m}}^+$", xy=(analytical_barm[2, 0, 0], analytical_barm[2, 0, 1]*0.8), 
            xycoords="data", ha="center", color=aprops["color"])

aprops = dict(arrowstyle="<-", color="red", lw=3)
ax.annotate("", xy=(0, 0), xytext=analytical_barm[2, 1], arrowprops=aprops, xycoords="data", ha="center", va="center")
ax.annotate(r"$\vec{\overline{m}}^-$", xy=(analytical_barm[2, 1, 0]*0.8, analytical_barm[2, 1, 1]),
            xycoords="data", ha="center", color=aprops["color"])
minx = np.amin(analytical_barm[:, :, 0])*1.1
maxx = np.amax(analytical_barm[:, :, 0])*1.1
miny = np.amin(analytical_barm[:, :, 1])*1.1
maxy = np.amax(analytical_barm[:, :, 1])*1.1

# Also illustrate components that the model learns to inhibit? Very small
aprops = dict(arrowstyle="<-", color="pink", lw=1.25)
ax.annotate("", xy=(0, 0), xytext=fixed_wvecs[0], arrowprops=aprops, xycoords="data", ha="center", va="center")
ax.annotate(r"$\vec{w}^-$", xy=fixed_wvecs[0], 
            xycoords="data", ha="left", color=aprops["color"])

aprops = dict(arrowstyle="<-", color="cyan", lw=1.25)
ax.annotate("", xy=(0, 0), xytext=fixed_wvecs[1], arrowprops=aprops, xycoords="data", ha="center", va="center")
ax.annotate(r"$\vec{w}^+$", xy=fixed_wvecs[1],
            xycoords="data", ha="center", color=aprops["color"])


ax.set_aspect("equal")
ax.set_ylim([miny, maxy])
ax.set_xlim([minx, maxx])
ax.set(xlabel="Dimension 1", ylabel="Dimension 2")
fig.set_size_inches(4.5, 4.5)
# fig.savefig("figures/two_odors/two_ibcm_neurons_background_mbar_w_vectors.pdf", transparent=True)
plt.show()
plt.close()

## Background before and after inhibition
Check how well this 2D background is inhibited. 

In [None]:
from simulfcts.plotting import plot_background_neurons_inhibition, plot_background_norm_inhibition

In [None]:
# Squared norm: prediction vs numerical
def squared_norm(v, axis=-1):
    return np.sum(v**2, axis=axis)

In [None]:
# Compute the predicted and numerical reduction factors
# Predictions for s
x_determ = np.sum(back_components, axis=0) / 2.0
reduct_factor_predict = inhib_rates[1] / (2*inhib_rates[0] + inhib_rates[1])
s_avg_predict = relu_inplace(x_determ * reduct_factor_predict)

# Predictions for s^2
x_determ = np.sum(back_components, axis=0) / 2.0
x_stoch = (back_components[0] - back_components[1])
x2_expectation = squared_norm(x_determ) + sigma2_nu*squared_norm(x_stoch)
reduct_factor_predict = inhib_rates[1] / (2*inhib_rates[0] + inhib_rates[1])
s_norm2_predict = reduct_factor_predict**2 * x2_expectation

# Numerics for s and s^2
transient = 30000
reduct_factor_numeric = np.mean(yser[transient:], axis=0) / np.mean(bkvecser[transient:], axis=0)
s2_norm_numeric = np.mean(yser[transient:]*yser[transient:])
reduct_factor_numeric2 =  s2_norm_numeric / np.mean(bkvecser*bkvecser)

print("--- Reduction of s components ---")
print("  Predict:", reduct_factor_predict)
print("  Numeric:", reduct_factor_numeric)
print("--- Reduction of s^2 norm ---")
print("  Predict:", reduct_factor_predict**2)
print("  Numeric:", reduct_factor_numeric2)

In [None]:
# Use generic plotting function
fig, axes_mat, axes = plot_background_neurons_inhibition(tser, bkvecser, yser, skp=50)
fig.set_size_inches(6.5, 2.5)
# Add lines at predicted average
for i, ax in enumerate(axes):
    ax.axhline(s_avg_predict[i], color="y", ls="--", 
               label="Predicted avg. ({:.1f} %)".format(reduct_factor_predict*100), lw=3.)
    ax.axhline(np.mean(yser[transient:, i], axis=0), ls="--", color="xkcd:pink", label="Numerical avg.", lw=1.5)
    txt = ax.annotate("{:.1f} %".format(reduct_factor_numeric[i]*100), xy=(tser[-1]/1000, s_avg_predict[i]*3), 
               ha="right", va="bottom", xycoords="data")
    txt.set_bbox(dict(facecolor='w', alpha=0.7, edgecolor='w'))
axes[-1].legend(bbox_to_anchor=(1, 1), loc="upper left", fontsize=9)

fig.tight_layout()
#fig.savefig("figures/two_odors/inhibition_gaussian_background_neurons_2odors.pdf", 
#    transparent=True, bbox_inches="tight")
plt.show()
plt.close()

In [None]:
# Use generic plotting function
fig, ax, bk_norm2, ss_norm2  = plot_background_norm_inhibition(tser, bkvecser, 
                                            yser, norm_fct=squared_norm, skp=50)
fig.set_size_inches(5., 2.5)
# Add lines at predicted average
ax.axhline(s_norm2_predict, color="y", ls="--", 
           label=r"Predicted norm${}^2$" + " ({:.1f} %)".format(reduct_factor_predict**2*100), lw=3.)
ax.axhline(np.mean(ss_norm2[transient:]), ls="--", color="xkcd:pink", 
               label=r"Numerical norm${}^2$", lw=1.5)
ax.annotate("{:.1f} %".format(reduct_factor_numeric2*100), xy=(tser[-1]/1000, s_norm2_predict*10), 
           ha="right", va="bottom", xycoords="data")
ax.legend(bbox_to_anchor=(1, 1), loc="upper left", fontsize=9)
ax.set_ylabel("Activity vector squared norm")

fig.tight_layout()
#fig.savefig("figures/two_odors/inhibition_gaussian_background_squarednorm_2odors.pdf", 
#    transparent=True, bbox_inches="tight")
plt.show()
plt.close()

In [None]:
fig, ax = plt.subplots()
tslice = slice(transient, int(duration/deltat), 100)
ax.scatter(bkvecser[tslice, 0], bkvecser[tslice, 1], color="xkcd:light red", label="Un-inhibited")
ax.scatter(yser[tslice, 0], yser[tslice, 1], color="xkcd:burgundy", label="Inhibited")
ax.plot(0, 0, "ks", ms=8)
ax.legend()
ax.set(xlabel="Dim. 1", ylabel="Dim. 2")
fig.set_size_inches(4, 3)
fig.tight_layout()
# fig.savefig("figures/two_odors/ibcm_inhibition_2d_scatter.pdf", transparent=True, bbox_inches="tight")
plt.show()
plt.close()

# Supplementary analysis of the model
Here is collected relevant code from older notebooks. 

## Convergence time scales
See Gautam's notes for details. 

In [None]:
from modelfcts.ibcm import integrate_ibcm, integrate_ibcm_network

In [None]:
def analytical_convergence_times_2d(init_m_sd, norms2_x_sd, mu, sigm2, alph=0.9):
    """ Predict times for m_s and m_d to converge to fixed points. 
    Valid for small sigma^2, when we expect m_s to converge before m_d
    If sigma^2 or the initial value of m_d are too large, the prediction
    for ts is still good probably, but for td it will be bad, 
    because we assumed that m_s reaches its steady-state value of 1
    to compute the time td. 
    Args:
        init_m_sd (list of 2 floats): initial value of m.x_s and m.x_d
        norms2_x_sd (list of 2 floats): squared norm of x_s and x_d
        mu (float): learning rate
        sigm2 (float): variance of nu
    Returns:
        ts (float): time for m_s to reach steady-state
        td (float): time for m_d to reach steady-state, 
            assuming m_s reached steady-state much faster. 
    """
    ts = (1.0/init_m_sd[0] - 1.0) + np.log(alph*(1.0 - init_m_sd[0]) / (1.0 - alph) / init_m_sd[0])
    ts /= mu * norms2_x_sd[0]
    # Time to converge to 90 %
    sig = np.sqrt(sigm2)
    #td = np.log(alph*np.sqrt(1.0 - sigm2*init_m_sd[1]**2)/(sig*init_m_sd[1]*np.sqrt(1-alph**2)))
    td = np.log(alph / np.sqrt(sigm2) / init_m_sd[1])
    td = td / (mu * norms2_x_sd[1]*sigm2) + ts
    return ts, td

In [None]:
def find_convergence_time(tpts, mds, mdd, sigm2, alph=0.9):
    """ 
    tpts: time points
    mds: time series of m \cdot \vec{x}_s
    mdd: time series of m \cdot \vec{x}_d
    sigm2: variance of nu
    """
    # Check when mds reaches close to 1 (analytical ss value)
    # and when mdd reaches close to \pm 1 / sigma
    ts = tpts[np.argmax(mds > alph)]
    td = tpts[np.argmax(np.abs(mdd) > alph / np.sqrt(sigm2))]
    return ts, td

In [None]:
def example_convergence_time(seed=53235417):
    input_vecs = np.asarray([[1, 0.25], [0.25, 1]]) / np.sqrt(1**2+0.25**2)
    back_components = input_vecs
    tmax = 160000
    ## Compute the coefficients in the Ornstein-Uhlenbeck update equation
    sigma2_nu = 0.09
    tau_nu = 2.0  # Fluctuation time scale of the background nu_alphas (same for all)
    deltat = 1.0
    average_nu = 0.0
    learning_mu = 0.001
    tau_theta = 200
    update_coefs_mean = np.exp(-deltat/tau_nu)
    update_coefs_noise = np.sqrt(sigma2_nu*(1 - np.exp(-2*deltat/tau_nu)))
    init_back_vec = np.sum(input_vecs, axis=0) / 2

    bk_update_params = [average_nu, update_coefs_mean, update_coefs_noise, back_components]
    
    
    # Show an example
    m_init = np.asarray([0.05, 0.025])
    tser, mser, nuser, cser, _, bkvecser = integrate_ibcm(m_init, update_ou_2inputs, 
                        [np.zeros(1), init_back_vec], bk_update_params, tmax, deltat, 
                        learnrate=learning_mu, seed=seed, noisetype="normal", tavg=tau_theta)
    
    mdota_ser = np.dot(mser, input_vecs[0])
    mdotb_ser = np.dot(mser, input_vecs[1])
    mdots_ser = 0.5*(mdota_ser + mdotb_ser)
    mdotd_ser = mdota_ser - mdotb_ser

    initial_mdotsd = [mdots_ser[0], mdotd_ser[0]]
    norms_vecsd = [0.25*np.sum((input_vecs[0] + input_vecs[1])**2), np.sum((input_vecs[0] - input_vecs[1])**2)]
    predict_ts, predict_td = analytical_convergence_times_2d(initial_mdotsd, norms_vecsd, learning_mu, sigma2_nu)
    print(predict_ts, predict_td)
    print(find_convergence_time(tser, mdots_ser, mdotd_ser, sigma2_nu))
    
    fig, axes = plt.subplots(1, 2)
    axes = axes.flatten()
    ax = axes[0]
    # Plot dot products with vectors x_a and x_b
    skp = 100
    ax.plot(tser[::skp], mdota_ser[::skp], color="red", label=r"$\vec{m}(t) \cdot \vec{x}_a$")
    ax.plot(tser[::skp], mdotb_ser[::skp], color="pink", label=r"$\vec{m}(t) \cdot \vec{x}_b$")

    # Analytical predictions
    analytical_mdots = [1.0 + 1.0 / (2.0 * np.sqrt(sigma2_nu)), 1.0 - 1.0 / (2.0 * np.sqrt(sigma2_nu))]
    ax.axhline(analytical_mdots[0], label=r"$1 + 1/(2 \sigma)$", ls="--", color="k")
    ax.axhline(analytical_mdots[1], label=r"$1 - 1/(2 \sigma)$", ls="-.", color="k")
    ax.set(xlabel="Time", ylabel=r"Dot products $\vec{m} \cdot \vec{x}_{a, b}$")

    # Plot dot products with x_s and x_d
    ax = axes[1]
    ax.plot(tser[::skp], mdotd_ser[::skp], color="orange", label=r"$\vec{m}(t) \cdot \vec{x}_d$")
    ax.plot(tser[::skp], mdots_ser[::skp], color="blue", label=r"$\vec{m}(t) \cdot \vec{x}_s$")
    # Analytical predictions
    analytical_mdots = [1.0 / np.sqrt(sigma2_nu), -1.0 / np.sqrt(sigma2_nu)]

    ax.axhline(analytical_mdots[0], label=r"$\pm 1/\sigma$", ls="-.", color="xkcd:orangey brown")
    ax.axhline(1.0, label=r"$\vec{m} \cdot \vec{x}_s = 1$", color="xkcd:marine", ls="--")
    ax.set(xlabel="Time", ylabel=r"Dot products $\vec{m} \cdot \vec{x}_{s, d}$")
    # Convergence times predicted
    for ax in axes:
        ax.axvline(predict_ts, color="cyan", ls=":", label=r"Conv. time $m_s$")
        ax.axvline(predict_td, color="orange", ls="--", label=r"Conv. time $m_d$")
        ax.legend(fontsize=8)

    for ax in axes:
        ax.set_xticks([0, 40000, 80000, 120000, 160000])
    fig.set_size_inches(7.5, 3.)
    fig.tight_layout()
    return fig, axes

In [None]:
fig, axes = example_convergence_time(seed=53235417)
# fig.savefig("figures/two_odors/ibcm_neuron_2d_convergence_time_example.pdf", transparent=True)
plt.show()
plt.close()

In [None]:
def scaling_convergence_time(n_tries=2, seed_sequence=None):
    # Check how ts and td scale as a function of initial x_d and x_s
    # n_tries x n_tries values of initial x_s and x_d are tried
    # Plot t_d - t_s, which should depend on eps_d only once this leading
    # order behaviour of t_s is removed. 
    
    # Random number generation business
    if seed_sequence is None:
        seed_sequence = np.random.SeedSequence()
    all_seeds = list(seed_sequence.spawn(n_tries*n_tries))
    
    # Initialize parameters
    input_vecs = np.asarray([[1, 0.25], [0.25, 1]]) / np.sqrt(1**2+0.25**2)
    back_components = input_vecs
    tmax = 160000
    # Compute the coefficients in the Ornstein-Uhlenbeck update equation
    sigma2_nu = 0.09
    tau_nu = 2.0  # Fluctuation time scale of the background nu_alphas (same for all)
    deltat = 1.0
    average_nu = 0.0
    learning_mu = 0.001
    tau_theta = 200
    update_coefs_mean = np.exp(-deltat/tau_nu)
    update_coefs_noise = np.sqrt(sigma2_nu*(1 - np.exp(-2*deltat/tau_nu)))
    init_back_vec = np.sum(input_vecs, axis=0) / 2

    bk_update_params = [average_nu, update_coefs_mean, update_coefs_noise, back_components]

    # Loop over pairs of x_s, x_d values
    epss_axis = np.geomspace(0.02, 0.2, n_tries)
    epsd_axis = np.geomspace(0.01, 0.1, n_tries)
    ts_grid = np.zeros([2, n_tries, n_tries])  # Should only depend on epss
    td_grid = np.zeros([2, n_tries, n_tries])  
    x_s, x_d = np.sum(input_vecs, axis=0)/2, input_vecs[0] - input_vecs[1]    
    norms_vecsd = [np.sum(x_s**2), np.sum(x_d**2)]
    
    for i, epss in enumerate(epss_axis):
        for j, epsd in enumerate(epsd_axis):
            # Combine x_s and x_d to form initial m vector (x_s, x_d are orthogonal, so easy)
            m_init = epss * x_s / norms_vecsd[0] + epsd * x_d / norms_vecsd[1]
            tser, mser, nuser, cser, _, bkvecser = integrate_ibcm(m_init, update_ou_2inputs, 
                        [np.zeros(1), init_back_vec], bk_update_params, tmax, deltat, 
                        learnrate=learning_mu, seed=all_seeds.pop(), noisetype="normal", tavg=tau_theta)
    
            mdota_ser = np.dot(mser, input_vecs[0])
            mdotb_ser = np.dot(mser, input_vecs[1])
            mdots_ser = 0.5*(mdota_ser + mdotb_ser)
            mdotd_ser = mdota_ser - mdotb_ser

            initial_mdotsd = [mdots_ser[0], mdotd_ser[0]]  # Should equal epss, epsd
    
            ts_grid[0, i, j], td_grid[0, i, j] = analytical_convergence_times_2d(initial_mdotsd, norms_vecsd, learning_mu, sigma2_nu)
            ts_grid[1, i, j], td_grid[1, i, j] = find_convergence_time(tser, mdots_ser, mdotd_ser, sigma2_nu)
        print("Completed {} points at eps_s = {}".format(n_tries, epss))
    
    return [epss_axis, epsd_axis], [ts_grid, td_grid]

def plot_time_analysis_results(eps_axes, time_grids):
    n_tries = eps_axes[0].size
    epss_axis, epsd_axis = eps_axes
    ts_grid, td_grid = time_grids
    #print("t_s results:\n", ts_grid)
    #print("t_d results:\n", td_grid)
    
    # Plot results
    fig, axes = plt.subplots(1, 2)
    axes = axes.flatten()
    # ts as a function of epss; should not depend on eps_d
    ax = axes[0]
    colors = sns.color_palette("mako", n_colors=n_tries)
    for j in range(epsd_axis.size):
        ax.plot(epss_axis, ts_grid[1, :, j], color=colors[j], ls="--",
                label=r"$\epsilon_d = {:.3f}$".format(epsd_axis[j]), marker="o")
    # Plot theory, independent of eps_d
    ax.plot(epss_axis, ts_grid[0, :, 0], label="Analytical", color="k", ls="-", lw=2.0)
    ax.set(xlabel=r"$\epsilon_s$ (initial $m_s = \vec{m} \cdot \vec{x}_s$)", 
           ylabel=r"$t_s$ ($\vec{m} \cdot \vec{x}_s$ convergence time)", 
           title="First phase", xscale="log", yscale="log")
    ax.legend(title="Simulation", fontsize=9)
    
    # td - ts as a function of epsd
    ax = axes[1]
    colors = sns.color_palette("flare", n_colors=n_tries)
    # Plot simulations
    for i in range(epss_axis.size):
        ax.plot(epsd_axis, td_grid[1, i] - ts_grid[1, i], color=colors[i], ls="--",
                label=r"$\epsilon_s = {:.3f}$".format(epss_axis[i]), marker="o")
    # Plot theory, independent of eps_s
    ax.plot(epsd_axis, td_grid[0, 0] - ts_grid[0, 0], label="Analytical", color="k", ls="-", lw=2.0)
    ax.set(xlabel=r"$\epsilon_d$ (initial $m_d = \vec{m} \cdot \vec{x}_d$)", 
           ylabel=r"$t_d$ ($\vec{m} \cdot \vec{x}_d$ convergence time)", 
           title="Second phase", xscale="log", yscale="log")
    ax.legend(title="Simulation", fontsize=9)
    
    fig.set_size_inches(7.5, 3.5)
    fig.tight_layout()
    return fig, axes

In [None]:
eps_axes, time_grids = scaling_convergence_time(n_tries=4, 
                            seed_sequence=np.random.SeedSequence(seed_from_gen(rgen_meta)))

In [None]:
fig, axes = plot_time_analysis_results(eps_axes, time_grids)
# fig.savefig("figures/two_odors/ibcm_neuron_2d_convergence_time_scaling.pdf", transparent=True)
plt.show()
plt.close()

## Distribution of IBCM neurons to opposite fixed points

In [None]:
from modelfcts.ibcm import integrate_ibcm_network

In [None]:
def study_origin_basins_2neurons(init_ampli, bk_sigma2, bk_vecs, learnrate=0.02, seed0=14345124, 
                          tavg=20, coupling=0.1, nsimuls=100):
    """ Function to produce a 2D map of the basins of attraction of fixed points for neuron 2, 
    when the initial conditions for neuron 1 are kept constant. Ideally, start m_1 close to its 
    fixed point, so we have the basins for neuron 2 in a slice through the fixed points of neuron 1. 
    
    seed0 is for the first simulation done, other initial points on the grid have a different seed. 
    
    Uses the update function update_ou_2inputs, nu with mean 0.5, initial value 0.5, normal noise, 
    so we don't have to input those parameters. 
    tmax and dt are also chosen in the function, no need to input. 
    
    Returns a 1darray of pearson correlation coefficients and a 3d array of initial vectors
    """
    # Default simulation parameters
    tmax = 7000
    transient = 2500
    dt = 1
    nu_init = 0.5*np.ones(1)
    back_init = [nu_init, (0.5+nu_init)*back_components[0] + (0.5-nu_init)*back_components[1]]
    nu_mean = 0.5
    tau_nu = 2
        
    # Compute the coefficients in the O-U update rule
    update_coefs_mean_loc = np.exp(-dt/tau_nu)
    update_coefs_noise_loc = np.sqrt(bk_sigma2*(1 - np.exp(-2*dt/tau_nu)))
    # Mean of nu is zero
    bk_update_params_loc = [0.0, update_coefs_mean_loc, update_coefs_noise_loc, bk_vecs]

    # Set up the grid of initial m2 vectors: find the two fixed points and try the 
    # rectangle with a corner at (0, 0) and sides extending from 0 to 1.5/(1-eta) times the max
    # component in each direction. We use the fixed points for a single neuron, 
    # because the components of fixed points for coupled neurons change by 1/(1 \pm eta) at most.
    norma = np.sum(bk_vecs[0]**2)
    normb = np.sum(bk_vecs[1]**2)
    c_plus, c_minus = 1 + 1/(2*np.sqrt(bk_sigma2)), 1 - 1/(2*np.sqrt(bk_sigma2))
    overlap = bk_vecs[0].dot(bk_vecs[1])
    m_fixplus =  (c_plus*normb - overlap*c_minus)/(norma*normb-overlap**2)*bk_vecs[0]
    m_fixplus += (c_minus*norma - overlap*c_plus)/(norma*normb-overlap**2)*bk_vecs[1]
    m_fixminus =  (c_minus*normb - overlap*c_plus)/(norma*normb-overlap**2)*bk_vecs[0]
    m_fixminus += (c_plus*norma - overlap*c_minus)/(norma*normb-overlap**2)*bk_vecs[1]
    
    # Run nsimuls simulations with a different random noise every time.
    rgen_meta = np.random.default_rng(seed=((seed0*52441435) % 1239025234))
    pearson_container = np.zeros(nsimuls)
    init_vecs_container = np.zeros([nsimuls, 2, 2])
    for i in range(nsimuls):
        init_vecs_container[i] = init_ampli * rgen_meta.normal(size=4).reshape(2, 2)
        seedi = seed0 + i**2
        result = integrate_ibcm_network(init_vecs_container[i], update_ou_2inputs, 
                        back_init, bk_update_params_loc, tmax, dt, 
                        learnrate=learnrate, seed=seedi, noisetype="normal", 
                        tavg=tavg, coupling=coupling)

        pearson_container[i] = np.corrcoef(result[-3][transient:].T)[0, 1]  # off-diagonal element of the correl matrix
    
    return pearson_container, init_vecs_container

In [None]:
def plot_run_statistics(correls, all_init_vecs):
    fig, axes = plt.subplots(1, 2)
    axes = axes.flatten()
    ax = axes[0]
    ax.hist(correls)
    ax.set(xlabel="Pearson correlation", ylabel="Number of simulations")

    print(np.count_nonzero(correls < 0) / correls.size)
    # Study those cases where both went to same fixed point
    args_where_same, = np.nonzero(correls > 0)
    ax = axes[1]
    if len(args_where_same) > 0:
        max_comp = np.amax(np.abs(all_init_vecs[args_where_same])) * 1.05
    else:
        plt.show()
        plt.close()
        return 1
    ax.set_xlim([-max_comp, max_comp])
    ax.set_ylim([-max_comp, max_comp])
    colors = sns.color_palette(n_colors=len(args_where_same))
    for i in range(len(args_where_same)):
        vecs = all_init_vecs[args_where_same[i]]
        arrowprops = dict(color=colors[i], arrowstyle="->")
        ax.annotate("", xy=vecs[0], xytext=(0, 0), arrowprops=arrowprops)
        ax.annotate("", xy=vecs[1], xytext=(0, 0), arrowprops=arrowprops)
        ax.plot(vecs[0, 0:1], vecs[0, 1:2], ls="none", marker="o", alpha=0.0)
        ax.plot(vecs[1, 0:1], vecs[1, 1:2], ls="none", marker="o", alpha=0.0)
    ax.set_aspect("equal")
    ax.set_title("Initial vectors when both neurons\nconverge to same fixed point")
    fig.set_size_inches(6, 3.5)
    fig.tight_layout()
    return fig, ax

In [None]:
coupling_eta = 0.2
correl_samples, initvecs_samples = study_origin_basins_2neurons(0.01, sigma2_nu, back_components, 
            learnrate=0.02, seed0=14124779, tavg=20, coupling=coupling_eta, nsimuls=50)

In [None]:
fig, ax = plot_run_statistics(correl_samples, initvecs_samples)
# fig.savefig("figures/two_odors/two_ibcm_neurons_opposite_fixed_pts_statistics.pdf", transparent=True)
plt.show()
plt.close()