# Habituation with PCA neurons in the inhibitory layer
The details of the model are described in other Jupyter notebooks (e.g., biopca_inhibition_three_components.ipynb). The ultimate goal here is to include this model in the full olfactory network down to Kenyon cells, apply it to increasingly realistic olfactory backgrounds and estimate its performance at 1) inhibiting the fluctuating background, and 2) still recognizing new odors. 

## Functions of general interest

In [None]:
import numpy as np
import scipy as sp
from scipy import stats
import matplotlib.pyplot as plt
import matplotlib as mpl
import seaborn as sns

from modelfcts.biopca import (
    integrate_inhib_ifpsp_network_skip, 
    relu_inplace, 
    biopca_respond_new_odors as response_new_odors
)

# Offline PCA for comparison
from utils.statistics import principal_component_analysis, seed_from_gen

from modelfcts.backgrounds import (update_logou_kinputs, update_thirdmoment_kinputs, 
                                   decompose_nonorthogonal_basis, logof10,  update_alternating_inputs)
from modelfcts.tagging import project_neural_tag, create_sparse_proj_mat
from simulfcts.plotting import (plot_3d_series, plot_w_matrix, plot_m_matrix, plot_pca_results,
                            plot_background_norm_inhibition, plot_background_neurons_inhibition)
from utils.metrics import frobnorm, subspace_align_error, jaccard
from modelfcts.checktools import compute_pca_meankept, compute_projector_series, analyze_pca_learning

## 1. Projection neuron layer
First, just check what happens with a new odor at the projection neuron layer. I will try the slightly non-gaussian noise case and 
 - a) symmetric components, 
 - b) non-symmetric (random, exponential elements) components. 

I will try both low dimensions and high dimensions, in the non-symmetric case. A high dimension helps finding nearly-orthogonal backgrounds and new odors. 

### 1.a) Symmetric odor vectors

In [None]:
### General simulation parameters
n_dimensions = 4  
# The larger the dimension, the more likely the odors are orthogonal. 
n_components = 3  # no need to look at super complicated odors for now; keep effective space 3D
# Can actually look at this latent space by Gram-Schmidt to find orthogonal axes spanning the input odors. 
n_neurons = 3  # Start small

# Simulation time scales
duration = 160000.0
deltat = 1.0
tau_nu = 2.0  # Correlation time scale of the background nu_gammas (same for all)
learnrate = 0.0005  # Learning rate of M
# Choose Lambda diagonal matrix as advised in Minden et al., 2018
lambda_range = 0.2
lambda_max = 5.0
# Learning rate of L, relative to learnrate. Adjusted to Lambda in the integration function
rel_lrate = 2.0  #  / lambda_max**2 
lambda_mat_diag = np.asarray([1.0 - lambda_range*k / (n_neurons - 1) for k in range(n_neurons)])
lambda_mat_diag *= lambda_max
biopca_rates = [learnrate, rel_lrate, lambda_max, lambda_range, learnrate]
inhib_rates = [1e-4, 2e-5]  # alpha, beta

# Initial synaptic weights: as advised in Minden et al., 2018 
rgen_meta = np.random.default_rng(seed=0x718e19927b12b8df4daaa66a2f0e6b76)
init_mmat = rgen_meta.standard_normal(size=[n_neurons, n_dimensions]) / np.sqrt(n_dimensions)
init_lmat = np.eye(n_neurons, n_neurons)  # Supposed to be near-identity, start as identity
ml_inits = [init_mmat, init_lmat]

# Choose three LI vectors in (+, +, +) octant: [0.8, 0.2, 0.2], [0.2, 0.8, 0.2], etc.
back_components = 0.2 * np.ones([n_components, n_dimensions])
for i in range(n_components):
    if i < n_dimensions:
        back_components[i, i] = 0.8
    else:  # If there are more components than there are dimensions (ORNs)
        back_components[i, i % n_dimensions] = 0.8 - i
    # Normalize
    back_components[i] = back_components[i] / np.sqrt(np.sum(back_components[i]**2))

# Initial background vector and initial nu values
averages_nu = np.ones(n_components) / np.sqrt(n_components)
init_nu = np.zeros(n_components)
init_bkvec = averages_nu.dot(back_components)
# nus are first in the list of initial background params
init_back_list = [init_nu, init_bkvec]

## Compute the matrices in the Ornstein-Uhlenbeck update equation
# Update matrix for the mean term: 
# Exponential decay with time scale tau_nu over time deltat
tau_nu = 2.0  # Fluctuation time scale of the background nu_gammas (same for all)
update_mat_A = np.identity(n_components)*np.exp(-deltat/tau_nu)

# Steady-state covariance matrix
sigma2 = 0.09
correl_rho = 0.0
epsilon_nu = 0.2
steady_covmat = correl_rho * sigma2 * np.ones([n_components, n_components])  # Off-diagonals: rho
steady_covmat[np.eye(n_components, dtype=bool)] = sigma2  # diagonal: ones

# Cholesky decomposition of steady_covmat gives sqrt(tau/2) B
# Update matrix for the noise term: \sqrt(tau/2(1 - exp(-2*deltat/tau))) B
# The sqrt(tau/2) is already included in the Cholesky of the covariance matrix; if is if we
# wanted the B matrix of the O-U equation itself that we should divide the Cholesky by that factor. 
psi_mat = np.linalg.cholesky(steady_covmat)
update_mat_B = np.sqrt(1.0 - np.exp(-2.0*deltat/tau_nu)) * psi_mat

back_params_3 = [update_mat_A, update_mat_B, back_components, averages_nu, epsilon_nu]

In [None]:
# Run simulation
sim_results = integrate_inhib_ifpsp_network_skip(ml_inits, update_thirdmoment_kinputs, 
                        init_back_list, biopca_rates, inhib_rates, back_params_3, duration, 
                        deltat, seed=seed_from_gen(rgen_meta), noisetype="normal")
# tseries, bk_series, bkvec_series, m_series, cbar_series, w_series, s_series
tser3, nuser3, bkvecser3, mser3, lser3, cbarser3, wser3, sser3 = sim_results

### Plotting the time course of the different neurons

In [None]:
skp = 100
tini = 0
tmx = int(duration)
tslice = slice(tini, tmx, skp)
fig, ax = plt.subplots()
w1_palette = sns.color_palette("Blues", n_colors=n_neurons)
w2_palette = sns.color_palette("Purples", n_colors=n_neurons)
w3_palette = sns.color_palette("Greens", n_colors=n_neurons)
#ax.plot(tser3[tslice], bkvecser3[tslice, 0], color="pink", alpha=0.5)
#ax.plot(tser3[tslice], bkvecser3[tslice, 1], color="blue", alpha=0.5)
#ax.plot(tser3[tslice], bkvecser3[tslice, 2], color="red", alpha=0.8)
#ax.plot(tser3[tslice], thetaser3[tslice, -1], color="orange", alpha=0.8)
for i in range(n_neurons-1):
    ax.plot(tser3[tslice], mser3[tslice, i, 0], color=w1_palette[i], alpha=0.8)
    ax.plot(tser3[tslice], mser3[tslice, i, 1], color=w2_palette[i], alpha=0.8)
    ax.plot(tser3[tslice], mser3[tslice, i, 2], color=w3_palette[i], alpha=0.8)
ax.plot(tser3[tslice], mser3[tslice, -1, 0], color=w1_palette[-1], label="Neuron Component 0", alpha=0.8)
ax.plot(tser3[tslice], mser3[tslice, -1, 1], color=w2_palette[-1], label="Neuron Component 1", alpha=0.8)
ax.plot(tser3[tslice], mser3[tslice, -1, 2], color=w3_palette[-1], label="Neuron Component 2", alpha=0.8)

ax.set(xlabel="Time", ylabel="Inhibition neurons components")
ax.legend()
plt.show()
plt.close()

## Compare to offline PCA
Check that the model is doing the best it can for non-gaussian inputs. 

Let $X$ be a matrix with each input sample in a column. According to Lemma 3 from Minden et al., 2018:
 - $L$ is diagonal with the $K$ first principal values, that is, the $K$ first eigenvalues of $XX^T$, on its diagonal
 - $\hat{U}_K = \Lambda^{-1} L^{-1} M$ is the learnt projector on the $K$ subspace: rows of $\hat{U}_K$ are the first $K$ principal components. 
 - Note that in the ifPSP algorithm, the Taylor series approximation $L^{-1} = L_d^{-1} - L_d^{-1} L_o L_d^{-1}$ is used. 

In [None]:
res = analyze_pca_learning(bkvecser3, mser3, lser3, lambda_mat_diag)
true_pca, learnt_pca, fser, off_diag_l_avg_abs, align_error_ser = res

In [None]:
fig, axes = plot_pca_results(tser3, true_pca, learnt_pca, align_error_ser, off_diag_l_avg_abs)
plt.show()
plt.close()

## Evolution of the inhibitory neurons' weights $\vec{w}_i$
Analytically, I find that, on average, $\vec{w}_i$ converges to $\vec{x}(\pm \sigma)$, i.e. to either input vector one standard deviation away from the mean input. So, here, I compare the numerical results for $\vec{w}$ to the possible fixed points. 

In [None]:
# Plotting the time course of the dot products -- not interesting with gaussian degeneracy
# Unclear what it shows. 
fig, axes = plot_w_matrix(tser3, wser3, skp=20, lw=1.5)
        
plt.show()
plt.close()

In [None]:
fig, ax, bknorm_ser, snorm_ser = plot_background_norm_inhibition(tser3, bkvecser3, sser3, skp=10)
skp_sym = 1
# Compute noise reduction factor, annotate
transient = 50000
avg_bknorm = np.mean(bknorm_ser[transient:])
avg_snorm = np.mean(snorm_ser[transient:])
avg_reduction_factor = avg_snorm / avg_bknorm
std_bknorm = np.std(bknorm_ser[transient:])
std_snorm = np.std(snorm_ser[transient:])
std_reduction_factor = std_snorm / std_bknorm

print("Mean activity norm reduced to "
      + "{:.1f} % of input".format(avg_reduction_factor * 100))
print("Standard deviation of activity norm reduced to "
      + "{:.1f} % of input".format(std_reduction_factor * 100))
ax.annotate("St. dev. reduced to {:.1f} %".format(std_reduction_factor * 100), 
           xy=(0.98, 0.98), xycoords="axes fraction", ha="right", va="top")

ax.legend(loc="center right", bbox_to_anchor=(1.0, 0.8))
fig.tight_layout()
plt.show()
plt.close()

In [None]:
fig, axes_mat, axes = plot_background_neurons_inhibition(tser3, bkvecser3, sser3, skp=10)
axes[-1].legend(loc="center right", bbox_to_anchor=(1.0, 0.6), fontsize=8, handlelength=1.5)
fig.tight_layout()

# Compute noise reduction factor, annotate
transient = 100000 // skp_sym
avg_bknorm = np.mean(bkvecser3[transient:])
avg_snorm = np.mean(sser3[transient:])
avg_reduction_factor = avg_snorm / avg_bknorm
std_bknorm = np.std(bkvecser3[transient:])
std_snorm = np.std(sser3[transient:])
std_reduction_factor = std_snorm / std_bknorm

print("Mean activity of a projection neuron reduced to "
      + "{:.1f} % of input".format(avg_reduction_factor * 100))
print("Standard deviation of a projection neuron's activity reduced to "
      + "{:.1f} % of input".format(std_reduction_factor * 100))

plt.show()
plt.close()

In [None]:
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
# Plot a few points for each neuron
trst = 5000
skp = 500
ax.plot(0, 0, 0, color="k", marker="o", ls="none", ms=12)
colors = sns.color_palette("magma", n_colors=n_neurons)
for i in range(n_neurons):
    ax.scatter(mser3[trst::skp, i, 0], mser3[trst::skp, i, 1], mser3[trst::skp, i, 2], 
               alpha=0.5, color=colors[i], label="Neuron {}".format(i))

# Annotate with vectors representing the odor components
orig = np.zeros([n_dimensions, n_components])
scale = 0.3
vecs = back_components.copy()
for i in range(n_components):
    vecs[i] = back_components[i] / np.sqrt(np.sum(back_components[i]**2)) * scale
ax.quiver(*orig[:3], *(vecs[:, :3].T), color="k", lw=2.0)
ax.view_init(azim=45, elev=30)
# ax.view_init(azim=45, elev=140)
# ax.legend()
plt.show()
plt.close()

In [None]:
projectorser3 = compute_projector_series(mser3, lser3)
projectorser3 = np.einsum("...ij,...jk", wser3, projectorser3)
projectionser3 = np.einsum("...ij,...j", projectorser3, bkvecser3)
subtracted_series = bkvecser3 - projectionser3

fig, axes = plt.subplots(2, n_dimensions // 2 + min(1, n_dimensions % 2), sharex=True, sharey=True)
axes = axes.flatten()
for i in range(n_dimensions):
    axes[i].plot(tser3/1000, subtracted_series[:, i], color="k")
    axes[i].set(xlabel="Time (x1000 steps)", 
                ylabel=r"$\vec{x} - LM\vec{x}$, " + "ORN {:d}".format(i))
fig.tight_layout()
plt.show()
plt.close()

### Response to a new odor
This part of the code only runs if the simulation above had ``n_dimensions > n_components``. 

The goal is to see whether a new odor, not linearly dependent of the ones in the background, also gets repressed close to zero, or produces an inhibited output noticeably different from the inhibited background, and still similar to the new odor vector, at least its component perpendicular to the background subspace. 

Need to test for many samples from the background odor distribution. Keep the new odor at a constant concentration, typical of the concentration at which we actually want the system to pick up the new odor. 

I realize that it's fine if the disentanglement of odors isn't perfect at the PN layer: besides the question of habituation, the sparse tag network proposed by Dasgupta does not address too well how multiple odors are disentangled from a complicated mixture. 

In [None]:
def respond_new_odors(odors, typical_l, typical_m, typical_w):
    """ 
    Args:
        odors (np.ndarray): indexed [..., n_orn] 
            so can take dot product properly with m and store many 
            odors along arbitrary other axes.
        typical_m (np.ndarray): indexed [n_neurons, n_orn]
        typical_w (np.ndarray): indexed [n_orn, n_neurons]
    """
    # Compute L_d inverse and L_o
    inv_l_diag = 1.0 / np.diagonal(typical_l)
    l_off = typical_l - np.diagflat(1.0 / inv_l_diag)
    
    # Compute activation of neurons
    c = inv_l_diag * (odors.dot(typical_m.T))
    # Lateral inhibition between neurons
    cbar = c - inv_l_diag*c.dot((typical_l - np.diagflat(1.0/inv_l_diag)).T)
    # Compute output after inhibition
    new_output = relu_inplace(odors - cbar.dot(typical_w.T))  # s = x - Wc
    return new_output

In [None]:
from utils.metrics import l2_norm, l1_norm, linf_norm, cosine_dist
def distance_panel_target(mixes, target):
    """ Compute a panel of distances between the pure (target) new odor and mixtures 
    (which can be without inhibition, with average inhibition, biopca inhibition, etc.). 
    
    Four distances included, in order: l2, l1, linf, cosine_dist
    
    Args:
        mixes (np.ndarray): mixtures of odors to compute distance from target, 
            the last axis should have the size of target, 
            while other axes are arbitrary.  
        target (np.1darray): target odor vector, same length as
            last axis of mixes. 
    Returns:
        dist_panel (np.ndarray): shape of pure, except the last axis, 
            which has length 4 (for the number of distances computed). 
    """
    # Make axis 0 the axis indexing distance metrics, to begin with
    # And move it to the last axis before returning
    dist_array = np.zeros([4] + list(mixes.shape[:-1]))
    # No need to add axes to target vector; if it is 1d, it is broadcasted
    # along the last axis of mixes, which indexes elements of each vector. 
    dist_array[0] = l2_norm(target - mixes)
    dist_array[1] = l1_norm(target - mixes)
    dist_array[2] = linf_norm(target - mixes)
    dist_array[3] = cosine_dist(target, mixes)
    
    return np.moveaxis(dist_array, 0, -1)

In [None]:
from modelfcts.backgrounds import sample_background_thirdmoment

In [None]:
# Statistics of improvement of recognition
# New odor
new_odor = np.roll(back_components[0], shift=-1)  # Should be a new vector

# Background samples, then add new odor
mix_samples = sample_background_thirdmoment(averages_nu, steady_covmat, epsilon_nu, back_components, 
                                                  size=1000, rgen=rgen_meta)
mix_frac = 0.5
mix_samples += mix_frac*new_odor.reshape(1, -1)

# Compare to inhibition of the average background
avg_back = averages_nu.dot(back_components)
a_over_2ab = inhib_rates[0] / (sum(inhib_rates) + inhib_rates[0])
inhib_avg_samples = mix_samples - a_over_2ab * avg_back.reshape(1, -1)

# Average m and w with which we will inhibit
l3_mean = np.mean(lser3[transient:], axis=0)
m3_mean = np.mean(mser3[transient:], axis=0)
w3_mean = np.mean(wser3[transient:], axis=0)

# Inhibition of each generated sample and statistics on performance
inhib_biopca_samples = respond_new_odors(mix_samples, l3_mean, m3_mean, w3_mean)

dist_pure_inhib_none = distance_panel_target(mix_samples, mix_frac*new_odor)
dist_pure_inhib_avg = distance_panel_target(inhib_avg_samples, mix_frac*new_odor)
dist_pure_inhib_biopca = distance_panel_target(inhib_biopca_samples, mix_frac*new_odor)

median_distances_none = np.median(dist_pure_inhib_none, axis=0)
median_distances_avg = np.median(dist_pure_inhib_avg, axis=0)
median_distances_biopca = np.median(dist_pure_inhib_biopca, axis=0)

In [None]:
# Histogram of distance to pure odor, for each distance
# Overlay histogram for mix without and with inhibition
fig, axes = plt.subplots(2, 2)
axes = axes.flatten()
clr_none = "xkcd:navy blue"
clr_biopca = "xkcd:turquoise"
clr_avg = "xkcd:orangey brown"
dist_names = [r"$L^2$ distance", r"$L^1$ distance", r"$L^{\infty}$ distance", "Cosine distance"]
for i, ax in enumerate(axes):
    ax.hist(dist_pure_inhib_none[:, i], label="No inhibition", facecolor=clr_none, alpha=0.6, 
        edgecolor=clr_none, density=True)
    ax.axvline(median_distances_none[i], color=clr_none, ls="--", lw=1.0)
    ax.hist(dist_pure_inhib_avg[:, i], label="Average inhibition", facecolor=clr_avg, alpha=0.6, 
        edgecolor=clr_avg, density=True)
    ax.axvline(median_distances_avg[i], color=clr_avg, ls="--", lw=1.0)
    ax.hist(dist_pure_inhib_biopca[:, i], label="biopca inhibition", facecolor=clr_biopca, alpha=0.6, 
        edgecolor=clr_biopca, density=True) 
    ax.axvline(median_distances_biopca[i], color=clr_biopca, ls="--", lw=1.0)
    ax.set(xlabel="Distance to new odor", ylabel="Probability density", title=dist_names[i])
axes[0].legend()
fig.tight_layout()
plt.show()
plt.close()

#### Metric to measure the quality of the inhibition
Distance to new odor alone? Compare to un-inhibited mixture?

Ultimately, will compute sparse binary neural tag and compare with Jaccard metric, but for now, I want to avoid this complication, which requires using many more dimensions than 4. I might keep this for a separate notebook (or even C code if it seems to work well). 

### 1.b) Random odor vectors

In [None]:
# Realistic model of olfactory receptor activation patterns:
# each component is i.i.d. exponential
def generate_odorant(n_rec, rgen, lambda_in=0.1):
    """ Generate vectors eta and kappa^-1 for an odorant, with antagonism parameter rho. 
    
    Args:
        n_rec (int): number of receptor types, length of vectors
        rgen (np.random.Generator): random generate (numpy >= 1.17)
        lambda_in (float): lambda parameter of the exp distribution
            Equals the inverse of the average of each vector component
    Returns:
        kappa1_vec (np.ndarray): 1d vector of receptor activities
    """
    return rgen.exponential(scale=1.0/lambda_in, size=n_rec)

## 2. Complete model: sparse Kenyon cell tags for odors
We need to make new simulations with many more dimensions (ORN types). 

Consequently, to avoid running into memory issues, we only save a subset of time steps in the simulation: this is fine because we are only interested in the slowly-evolving $\vec{w}$ and $\vec{m}$, while we don't care too much for $\vec{x}$'s fast fluctuations. We just want the final average $\vec{w}$ to apply as inhibition to randomly sampled background odors, which we don't even take from simulations but just generate from the steady-state distribution. 

### Run a new simulation with 25 dimensions

In [None]:
### General simulation parameters
n_dimensions_tag = 25  # Half the real number for faster simulations
n_neurons = n_components

# Simulation rates and coupling stay the same (try at least)
duration = 160000.0
deltat = 1.0
tau_nu = 2.0  # Correlation time scale of the background nu_gammas (same for all)
learnrate = 0.001  # Learning rate of M
rel_lrate = 2.0  # Learning rate of L, relative to learnrate
# Choose Lambda diagonal matrix as advised in Minden et al., 2018
# but scale it up to counteract W regularization
lambda_range = 0.2
lambda_max = 5.0
# Learning rate of L, relative to learnrate. Adjusted to Lambda in the integration function
rel_lrate = 2.0  #  / lambda_max**2 
lambda_mat_diag = np.asarray([1.0 - lambda_range*k / (n_neurons - 1) for k in range(n_neurons)])
lambda_mat_diag *= lambda_max
biopca_rates = [learnrate, rel_lrate, lambda_max, lambda_range, learnrate]

inhib_rates = [5e-5, 1e-5]  # alpha, beta

# Initial synaptic weights: as advised in Minden et al., 2018 
rgen_meta_tag = np.random.default_rng(seed=0xb65007421a888ebe4c0e83313ef84911)
init_mmat_tag = rgen_meta_tag.standard_normal(size=[n_neurons, n_dimensions_tag]) / np.sqrt(n_dimensions_tag)
init_lmat_tag = np.eye(n_neurons, n_neurons)  # Supposed to be near-identity, start as identity
ml_inits_tag = [init_mmat_tag, init_lmat_tag]

# Choose symmetric, normalized background odor components
#back_components_tag = np.ones([n_components, n_dimensions_tag]) * 0.2
#for i in range(n_components):
#    back_components_tag[i, i] = 0.8
#    back_components_tag[i] /= np.sqrt(np.sum(back_components_tag[i]**2))

# Choose randomly generated background vectors
back_components_tag = np.zeros([n_components, n_dimensions_tag])
for i in range(n_components):
    back_components_tag[i] = generate_odorant(n_dimensions_tag, rgen_meta_tag, lambda_in=0.1)
print(back_components_tag)
back_components_tag = back_components_tag / l2_norm(back_components_tag).reshape(-1, 1)

# Initial nu values stay the same
init_bkvec_tag = averages_nu.dot(back_components_tag)
# nus are first in the list of initial background params
init_back_list_tag = [init_nu, init_bkvec_tag]

# Update matrices for nu process stay the same
back_params_tag = [update_mat_A, update_mat_B, back_components_tag, averages_nu, epsilon_nu]

In [None]:
skp_tag = 20
sim_results = integrate_inhib_ifpsp_network_skip(ml_inits_tag, update_thirdmoment_kinputs, 
                        init_back_list_tag, biopca_rates, inhib_rates, back_params_tag, duration, 
                        deltat, seed=seed_from_gen(rgen_meta_tag), noisetype="normal", skp=skp_tag)
# tseries, bk_series, bkvec_series, m_series, cbar_series, w_series, s_series
tser_tag, nuser_tag, bkvecser_tag, mser_tag, lser_tag, cbarser_tag, wser_tag, sser_tag = sim_results

### Check the output a bit

In [None]:
skp = 50
fig, ax = plt.subplots()
w1_palette = sns.color_palette("Blues", n_colors=n_neurons)
w2_palette = sns.color_palette("Purples", n_colors=n_neurons)
w3_palette = sns.color_palette("Greens", n_colors=n_neurons)
for i in range(n_neurons-1):
    ax.plot(tser_tag[::skp], mser_tag[::skp, i, 0], color=w1_palette[i], alpha=0.8)
    ax.plot(tser_tag[::skp], mser_tag[::skp, i, 1], color=w2_palette[i], alpha=0.8)
    ax.plot(tser_tag[::skp], mser_tag[::skp, i, 2], color=w3_palette[i], alpha=0.8)
ax.plot(tser_tag[::skp], mser_tag[::skp, -1, 0], color=w1_palette[-1], label="Neuron Component 0", alpha=0.8)
ax.plot(tser_tag[::skp], mser_tag[::skp, -1, 1], color=w2_palette[-1], label="Neuron Component 1", alpha=0.8)
ax.plot(tser_tag[::skp], mser_tag[::skp, -1, 2], color=w3_palette[-1], label="Neuron Component 2", alpha=0.8)

ax.set(xlabel="Time", ylabel="Inhibition neurons components")
ax.legend()
plt.show()
plt.close()

In [None]:
res = analyze_pca_learning(bkvecser_tag, mser_tag, lser_tag, lambda_mat_diag)
true_pca, learnt_pca, fser, off_diag_l_avg_abs, align_error_ser = res

fig, axes = plot_pca_results(tser_tag, true_pca, learnt_pca, align_error_ser, off_diag_l_avg_abs)
plt.show()
plt.close()

### Check background statistics

In [None]:
print(np.mean(nuser_tag, axis=0))
print(np.mean(nuser_tag**2, axis=0))
print(np.mean(nuser_tag**3, axis=0))

In [None]:
projectorser_tag = compute_projector_series(mser_tag, lser_tag)
projectorser_tag = np.einsum("...ij,...jk", wser_tag, projectorser_tag)
projectionser_tag = np.einsum("...ij,...j", projectorser_tag, bkvecser_tag)
subtracted_series = bkvecser_tag - projectionser_tag

n_row = 5
n_col = n_dimensions_tag // n_row + min(1, n_dimensions_tag % n_row)
fig, axes = plt.subplots(n_row, n_col, sharex=True, sharey=True)
fig.set_size_inches(n_row*2.0, n_col*2.0)
axes = axes.flatten()
for i in range(n_dimensions_tag):
    axes[i].plot(tser_tag/1000, subtracted_series[:, i], color="k")
    if i // n_col == (n_dimensions_tag // n_col):
        axes[i].set_xlabel("Time (x1000 steps)")
    if i % n_col == 0:
        axes[i].set_ylabel(r"$\vec{x} - LM\vec{x}$")
    axes[i].set_title("ORN {:d}".format(i), y=0.85)
fig.tight_layout()
plt.show()
plt.close()

### Compute and compare projection tags after inhibition

In [None]:
### New odor, mix, and inhibit
### Repeat for many new odors (and ideally, should repeat for many backgrounds)
### But for now, assume all simulations would give similarly good inhibition. 
n_test_new_odors = 100
mix_frac = 0.2

# Average m and w with which we will inhibit
transient_tag = 100000 // skp_tag
ltag_mean = np.mean(lser_tag[transient_tag:], axis=0)
mtag_mean = np.mean(mser_tag[transient_tag:], axis=0)
wtag_mean = np.mean(wser_tag[transient_tag:], axis=0)

# Background samples, valid for all new test odors
back_samples_tag = sample_background_thirdmoment(averages_nu, steady_covmat, epsilon_nu, back_components_tag, 
                                                  size=100, rgen=rgen_meta_tag)
inhib_biopca_samples_tag = []
inhib_avg_samples_tag = []
mix_samples_tag = []
new_odor_targets = []
# Average background
avg_back_tag = averages_nu.dot(back_components_tag)
a_over_ab = inhib_rates[0] / sum(inhib_rates)
for i in range(n_test_new_odors):
    # New odor
    #new_odor_tag = np.roll(back_components_tag[0], shift=-1)  # Should be a new vector
    new_odor_tag = generate_odorant(n_dimensions_tag, rgen_meta_tag)
    new_odor_tag = new_odor_tag / l2_norm(new_odor_tag)
    new_odor_targets.append(new_odor_tag)

    mix_samples = back_samples_tag + mix_frac * new_odor_tag.reshape(1, -1)
    mix_samples_tag.append(mix_samples)
    
    # Compare to inhibition of the average background
    inhib_avg_samples = mix_samples - a_over_ab * avg_back_tag.reshape(1, -1)
    inhib_avg_samples_tag.append(inhib_avg_samples)

    # Inhibition of each generated sample and statistics on performance
    inhib_biopca_samples = respond_new_odors(mix_samples, ltag_mean, mtag_mean, wtag_mean)
    inhib_biopca_samples_tag.append(inhib_biopca_samples)

mix_samples_tag = np.asarray(mix_samples_tag)
inhib_avg_samples_tag = np.asarray(inhib_avg_samples_tag)
inhib_biopca_samples_tag = np.asarray(inhib_biopca_samples_tag)
new_odor_targets = np.asarray(new_odor_targets)

In [None]:
# Compute tags. This won't be great because too few dimensions to begin with, but try anyways. 
projtag_kwargs = dict(kc_sparsity=0.05, adapt_kc=True, n_pn_per_kc=6, fix_thresh=None)
proj_mat = create_sparse_proj_mat(n_kc=int(2000/50*n_dimensions_tag), n_rec=n_dimensions_tag, 
                        rgen=rgen_meta, fraction_filled=projtag_kwargs["n_pn_per_kc"]/n_dimensions_tag)

In [None]:
# Compute tags and Jaccard distances between target odor and mixture without or with inhibition
jaccards_inhib_none = []
jaccards_inhib_avg = []
jaccards_inhib_biopca = []
for i in range(mix_samples_tag.shape[0]):
    target_tag = project_neural_tag(new_odor_targets[i], new_odor_targets[i], proj_mat, **projtag_kwargs)
    for j in range(mix_samples_tag.shape[1]):
        tag_none = project_neural_tag(mix_samples_tag[i, j], mix_samples_tag[i, j], proj_mat, **projtag_kwargs)
        tag_avg = project_neural_tag(inhib_avg_samples_tag[i, j], mix_samples_tag[i, j], proj_mat, **projtag_kwargs)
        tag_biopca = project_neural_tag(inhib_biopca_samples_tag[i, j], mix_samples_tag[i, j], proj_mat, **projtag_kwargs)
        jaccards_inhib_none.append(jaccard(target_tag, tag_none))
        jaccards_inhib_avg.append(jaccard(target_tag, tag_avg))
        jaccards_inhib_biopca.append(jaccard(target_tag, tag_biopca))


In [None]:
# Histograms of Jaccard similarities: larger similarity is better
fig, ax = plt.subplots()
clr_none = "xkcd:navy blue"
clr_biopca = "xkcd:turquoise"
clr_avg = "xkcd:orangey brown"

ax.hist(jaccards_inhib_none, label="No inhibition", facecolor=clr_none, alpha=0.6, 
        edgecolor=clr_none, density=True)
ax.axvline(np.median(jaccards_inhib_none), color=clr_none, ls="--", lw=1.0)
ax.hist(jaccards_inhib_avg, label="Average inhibition", facecolor=clr_avg, alpha=0.6, 
        edgecolor=clr_avg, density=True)
ax.axvline(np.median(jaccards_inhib_avg), color=clr_avg, ls="--", lw=1.0)
ax.hist(jaccards_inhib_biopca, label="biopca inhibition", facecolor=clr_biopca, alpha=0.6, 
        edgecolor=clr_biopca, density=True)
ax.axvline(np.median(jaccards_inhib_biopca), color=clr_biopca, ls="--", lw=1.0)

ax.set(xlabel="Jaccard similarity", ylabel="Probability density", title="Jaccard similarity (higher is better)")
ax.legend()
fig.tight_layout()
do_save = True
if mix_frac == 0.2 and do_save:
    fig.savefig("figures/detection/jaccard_similarity_biopca_average_none_f20percent.pdf", transparent=True)
elif mix_frac == 0.5 and do_save:
    fig.savefig("figures/detection/jaccard_similarity_biopca_average_none_f50percent.pdf", transparent=True)
plt.show()
plt.close()

# Comparison to ideal inhibitory network
In a linear algebra perspective, the best inhibition that could possibly be achieved of a new odor plus background mixture is that the whole component of the new odor parallel to the vector subspace spanned by the background odors is suppressed, while the component perpendicular to it is kept. Indeed, the appearance of the new odor's component in the background space cannot be distinguished from a fluctuation of the background (unless we had neurons tracking statistics of typical activations in that space, but not obvious how to get that). At any rate, this is the best we can hope our IBCM inhibition network will achieve. 

In [None]:
def find_projector(a):
    """ Calculate projector a a^+, which projects
    a column vector on the vector space spanned by columns of a. 
    """
    a_inv = np.linalg.pinv(a)
    return a.dot(a_inv)
    
def find_parallel_component(x, basis, projector=None):
    """
    Args:
        x (np.ndarray): 1d array of length D containing the vector to decompose. 
        basis (np.ndarray): 2d matrix of size DxK where each column is one
            of the linearly independent background vectors. 
        projector (np.ndarray): 2d matrix A A^+, the projector on the vector
            space spanned by columns of basis. 
    Return:
        x_par (np.ndarray): component of x found in the vector space of basis
            The perpendicular component can be obtained as x - x_par. 
    """
    # If the projector is not provided yet
    if projector is None:
        # Compute Moore-Penrose pseudo-inverse and AA^+ projector
        projector = find_projector(basis)
    x_par = projector.dot(x)
    return x_par

def ideal_linear_inhibitor(x_n_par, x_n_ort, x_back, f, alpha, beta):
    """ Calculate the ideal projection neuron layer, which assumes
    perfect inhibition (down to beta/(alpha+beta)) of the component of the mixture
    parallel to the background odors' vector space, while leaving the orthogonal
    component of the new odor untouched. 
    
    Args:
        x_n_par (np.1darray): new odor, component parallel to background vector space
        x_n_ort (np.1darray): new odor, component orthogonal to background vector space 
        x_back (np.2darray): background samples, one per row
        f (float): mixture fraction (hard case is f=0.2)
        alpha (float): inhibitory weights learning rate alpha
        beta (float): inhibitory weights decaying rate beta
    
    Returns:
        s (np.1darray): projection neurons after perfect linear inhibition
    """
    # Allow broadcasting for multiple x_back vectors
    factor = beta / (2*alpha + beta)
    s = factor * f*x_n_par + f*x_n_ort
    # I thought the following would have been even better, but turns out it is worse for small f
    #s = f*x_n_par + f*x_n_ort
    s = relu_inplace(s.reshape(1, -1) + factor * x_back)
    return s

In [None]:
# Reuse each new odor in new_odor_targets and each background in back_samples_tag
# Compute the projector on the background odor components only once
# Compute parallel component of each new odor
# Mix it with all background samples at once using broadcasting capability of ideal_linear_inhibitor function
background_projector = find_projector(back_components_tag.T)
inhib_ideal_samples_tag = []
for od in new_odor_targets:
    # Decompose
    od_par = find_parallel_component(od, basis=back_components_tag.T, projector=background_projector)
    od_ort = od - od_par
    # Compute the perfectly inhibited mixture with each background sample
    inhib_ideal = ideal_linear_inhibitor(od_par, od_ort, back_samples_tag, mix_frac, *inhib_rates)
    # Background reduced to b/(a+b), new odor intact? Perfect inhibition
    #inhib_ideal = inhib_rates[1] / sum(inhib_rates) * (1.0 - mix_frac) * back_samples_tag + mix_frac * od.reshape(1, -1)
    inhib_ideal_samples_tag.append(inhib_ideal)
inhib_ideal_samples_tag = np.asarray(inhib_ideal_samples_tag)

# Compute neural tags of the ideal inhibited mixtures and compare to target tags. 
# Compute tags and Jaccard distances between target odor and mixture without or with inhibition
jaccards_inhib_ideal = []
for i in range(new_odor_targets.shape[0]):
    target_tag = project_neural_tag(new_odor_targets[i], new_odor_targets[i], proj_mat, **projtag_kwargs)
    for j in range(back_samples_tag.shape[0]):
        # Input mixture vector too
        mixture_ideal_samples = (1.0-mix_frac)*back_samples_tag[j] + mix_frac*new_odor_targets[i]
        tag_ideal = project_neural_tag(inhib_ideal_samples_tag[i, j], mixture_ideal_samples, proj_mat, **projtag_kwargs)
        jaccards_inhib_ideal.append(jaccard(target_tag, tag_ideal))


In [None]:
# Histograms of Jaccard similarities: larger similarity is better
fig, ax = plt.subplots()
clr_map = {"none": "xkcd:navy blue", "average": "xkcd:orangey brown", 
           "ibcm":"xkcd:turquoise", "ideal": "xkcd:powder blue", "ideal2":"xkcd:pale rose"}

ax.hist(jaccards_inhib_ideal, label="Ideal inhibition", facecolor=clr_map["ideal"], alpha=0.6, 
        edgecolor=clr_map["ideal"], density=True)
ax.axvline(np.median(jaccards_inhib_ideal), color=clr_map["ideal"], ls="--", lw=1.0)
ax.hist(jaccards_inhib_avg, label="Average inhibition", facecolor=clr_map["average"], alpha=0.6, 
        edgecolor=clr_map["average"], density=True)
ax.axvline(np.median(jaccards_inhib_avg), color=clr_map["average"], ls="--", lw=1.0)
ax.hist(jaccards_inhib_ibcm, label="IBCM inhibition", facecolor=clr_map["ibcm"], alpha=0.6, 
        edgecolor=clr_map["ibcm"], density=True)
ax.axvline(np.median(jaccards_inhib_ibcm), color=clr_map["ibcm"], ls="--", lw=1.0)

ax.set(xlabel="Jaccard similarity", ylabel="Probability density", title="Jaccard similarity (higher is better)")
ax.legend()
fig.tight_layout()
plt.show()
plt.close()

# Performance as a function of f
I expect to see a relatively sharp drop of median performance for IBCM a bit above $f=\beta/(2\alpha + \beta)$, and at this value for the ideal inhibition. 

In [None]:
def compute_median_performances(back_samples, new_odors, f, projmat, proj_kwargs, 
                                m_mean, w_mean, eta, inhib_ab, back_components):
    """ Compute median Jaccard similarity for the different inhibition methods we have, 
    for a given value of mixture parameter f. """
    all_jaccard_pairs_dict = {"none":[], "average":[], "ibcm":[], "ideal":[], "ideal2":[]}  # list of lists, one per method
    back_proj = find_projector(back_components.T)
    for i in range(new_odors.shape[0]):
        # Compute target tag
        target_tag = project_neural_tag(new_odors[i], new_odors[i], projmat, **proj_kwargs)
        # Prepare mixtures
        mix_samples = back_samples + new_odors[i:i+1]*f
        
        # Compute inhibited mixtures with the different methods
        # No inhibition: just use mix_samples
        # Average inhibition
        avg_back_tag = averages_nu.dot(back_components_tag)
        b_over_2ab = inhib_ab[1] / (sum(inhib_ab) + inhib_ab[0])
        inhib_avg_samples = mix_samples - a_over_ab * avg_back_tag.reshape(1, -1)

        # Inhibition of each generated sample and statistics on performance
        inhib_ibcm_samples = respond_new_odors(mix_samples, m_mean, w_mean, eta)
        
        # Ideal inhibition
        od_par = find_parallel_component(new_odors[i], basis=back_components.T, projector=back_proj)
        od_ort = new_odors[i] - od_par
        # Compute the perfectly inhibited mixture with each background sample
        inhib_ideal_samples = ideal_linear_inhibitor(od_par, od_ort, back_samples, f, *inhib_ab)
        # Background reduced to b/(a+b), new odor intact?
        inhib_ideal2_samples = relu_inplace(b_over_2ab * back_samples + f * new_odors[i:i+1])
    
        # For each inhibited mixture, compute jaccard similarity
        current_jaccard_dict = {"none":[], "average":[], "ibcm":[], "ideal":[], "ideal2":[]}
        for j in range(back_samples.shape[0]):
            mix_tag = project_neural_tag(mix_samples[j], mix_samples[j], projmat, **proj_kwargs)
            current_jaccard_dict["none"].append(jaccard(target_tag, mix_tag))
            
            # Average
            mix_tag = project_neural_tag(inhib_avg_samples[j], mix_samples[j], projmat, **proj_kwargs)
            current_jaccard_dict["average"].append(jaccard(target_tag, mix_tag))
        
            # IBCM
            mix_tag = project_neural_tag(inhib_ibcm_samples[j], mix_samples[j], projmat, **proj_kwargs)
            current_jaccard_dict["ibcm"].append(jaccard(target_tag, mix_tag))
            
            # Ideal
            mix_tag = project_neural_tag(inhib_ideal_samples[j], mix_samples[j], projmat, **proj_kwargs)
            current_jaccard_dict["ideal"].append(jaccard(target_tag, mix_tag))
            
            # Ideal 2
            mix_tag = project_neural_tag(inhib_ideal2_samples[j], mix_samples[j], projmat, **proj_kwargs)
            current_jaccard_dict["ideal2"].append(jaccard(target_tag, mix_tag))
            
        # Add those values to the total list
        for method in all_jaccard_pairs_dict.keys():
            all_jaccard_pairs_dict[method].append(current_jaccard_dict[method])
    
    # Convert to 2d array and compute median
    # Could choose to have one median per odor or per background sample
    all_jaccard_pairs_dict = {k:np.asarray(a) for k, a in all_jaccard_pairs_dict.items()}
    median_jaccard_pairs_dict = {k:np.median(a) for k, a in all_jaccard_pairs_dict.items()}
    return median_jaccard_pairs_dict
    

In [None]:
# Use previous functions for various f values
median_jaccards = {"none":[], "average":[], "ibcm":[], "ideal":[], "ideal2":[]}
f_range = np.arange(0.1, 0.8, 0.1)
for f in f_range:
    meds = compute_median_performances(back_samples_tag, new_odor_targets, f, proj_mat, projtag_kwargs, 
                                mtag_mean, wtag_mean, coupling_eta, inhib_rates, back_components_tag)
    for k in meds:
        median_jaccards[k].append(meds[k])
    print("Done f = {:.2f}".format(f))

In [None]:
fig, ax = plt.subplots()
labelmap = {"none":"None", "average":"Average", "ibcm":"IBCM", "ideal":r"Ideal $\perp$", "ideal2":"Ideal all"}
for k in median_jaccards:
    ax.plot(f_range, median_jaccards[k], color=clr_map[k], label=labelmap[k], lw=3)
ax.set(xlabel="Fraction $f$ of new odor", ylabel="Median Jaccard similarity")
ax.legend(title="Inhibition method")
fig.set_size_inches(4, 3)
fig.tight_layout()
# fig.savefig("figures/detection/inhibition_jaccard_comparison_methods.pdf", transparent=True)
plt.show()
plt.close()

## Analytical prediction for the "perfect" inhibition model

#WARNING: THIS NEVER WORKED. Code kept for reference. 

Perfect inhibition: only the background is reduced to $\frac{\beta}{\alpha + \beta}$ of its original amplitude, while the new odor is untouched:

$$ \vec{s} = \frac{\beta}{\alpha + \beta} (1-f) \vec{x}^b(t) + f \vec{x}^n $$

One would think this is the best possible inhibition, but for low $f$, it fares worse than deleting the parallel component of $\vec{x}^n$ as well, because inhibition of the parallel component lowers even further the activity of Kenyon cells specific to the background; the new odor does not reinforce them and only KCs specific to the new odor can cross the threshold. 

My analytical calculation is for this "perfect" inhibition, because it is easier to treat analytically than the . Also, I computed the mean rather than median Jaccard similarity. Anyways, let's see what it looks like and if it makes sense. 

UPDATE: It just does not work. 