# Correlating turbulent background processes

We first simulate independent concentration variables $\nu(t)$ and consider their zero-meaned version, $\tilde{\nu}(t) = \nu(t) - \langle \nu \rangle$. These variables have mean zero, a diagonal covariance matrix $\sigma^2 \mathbb{1}$, and third moment $m_3$ (while mixed third-order moments of different $\tilde{\nu}_\mu$ are zero due to their zero average). 

Then, we can transform these variables to have some desired covariance matrix $C$ with Cholesky decomposition $\sigma^2 RR^T = \Sigma$ (i.e. $R$ is the Cholesky decomposition of $\Sigma / \sigma^2$ and thus has scale $1/\sigma$). Indeed, we take concentrations $c = R \tilde{\nu} + \langle \nu \rangle$. These have the same mean as the original variables, $\langle c \rangle = \langle \nu \rangle$, and the desired covariance matrix, 

$$ C = \langle (c - \langle c \rangle) (c - \langle c \rangle)^T \rangle = \langle R \tilde{\nu} \tilde{\nu}^T R^T \rangle = \sigma^2 R R^T = \Sigma $$

The third moments are altered by this transformation, but they are not generally zero, so we can still numerically get the IBCM model to converge. 

We consider what happens as a pair of odors progressively gets more correlated. We take the last two odors for this. This corresponds to a covariance matrix

$$ \Sigma = \sigma^2 \begin{pmatrix}
    1 & 0 & \ldots & 0 & 0 \\
    0 & 1 & \ldots & 0 & 0 \\
    \ldots & \ldots & \ldots & \ldots \\
    0 & 0 & \ldots & 1 & \rho  \\
    0 & 0 & \ldots & \rho & 1
\end{pmatrix} $$

where $-1 < \rho < 1$ is the Pearson correlation coefficient. Then $R$ is the Cholesky decomposition of the matrix on the right, which has a simple expression because only a $2x2$ block is not diagonal:

$$ R = \sigma \begin{pmatrix}
    1 & 0 & \ldots & 0 & 0 \\
    0 & 1 & \ldots & 0 & 0 \\
    \ldots & \ldots & \ldots & \ldots \\
    0 & 0 & \ldots & 1 & 0 \\
    0 & 0 & \ldots & \rho & \sqrt{1 - \rho^2}
\end{pmatrix} $$

We also test the limiting case $\rho = 0$, in which case the Cholesky decomposition does not formally exist, but we can still mix odors with a matrix

$$ \Sigma = \sigma^2 \begin{pmatrix}
    1 &  \ldots & 0 & 0 \\
    \ldots  & \ldots & \ldots \\
    0 & \ldots & 1 & 0  \\
    0 & \ldots & 1 & 0
\end{pmatrix} $$

## In this notebook
We run simulations for a few different $\rho$s with a given background seed, to illustrate how each model (IBCM, BioPCA) behaves as $\rho$ increases. 
A systematic check of how background correlations affect new odor recognition is done in ``secondary_scripts/run_performance_correlation.py`` and ``secondary_scripts/analyze_correlation_results.py``. The present notebook is a companion to these. 

### Remark

If the background has very strong correlation between odors, it may not be effectively $N_\mathrm{B}$-dimensional anymore, so the representation in terms of the dot products with original odors, $h_{\gamma}$, may not be appropriate. There could be a new basis picked up by neurons; some decomposition of the mixture $\sum_\gamma \xi_\gamma \mathbf{y}_\gamma$ where the $\xi_{\gamma}$ are not odor concentrations but other independent components. 

## Imports

In [None]:
import numpy as np
from scipy import sparse
import matplotlib.pyplot as plt
import matplotlib as mpl
import seaborn as sns
from time import perf_counter
import os, json, sys
if ".." not in sys.path:
    sys.path.insert(1, "..")
from os.path import join as pj
    
from sklearn.decomposition import FastICA

from modelfcts.ibcm import (
    integrate_inhib_ibcm_network_options,
    ibcm_respond_new_odors,
    compute_mbars_hgammas_hbargammas,
    ibcm_respond_new_odors
)
from modelfcts.ibcm_analytics import (
    fixedpoint_thirdmoment_exact, 
    ibcm_fixedpoint_w_thirdmoment, 
    ibcm_all_largest_eigenvalues
)
from modelfcts.biopca import (
    integrate_inhib_biopca_network_skip,
    build_lambda_matrix,
    biopca_respond_new_odors
)
from modelfcts.average_sub import (
    integrate_inhib_average_sub_skip, 
    average_sub_respond_new_odors
)
from modelfcts.ideal import (
    find_projector, 
    find_parallel_component, 
    ideal_linear_inhibitor, 
    compute_ideal_factor
)
from modelfcts.checktools import (
    analyze_pca_learning, 
    check_conc_samples_powerlaw_exp1
)
from modelfcts.backgrounds import (
    update_powerlaw_mixed_concs,
    logof10, 
    sample_ss_conc_powerlaw,
    sample_ss_mixed_concs_powerlaw,
    generate_odorant
)
from modelfcts.tagging import (
    project_neural_tag, 
    create_sparse_proj_mat, 
    SparseNDArray, 
)
from utils.statistics import seed_from_gen
from modelfcts.distribs import (
    truncexp1_average,
    powerlaw_cutoff_inverse_transform
)
from utils.smoothing_function import (
    moving_average, 
    moving_var
)
from simulfcts.plotting import (
    plot_hbars_gamma_series, 
    plot_w_matrix, 
    plot_background_norm_inhibition, 
    plot_background_neurons_inhibition, 
    plot_pca_results, 
    hist_outline
)
from simulfcts.analysis import compute_back_reduction_stats
from utils.metrics import jaccard, l2_norm

In [None]:
def get_target_cholesky(correl, n_comp):
    target_covmat_scaled = np.zeros((n_comp, n_comp))
    target_covmat_scaled[[-1, -2], [-2, -1]] = correl
    target_covmat_scaled[np.diag_indices(n_comp)] = 1.0
    if abs(correl) < 1.0:
        target_cholesky = np.linalg.cholesky(target_covmat_scaled)
    else:
        target_cholesky = np.zeros((n_comp, n_comp))
        target_cholesky[np.diag_indices(n_comp)] = 1.0
        target_cholesky[-1, -1] = 0.0
        target_cholesky[-1, -2] = correl  # Replace odor 1 by odor 0 for rho = +-1
    return target_cholesky

def mix_concs(nuser, means, chol):
    mixed_concs_ser = np.einsum("ij,kj->ki", chol, nuser - means) + means
    return mixed_concs_ser

### Aesthetic parameters

In [None]:
#plt.style.use(['dark_background'])
plt.rcParams["figure.figsize"] = (4.5, 3.0)

In [None]:
do_save_outputs = False
do_save_plots = False

models = ["ibcm", "biopca", "avgsub", "ideal", "orthogonal", "none"]
model_nice_names = {
    "ibcm": "IBCM",
    "biopca": "BioPCA",
    "avgsub": "Average",
    "ideal": "Ideal",
    "optimal": "Optimal",
    "orthogonal": "Orthogonal",
    "none": "None"
}
model_colors = {
    "ibcm": "xkcd:turquoise",
    "biopca": "xkcd:orangey brown",
    "avgsub": "xkcd:navy blue",
    "optimal": "xkcd:powder blue",
    "ideal": "xkcd:light green",
    "orthogonal": "xkcd:pale rose",
    "none": "grey"
}

### Initialization

In [None]:
# Initialize common simulation parameters
n_dimensions = 50  # Half the real number for faster simulations
n_components = 4  # Number of background odors

inhib_rates = [5e-5, 1e-5]  # alpha, beta  [0.00025, 0.00005]

# Simulation duration
duration = 360000.0
deltat = 1.0
n_chunks = 1
skp = 50 * int(1.0 / deltat)

# Common model options
activ_function = "identity"  # "ReLU"

# Background process
update_fct = update_powerlaw_mixed_concs

# Choose randomly generated background vectors
# This seed gave nicely spread out odors easier to learn 0xe329714605b83365e67b44ed7e001ec
# Another random seed: 0xb7bf767bbad297aeeee19d0ccdc3647e
rgen_meta = np.random.default_rng(seed=0xb7bf767bbad297aeeee19d0ccdc3647e)
back_components = np.zeros([n_components, n_dimensions])
for i in range(n_components):
    back_components[i] = generate_odorant(n_dimensions, rgen_meta, lambda_in=0.1)
back_components = back_components / l2_norm(back_components).reshape(-1, 1)

# Seed for background simulation, to make sure all models are the same
simul_seed = seed_from_gen(rgen_meta)

# Turbulent background parameters: same rates and constants for all odors
back_params = [
    np.asarray([1.0] * n_components),        # whiff_tmins
    np.asarray([500.] * n_components),       # whiff_tmaxs
    np.asarray([1.0] * n_components),        # blank_tmins
    np.asarray([800.0] * n_components),      # blank_tmaxs
    np.asarray([0.6] * n_components),        # c0s
    np.asarray([0.5] * n_components),        # alphas
]

# Compute mean of independent underlying variables, 
# to determine the mean and target covariance of mixed variables
tblo, tbhi, twlo, twhi = back_params[2], back_params[3], back_params[0], back_params[1]
whiffprob = np.mean(1.0 / (1.0 + np.sqrt(tblo*tbhi/twlo/twhi)))
avg_whiff_conc = np.mean(truncexp1_average(*back_params[4:6]))
mean_conc = whiffprob * avg_whiff_conc  # average time in whiffs vs blanks * average whiff conc
print("Analytical mean conc:", mean_conc)
#print("Numerical mean conc:", mean_conc_empirical)

# Choose desired Pearson correlation between odors
# We will vary this below
# Up to 0.5 there is some convergence, beyond, issue. 
# Note that for rho = 1, there would effectively be only one odor. 
correl_rho = 0.7

# Target covariance matrix (scaled by variance of underlying independent variables)
target_cholesky = get_target_cholesky(correl_rho, n_components)
target_covmat_scaled = target_cholesky.dot(target_cholesky.T)
print(target_covmat_scaled)

# Add mean conc and Cholesky mixing matrix to parameters
back_params.append(mean_conc)
back_params.append(target_cholesky)
# Then add background odor vectors last to that list
back_params.append(back_components)

# Initial values of background process variables (underlying independent (t, c))
init_concs_ind = sample_ss_conc_powerlaw(*back_params[:-3], size=1, rgen=rgen_meta)
init_times = powerlaw_cutoff_inverse_transform(
                rgen_meta.random(size=n_components), *back_params[2:4])
tc_init = np.stack([init_times, init_concs_ind.squeeze()], axis=1)

# Initial background vector 
init_concs_mix = target_cholesky.dot(tc_init[:, 1] - mean_conc) + mean_conc
init_bkvec = init_concs_mix.dot(back_components)
# nus are first in the list of initial background params
init_back_list = [tc_init, init_bkvec]

## Background process example

In [None]:
# Run a dense simulation to extract mixed concentrations for
# global correl_rho chosen above (0.7)
# Dummy initialization
simul_seed2 = seed_from_gen(rgen_meta)
avg_options = {"activ_fct": activ_function}
init_synapses_avg = np.zeros([1, n_dimensions])

sim_avg_res = integrate_inhib_average_sub_skip(
                init_synapses_avg, update_fct, init_back_list, 
                [], inhib_rates, back_params, duration, deltat,
                seed=simul_seed2, noisetype="uniform", skp=1, **avg_options
)

_, bkser_avg, bkvecser_mixed, _, _ = sim_avg_res
mixed_concs_sample = mix_concs(bkser_avg[:, :, 1], mean_conc, target_cholesky)
del sim_avg_res, bkser_avg

In [None]:
# Mixed concentrations time series
fig, ax = plt.subplots()
tslice = slice(0, 1000)
tser_dense = np.arange(0.0, duration, deltat)
for i in range(n_components):
    ax.plot(tser_dense[tslice]/1000, mixed_concs_sample[tslice, i], lw=0.8, label="Odor {}".format(i))
ax.set(xlabel="Time (x1000 steps)", ylabel=r"Mixed odor concentrations")
ax.legend(loc="upper left", bbox_to_anchor=(1.0, 1.0))
plt.show()
plt.close()

In [None]:
# Background vectors time series with mixed concentrations
tslice = slice(0, 50000, 100)
n_cols = 6
n_plots = n_dimensions // 4  # Only show first 24 OSNs
n_rows = n_plots // n_cols + min(1, n_plots % n_cols)
fig, axes = plt.subplots(n_rows, n_cols, sharex=True, sharey=True)
fig.set_size_inches(n_cols*1.75, n_rows*1.75)
for i in range(n_plots):
    ax = axes.flat[i]
    ax.scatter(bkvecser_mixed[tslice, 2*i+1], bkvecser_mixed[tslice, 2*i], 
               s=9, alpha=0.5, color="k")
    for j in range(n_components):
        ax.plot(*zip([0.0, 0.0], 3.0*back_components[j, 2*i:2*i+2:][::-1]), lw=2.0)
    ax.set(xlabel="OSN {}".format(2*i+2), ylabel="OSN {}".format(2*i+1))
for i in range(n_plots, n_rows*n_cols):
    axes.flat[i].set_axis_off()
fig.tight_layout()
plt.show()
plt.close()

In [None]:
# Matrix of third moments, to see how asymmetric components become
# thus explaining the more difficult convergence to specificity
mean_concs_num = np.mean(mixed_concs_sample, axis=0)  # unmixed
conc_0mean = (mixed_concs_sample - mean_concs_num)
thirdmoments = np.mean(conc_0mean[:, :, None, None] 
                       * conc_0mean[:, None, :, None] 
                       * conc_0mean[:, None, None, :], axis=0)

# Check that the covmat is approx. what we wanted
covnum = np.mean(conc_0mean[:, :, None] 
                       * conc_0mean[:, None, :], axis=0)
fig, ax = plt.subplots()
absrange = np.abs(covnum).max()
linthresh = np.abs(covnum).min()
#cmap_norm = SymLogNorm(linthresh=linthresh, vmin=-absrange, vmax=absrange)
#cmap_norm = Normalize(vmin=-absrange, vmax=absrange)
cmap_choice = "viridis"
cmap_norm = mpl.colors.Normalize(vmin=min(0, np.amin(covnum)), vmax=np.amax(covnum))
im = ax.imshow(covnum, norm=cmap_norm, cmap=cmap_choice)
ax.set(xlabel="j", ylabel="k", title=r"$C_{ij}$")
ax.set(xticks=range(0, n_components), yticks=range(0, n_components))
cb = fig.colorbar(im, location="right", orientation="vertical", 
             ax=ax, label="Covariance matrix")
plt.show()
plt.close()

fig, axes = plt.subplots(1, n_components, constrained_layout=True)
absrange = np.abs(thirdmoments).max()
linthresh = np.abs(thirdmoments).min()
#cmap_norm = SymLogNorm(linthresh=linthresh, vmin=-absrange, vmax=absrange)
#cmap_norm = Normalize(vmin=-absrange, vmax=absrange)
#cmap_choice = "RdBu"
cmap_choice = "viridis"
cmap_norm = mpl.colors.Normalize(vmin=min(0, np.amin(thirdmoments)), vmax=np.amax(thirdmoments))
for i in range(n_components):
    ax = axes.flat[i]
    im = ax.imshow(thirdmoments[i], norm=cmap_norm, cmap=cmap_choice)
    ax.set_title("$" + f"C_{i}jk" + "$")
    ax.set(xticks=range(0, n_components), yticks=range(0, n_components))
    ax.set(xlabel="j", ylabel="k")
cb = fig.colorbar(im, location="bottom", orientation="horizontal", 
             ax=axes.flatten(), label="Third-order correlation")
#cb.set_ticks([-1e-2, -1e-3, -1e-4, 1e-5, 1e-4, 1e-3, 1e-2])

plt.show()
plt.close()

# IBCM habituation
## IBCM parameters and initial values, same for all simulations

In [None]:
# IBCM model parameters
n_i_ibcm = 24  # Number of inhibitory neurons for IBCM case

# Model rates
learnrate_ibcm = 0.001 #5e-5
tau_avg_ibcm = 1200  # 2000
coupling_eta_ibcm = 0.6/n_i_ibcm
ssat_ibcm = 50.0
k_c2bar_avg = 0.1
decay_relative_ibcm = 0.005
lambd_ibcm = 1.0
ibcm_rates = [
    learnrate_ibcm, 
    tau_avg_ibcm, 
    coupling_eta_ibcm, 
    lambd_ibcm,
    ssat_ibcm, 
    k_c2bar_avg,
    decay_relative_ibcm 
]
ibcm_options = {
    "activ_fct": activ_function, 
    "saturation": "tanh", 
    "variant": "law", 
    "decay": True
}

# Initial synaptic weights: small positive noise
init_synapses_ibcm = 0.5*rgen_meta.standard_normal(size=[n_i_ibcm, n_dimensions])*lambd_ibcm
#init_synapses_ibcm = (0.3 * back_components[rgen_meta.choice(n_components, size=n_i_ibcm), :]
#                      + 0.1*rgen_meta.standard_normal(size=[n_i_ibcm, n_dimensions]))* lambd_ibcm

## IBCM simulation functions
They rely on global parameters that won't change as we vary $\rho$. 

In [None]:
def run_ibcm_simulation_correl(rho, simseed, duration_local=duration, skp_local=skp):
    # Make a copy of global parameters, change correlation rho
    back_params_local = list(back_params)

    # Target covariance matrix (scaled by variance of underlying independent variables)
    target_cholesky = get_target_cholesky(rho, n_components)

    # Add current Cholesky matrix to list of background params
    back_params_local[-2] = target_cholesky
    
    # Run the IBCM simulation
    tstart = perf_counter()
    sim_results = integrate_inhib_ibcm_network_options(
                init_synapses_ibcm, update_fct, init_back_list, 
                ibcm_rates, inhib_rates, back_params_local, duration_local, 
                deltat, seed=simseed, noisetype="uniform",  
                skp=skp_local, **ibcm_options
    )
    tend = perf_counter()
    print("Finished simulation for rho =", rho, "in {:.2f} s".format(tend - tstart))
    
    # Mixed concentrations time series
    nuser_ibcm = sim_results[1][:, :, 1]
    mixed_concs_ser = mix_concs(nuser_ibcm, mean_conc, target_cholesky)

    return [*sim_results, mixed_concs_ser]

In [None]:
def analyze_clean_ibcm_simul(results_raw, correl_rho_loc):
    """
    Args:
        results_raw = (tser_ibcm, nuser_ibcm, bkvecser_ibcm, mser_ibcm, 
            cbarser_ibcm, thetaser_ibcm, wser_ibcm, yser_ibcm, mixed_concs_ser)
    Returns:
        mixed_concs_ser, cbars_gamma, wser_ibcm, bkvecser_ibcm, 
            yser_ibcm, moments_conc, cgammas_bar_counts, specif_gammas, correl_c_conc
    """
    (tser_ibcm, nuser_ibcm, bkvecser_ibcm, mser_ibcm, 
        cbarser_ibcm, thetaser_ibcm, wser_ibcm, yser_ibcm, mixed_concs_ser) = results_raw
    # Calculate cgammas_bar and mbars
    transient = int(5/6*duration / deltat) // skp
    basis = back_components  # Combine last two vectors for rho = 0
    if abs(correl_rho_loc - 1.0) < 1e-6:  # effectively N-1 vectors only
        print("Found rho = 1")
        basis = np.concatenate([back_components[:2], 
            np.sum(back_components[2:4], axis=0, keepdims=True)], axis=0)
    # Dot products \bar{c}_{\gamma} = \bar{\vec{m}} \cdot \vec{x}_{\gamma}
    mbarser, c_gammas, cbars_gamma = compute_mbars_hgammas_hbargammas(
                                results_raw[3], coupling_eta_ibcm, basis)
    
    # Moments of concentrations
    conc_ser = mixed_concs_ser
    mean_conc = np.mean(conc_ser)
    sigma2_conc = np.var(conc_ser)
    thirdmom_conc = np.mean((conc_ser - mean_conc)**3)
    moments_conc = [float(mean_conc), float(sigma2_conc), float(thirdmom_conc)]

    # Count how many dot products are at each possible value. Use cbar = 1.0 as a split. 
    split_val = 2.0
    cbars_gamma_mean = np.mean(cbars_gamma[transient:], axis=0)
    cgammas_bar_counts = {"above": int(np.sum(cbars_gamma_mean.flatten() > split_val)), 
                          "below": int(np.sum(cbars_gamma_mean.flatten() <= split_val))}

    specif_gammas = np.argmax(np.mean(cbars_gamma[transient:], axis=0), axis=1)
    
    cbarser_norm_centered = cbarser_ibcm - np.mean(cbarser_ibcm[transient:], axis=0)
    conc_ser_centered = conc_ser - np.mean(conc_ser[transient:], axis=0)
    correl_c_conc = np.mean(cbarser_norm_centered[transient:, :, None] 
                      * conc_ser_centered[transient:, None, :], axis=0)
    
    results_clean = (conc_ser, cbars_gamma, wser_ibcm, bkvecser_ibcm, 
                     yser_ibcm, moments_conc, cgammas_bar_counts, specif_gammas, correl_c_conc)
    return results_clean


def save_ibcm_simuls_to_disk(fname, **all_results_clean):
    # Save cbar gamma series, that's all we really need for the figures
    # Will run a separate short, non-skipped simulation to plot mixed concentrations
    cbars_gamma_series = {}
    for simname in all_results_clean.keys():
        try: 
            float(simname)
        except:
            fullname = simname  # mixed conc series
            cbars_gamma = all_results_clean[simname]
        else:
            (conc_ser, cbars_gamma, wser_ibcm, bkvecser_ibcm, yser_ibcm, moments_conc, 
                cgammas_bar_counts, specif_gammas, correl_c_conc) = all_results_clean[simname]
            fullname = "cbars_gamma_ser_" + simname
        cbars_gamma_series[fullname] = cbars_gamma
    np.savez_compressed(fname, **cbars_gamma_series)
    return 0

## IBCM simulations
Run simulations for $\rho = -0.6, \rho = 0.2, \rho = 0.4, \rho=0.7, \rho = 1$


In [None]:
rho_range = np.asarray([-0.6, 0.2, 0.4, 0.7, 1.0])
tser_ibcm = np.arange(0.0, duration, deltat * skp)
all_ibcm_results_clean = {}

for rho in rho_range:
    # Run and keep all in RAM for choice of plotting below
    print("Running simulation for rho = {}".format(rho))
    raw_res = run_ibcm_simulation_correl(rho, simul_seed)
    all_ibcm_results_clean[str(rho)] = analyze_clean_ibcm_simul(raw_res, rho)

## IBCM analysis
Plot cbar_gammas series for each rho, and plot the sample correlated concentrations time series. 

In [None]:
for rho in rho_range:
    print("rho = {}".format(rho))
    cbars_gamma = all_ibcm_results_clean[str(rho)][1]
    fig , ax, _ = plot_hbars_gamma_series(tser_ibcm, cbars_gamma, 
                        skp=5, transient=320000 // skp)
    
    fig.tight_layout()
    leg = ax.legend(loc="upper left", bbox_to_anchor=(1., 1.))
    if do_save_plots:
        fig.savefig(pj("figures", "correlation", 
            "cbargammas_series_turbulent_correlation_{}.pdf".format(str(rho).replace(".", "-"))), 
            transparent=True, bbox_inches="tight", bbox_extra_artists=(leg,))
    plt.show()
    plt.close()

In [None]:
# Plot y series norm for each of these cases
# Not very interesting. Rho = 0.7 is only slightly worse. 
for rho in rho_range:
    print("rho = {}".format(rho))
    yser_ibcm = all_ibcm_results_clean[str(rho)][4]
    bkvecser_ibcm = all_ibcm_results_clean[str(rho)][3]
    fig, ax, bknorm_ser, ynorm_ser = plot_background_norm_inhibition(
                                    tser_ibcm, bkvecser_ibcm, yser_ibcm, skp=1)

    # Compute noise reduction factor, annotate
    transient = 100000 // skp
    norm_stats = compute_back_reduction_stats(bknorm_ser, ynorm_ser, trans=transient)

    print("Mean activity norm reduced to "
          + "{:.1f} % of input".format(norm_stats['avg_reduction'] * 100))
    print("Standard deviation of activity norm reduced to "
          + "{:.1f} % of input".format(norm_stats['std_reduction'] * 100))
    ax.annotate("St. dev. reduced to {:.1f} %".format(norm_stats['std_reduction'] * 100), 
               xy=(0.98, 0.98), xycoords="axes fraction", ha="right", va="top")

    ax.legend(loc="center right", bbox_to_anchor=(1.0, 0.8))
    fig.tight_layout()
    if do_save_plots:
        fig.savefig(pj("figures", "correlation", 
            "pn_activity_norm_turbulent_correlation_ibcm_rho_{}.pdf".format(str(rho).replace(".", "-"))),  
            transparent=True, bbox_inches="tight")
    plt.show()
    plt.close()

In [None]:
# Correlation between nu and cbarser, to see if some neurons are specific to odors
for rho in rho_range:
    print("rho = {}".format(rho))
    cbar_nu_correl = all_ibcm_results_clean[str(rho)][8]
    cbars_gamma_series = all_ibcm_results_clean[str(rho)][1]
    specif_gammas = all_ibcm_results_clean[str(rho)][7]

    fig, ax = plt.subplots()
    img = ax.imshow(cbar_nu_correl.T)
    ax.set(ylabel=r"Component $\gamma$", xlabel=r"Neuron $i$")
    fig.colorbar(img, label=r"$\langle (\bar{c}^i - \langle \bar{c}^i \rangle)"
                 r"(\nu_{\gamma} - \langle \nu_{\gamma} \rangle) \rangle$", 
                location="top")
    fig.tight_layout()
    if do_save_plots:
        fig.savefig(pj("figures", "correlation", 
            "specificities_turbulent_correlation_{}.pdf".format(rho.replace(".", "-"))), 
            transparent=True, bbox_inches="tight")
    plt.show()
    plt.close()

    # Check if each component has at least one neuron
    split_val = 3.5
    n_comp_expected = n_components-1 if abs(rho - 1.0) < 1e-6 else n_components
    for comp in range(n_comp_expected):
        print("Number of neurons specific to component {}: {}".format(
                comp, np.sum(np.mean(cbars_gamma_series[-2000:, :, comp], axis=0) > split_val)))#

In [None]:
for rho in rho_range:
    print("rho = {}".format(rho))
    wser_ibcm = all_ibcm_results_clean[str(rho)][2]
    fig, axes = plot_w_matrix(tser_ibcm, wser_ibcm, skp=100)
    fig.tight_layout()
    plt.show()
    plt.close()

# BioPCA simulation

## BioPCA parameters that don't change

In [None]:
# BioPCA model parameters
n_i_pca = n_components  # Number of inhibitory neurons for BioPCA case

# Model rates
learnrate_pca = 1e-4  # Learning rate of M
# Choose Lambda diagonal matrix as advised in Minden et al., 2018
# but scale it up to counteract W regularization
lambda_range_pca = 0.5
lambda_max_pca = 8.0
# Learning rate of L, relative to learnrate. Adjusted to Lambda in the integration function
rel_lrate_pca = 2.0  #  / lambda_max_pca**2 
lambda_mat_diag = build_lambda_matrix(lambda_max_pca, lambda_range_pca, n_i_pca)

xavg_rate_pca = learnrate_pca
pca_options = {
    "activ_fct": activ_function, 
    "remove_lambda": False, 
    "remove_mean": True
}
biopca_rates = [learnrate_pca, rel_lrate_pca, lambda_max_pca, lambda_range_pca, xavg_rate_pca]


# Initial synaptic weights: small positive noise
rgen_pca = np.random.default_rng(seed=0x838b5119fbcfea9685dd64bd1d12d6cf)
init_synapses_pca = rgen_pca.standard_normal(size=[n_i_pca, n_dimensions]) / np.sqrt(n_i_pca)
init_mmat_pca = rgen_pca.standard_normal(size=[n_i_pca, n_dimensions]) / np.sqrt(n_dimensions)
init_lmat_pca = np.eye(n_i_pca, n_i_pca)  # Supposed to be near-identity, start as identity
ml_inits_pca = [init_mmat_pca, init_lmat_pca]


## BioPCA simulation functions

In [None]:
def run_biopca_simulation_correl(rho, simseed, duration_local=duration, skp_local=skp):
    # Make a copy of global parameters, change correlation rho
    back_params_local = list(back_params)

    # Target covariance matrix (scaled by variance of underlying independent variables)
    target_cholesky = get_target_cholesky(rho, n_components)

    # Add current Cholesky matrix to list of background params
    back_params_local[-2] = target_cholesky
    
    # Run the IBCM simulation
    tstart = perf_counter()
    sim_results = integrate_inhib_biopca_network_skip(
                ml_inits_pca, update_fct, init_back_list, biopca_rates, 
                inhib_rates, back_params_local, duration_local, deltat, 
                seed=simul_seed, noisetype="uniform", skp=skp_local, **pca_options
    )
    tend = perf_counter()
    print("Finished simulation for rho =", rho, "in {:.2f} s".format(tend - tstart))
        
    # Mixed concentrations time series
    nuser_pca = sim_results[1]
    mixed_concs_ser = (np.einsum("ij,kj->ki", target_cholesky, 
                                nuser_pca[:, :, 1] - mean_conc) + mean_conc)

    return [*sim_results, mixed_concs_ser]

In [None]:
def analyze_clean_biopca_simul(results_raw, correl_rho_loc):
    """
    Args:
        results_raw = (tser_pca, nuser_pca, bkvecser_pca, mser_pca, 
            lser_pca, xser_pca, cbarser_pca, wser_pca, yser_pca, mixed_concs_ser)
    Returns:
        mixed_concs_ser, bkvecser_pca, yser_pca, wser_pca,
            true_pca, learnt_pca, off_diag_l_avg_abs, align_error_ser)
    """
    (tser_pca, nuser_pca, bkvecser_pca, mser_pca, lser_pca, xser_pca, 
         cbarser_pca, wser_pca, yser_pca, mixed_concs_ser) = results_raw
    
    # Analyze versus true offline PCA of the background samples
    print("Starting analysis of BioPCA vs true PCA")
    tstart = perf_counter()
    res = analyze_pca_learning(bkvecser_pca, mser_pca, lser_pca, 
                           lambda_mat_diag, demean=pca_options["remove_mean"])
    true_pca, learnt_pca, _, off_diag_l_avg_abs, align_error_ser = res
    tend = perf_counter()
    print("Completed analysis in {:.1f} s".format(tend - tstart))
    
    # Also save info about background vs yser_pca
    results_clean = (mixed_concs_ser, bkvecser_pca, yser_pca, wser_pca,
                     true_pca, learnt_pca, off_diag_l_avg_abs, align_error_ser)
    return results_clean


def save_biopca_simuls_to_disk(fname, **all_results_clean):
    # Save true and learnt PCA, that's all we really need
    true_learnt_pcas = {}
    for simname in all_results_clean.keys():
        (mixed_concs_ser, bkvecser_pca, yser_pca, wser_pca, true_pca, 
         learnt_pca, off_diag_l_avg_abs, align_error_ser) = all_results_clean[simname]
        fullname = "true_pca_vals_" + simname
        true_learnt_pcas[fullname] = true_pca[0]
        fullname = "learnt_pca_vals_" + simname
        true_learnt_pcas[fullname] = learnt_pca[0]
        fullname = "pca_align_error_" + simname
        true_learnt_pcas[fullname] = align_error_ser
        print(learnt_pca[1].shape)
    np.savez_compressed(fname, **true_learnt_pcas)
    return 0


## BioPCA simulations

In [None]:
tser_biopca = np.arange(0.0, duration, deltat * skp)
all_biopca_results_clean = {}

for rho in rho_range:
    # Run and keep all in RAM for choice of plotting below
    print("Running simulation for rho = {}".format(rho))
    raw_res = run_biopca_simulation_correl(rho, simul_seed)
    all_biopca_results_clean[str(rho)] = analyze_clean_biopca_simul(raw_res, rho)

## BioPCA analysis

In [None]:
for rho in rho_range:
    print("rho = {}".format(rho))
    # (mixed_concs_ser, bkvecser_pca, yser_pca, wser_pca
    #   true_pca, learnt_pca, off_diag_l_avg_abs, align_error_ser)
    true_pca, learnt_pca = all_biopca_results_clean[str(rho)][4:6]
    align_error_ser = all_biopca_results_clean[str(rho)][7]
    off_diag_l = all_biopca_results_clean[str(rho)][6]
    fig, axes = plot_pca_results(tser_biopca/1000, true_pca, learnt_pca, align_error_ser, off_diag_l)
    axes[-1].set_xlabel("Time (x1000 steps)")
    fig.set_size_inches(fig.get_size_inches()[0], 3*2.5)
    plt.show()
    plt.close()

In [None]:
# Plot y series norm for each of these cases
# Not very interesting, not consistent with level of convergence to true PCA. 
for rho in rho_range:
    print("rho = {}".format(rho))
    yser_biopca = all_biopca_results_clean[str(rho)][2]
    bkvecser_biopca = all_biopca_results_clean[str(rho)][1]
    fig, ax, bknorm_ser, ynorm_ser = plot_background_norm_inhibition(
                                    tser_biopca, bkvecser_biopca, yser_biopca, skp=1)

    # Compute noise reduction factor, annotate
    transient = 100000 // skp
    norm_stats = compute_back_reduction_stats(bknorm_ser, ynorm_ser, trans=transient)

    print("Mean activity norm reduced to "
          + "{:.1f} % of input".format(norm_stats['avg_reduction'] * 100))
    print("Standard deviation of activity norm reduced to "
          + "{:.1f} % of input".format(norm_stats['std_reduction'] * 100))
    ax.annotate("St. dev. reduced to {:.1f} %".format(norm_stats['std_reduction'] * 100), 
               xy=(0.98, 0.98), xycoords="axes fraction", ha="right", va="top")

    ax.legend(loc="center right", bbox_to_anchor=(1.0, 0.8))
    fig.tight_layout()
    if do_save_plots:
        fig.savefig(pj("figures", "correlation", 
            "pn_activity_norm_turbulent_correlation_biopca_rho_{}.pdf".format(str(rho).replace(".", "-"))),  
            transparent=True, bbox_inches="tight")
    plt.show()
    plt.close()

In [None]:
for rho in rho_range:
    print("rho = {}".format(rho))
    wser_biopca = all_biopca_results_clean[str(rho)][3]
    fig, axes = plot_w_matrix(tser_biopca, wser_biopca, skp=100)
    fig.tight_layout()
    plt.show()
    plt.close()

# Saving results to plot

In [None]:
do_save_outputs = True

In [None]:
if do_save_outputs:
    save_folder = pj("..", "results", "for_plots", "correlation")
    save_ibcm_simuls_to_disk(pj(save_folder, "ibcm_examples_turbulent_correl.npz"), 
                             mixed_concs=mixed_concs_sample, **all_ibcm_results_clean)
    save_biopca_simuls_to_disk(pj(save_folder, "biopca_examples_turbulent_correl.npz"), 
                               **all_biopca_results_clean)