# BioPCA habituation to turbulent backgrounds as a function of model rates

For BioPCA, measure L2 distance between learned and true log-eigenvalues? Or alignment subspace error as usual? 

Things to check:
 - Convergence of BioPCA as as function of $\mu$ and $\mu_L$. Leave $\Lambda$ matrix as it is, not the major driver of convergence (although it can help a little). 
 - Convergence as a function of the number of odors, the strength of turbulence


## Imports

In [None]:
import numpy as np
from scipy import sparse, special
import matplotlib.pyplot as plt
import matplotlib as mpl
import seaborn as sns
from time import perf_counter
import os, json
from os.path import join as pj
import sys
if ".." not in sys.path:
    sys.path.insert(1, "..")

from modelfcts.biopca import (
    integrate_inhib_biopca_network_skip,
    build_lambda_matrix,
    biopca_respond_new_odors
)
from modelfcts.checktools import (
    analyze_pca_learning, 
    check_conc_samples_powerlaw_exp1
)
from modelfcts.backgrounds import (
    update_powerlaw_times_concs, 
    logof10, 
    sample_ss_conc_powerlaw, 
    generate_odorant
)
from utils.statistics import seed_from_gen
from modelfcts.distribs import (
    truncexp1_average,
    powerlaw_cutoff_inverse_transform
)
from utils.smoothing_function import (
    moving_average, 
    moving_var
)
from simulfcts.plotting import (
    plot_hbars_gamma_series, 
    plot_w_matrix, 
    plot_background_norm_inhibition, 
    plot_background_neurons_inhibition, 
    plot_pca_results, 
    hist_outline
)
from simulfcts.analysis import compute_back_reduction_stats
from utils.metrics import l2_norm

# Initialization

### Aesthetic parameters

In [None]:
do_save_plots = False
do_save_outputs = False

root_dir = pj("..")
outputs_folder = pj(root_dir, "results", "for_plots", "convergence")
panels_folder = pj(root_dir, "figures", "convergence")
params_folder = pj(root_dir, "results", "common_params")

# rcParams
with open(pj(params_folder, "olfaction_rcparams.json"), "r") as f:
    new_rcParams = json.load(f)
plt.rcParams.update(new_rcParams)

# color maps
with open(pj(params_folder, "back_colors.json"), "r") as f:
    all_back_colors = json.load(f)
back_color = all_back_colors["back_color"]
back_color_samples = all_back_colors["back_color_samples"]
back_palette = all_back_colors["back_palette"]

with open(pj(params_folder, "orn_colors.json"), "r") as f:
    orn_colors = json.load(f)
    
with open(pj(params_folder, "inhibitory_neuron_two_colors.json"), "r") as f:
    neuron_colors = np.asarray(json.load(f))
with open(pj(params_folder, "inhibitory_neuron_full_colors.json"), "r") as f:
    neuron_colors_full24 = np.asarray(json.load(f))
# Here, 32 neurons, need to make a new palette with same parameters
neuron_colors_full = np.asarray(sns.husl_palette(n_colors=32, h=0.01, s=0.9, l=0.4, as_cmap=False))

with open(pj(params_folder, "model_colors.json"), "r") as f:
    model_colors = json.load(f)
with open(pj(params_folder, "model_nice_names.json"), "r") as f:
    model_nice_names = json.load(f)

models = list(model_colors.keys())
print(models)

# Background generation and initialization functions

In [None]:
def linear_combi(concs, backs):
    """ concs: shaped [..., n_odors]
        backs: 2D array, shaped [n_odors, n_osn]
    """
    return concs.dot(backs)

In [None]:
# Global choice of background and odor mixing functions
update_fct = update_powerlaw_times_concs
combine_fct = linear_combi

In [None]:
# We will later explore the effect of varying these parameters on the convergence, 
# but put the default ones in a function
def default_background_params(n_comp):
    """ Default time and concentration parameters for the turbulent process"""
    # Turbulent background parameters: same rates and constants for all odors
    back_pms_turbulent = [
        np.asarray([1.0] * n_comp),        # whiff_tmins
        np.asarray([500.] * n_comp),       # whiff_tmaxs
        np.asarray([1.0] * n_comp),        # blank_tmins
        np.asarray([800.0] * n_comp),      # blank_tmaxs
        np.asarray([0.6] * n_comp),        # c0s
        np.asarray([0.5] * n_comp),        # alphas
    ]
    return back_pms_turbulent

In [None]:
# Background initialization, given parameters and a seeded random generator
def initialize_given_background(back_pms, rgen, n_comp, n_dim):
    # Initial values of background process variables (t, c for each variable)
    init_concs = sample_ss_conc_powerlaw(*back_pms[:-1], size=1, rgen=rgen)
    init_times = powerlaw_cutoff_inverse_transform(
                    rgen.random(size=n_comp), *back_pms[2:4])
    tc_init = np.stack([init_times, init_concs.squeeze()], axis=1)

    # Initial background vector: combine odors with the tc_init concentrations
    back_comps = back_pms[-1]
    init_bkvec = combine_fct(tc_init[:, 1], back_comps)
    # background random variables are first in the list of initial values
    init_back = [tc_init, init_bkvec]
    
    return init_back

# BioPCA simulation functions

In [None]:
# Example of analyses to do on BioPCA results
def analyze_clean_biopca_simul(results_raw):
    """
    We do not need to save odor vectors (back_components), 
    since the IBCM simulation will provide them for both models. 
    
    Args:
        results_raw = (tser_pca, nuser_pca, bkvecser_pca, mser_pca, 
            lser_pca, xser_pca, hbarser_pca, wser_pca, yser_pca)
    Returns:
        bkvecser_pca, ysernorm_pca, wser_pca, true_pca, 
            learnt_pca, off_diag_l_avg_abs, align_error_ser)
    """
    (tser_pca, nuser_pca, bkvecser_pca, mser_pca, lser_pca, xser_pca, 
         hbarser_pca, wser_pca, yser_pca) = results_raw
    
    # Analyze versus true offline PCA of the background samples
    print("Starting analysis of BioPCA vs true PCA")
    tstart = perf_counter()
    res = analyze_pca_learning(bkvecser_pca, mser_pca, lser_pca, 
                           lambda_mat_diag, demean=pca_options["remove_mean"])
    true_pca, learnt_pca, _, off_diag_l_avg_abs, align_error_ser = res
    tend = perf_counter()
    print("Completed analysis in {:.1f} s".format(tend - tstart))
    
    ysernorm_pca = l2_norm(yser_pca, axis=1)
    bkvecsernorm_pca = l2_norm(bkvecser_pca, axis=1)
    
    # Also save info about background vs yser_pca
    
    # Sum of squared differences between learnt and true eigenvalues in log scale
    transient = int(3 * tser_pca.shape[0] // 4)
    avg_learnt = learnt_pca[0][transient:].mean(axis=0)
    n_comp = nuser_pca.shape[1]
    eigenvals_diff_log = (np.log10(true_pca[0][:n_comp]) - np.log10(avg_learnt[:n_comp]))**2
    
    eigenvals_vari = np.var(learnt_pca[0][transient:, :n_comp], axis=0, ddof=1)
    
    results_clean = (true_pca, learnt_pca, off_diag_l_avg_abs, 
                     align_error_ser, eigenvals_diff_log, eigenvals_vari)
    return results_clean

In [None]:
def run_analyze_biopca_one_back_seed(
        biopca_rates_loc, back_rates, inhib_rates_loc, 
        options_loc, dimensions, seedseq, 
        duration_loc=360000.0, dt_loc=1.0, skp_loc=20, full_returns=False
    ):
    """ Given BioPCA model rates and background parameters except
    background odors (but incl. number odors and c0), and a main seed sequence, 
    run and analyze convergence of BioPCA on the background generated from that seed. 
    The seedseq should itself have been spawned from a root seed to have a distinct
    one per run; this still makes seeds reproducible yet distinct for different runs. 
    The seedseq here is spawned again for a background gen. seed and a simul. seed. 
    
    Args:
        dimensions: gives [n_components, n_dimensions, n_i_ibcm]
    
    Returns:
        iff full_return:
            gaps, specifs, hgamvari, hgammas_ser, sim_results
        else:
            gaps, specifs, hgamvari, None, None
        alignment_gaps: indexed [neuron]
        specif_gammas: indexed [neuron]
        gamma_vari: indexed [neuron, component]
    """
    #print("Initializing IBCM simulation...")
    # Get dimensions
    n_comp, n_dim, n_i_pca = dimensions
    
    # Spawn back. generation seed and simul seed
    initseed, simseed = seedseq.spawn(2)
    
    # Duplicate back params before appending locally-generated odor vectors to them
    back_pms_loc = list(back_rates)
    
    # Create background
    rgen_init = np.random.default_rng(initseed)
    back_comps_loc = generate_odorant((n_comp, n_dim), rgen_init)
    back_comps_loc = back_comps_loc / l2_norm(back_comps_loc, axis=1)[:, None]

    # Add odors to the list of background parameters
    back_pms_loc.append(back_comps_loc)

    # Initialize background with the random generator with seed rgenseed
    rgen_init = np.random.default_rng(initseed)
    init_back = initialize_given_background(back_pms_loc, rgen_init, n_comp, n_dim)

    # Initial synaptic weights: small positive noise
    init_synapses_pca = rgen_init.standard_normal(size=[n_i_pca, n_dim]) / np.sqrt(n_i_pca)
    init_mmat_pca = rgen_init.standard_normal(size=[n_i_pca, n_dim]) / np.sqrt(n_dim)
    init_lmat_pca = np.eye(n_i_pca, n_i_pca)  # Supposed to be near-identity, start as identity
    ml_inits_pca = [init_mmat_pca, init_lmat_pca]
    
    # Run the BioPCA simulation
    print("Running BioPCA simulation...")
    tstart = perf_counter()
    sim_results = integrate_inhib_biopca_network_skip(
                ml_inits_pca, update_fct, init_back, 
                biopca_rates_loc, inhib_rates_loc, back_pms_loc,
                duration_loc, dt_loc, seed=simseed, 
                noisetype="uniform",  skp=skp_loc, **options_loc
    )
    tend = perf_counter()
    print("Finished BioPCA simulation in {:.2f} s".format(tend - tstart))
    
    # Now analyze BioPCA simul for convergence
    print("Starting to analyze BioPCA simulation...")
    tstart = perf_counter()
    results_clean = analyze_clean_biopca_simul(sim_results)
    # TODO: compute additional convergence metrics
    true_pca, learnt_pca, off_diag_l_avg_abs, align_error_ser, pc_diff, pc_vari = results_clean
    tend = perf_counter()
    print("Finished analyzing BioPCA simulation in {:.2f} s".format(tend - tstart))
    
    # Doesn't return full series, only the summary statistics of convergence
    analysis_results_ret = (true_pca, learnt_pca, off_diag_l_avg_abs, align_error_ser, pc_diff, pc_vari)
    if full_returns:
        sim_results_ret = sim_results
    else:
        sim_results_ret = None
    
    return analysis_results_ret, sim_results_ret

Enjoy a simplification for once: we do not need to consider other models like average subtraction, optimal $P$, orthogonal projection since all we care about in this notebook is the convergence of the two biologically plausible models. 

# Plotting functions

In [None]:
def plot_biopca_results(res_biopca_clean, res_biopca_raw, skp_loc=10):
    # Extract individual arrays from the lists of results
    (tser_pca, nuser_pca, bkvecser_pca, mser_pca, lser_pca, xser_pca, 
         hbarser_pca, wser_pca, yser_pca) = res_biopca_raw
    (true_pca, learnt_pca, off_diag_l_avg_abs, align_error_ser, pca_diff, pca_vari) = res_biopca_clean
    
    print("log-scale squared distance in principal values learned vs true:", pca_diff)

    # Plot learnt vs true PCA
    tser_scaled = tser_common * dtscale
    fig, axes = plot_pca_results(tser_scaled, true_pca, learnt_pca, align_error_ser, off_diag_l_avg_abs)
    axes[-1].set_xlabel("Time (min)")
    axes[0].get_legend().remove()
    fig.tight_layout()
    fig.set_size_inches(fig.get_size_inches()[0], 2.5*plt.rcParams["figure.figsize"][1])
    plt.show()
    plt.close()

    # Plot level of background inhibition
    fig, ax, bknorm_ser, ynorm_ser = plot_background_norm_inhibition(
                                    tser_scaled*1000, bkvecser_pca, yser_pca, skp=2)
    ax.set_xlabel("Time (min)")
    
    # Compute noise reduction factor, annotate
    transient = 250000 // skp_loc
    norm_stats = compute_back_reduction_stats(bknorm_ser, ynorm_ser, trans=transient)

    print("Mean activity norm reduced to "
          + "{:.1f} % of input".format(norm_stats['avg_reduction'] * 100))
    print("Standard deviation of activity norm reduced to "
          + "{:.1f} % of input".format(norm_stats['std_reduction'] * 100))
    ax.annotate("St. dev. reduced to {:.1f} %".format(norm_stats['std_reduction'] * 100), 
               xy=(0.98, 0.98), xycoords="axes fraction", ha="right", va="top")

    ax.legend(loc="center right", bbox_to_anchor=(1.0, 0.8))
    fig.tight_layout()
    plt.show()
    plt.close()

# BioPCA simulation with proper rates 
To illustrate convergence

In [None]:
# Common parameters for all simulations
# Dimensions: 25 is enough?
n_dimensions = 25
n_components = 3  # try with 3 for simplicity by default

# Inhibition W learning and decay rates
inhib_rates_default = [0.00005, 0.00001]  # alpha, beta  [0.00025, 0.00005]

# Simulation duration and integration time step
duration = 360000.0
deltat = 1.0

# Saving every skp simulation point, 50 is enough for plots, 
# here use 20 to get convergence time accurately
skp_default = 20 * int(1.0 / deltat)
tser_common = np.arange(0.0, duration, deltat*skp_default)
dtscale = 10.0 / 1000.0 / 60.0  # to convert time steps units to minutes

# Common model options
activ_function = "identity"  #"ReLU"

## BioPCA default parameters

In [None]:
# BioPCA model parameters, same for all epsilons
n_i_pca = n_components  # Number of inhibitory neurons for BioPCA case

# Model rates
learnrate_pca = 2.5e-5  # Learning rate of M
# Choose Lambda diagonal matrix as advised in Minden et al., 2018
# but scale it up to counteract W regularization
lambda_range_pca = 0.3
lambda_max_pca = 9.0
# Learning rate of L, relative to learnrate. Adjusted to Lambda in the integration function
rel_lrate_pca = 3.0
lambda_mat_diag = build_lambda_matrix(lambda_max_pca, lambda_range_pca, n_i_pca)

xavg_rate_pca = 1e-4  #learnrate_pca
pca_options = {
    "activ_fct": activ_function, 
    "remove_lambda": False, 
    "remove_mean": True
}
biopca_rates_default = [learnrate_pca, rel_lrate_pca, lambda_max_pca, lambda_range_pca, xavg_rate_pca]

In [None]:
# Test run for now
# Create a default background for testing purposes
meta_seedseq = np.random.SeedSequence(0x6225b86e0b4a826a8845d573f634c7d9)

# Package dimensions and back. parameters
back_rates_default = default_background_params(n_components)
# Try changing background rates here if desired
# Concentration scale c0: multiply up-down to change convergence dynamics
# just as well as the learning rate, although with a different scaling. 
back_rates_default[4][:] = 0.6
back_rates_default[1][:] = 500.0  # whiff duration
back_rates_default[3][:] = 800.0  # blank duration
dimensions_biopca = [n_components, n_dimensions, n_i_pca]

# Run and analyze simulation derived from the meta seedsequence
lysis_res, sim_res = run_analyze_biopca_one_back_seed(biopca_rates_default, back_rates_default, 
                        inhib_rates_default, pca_options, dimensions_biopca, meta_seedseq,
                        duration_loc=duration, dt_loc=deltat, skp_loc=skp_default, full_returns=True)

In [None]:
# Visualize convergence dynamics first
print("True eigenvalues:", lysis_res[0][0][:3])
print("L_ii standard dev., scaled by true eigenvalues (so, CV):", 
      np.sqrt(lysis_res[5]) / lysis_res[0][0][:3])
plot_biopca_results(lysis_res, sim_res, skp_loc=skp_default)

In [None]:
fig, axes = plt.subplots(n_i_pca)
fig.set_size_inches(plt.rcParams["figure.figsize"][0], plt.rcParams["figure.figsize"][1]*3)
tser_scaled = tser_common * dtscale
mser_biopca = sim_res[3]
lser_biopca = sim_res[4]

tsl = slice(tser_scaled.shape[0]//3, None, 4)
for i in range(n_i_pca):
    for j in range(n_dimensions):
        axes[i].plot(tser_scaled[tsl], mser_biopca[tsl, i, j])
plt.show()
plt.close()

# Observations

Recall the BioPCA equations, written in my notation:

$$ \frac{\mathrm{d} M}{\mathrm{d} t} = \mu (\mathbf{h} \mathbf{s}^T - M) $$

$$ \frac{\mathrm{d} L'}{\mathrm{d} t} = \mu_L \mu (\mathbf{h} \mathbf{h}^T - \underline{\underline{\Lambda}} L' \underline{\underline{\Lambda}} ) $$

$$ \mathbf{h} = LM \mathbf{s} \quad, \quad L' = L^{-1} \quad, \quad \mu_L \approx 2$$

My usual initial conditions were $L_{ii} = 1$: much too large compared to typical principal values in the backgrounds I consider. Then, if $L$ is too large, $L'$ is too small: $\mathbf{h}$ amplitude is suppressed, so we essentially have exponential decay, $ \frac{\mathrm{d} L'}{\mathrm{d} t} \sim - \underline{\underline{\Lambda}} L' \underline{\underline{\Lambda}}$, of the eigenvalues $L'$ (increase of $L$). So that's where the linear behavior in log scale comes from. The decay rate is $\mu \Lambda_i^2$: each eigenvalue gets a different decay rate, so they reach different eigenvalues at different times and break fixed point degeneracy. Fastest decaying (largest $\Lambda_{ii}$ = first entry of the matrix) stops to decay at largest $L'$, and so on.  

For $M$, due to small $h$, we initially get exponential decay. Then, once $L'$ is small enough ($L$ large enough, $h$ larger), we have an increase of $M$ until stabilization to the fixed point

Note that by increasing the scale of $\Lambda$, we increase the $L'$ decay rate: we should probably compensate by decreasing $\mu_L$ if $L'$ entries are found to shoot past the true eigenvalues. We want to keep $\Lambda^2 \mu_L > 2$ to ensure convergence.

Note that if $M$ has too much time to decay, $h$ stays too small and $L'$ keeps decaying and overshoots the eigenvalue. So this is also why we want $\mu_L$ relatively large. It's a good idea, in fact, to keep $\mu_L = 2$ even if the $\Lambda$ scale is increased. 


For $L'_{ii}$ much too small initially ($L$ large), we would have $h$ very large, so $L'$ increses 