In [None]:
import numpy as np
import scipy as sp
import matplotlib.pyplot as plt
from scipy import special

# Analytical derivation

Let $Y_j$ be the activity of Kenyon cell $j \in \{1, 2, \ldots, 2000\}$, $S_i$ be the activity of ORN $i \in \{1, 2, \ldots, 50\}$, and $X_i = \max({S_i - W_i(t), 0})$ be the corresponding projection neuron, where $W_i$ is the habituation weight at time $t$. Each KC is a sum of $k=6$ randomly selected projection neurons, denoted by the set of $k$ indices called $I_j$:

$$ Y_j = \sum_{i \in I_j} X_{i} $$

For a pure odor, the $S_i$ are i.i.d. exponential with parameter $\lambda=0.1$ (mean $1/\lambda = 10$). For a mixture of odors, they are a linear combination of such i.i.d. exponential variables. 

Let's focus on the first KC and choose indices so that $I_j = \{1, 2, \ldots, k\}$ for simplicity of notation:

$$ Y_1 = \sum_{i=1}^k X_i $$


We are interested in the number $M$ of KCs that will be above some threshold $\tau_0$, 

$$M = \sum_{j=1}^{N_{KC}} \mathbb{1}_{Y_j > \tau_0} \, ,$$

where $\mathbb{1}_{Y_j > \tau_0}$ is $1$ if $Y_j$ is above threshold, 0 otherwise (note how this is a Bernoulli variable with probability of success $\mathbb{P}[Y_j > \tau_0]$. Ideally, we would want $\mathbb{P}[M=m]$, but that's an unreasonable expectation, I think, because the $Y_j$ are correlated variables: they are sums of subsets of the same fifty $X_i$. So the Bernoulli variables $\mathbb{1}_{Y_j > \tau_0}$ are correlated too. Moreover, their correlation is hard to guess because the subsets are chosen randomly. 

Nonetheless, using the linearity of expectation values, we can get 

$$ \mathbb{E}[M] = \sum_{j=1}^{N_{KC}} \mathbb{E}[\mathbb{1}_{Y_j > \tau_0}] = \sum_{j=1}^{N_{KC}} \mathbb{P}[Y_j > \tau_0]  $$

The $Y_j$ are identically distributed, although not independent, because they are all built as a sum of i.i.d. variables, the $X_j$ (the $X_j$ are i.i.d. because the $S_i$ are i.i.d. exponential, or sum of exponentials if combining odors, and the mapping from $X_i$ to $S_i$ is one-to-one). Therefore, we can replace $Y_j$ by $Y_1$ above, and get a manageable expression for $\mathbb{E}[M]$:

$$ \mathbb{E}[M] = N_{KC} \mathbb{P}[Y_1 > \tau_0]  $$

### Note
If the $Y_j$ were independent, i.e. the PNs were randomly drawn to form each $Y_j$ separately, the $Y_j$ would also be identically distributed (since the $X_i$ are), and we would have

$$ \mathbb{P}[M=m] = \binom{N_{KC}}{m} (\mathbb{P}[Y_1 > \tau_0])^m (1 - \mathbb{P}[Y_1 > \tau_0])^{N_{KC}-m}  $$

because each KC would independently have a probability $\mathbb{P}[Y_1 > \tau_0]$ to be above threshold, thus  $M$ would be a sum of i.i.d Bernoulli variables indicating if $Y_j$ is above $\tau_0$ (with probability $\mathbb{P}[Y_j > \tau_0] = \mathbb{P}[Y_1 > \tau_0]$) or below. The binomial distribution follows from this consideration. 

In [None]:
def average_nkc_above_thresh(tau, npn_per_kc, inv_avg_pn, nkc):
    return nkc * sp.special.gammaincc(npn_per_kc, tau*inv_avg_pn)

## Pure odorant, no habituation
In this case, $Y_1 = \sum_{i=1}^k X_i = \sum_{i=1}^k S_i$ where $S_i$ are i.i.d. exponential with $\lambda=10$. Therefore, $Y_1 \sim Gamma(k, \lambda)$. The probability density of a gamma distribution is, as a reminder, 

$$ f_Y(y) = \frac{\lambda^k}{\Gamma(k)} y^{k-1} e^{-\lambda y} $$

The cumulative distribution follows:

$$ \mathbb{P}[Y_1 > \tau_0] = \frac{1}{\Gamma{k}} \int_{\lambda \tau_0}^{\infty} u^{k-1} e^{-u} du = Q(k, \lambda \tau_0 $$

where $Q$ is the normalized lower incomplete gamma function. This special function is conveniently available in ``scipy.special.gammaincc``. 

We compute $\mathbb{E}[M] = N_{KC} Q(k, \lambda \tau_0)$ below for $k=6$, $N_{KC}=2000$, $\tau_0 = 20$, $\lambda=0.1$ (mean of the exponential: 10). We see that tags keeping the top 100 KCs are well-filled for a pure odor without habituation with this choice of threshold. 

In [None]:
# Define parameters
lambd = 1 / 10
nKC = 2000
k = 6
tau0 = 20
average_no_habit = average_nkc_above_thresh(tau0, k, lambd, nKC)
print(average_no_habit)

## Pure odorant, full habituation

After full habituation, we have $W_i = \frac{\alpha}{\alpha + \beta} S_i$

$$ X_i = S_i - \frac{\alpha}{\alpha + \beta} S_i = \frac{\beta}{\alpha + \beta} S_i $$ 

which follows an exponential distribution with mean $\frac{1}{\lambda} \frac{\beta}{\alpha + \beta}$, so an exponential distribution with $\lambda' = \lambda (1 + \frac{\alpha}{\beta})$. Therefore, the above reasoning still holds, and we obtain

$$ \mathbb{E}[M] = N_{KC} Q\left(k, \lambda \tau_0(1 + \alpha / \beta) \right) $$

We show below that this results in very sparse KC tags (a lot less than 100 cells above threshold) after complete habituation to a pure odorant, with habituation rates $\alpha=0.05$ and $\beta=0.01$. This corresponds to figure 2B in the paper. 

In [None]:
# Pure odorant, perfect habituation
alpha = 0.05
beta = 0.01
lambd_full_habit = (alpha + beta) / beta * lambd
average_full_habit = average_nkc_above_thresh(tau0, k, lambd_full_habit, nKC)
print(average_full_habit)

## Pure odorant, partial habituation

Now, the subtracted weights are $W_i(t) = \frac{\alpha}{\alpha + \beta} S_i ( 1 - e^{-(\alpha + \beta)t})$. Therefore, $X_i$ becomes $S_i - W_i(t) = (\beta - \alpha e^{-(\alpha + \beta)t}) S_i / (\alpha + \beta)$, which is an exponential distribution with parameter $\lambda'(t) = \frac{(\alpha + \beta) \lambda}{\beta + \alpha e^{-(\alpha + \beta)t}}$. Following the same reasoning, we again get a Gamma distribution for $Y_j$, and thus

$\mathbb{E}[M] = N_{KC} Q \left(k, \frac{(\alpha + \beta) \lambda}{\beta + \alpha e^{-(\alpha + \beta)t}} \right) $

We compute this below for the typical choice of parameters and see that we should still get, on average, a dense KC tag after 50 time steps of habituation. 

In [None]:
def habituate_time_lambda_factor(t, alph, bet):
    time_factor = (bet + alph * np.exp(-(alph+bet)*t))
    return (alph + bet) / time_factor

In [None]:
habit_factor = habituate_time_lambda_factor(50, alpha, beta)
average_t50_habit = average_nkc_above_thresh(tau0, k, lambd*habit_factor, nKC)
print(average_t50_habit)

In [None]:
## Plotting the average as a function of tau for different habituation times
tau0_range = np.arange(10, 30, 0.1)
all_nkc_curves = {}
for t in [0, 50, 1000]:
    habit_factor = habituate_time_lambda_factor(t, alpha, beta)
    average_t_habit = average_nkc_above_thresh(tau0_range, k, lambd*habit_factor, nKC)
    all_nkc_curves[t] = average_t_habit

In [None]:
fig, ax = plt.subplots()
for t in all_nkc_curves.keys():
    ax.plot(tau0_range, all_nkc_curves[t], label=r"time = ${}$".format(t))
ax.set(xlabel=r"Threshold $\tau_0$", ylabel="Average number of KC above threshold")
ax.axhline(0.05*nKC, ls="--", color="k")
ax.set_yscale("log")
ax.legend(title="Habituation time")
fig.savefig("figures/kc_threshold_scaling_pure_habituation.png", transparent=True)
plt.show()
plt.close()

We see that we get, on average, a sparse tag (that is, with less than 100 active KCs) if we habituate long enough and take $\tau_0=20$. 

# Mixture with habituation to one constant pure background before

After habituation to background odor $S^B$, we present the mixture $f S^A + (1-f) S^B$, with $f = 0.2$. The habituation weights are, at that time, $W_i(t) = \frac{\alpha}{\alpha + \beta} S_i^B ( 1 - e^{-(\alpha + \beta)t})$. Thus, 

$$ Y_j = \sum_{i=1}^k X_i^B (1-f) - W_i(t) + f X_i^A $$

We are now combining two gamma distributions: one from the background, one from the new odor. 

$$ Y_j = \sum_{i=1}^k S_i^B \left(1 - f - \frac{\alpha}{\alpha + \beta} (1 - e^{-(\alpha + \beta)t}) \right) + \sum_{i=1}^k f S_i^A $$

We suppose that due to habituation, almost all contribution of odor B is removed and only $fA$ is left in the PN layer. This is a good approximation for long enough times, because $1 - f - \frac{\alpha}{\alpha + \beta} = 1 - 1/5 - 5/6 = -0.0333$, which is roughly 5 times smaller than the contribution of $fX_j^A=0.2X_j^A$, especially in the ORNs dominated by A. 

Therefore, the distribution of each KC neuron, $Y_j = \sum_{i=1}^k f S_i^A $, is still a Gamma distribution, now with average $f/\lambda$. Therefore, at time $t=t^*$, the time where $f + \alpha (1 - e^{-(\alpha + \beta)t^*})/ (\alpha + \beta) = 1$, we have exactly

$$\mathbb{E}[M(t = t^*)] = N_{KC} Q(k, \tau \lambda / f) $$

And this is approximately true at times a bit before or larger (because $f \approx \beta / (\alpha + \beta)$). 

An exact expression could in principle be found (in terms of a power series) for a sum  or difference of two Gamma variables with different scales, corresponding to the contributions of odorants A and B, using the results of <https://link.springer.com/article/10.1007/BF02481123> (or <https://link.springer.com/article/10.1007%2FBF02481056>), but it's not worth the sophistication yet. 

In [None]:
f = 0.2
## Plotting the average as a function of tau for different habituation times
tau0_range = np.arange(10, 30, 0.1)
habit_mix_nkc_curve = average_nkc_above_thresh(tau0_range, k, lambd/f, nKC)

In [None]:
fig, ax = plt.subplots()
ax.plot(tau0_range, habit_mix_nkc_curve, label="Assuming B is perfectly compensated")
ax.set(xlabel=r"Threshold $\tau_0$", ylabel="Average number of KC above threshold")
ax.set_title("After habituation to B, A presented at 20 %")
ax.legend()
ax.axhline(0.05*nKC, ls="--", color="k")
ax.set_yscale("log")

fig.savefig("figures/kc_threshold_scaling_mixture.png", transparent=True)
plt.show()
plt.close()

# Remark
We see that around $\tau_0 = 20$, we reach a regime where, on average, fewer than 100 KCs will be firing. The KC tag computed for the odor will start to empty itself dramatically. This lowers the accuracy of the odorant discrimination. It explains why, as we decrease $\tau_0$ below 20, we see a sharp increase in accuracy after habituation to a constant string. The increase reaches 100 % if the habituation time is set to $t^*$ so that the contribution of the background is exactly canceled, and the new odor, even at 20 %, can give at least 100 KCs above threshold. 