# Dependence of Cross-Feeding on Niche Overlap

## Niche Overlap

Let's imagine an community with $N$ consumers and $M$ resources, the $N\times M$ matrix $u_{i\alpha}$ describes the space of preferences consumers have for resources. The vector $\vec{u}_{i}$ corresponds to the $i^{th}$ row of $u_{i\alpha}$ and represents the distribution of resource preferences for the $i^{th}$ consumer. 

We had defined Niche Overlap in the community as the average cosine similarity between each pair of consumer preference vectors $\vec{u}_{i}$.

\begin{equation}
   \large N_{o} = \dfrac{2}{N(N-1)} \sum \limits_{i=1}^{N} \sum \limits_{j\neq i}^{N} \dfrac{\vec{u}_{i} \cdot \vec{u}_{j}}{\lvert \lvert \vec{u}_{i} \rvert \rvert \lvert \lvert \vec{u}_{j} \rvert \rvert}
\end{equation}

Indeed this gives us a measure of how similar consumer preferences are on average and since the consumer preference vectors are necessarily real-valued, it is good scalar measure of preference overlap. 

On the other hand, we may ask in which direction (in resource space) consumer preferences concentrate. This is easily accomplished by averaging the sum of the preference vectors.

\begin{equation}
    \large \vec{u}_{avg} = \dfrac{1}{N} \sum \limits_{i=1}^{N} \vec{u}_{i}
\end{equation}

Notice, that $\vec{u}_{avg}$ tells us very little about how preferences vary and indeed, nothing of how they overlap. Coupled with $N_{o}$, however, we can get a general notion of the shape defined by consumer preferences. While $\vec{u}_{avg}$ tells us the general direction of consumer preferences, $N_{o}$ is a measure of the overall distance from that average. When $N_{o} \approx 1$ we can surmise that consumer preferences are closely centered around $\vec{u}_{avg}$, whereas if $N_{o} \approx 0$ they are likely spread out.

## Effective Leakage and Quantifying Cross-feeding

To get an idea of how a given consumer is expected to leak resources we defined an effective leakage measure, $\vec{L}_{i}^{eff}$, which is simply the overall sum of resource specific leakage vectors, weighted by the consumer's uptake capacity for the corresponding resources.

\begin{equation*}
    \large \vec{L}_{i}^{eff} = \sum \limits_{\alpha = 1}^{M} u_{i \alpha} \vec{l}_{i \alpha}
\end{equation*}

Where $\vec{l}_{i \alpha} = (l_{i\alpha 1}, l_{i \alpha 2}, \dots, l_{i \alpha M})$ is the $i^{th}$ consumer's leakage vector for resource $R_{\alpha}$. This essentially measures the total distribution of metabolites leaked by a consumer in a scenario where every resource is available at equal concentrations.

Having defined effective leakage, we can get a measure of the extent to which resources leaked by consumers will contribute to the growth of the community. To do this we simply calculate the average pairwise cosine similarity between effective leakage and consumer uptake vectors. 

\begin{equation*}
    \large C_{feed} = \dfrac{2}{N(N-1)} \sum \limits_{i=1}^{N} \sum \limits_{j\neq i}^{N} \dfrac{\vec{L}_{i}^{eff} \cdot \vec{u}_{j}}{\lvert \lvert \vec{L}_{i}^{eff} \rvert \rvert \lvert \lvert \vec{u}_{j} \rvert \rvert}
\end{equation*}

Therefore $C_{feed}$ is an overall measure of similarity between consumer uptake preferences and the distribution of resources which are likely to be leaked by consumers. Notice that we can reframe this cross-feeding measure in terms of average effective leakage, $\vec{L}_{avg}^{eff}$.

\begin{equation*}
    \large \vec{L}_{avg}^{eff} = \dfrac{1}{N} \sum \limits_{i=1}^{N} \vec{L}_{i}^{eff}
\end{equation*}

Hence we can consider $C_{feed}$ to be somewhat dependent on the similarity between $\vec{u}_{avg}$ and $\vec{L}_{avg}^{eff}$. More importantly, we can note that $\vec{L}_{i}^{eff}$ covaries with $\vec{u}_{avg}$ since they both depend on consumer uptake vectors $\vec{u}_{i}$.

## Linking Niche Space to Cross-feeding

Parsing the exact covariance pattern between $\vec{u}_{avg}$ and $\vec{L}_{avg}^{eff}$ can be somwhat difficult, given the dimensionality of our system. However, we can take advantage of our definition of niche overlap $N_{o}$ as well as a few geometric arguments to infer a few coarse relationships. First, we define a useful measure to help us construct a geometric intuition, which is $\overline{L}_{sim}^{eff}$ - the average pairwise cosine similarity between effective leakage vectors.

\begin{equation*}
    \large \overline{L}_{sim}^{eff} = \dfrac{2}{N(N-1)} \sum \limits_{i=1}^{N} \sum \limits_{j\neq i}^{N} \dfrac{\vec{L}_{i}^{eff} \cdot \vec{L}_{j}^{eff}}{\lvert \lvert \vec{L}_{i}^{eff} \rvert \rvert \lvert \lvert \vec{L}_{j}^{eff} \rvert \rvert}
\end{equation*}

In practice $\overline{L}_{sim}^{eff}$ has very little biological significance. It simply measures how similar effective leakage is between consumers or, more geometrically, how tightly effective leakage vectors accumulate around $\vec{L}_{avg}^{eff}$. Of course intuition tells us that if $\overline{L}_{sim}^{eff} << 1$ then the $\vec{L}_{i}^{eff}$ will vary a great deal around $\vec{L}_{avg}^{eff}$, conversely if $\overline{L}_{sim}^{eff}$ is $\mathbb{O}(1)$, then the $\vec{L}_{i}^{eff}$ will be in the vicinity of $\vec{L}_{avg}^{eff}$.

We can focus, for the time being, on uptake and leakage distributions which wrap neatly around their average and highlight two illustrative scenarios.


### Single Resource Bias

Let's assume, however unrealistic, that consumer preferences in a community are very strongly biased towards the consumption of one resource. Ignoring the feasibility of such a community, we can notice that 

\begin{equation*}
    \large \vec{u}_{avg} = U \hat{e}_{k}
\end{equation*}

where $\hat{e}_{k}$ is the basis vector for an arbitrary resource $R_{k}$ and $U$ is a positive constant. 

Since we're focused on uptake and leakage distributions with negligible variance it should be evident that $C_{feed} = \vec{u}_{avg} \cdot \vec{L}_{avg}^{eff}$. Thus, $C_{feed} = 1$ if and only if $\vec{L}_{avg}^{eff}$ is parallel to $\vec{u}_{avg}$, and $C_{feed} = 0$ if $\vec{L}_{avg}^{eff}$ lies on any direction orthogonal to $\vec{u}_{avg}$. Of course, if $\vec{L}_{avg}^{eff}$ is any linear composition with a non-zero $\hat{e}_{k}$ component, $C_{feed}$ will take intermediate values. This illustrate the scenario where $C_{feed}$ has full range of motion between one and zero.

### Perfectly Balanced
Let us now turn to the wholly different case where consumer uptake preferences are evenly split between every resource. Hence:

\begin{equation*}
    \large \vec{u}_{avg} = U \sum \limits_{\alpha = 1}^{M} \hat{e}_{\alpha}
\end{equation*}

Again, $U$ is some arbitrary constant and $\hat{e}_{\alpha}$ are the resource basis vectors. We can notice here, that there is a minimum value $C_{feed}$ can take, which occurs when $\vec{L}_{avg}^{eff}$ is perfectly biased towards any single resource  $\vec{L}_{avg}^{eff} = L \hat{e}_{k}$. Doing the corresponding algebra leads us to an exact expression for this value:

\begin{equation*}
    \large C_{feed}^{min} = \dfrac{1}{\sqrt{M}}
\end{equation*}

This, of course, has no bearing on the maximum amount of cross feeding which can be achieved in the community, which will always be unity. It is only the lower bound of $C_{feed}$ which relies on the distribution of uptake preferences.

### The General Case
If we generalize and consider an arbitrary distribution of uptake preferences which is neither perfectly biased towards one resource, nor perfectly balanced, rather somwhere in between these two extremes, we can notice that $C_{feed}^{min}$ will depend entirely on the smallest projection of $\vec{u}_{avg}$. So let us choose $\vec{u}_{avg}$ such that one of it's components $u_{k}$ is smaller than all of the rest.

\begin{equation*}
    \large \vec{u}_{avg} = u_{k} \hat{e}_{k} + \sum \limits_{\alpha \neq k}^{M} u_{\alpha} \hat{e}_{\alpha}
\end{equation*}

If $C_{feed}^{min}$ corresponds to an average effective leakage perfectly aligned with $\hat{e}_{k}$, we have that:

\begin{equation*}
    \large C_{feed}^{min} = \dfrac{u_{k}}{\lvert \lvert \vec{u}_{avg} \rvert \rvert} = \dfrac{u_{k}}{\sqrt{u_{k}^{2} + \sum \limits_{\alpha \neq k}^{M} u_{\alpha}^{2}}}
\end{equation*}