### Master equation

Let the states of cells be indexed by $i$ (and $j$ as needed) and the states of the environment be indexed by $k$ (and $l$ as needed). Let there be $n_i$ cells in state $i$. There is no need to subscript environmental states, so we wil let $k$ (and $l$ as needed) denote environmental states. Let $\beta_{ik}$ be the growth rate of cells in state $i$ in environment $k$. Let $\gamma_{ik}$ be the rate of death of cells in state $i$ in environment $k$. Let $P(n_1, n_2, \ldots, e;t)$ be the probability have having $n_1$ cells in state $1$, $n_2$ cells in state 2, etc., with the environment in state $e$ at time $t$. We have the understanding that if any of the $n_i$ values are negative, the probability of that state is zero. For ease of notation, let

\begin{align}
&P = P(n_1, n_2, \ldots, k;t),\\[1em]
&P_i = P(\ldots, n_i-1, \ldots, k;t),\\[1em]
&P^i = P(\ldots, n_i+1, \ldots, k;t),\\[1em]
&P_i^j = P(\ldots, n_i-1, \ldots, n_j+1,\ldots, k;t).
\end{align}

Let $s^k_{ji}$ be the rate of switching of cell state $i$ to state $j$ in environment $k$. Let $\sigma_{lk}$ be the rate of switching the environment from state $k$ to state $l$. We understand that $s^k_{ii}$ and $\sigma_{kk}$ are both zero; that is self-transitions are not allowed. Then, the master equation is

\begin{align}
\frac{\mathrm{d}P}{\mathrm{d}t} &= \sum_i \beta_{ik} ((n_i - 1)P_i - n_i P) \\[1em]
&+\sum_i \gamma_{ik}\left((n_i+1)P^i - n_i P\right) \\[1em]
&+\sum_i\sum_j s^k_{ji}\left((n_i+1)P^i_j - n_i P\right) \\[1em]
&+\sum_i\sum_j s^k_{ij}\left((n_j+1)P^j_i - n_j P\right) \\[1em]
&+\sum_l (\sigma_{kl}P(n_1,\ldots, l;t) - \sigma_{lk}P).
\end{align}

Consider the following special case.

- There are two cell states, a and b.
- There are two environmental states, 1 and 2.
- There are two growth rates; $\beta_{1a} = \beta_{2b} \equiv \beta_\mathrm{fast}$ and $\beta_{1b} = \beta_{2a} \equiv \beta_\mathrm{slow}$.
- The switching rate of the environment is independent of the direction of the switch, such that $\sigma_{ab} = \sigma_{ba} \equiv \sigma$.
- The switching rate of the environment is independent of the direction of the switch and is independent of the state of the environment, such that $s_{12}^a = s_{12}^b = s_{21}^a = s_{21}^b \equiv s$.
- The cell death rate is independent of cell and environmental state such that $\gamma_{1a} = \gamma_{1b} = \gamma_{2a} = \gamma_{2b} \equiv \gamma$.

For this special case, the master equation is

\begin{align}
\frac{\mathrm{d}P(n_1, n_2, k;t)}{\mathrm{d}t} &= \beta_{1k}\left[(n_1-1)P(n_1-1, n_2, k;t) - n_1 P(n_1, n_2, k;t)\right] \\[1em]
&+ \beta_{2k}\left[(n_2-1)P(n_1, n_2-1, k;t) - n_2 P(n_1, n_2, k;t)\right] \\[1em]
&+ \gamma\left[(n_1+1)P(n_1+1, n_2, k;t) + (n_2+1) P(n_1, n_2+1, k;t) - (n_1+n_2)P(n_1, n_2, k;t)\right] \\[1em]
&+ s\left[(n_1 + 1)P(n_1+1, n_2-1, k;t) + (n_2 + 1)P(n_1-1, n_2+1, k;t) - (n_1+n_2)P(n_1, n_2, k;t)\right]\\[1em]
&+ \sigma\left[P(n_1, n_2, l;t) - P(n_1, n_2, k;t) \right],
\end{align}

where $k$ and $l$ are either a or b with $k \ne l$. Note that for now it is convenient to continue to use $\beta$'s subscripted with cell state and environment; we will switch to $\beta_\mathrm{fast}$ and $\beta_\mathrm{slow}$ momentarily.

We can sample out of this master equation using a Gillespie simulation, but before we do that, we can derive differential equations for various moments of the distribution. Toward that end, we define probability of have $n_1$ cells in state 1 and $n_2$ cells in state 2 conditioned on the environment being in state $k$ is

\begin{align}
P(n_1, n_2 \mid k;t) = \frac{P(n_1, n_2, k;t)}{P(k;t)}.
\end{align}

Then, the left-hand-side of the master equation is

\begin{align}
\frac{\mathrm{d}P(n_1, n_2, k;t)}{\mathrm{d}t} = P(k;t)\, \frac{\mathrm{d}P(n_1, n_2\mid k ;t)}{\mathrm{d}t} + P(n_1, n_2\mid k ;t)\,\frac{\mathrm{d}P(k;t)}{\mathrm{d}t}.
\end{align}

We define

\begin{align}
\langle \xi \rangle_k = \sum_{n_1 = 0}^\infty\sum_{n_2 = 0}^\infty \xi P(n_1, n_2 \mid k;t)
\end{align}

to be the time-dependent expectation of $\xi$ in environment $k$. Then, we can write

\begin{align}
\sum_{n_1=0}^\infty\sum_{n_2=0}^\infty\,n_1\,\frac{\mathrm{d}P(n_1, n_2, k;t)}{\mathrm{d}t} &= 
P(k;t)\, \frac{\mathrm{d}\langle n_1 \rangle_k}{\mathrm{d}t} + \langle n_1 \rangle_k\,\frac{\mathrm{d}P(k;t)}{\mathrm{d}t}\\[1em]
&= \left[\beta_{1k}\langle n_1\rangle_k - \gamma \langle n_1 \rangle_k
+ s\left(\langle n_2 \rangle_k - \langle n_1 \rangle_k\right)
- \sigma \langle n_1 \rangle_k\right]P(k;t)
+ \sigma \langle n_1 \rangle_l P(l;t).
\end{align}


We can compute a differential equation for $P(k;t)$, the probability that the environment is in state $k$.

\begin{align}
\frac{\mathrm{d}P(k;t)}{\mathrm{d}t} = \sum_{n_1}^\infty\sum_{n_2}^\infty\frac{\mathrm{d}P(n_1, n_2, k;t)}{\mathrm{d}t} = \sigma\left[P(l;t) - P(k;t)\right]
= \sigma\left(1 - 2P(k;t)\right).
\end{align}

This can be solved to give

\begin{align}
P(k;t) = \frac{1}{2} + \left(P(k;0) - \frac{1}{2}\right)\mathrm{e}^{-2\sigma t}.
\end{align}

We assume that the environmental switching has already hit its steady state such that $P(k;0) = 1/2$, then $P(k;t) = P(l;t) = 1/2$ for all time. In this case, we have

\begin{align}
\frac{\mathrm{d}\langle n_1 \rangle_k}{\mathrm{d}t} = \beta_{1k}\langle n_1\rangle_k - \gamma \langle n_1 \rangle_k
+ s\left(\langle n_2 \rangle_k - \langle n_1 \rangle_k\right)
- \sigma \langle n_1 \rangle_k
+ \sigma \langle n_1 \rangle_l.
\end{align}

Now using $\beta_\mathrm{fast} = \beta_\mathrm{1a} = \beta_{2b}$ and $\beta_\mathrm{slow} = \beta_{1b} = \beta_{2a}$, we can explicitly write four ODEs.

\begin{align}
\frac{\mathrm{d}\langle n_1 \rangle_a}{\mathrm{d}t} &= \beta_\mathrm{fast}\langle n_1\rangle_a - \gamma \langle n_1 \rangle_a
+ s\left(\langle n_2 \rangle_a - \langle n_1 \rangle_a\right)
+ \sigma\left(\langle n_1 \rangle_b - \langle n_1 \rangle_a\right),\\[1em]
\frac{\mathrm{d}\langle n_1 \rangle_b}{\mathrm{d}t} &= \beta_\mathrm{slow}\langle n_1\rangle_b - \gamma \langle n_1 \rangle_b
+ s\left(\langle n_2 \rangle_b - \langle n_1 \rangle_b\right)
+ \sigma\left(\langle n_1 \rangle_a - \langle n_1 \rangle_b\right),\\[1em]
\frac{\mathrm{d}\langle n_2 \rangle_a}{\mathrm{d}t} &= \beta_\mathrm{slow}\langle n_2\rangle_a - \gamma \langle n_2 \rangle_a
+ s\left(\langle n_1 \rangle_a - \langle n_2 \rangle_a\right)
+ \sigma\left(\langle n_2 \rangle_b - \langle n_2 \rangle_a\right),\\[1em]
\frac{\mathrm{d}\langle n_2 \rangle_b}{\mathrm{d}t} &= \beta_\mathrm{fast}\langle n_2\rangle_b - \gamma \langle n_2 \rangle_b
+ s\left(\langle n_1 \rangle_b - \langle n_2 \rangle_b\right)
+ \sigma\left(\langle n_2 \rangle_a - \langle n_2 \rangle_b\right),
\end{align}

which can be written as

\begin{align}
\frac{\mathrm{d}}{\mathrm{d}t}\begin{pmatrix}
\langle n_1 \rangle_a\\
\langle n_1 \rangle_b\\
\langle n_2 \rangle_a\\
\langle n_2 \rangle_b
\end{pmatrix} = \mathsf{A}\cdot \begin{pmatrix}
\langle n_1 \rangle_a\\
\langle n_1 \rangle_b\\
\langle n_2 \rangle_a\\
\langle n_2 \rangle_b
\end{pmatrix}
\end{align}

with 

\begin{align}
\mathsf{A} = \begin{pmatrix}
\beta_\mathrm{fast} - \gamma - s - \sigma & \sigma & s & 0 \\
\sigma & \beta_\mathrm{slow} - \gamma - s - \sigma & 0 & s \\
s & 0 & \beta_\mathrm{slow} - \gamma - s - \sigma & \sigma \\
0 & s & \sigma & \beta_\mathrm{fast} - \gamma - s - \sigma
\end{pmatrix}
\end{align}

The four (real) eigenvalues of this linear system are

\begin{align}
\lambda = \beta_\mathrm{fast} + \beta_\mathrm{slow} - 2(\gamma + s + \sigma) \pm \sqrt{(\beta_\mathrm{fast} - \beta_\mathrm{slow})^2 + 4(s \pm \sigma)^2}.
\end{align}

Therefore, the largest eigenvalue, that which determines the growth rate (provided it is positive), is

\begin{align}
\lambda = \beta_\mathrm{fast} + \beta_\mathrm{slow} - 2(\gamma + s + \sigma) + \sqrt{(\beta_\mathrm{fast} - \beta_\mathrm{slow})^2 + 4(s+\sigma)^2}.
\end{align}

What switching rate gives optimal growth? We can investigate by differentiating.

\begin{align}
\frac{\mathrm{d}\lambda}{\mathrm{d}s} = 2\left(1-\frac{2(s+\sigma)}{\sqrt{(\beta_\mathrm{fast}-\beta_\mathrm{slow})^2+4(s + \sigma)^2}}\right).
\end{align}

This derivative is negative for all nonnegative values of $s$. Therefore, the optimal growth rate occurs when the switching is very slow. (There must be some switching because otherwise the cells could all be in a slow state and never escape.)