# Lab 12: A stochastic model of evolution

In today's lab we'll see how we can simulate a stochastic model of evolution. We'll start with our deterministic model of haploid selection, where the frequency of the A allele in the next generation is

$\displaystyle{p(t+1) = \frac{p(t)W_A}{p(t)W_A + (1-p(t))W_a}}$

Now imagine that $p(t+1)$ does not give the frequency in the next generation. Instead, imagine all $N$ individuals in the population producing a very large number of gametes, in proportion to their fitness, from which $N$ are chosen at random to make the next generation. Then $p(t+1)$ is the fraction of that large number of gametes with allele A. Sampling $N$ of these gametes, the probability we sample $X(t+1)=j$ with allele A given there were $X(t)=i$ with allele A in the previous generation (i.e., $p(t)=i/N$) is then given by a **binomial distribution**

$\displaystyle{p_{ji} = P(X(t+1)=j | X(t)=i) = {N \choose j}p(t+1)^j(1-p(t+1))^{N-j}}$

This is the haploid **Wright-Fisher model** (section 13.4 in the text). What this says is that the number of A alleles in the next generation is binomially distributed with $N$ trials and probability of success $p(t+1)$. This tells us how to simulate this process.

For example, if the population size was $N=100$ and the fraction of A gametes was $p(t+1)=0.1$, we can stochastically choose the number of A alleles in the next generation

In [None]:
from numpy.random import binomial #import a binomial sampler
binomial(100,0.1) #make a draw from the binomial distribution with N=100 trials and p=0.1 probability of success

One interesting case is when there is no selection, $W_A=W_a$. Our deterministic recursion then gives $p(t+1)=p(t)$, i.e., no change in frequency, i.e., no evolution. But what happens in a stochastic version of this model?

In [None]:
N = 100 #population size
tmax = 2*N #number of generations to simulate

t = 0 #starting time
p = 0.5 #starting frequency
tps = [[t,p]] #list of times and frequencies
while t < tmax:
    pnext = p #frequency of A among gametes
    p = binomial(N,pnext)/N #the numer of A alleles in the next generation is binomially distributed with N trials and probability of success pnext, and then we divide by N to get the frequency 
    t = t + 1 #update time
    tps.append([t,p])
    
list_plot(tps, plotjoined=True, ymin=0, ymax=1, axes_labels=['generation','p'])

This is radically different: we now see the frequency change dramatically and in many cases we fix either the A or a allele. This change in allele frequency -- evolution -- caused by random sampling is called **genetic drift**.

**Q1** Simulate this process 10 times, with the same parameter values as above, and plot all the allele frequency trajectories together in one plot.

**Q2** In how many instances was the A allele lost? In how many instances did A fix (ie, the a allele was lost)? In how many were A and a still segregating?

**Q3** Now imagine $W_A=1.1$ and $W_a=1$, so the A allele is beneficial. Start with the A allele at frequency $p(0)=0.5$. What is the expected frequency in the next generation from our deterministic recursion?

**Q4** Sample from a binomial distribution with $N=100$ trials and the probability of success given by your answer in Q3. This gives the numeber of A alleles in the next generation.

**Q5** Now what is the frequency of the A allele in this next generation?

**Q6** Write a while loop that updates the frequency over successive generations, simulating the Wright-Fisher with selection. Hint, you should be able to do this by changing just pnext in the neutral example above.

**Q7** Run 10 replicates with the same parameters as the neutral simulations, and $W_A=1.1$, $W_a=1$. Plot the allele frequency trajectories of the 10 replicates on the same plot.

**Q8** Now how many times did the A allele fix?

If you have time, try varying the value of $W_A$ (or $N$) to modulate the relative strengths of natural selection and genetic drift.