<a href="https://colab.research.google.com/github/emgoss/PLP6621C/blob/main/Module1_R.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Lab 1: Population simulations

The goal of this exercise is to build an intuition for population genetic processes in populations. The R package “learnPopGen” simulates a Wright-Fisher population over time. The main parameters that we will vary are allele frequency, population size, and number of generations that the simulation is run. We can also run multiple replicate populations at the same time. After running simulations with no selection, we will compare these outcomes to what we see when there is natural selection on alleles by changing the relative fitness of alleles and genotypes.

Run the commands (press "play" icons in code boxes) to obtain plots. Later in the exercise, I ask you to edit the command line.

The first command installs the package and will take a while to run.


In [None]:
install.packages("learnPopGen")

In [None]:
library(learnPopGen) # This tells R to load the package now that it is installed.

Now we will run a simulation of a population with 100 individuals (Ne). These are diploids - so there are 200 copies of the gene or chromosome we are simulating. In this simulation we will track our allele of interest, allele A. Here, allele A starts at (p0=0.5), meaning it is present in half chromosomes in this population and the other half have the alternative allele (a). Each allele (A,a) and genotype (AA, Aa, aa) have equal fitness (w), meaning they contribute equally to the next generation on average. We will run the simulation for 400 generations (ngen) to see what happens to the frequency of allele A [f(A)].

In [None]:
drift.selection(p0=0.5, Ne=100, w=c(1,1,1), ngen=400, nrep=1)

The resulting plot will vary each time you run it. The population size stays the same though. Imagine that each generation there are many randomly combining gametes but only 100 resulting offspring.

You can compare multiple realizations of the simulation by increasing the number of replicates to display at one time (nrep). Now we will compare the outcome in 5 different populations.

In [None]:
drift.selection(p0=0.5, Ne=100, w=c(1,1,1), ngen=400, nrep=5)

When f(A)=1, allele A has reached fixation in the population, meaning all individuals have the A allele. When f(A)=0, the A allele is lost from the population and all individuals have the a allele. You may see that sometimes the A allele fixes and sometimes it is lost.



---



Now we will vary population size (Ne) and see what happens. Run at least 20 simulations for different population sizes. You might try Ne of 10, 25, 100, and larger. You could run only one or two populations at a time (nrep) if you are having trouble seeing the dynamics or five at a time (or more if you’d like). You may need to increase or decrease the number of generations simulated to capture the dynamics.

In [None]:
drift.selection(p0=0.5, Ne=10, w=c(1,1,1), ngen=100, nrep=5)

In [None]:
drift.selection(p0=0.5, Ne=25, w=c(1,1,1), ngen=200, nrep=5)

In [None]:
drift.selection(p0=0.5, Ne=500, w=c(1,1,1), ngen=1000, nrep=5)

**Question 1:** What are your observations about the allele frequencies over time at different population sizes? Comment how long it takes for fixation of A, f(A)=1, or loss of A, f(A)=0.




***Your answer here (double click to type):***






---



I changed the population size and allele frequency to  1/(2N) (refer to table below), which is the frequency of a new mutation in a diploid population (i.e., 1 copy of the mutation in 2N chromosomes.)

	N    Initial frequency
---
	4    0.125
	20   0.025
	50   0.01
	100  0.005
	500  0.001
Observe the dynamics for each population size. (Note that I changed the number of reps to 20 below.)


In [None]:
drift.selection(p0=0.125, Ne=20, w=c(1,1,1), ngen=200, nrep=20)

In [None]:
drift.selection(p0=0.01, Ne=50, w=c(1,1,1), ngen=500, nrep=20)

In [None]:
drift.selection(p0=0.005, Ne=100, w=c(1,1,1), ngen=1000, nrep=20)

In [None]:
drift.selection(p0=0.001, Ne=500, w=c(1,1,1), ngen=1000, nrep=20)

**Question 2** What do you notice about the effect of population size on the dynamics of the new mutation?

***Your answer here:***



---



Now that you have some idea of the effect of drift on a new mutation, how does higher or lower fitness of this mutation affect its frequency in the population? Here, the new mutation is allele A, so it comes into the population as Aa (heterozygous) and mating between Aa and Aa creates homozygous AA genotypes. If the homozygous state of AA has slightly higher fitness than genotype aa and the heterozygote is codominant (fitness inbetween AA and aa), then we can change the fitnesses to w=c(1.2, 1.1, 1.0). Play with fitness values and see how it affects the fixation or loss of the new mutation. I gave one example below, but please try other options by editing the command.

In [None]:
drift.selection(p0=0.005, Ne=20, w=c(1.2,1.1,1), ngen=100, nrep=50)

Here the heterozygote has the same fitness as AA, meaning that A is dominant.

In [None]:
drift.selection(p0=0.005, Ne=100, w=c(1.2,1.2,1), ngen=100, nrep=20)

What if there is overdominance, meaning that the heterozygote has higher fitness than either homozygote?

In [None]:
drift.selection(p0=0.005, Ne=100, w=c(1,1.2,1), ngen=100, nrep=20)

**Question 3:** Discuss how selection on a new mutation interacts with population size. What are the consequences of selection for A when:

1.   A is a codominant allele (fitness of AA>Aa>aa)?
2.   A is the dominant allele (AA=Aa>aa)?
3.   there is heterozygote advantage (Aa > AA and aa)?



***Your answer:***



---

