# Selection on Multiple Alleles

Deterministic (infinite population) model for exploring how viability selection shapes allele frequencies when there are **three or more alleles** at a single locus.

Key ideas:
- Random mating → Hardy-Weinberg genotype frequencies each generation
- Viability selection acts on diploid genotypes (each genotype has a fitness)
- No drift (infinite pop), no mutation, no migration — **pure selection dynamics**
- Depending on the fitness landscape, alleles may be lost (**transient polymorphism**) or maintained at equilibrium (**balancing selection**)

In [None]:
from popgen_sim import *
print('Module loaded!')

---
## Quick Reference

```python
SelectionParams(
    n_alleles=3,                       # number of alleles
    freqs=[0.33, 0.34, 0.33],          # initial allele frequencies (must sum to 1)
    allele_labels=['A1', 'A2', 'A3'],  # optional custom names
    fitness={                           # diploid genotype fitnesses
        'A1A1': 1.0, 'A1A2': 0.9, 'A1A3': 0.85,
        'A2A2': 0.7, 'A2A3': 0.6,
        'A3A3': 0.5,
    },
    n_generations=200,
)
```

**Functions:**
- `simulate_selection(params)` → `SelectionResult`
- `make_selection_player(result)` → interactive widget
- `plot_selection(result)` → 3-panel static figure (allele freqs, genotypes, w̄)
- `plot_selection_trajectory(result)` → ternary simplex plot (3 alleles only)

---
## 1. Directional Selection (Transient Polymorphism)

A1 is the most fit allele in all genotype contexts. A2 and A3 will be eliminated — but at different rates depending on their fitness in heterozygotes. Watch the allele frequencies converge to fixation of A1.

In [None]:
params_dir = SelectionParams(
    n_alleles=3,
    freqs=[0.33, 0.34, 0.33],
    fitness={
        'A1A1': 1.0,  'A1A2': 0.95, 'A1A3': 0.90,
                       'A2A2': 0.80, 'A2A3': 0.70,
                                      'A3A3': 0.60,
    },
    n_generations=200,
)

result_dir = simulate_selection(params_dir)
make_selection_player(result_dir)

In [None]:
# Static figure
plot_selection(result_dir)
plt.show()

In [None]:
# Ternary trajectory
plot_selection_trajectory(result_dir)
plt.show()

### Questions to explore
- What happens to mean fitness over time? (Hint: Fisher's Fundamental Theorem)
- Which allele is lost first, and why?
- Try changing initial frequencies — does the endpoint change?

---
## 2. Heterozygote Advantage (Balancing Selection)

When **all heterozygotes are more fit than homozygotes**, selection maintains all three alleles at a stable equilibrium. This is overdominance generalized to multiple alleles.

In [None]:
params_bal = SelectionParams(
    n_alleles=3,
    freqs=[0.8, 0.1, 0.1],  # start far from equilibrium
    fitness={
        'A1A1': 0.7,  'A1A2': 1.0,  'A1A3': 1.0,
                       'A2A2': 0.6,  'A2A3': 1.0,
                                      'A3A3': 0.5,
    },
    n_generations=500,
)

result_bal = simulate_selection(params_bal)
make_selection_player(result_bal)

In [None]:
plot_selection(result_bal)
plt.show()

In [None]:
plot_selection_trajectory(result_bal)
plt.show()

### Questions to explore
- Do the equilibrium frequencies depend on the starting frequencies? Try different initial values.
- What determines the equilibrium? (Hint: it depends on the *homozygote* fitnesses)
- What happens if you make the heterozygote fitnesses unequal (e.g., w(A1A2) = 1.0 but w(A2A3) = 0.9)?

---
## 3. Heterozygote Disadvantage (Unstable Equilibrium)

When homozygotes are more fit than heterozygotes (underdominance), polymorphism is **unstable**. The outcome depends on initial frequencies — one allele will fix, but which one?

In [None]:
# Starting near A1 — what happens?
params_under = SelectionParams(
    n_alleles=3,
    freqs=[0.6, 0.2, 0.2],
    fitness={
        'A1A1': 1.0,  'A1A2': 0.6,  'A1A3': 0.6,
                       'A2A2': 0.9,  'A2A3': 0.5,
                                      'A3A3': 0.8,
    },
    n_generations=200,
)

result_under = simulate_selection(params_under)
plot_selection(result_under)
plt.show()

In [None]:
# Now start near A2 — different outcome?
params_under2 = SelectionParams(
    n_alleles=3,
    freqs=[0.2, 0.6, 0.2],
    fitness={
        'A1A1': 1.0,  'A1A2': 0.6,  'A1A3': 0.6,
                       'A2A2': 0.9,  'A2A3': 0.5,
                                      'A3A3': 0.8,
    },
    n_generations=200,
)

result_under2 = simulate_selection(params_under2)
plot_selection(result_under2)
plt.show()

In [None]:
# Compare trajectories on ternary plot
fig, axes = plt.subplots(1, 2, figsize=(13, 6))

plt.sca(axes[0])
plot_selection_trajectory(result_under)
axes[0].set_title('Start: p(A1)=0.6')

plt.sca(axes[1])
plot_selection_trajectory(result_under2)
axes[1].set_title('Start: p(A2)=0.6')

plt.tight_layout()
plt.show()

---
## 4. Sickle-Cell Style: One Heterozygote Advantage, Others Neutral

A classic scenario inspired by sickle-cell: the A1A2 heterozygote has a fitness advantage (malaria resistance), the A2A2 homozygote is strongly deleterious (sickle-cell disease), while A3 is a neutral variant.

In [None]:
params_sickle = SelectionParams(
    n_alleles=3,
    allele_labels=['A', 'S', 'C'],  # A = normal, S = sickle, C = hemoglobin C
    freqs=[0.7, 0.2, 0.1],
    fitness={
        'AA': 0.9,  'AS': 1.0,  'AC': 0.95,
                     'SS': 0.2,  'SC': 0.7,
                                  'CC': 0.85,
    },
    n_generations=500,
)

result_sickle = simulate_selection(params_sickle)
make_selection_player(result_sickle)

In [None]:
plot_selection(result_sickle)
plt.show()

In [None]:
plot_selection_trajectory(result_sickle)
plt.show()

### Questions to explore
- Is this a stable equilibrium? What alleles are maintained?
- What happens if you increase the fitness of CC homozygotes?
- What if the SC heterozygote also has high fitness?

---
## 5. Your Turn: Design a Fitness Landscape

Try designing your own fitness values. Can you create:
1. A scenario where A3 fixes despite starting rare?
2. A stable 3-allele polymorphism where all alleles are at roughly equal frequency?
3. A case where the outcome (which allele fixes) depends on starting frequencies?

In [None]:
# Your parameters here!
params_custom = SelectionParams(
    n_alleles=3,
    freqs=[0.33, 0.34, 0.33],
    fitness={
        'A1A1': 1.0,  'A1A2': 1.0, 'A1A3': 1.0,
                       'A2A2': 1.0, 'A2A3': 1.0,
                                     'A3A3': 1.0,
    },
    n_generations=300,
)

result_custom = simulate_selection(params_custom)
plot_selection(result_custom)
plt.show()