(c) 2024 Manuel Razo. This work is licensed under a [Creative Commons
Attribution License CC-BY 4.0](https://creativecommons.org/licenses/by/4.0/).
All code contained herein is licensed under an [MIT
license](https://opensource.org/licenses/MIT).

In [1]:
# Import project package
import Antibiotic
import Antibiotic.mh as mh

# Load CairoMakie for plotting
using CairoMakie
import PairPlots
import ColorSchemes

# Import DimensionalData for handling trajectories
import DimensionalData as DD

# Import basic math libraries
import StatsBase
import LinearAlgebra
import Random
import Distributions
import Distances

# Activate backend
CairoMakie.activate!()

# Set PBoC Plotting style
Antibiotic.viz.theme_makie!()

# Simulating Fitnotype profiles

In this notebook, we will use the previously-developed Metropolis-Hastings
evolutionary dynamics algorithm to simulate the evolution of strains on a
fitness-mutational landscape and generate fitnotypic profiles.

The idea is the following: We will define a **single** mutational landscape and
an evolution-condition fitness landscape. Then, we will use the Metropolis-
Hastings algorithm to simulate the evolution of strains on this landscape.
Finally, to simulate the fitnotype profiles, we will determine the fitness of
the different evolving populations in a set of random fitness landscapes
different from the evolution condition landscape.

## 2D Example


Let's start with a simple 2D example. We will define a single central fitness
peak for the evolution condition landscape and four mutational depressions
surrounding it.

In [None]:
# Evolution condition amplitude
fit_evo_amplitude = 5.0
# Evolution condition mean
fit_evo_mean = [0.0, 0.0]
# Evolution condition covariance
fit_evo_covariance = 3.0
# Create peak
fit_evo_peak = mh.GaussianPeak(
    fit_evo_amplitude,
    fit_evo_mean,
    fit_evo_covariance
)

# Mutational peak amplitude
mut_evo_amplitude = 1.0
# Mutational peak means
mut_means = [
    [-1.5, -1.5],
    [1.5, -1.5],
    [1.5, 1.5],
    [-1.5, 1.5],
]
# Mutational peak covariance
mut_evo_covariance = 0.45
# Create mutational peaks
mut_evo_peaks = mh.GaussianPeaks(
    mut_evo_amplitude,
    mut_means,
    mut_evo_covariance
)

Let's plot the landscapes.

In [None]:
# Define range of phenotypes to evaluate
x = range(-4, 4, length=100)
y = range(-4, 4, length=100)

# Create meshgrid
F = mh.fitness(x, y, fit_evo_peak)
M = mh.mutational_landscape(x, y, mut_evo_peaks)

# Initialize figure
fig = Figure(size=(600, 300))

# Add axis for trajectory in fitness landscape
ax1 = Axis(
    fig[1, 1],
    xlabel="phenotype 1",
    ylabel="phenotype 2",
    aspect=AxisAspect(1),
    title="Fitness landscape",
)
# Add axis for trajectory in mutational landscape
ax2 = Axis(
    fig[1, 2],
    xlabel="phenotype 1",
    ylabel="phenotype 2",
    aspect=AxisAspect(1),
    title="Mutational landscape",
)

# Plot a heatmap of the fitness landscape
heatmap!(ax1, x, y, F, colormap=:viridis)
# Plot heatmap of mutational landscape
heatmap!(ax2, x, y, M, colormap=:magma)

# Plot contour plot
contour!(ax1, x, y, F, color=:white)
contour!(ax2, x, y, M, color=:white)

fig

### Simulate evolution

Now, we will use the Metropolis-Hastings algorithm to simulate the evolution of
strains on this landscape.

In [None]:
Random.seed!(42)

# Define dimensionality
n_dim = 2
# Define number of simulations
n_sim = 10
# Define number of evolution steps
n_steps = 300

# Set evolution parameters
β = 10.0
µ = 0.1

# Select initial conditions relatively close to each other
x0 = rand(Distributions.MvNormal([-2.5, -2.5], 0.1), n_sim)

# Define dimensions to be used with DimensionalData
phenotype = DD.Dim{:phenotype}([:x1, :x2])
fitness = DD.Dim{:fitness}([:fitness])
time = DD.Dim{:time}(0:n_steps)
lineage = DD.Dim{:lineage}(1:n_sim)


# Initialize DimensionalData array to hold trajectories and fitness
phenotype_traj = DD.zeros(Float32, phenotype, time, lineage)
fitness_traj = DD.zeros(Float32, fitness, time, lineage)

# Stack arrays to store trajectories in phenotype and fitness dimensions
x_traj = DD.DimStack(
    (phenotype=phenotype_traj, fitness=fitness_traj),
)

# Store initial conditions
x_traj.phenotype[time=1] = x0
x_traj.fitness[time=1] = mh.fitness(x0, fit_evo_peak)

# x_traj[data=DD.At(:fitness), time=1] .= [fit_evo_peak(x0[:, i]) for i in 1:n_sim]

# Loop over simulations
for i in 1:n_sim
    # Run Metropolis-Hastings algorithm
    trajectory = mh.evo_metropolis_hastings(
        x_traj.phenotype[time=1, lineage=i],
        fit_evo_peak,
        mut_evo_peaks,
        β,
        µ,
        n_steps
    )

    # Store trajectory
    x_traj.phenotype[lineage=i] = trajectory

    # Calculate and store fitness for each point in the trajectory
    x_traj.fitness[lineage=i] = mh.fitness(trajectory, fit_evo_peak)
end

x_traj

Let's plot the trajectories.

In [None]:
Random.seed!(42)

# Initialize figure
fig = Figure(size=(600, 300))

# Add axis for fitness landscape
ax1 = Axis(
    fig[1, 1],
    xlabel="phenotype 1",
    ylabel="phenotype 2",
    aspect=AxisAspect(1),
    title="Fitness landscape",
)
# Add axis for mutational landscape
ax2 = Axis(
    fig[1, 2],
    xlabel="phenotype 1",
    ylabel="phenotype 2",
    aspect=AxisAspect(1),
    title="Mutational landscape",
)

# Plot fitness landscape
heatmap!(ax1, x, y, F)
# Plot heatmap of mutational landscape
heatmap!(ax2, x, y, M, colormap=:magma)

# Plot contour plot
contour!(ax1, x, y, F, color=:white)
contour!(ax2, x, y, M, color=:white)

# Loop over simulations
for i in DD.dims(x_traj, :lineage)
    # Plot trajectory
    scatterlines!.(
        [ax1, ax2],
        Ref(x_traj.phenotype[phenotype=DD.At(:x1), lineage=i].data),
        Ref(x_traj.phenotype[phenotype=DD.At(:x2), lineage=i].data),
        color=ColorSchemes.seaborn_colorblind[i],
        markersize=3
    )
end

# Set limits
xlims!(ax1, -4, 4)
ylims!(ax1, -4, 4)
xlims!(ax2, -4, 4)
ylims!(ax2, -4, 4)

fig

### Simulate fitnotype profiles

Having evolved the populations on the fitness-mutational landscape, we can
now simulate the fitnotype profiles. For this, we will use a set of random
fitness landscapes and determine the fitness of the evolved populations in each
one of them.

Let's define a set of random fitness landscapes.

In [6]:
Random.seed!(42)

# Define landscape dimensionality
n_dim = 2

# Define number of fitness landscapes
n_fit_lans = 50

# Define range of peak means
peak_mean_min = -4.0
peak_mean_max = 4.0

# Define range of fitness amplitudes
fit_amp_min = 1.0
fit_amp_max = 5.0

# Define covariance range
fit_cov_min = 0.5
fit_cov_max = 3.0

# Define possible number of fitness peaks
n_fit_peaks_min = 1
n_fit_peaks_max = 3

# Initialize array to hold fitness landscapes
fit_lans = Array{mh.AbstractPeak}(undef, n_fit_lans + 1)

# Store evolution condition in first landscape
fit_lans[1] = fit_evo_peak

# Loop over number of fitness landscapes
for i in 1:n_fit_lans
    # Sample number of fitness peaks
    n_fit_peaks = rand(n_fit_peaks_min:n_fit_peaks_max)

    # Sample fitness means as 2D vectors from uniform distribution
    fit_means = [
        rand(Distributions.Uniform(peak_mean_min, peak_mean_max), 2)
        for _ in 1:n_fit_peaks
    ]

    # Sample fitness amplitudes from uniform distribution
    fit_amplitudes = rand(
        Distributions.Uniform(fit_amp_min, fit_amp_max), n_fit_peaks
    )

    # Sample fitness covariances from uniform distribution
    fit_covariances = rand(
        Distributions.Uniform(fit_cov_min, fit_cov_max), n_fit_peaks
    )

    # Check dimensionality
    if n_fit_peaks == 1
        # Create fitness peaks
        fit_lans[i+1] = mh.GaussianPeak(
            first(fit_amplitudes), first(fit_means), first(fit_covariances)
        )
    else
        # Create fitness peaks
        fit_lans[i+1] = mh.GaussianPeaks(
            fit_amplitudes, fit_means, fit_covariances
        )
    end # if
end # for

Let's now plot some ofthe fitness landscapes in a grid.

In [None]:
# Define number of rows and columns
n_rows = 5
n_cols = 5

# Define ranges of phenotypes to evaluate
x = range(-6, 6, length=100)
y = range(-6, 6, length=100)

# Initialize figure
fig = Figure(size=(200 * n_cols, 200 * n_rows))

# Add grid layout
gl = fig[1, 1] = GridLayout()

# Loop over fitness landscapes
for i in 1:(n_rows*n_cols)
    # Extract fitness landscape
    fit_lan = fit_lans[i]
    # Define row and column
    row = (i - 1) ÷ n_cols + 1
    col = (i - 1) % n_cols + 1
    # Add axis
    ax = Axis(gl[row, col], aspect=AxisAspect(1))
    # Evaluate fitness landscape
    F = mh.fitness(x, y, fit_lan)
    # Plot fitness landscape
    heatmap!(ax, x, y, F, colormap=:viridis)
    # Plot contour plot
    contour!(ax, x, y, F, color=:white)
end

# Add global x and y labels
Label(gl[end+1, :], "phenotype 1")
Label(gl[:, 0], "phenotype 2", rotation=π / 2)

fig


With these fitness landscapes, we can now determine the fitness of the evolved
populations in each one of them.

In [None]:
# Define landscape dimension
landscape = DD.Dim{:landscape}(1:n_fit_lans)

# Initialize fitness and phenotype profiles
fitness_profiles = DD.zeros(Float32, landscape, time, lineage)
phenotype_profiles = DD.zeros(Float32, phenotype, time, lineage)

# Initialize DimensionalData array to hold fitnotype profiles
fitnotype_profiles = DD.DimStack(
    (phenotype=phenotype_profiles, fitness=fitness_profiles),
)

# Store evolution condition in first landscape
fitnotype_profiles.phenotype .= x_traj.phenotype
fitnotype_profiles.fitness[landscape=1] = x_traj.fitness

# Loop over fitness landscapes
for lan in DD.dims(fitnotype_profiles, :landscape)[2:end]
    # Loop through lineages
    for lin in DD.dims(fitnotype_profiles, :lineage)
        # Store fitness trajectories
        fitnotype_profiles.fitness[landscape=lan, lineage=lin] = mh.fitness(
            fitnotype_profiles.phenotype[lineage=lin].data,
            fit_lans[lan]
        )
    end # for
end # for

fitnotype_profiles

### Exploratory analysis of fitnotype profiles

Now that we have the fitnotype profiles, we can perform an exploratory analysis
of them.

Let's begin by plotting the fitness profiles across all environments for a few
sample time points.

In [None]:
# Define number of time points to plot
n_tps_plot = 4

# Define time point indices to plot as evenly spaced as possible
tps_plot = Int.(range(
    DD.dims(fitnotype_profiles, :time)[[1, end]]..., length=n_tps_plot
))

# Initialize figure
fig = Figure(size=(400, 150 * n_tps_plot))

# Add grid layout for entire figure
gl = fig[1, 1] = GridLayout()

# Add grid layout for plots
gl_plots = gl[1:5, 1:5] = GridLayout()

# Loop over time points
for (i, tp) in enumerate(tps_plot)
    # Add axis
    ax = Axis(
        gl_plots[i, 1],
        title="t = $tp",
        yscale=log10,
    )
    # Check if final plot
    if i ≠ n_tps_plot
        # Turn off x-axis
        hidexdecorations!(ax, grid=false)
    end
    # Loop over lineages
    for lin in DD.dims(fitnotype_profiles, :lineage)
        # Plot fitness profile
        scatterlines!(
            ax,
            collect(DD.dims(fitnotype_profiles, :landscape)),
            fitnotype_profiles.fitness[time=DD.At(tp), lineage=lin].data,
            color=ColorSchemes.glasbey_hv_n256[lin],
            markersize=6
        )
    end # for 
end # for i

# Add global x and y labels
Label(gl[end+1, 3], "environment index")
Label(gl[3, 0], "fitness", rotation=π / 2)
fig

From this type of plot is hard to see the dynamics of the fitness across
environments. Let's try performing a PCA on the fitness profiles via SVD.

In [11]:
# Reshape the array to stack the 3rd dimension
fit_mat = log.(
    reshape(fitnotype_profiles.fitness.data, size(fitnotype_profiles, 4), :)
)

# Fit model to standardize data to mean zero and standard deviation 1 on each
# environment 
dt = StatsBase.fit(StatsBase.ZScoreTransform, fit_mat, dims=2)

# Standardize the data to have mean 0 and standard deviation 1
fit_std = StatsBase.transform(dt, fit_mat)

# Compute SVD
U, S, V = LinearAlgebra.svd(fit_std);

Let's now plot the singular values profile as well as the percentage of variance
explained by each principal component.

In [None]:
# Initialize figure
fig = Figure(size=(650, 300))

# Add axis for singular values
ax1 = Axis(
    fig[1, 1],
    title="Singular values",
    xlabel="singular value index",
    ylabel="singular value",
)

# Plot singular values
scatterlines!(ax1, S)

# Add axis for percentage of variance explained
ax2 = Axis(
    fig[1, 2],
    title="% variance explained",
    xlabel="principal component index",
    ylabel="% variance explained",
)
# Compute percentage of variance explained
pve = S .^ 2 ./ sum(S .^ 2)
# Plot percentage of variance explained
scatterlines!(ax2, pve)

fig

From these plots we can see that the first two principal components explain most
of the variance in the data with $\approx$ 45% of the variance explained by the
first principal component and $\approx$ 25% by the second.

Let's project the data onto the first two principal components and plot the
results.

In [None]:
# Project data onto first two principal components
fit_pca = U[:, 1:2]' * fit_std

# Initialize figure
fig = Figure(size=(300, 300))

# Add axis
ax = Axis(
    fig[1, 1],
    xlabel="principal component 1",
    ylabel="principal component 2",
    aspect=AxisAspect(1),
)

# Plot fitness profiles
scatter!(ax, fit_pca[1, :], fit_pca[2, :], markersize=5)

fig

This structure really resembles the trajectories in the fitness/mutational
landscape space. To confirm that is the case, let's try to color each point by
the strain index and see if we can see the similarity in the dynamics in both
spaces.

In [None]:
# Standardize each slice of the fitnotype profiles
fit_pca_std = StatsBase.transform.(
    Ref(dt), eachslice(fitnotype_profiles.fitness.data, dims=3)
)

# Initialize figure
fig = Figure(size=(600, 300))

# Add axis for original space
ax1 = Axis(
    fig[1, 1],
    title="Phenotype space",
    aspect=AxisAspect(1),
    xlabel="phenotype 1",
    ylabel="phenotype 2",
)

# Add axis for PCA space
ax2 = Axis(
    fig[1, 2],
    title="PCA space",
    aspect=AxisAspect(1),
    xlabel="principal component 1",
    ylabel="principal component 2",
)


# Loop over lineages
for lin in DD.dims(fitnotype_profiles, :lineage)
    # Plot trajectory
    scatterlines!(
        ax1,
        fitnotype_profiles.phenotype[phenotype=DD.At(:x1), lineage=lin].data,
        fitnotype_profiles.phenotype[phenotype=DD.At(:x2), lineage=lin].data,
        color=ColorSchemes.seaborn_colorblind[lin],
        markersize=4
    )
end

# Loop through each simulation (2nd dimension)
for (j, slice) in enumerate(fit_pca_std)
    # Project slice onto PCA space
    pca_slice = U[:, 1:2]' * slice
    # Plot slice
    scatterlines!(
        ax2,
        pca_slice[1, :],
        pca_slice[2, :],
        color=ColorSchemes.seaborn_colorblind[j],
        markersize=4
    )
end

fig