In [2]:
from dms_stan.datasets.trpb import (
    TrpBExponentialGrowth,
    TrpBSigmoidGrowthInitParam,
    TrpBSigmoidGrowth,
)

SOURCE_FILE = "~/GitRepos/DMSStan/raw_data/trpb/3-site_merged_replicates/LibI/20230926/LibI_merged_AAs.csv"

Prior predictive check for the TrpB exponential growth model:

In [3]:
EXP_MODEL = TrpBExponentialGrowth.from_data_file(SOURCE_FILE)
EXP_MODEL.prior_predictive()

BokehModel(combine_events=True, render_bundle={'docs_json': {'1cdd6dba-8d3b-45ee-922b-eb3ae08259d1': {'version…

Now a slightly more expressive model: Sigmoid growth parametrized using initial abundances:

In [3]:
SIG_INIT_MODEL = TrpBSigmoidGrowthInitParam.from_data_file(SOURCE_FILE)
SIG_INIT_MODEL.prior_predictive()

BokehModel(combine_events=True, render_bundle={'docs_json': {'41241015-3cd5-4133-8a93-561a3ab7a003': {'version…

Slightly more expressive again: Sigmoid growth with variable growth rates and inflection points, but assuming identical maximum abundances for all variants.

In [4]:
SIG_MODEL = TrpBSigmoidGrowth.from_data_file(SOURCE_FILE)
SIG_MODEL.prior_predictive()

BokehModel(combine_events=True, render_bundle={'docs_json': {'ead6cb4e-538e-4e7a-9dba-18563d46f330': {'version…

# MAP

Now that we've selected our priors, we're ready to identify the MAP for each.

In [5]:
EXP_MAP = EXP_MODEL.approximate_map(early_stop=10, device=0, seed=1025)
EXP_MAP.plot_loss_curve()

Epochs:  12%|█▏        | 12006/100000 [02:43<20:01, 73.26it/s, -log pdf/pmf=1738634.13]


We can plot the posterior predictive checks for the MAP:

In [6]:
EXP_MAP.get_inference_obj(batch_size=50).run_ppc(logy_ppc_samples=True)

BokehModel(combine_events=True, render_bundle={'docs_json': {'999ab16f-ab38-4a65-b699-18d423a6f2f6': {'version…

Same for the abundance-initialized sigmoid model:

In [7]:
SIG_INIT_MAP = SIG_INIT_MODEL.approximate_map(early_stop=10, device=0, seed=1025)
SIG_INIT_MAP.plot_loss_curve()

Epochs:  15%|█▍        | 14561/100000 [03:54<22:54, 62.17it/s, -log pdf/pmf=709488.29] 


In [8]:
SIG_INIT_INF_OBJ = SIG_INIT_MAP.get_inference_obj(batch_size=50)
SIG_INIT_INF_OBJ.run_ppc(logy_ppc_samples=True)

BokehModel(combine_events=True, render_bundle={'docs_json': {'bf64ebeb-be5a-4a2c-81db-cbe69290a704': {'version…

And for the full sigmoid model:

In [9]:
SIG_MAP = SIG_MODEL.approximate_map(early_stop=10, device=0, seed=1025)
SIG_MAP.plot_loss_curve()

Epochs:  16%|█▌        | 16113/100000 [04:15<22:10, 63.03it/s, -log pdf/pmf=1721944.23]


In [10]:
SIG_MAP.get_inference_obj(batch_size=50).run_ppc(logy_ppc_samples=True)

BokehModel(combine_events=True, render_bundle={'docs_json': {'2da76045-feb3-4b24-aede-cede70bd7080': {'version…

# MCMC

Finally, we will use Stan to sample from the posterior. Sampling is likely to take some time with these models, so we're going to compile an object that will allow us to run sampling outside of the notebook:

In [None]:
EXP_MODEL.mcmc(
    output_dir="./exponential",
    cpp_options={"STAN_THREADS": True},
    seed=1025,
    delay_run=True,
    iter_warmup=2000,
)
# SIG_INIT_MODEL.mcmc(
#     output_dir="./sigmoid_init",
#     cpp_options={"STAN_THREADS": True},
#     seed=1025,
#     delay_run=True,
#     iter_warmup=2000,
# )
# SIG_MODEL.mcmc(
#     output_dir="./sigmoid",
#     cpp_options={"STAN_THREADS": True},
#     seed=1025,
#     delay_run=True,
#     iter_warmup=2000,
# )

15:46:22 - cmdstanpy - INFO - compiling stan file /home/bwittmann/GitRepos/DMSStan/flip3/trpB/exponential/model-20250415154622.stan to exe file /home/bwittmann/GitRepos/DMSStan/flip3/trpB/exponential/model-20250415154622
15:50:06 - cmdstanpy - INFO - compiled model executable: /home/bwittmann/GitRepos/DMSStan/flip3/trpB/exponential/model-20250415154622


Now run analysis on the diagnostics and report:

In [12]:
# samples = SampleResults.from_disk("/home/bwittmann/GitRepos/DMSStan/flip3/trpB/sigmoid/model-20250410192656-20250410192733_arviz.nc", skip_fit=True)

In [13]:
# _ = samples.diagnose()

In [14]:
# samples.run_ppc(logy_ppc_samples=True)

In [15]:
# samples.inference_obj