Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve documentation for params object? #13

Open
akeyel opened this issue Mar 27, 2024 · 1 comment
Open

Improve documentation for params object? #13

akeyel opened this issue Mar 27, 2024 · 1 comment

Comments

@akeyel
Copy link

akeyel commented Mar 27, 2024

I am trying to run the model with a data set from New York State. I can get everything to run, but I don't think I'm actually running a model that makes sense (I just used the defaults from the vignette example!). Fundamentally, I don't know what the entries in the params object are. In particular, I'm not sure which are fixed, and which are priors that the model will update during the simulation. I'm also unsure how to determine the appropriate inputs based on my input data set.

Would it be possible to add documentation for these parameters (e.g. through creating a params man page that includes a description of each of the parameters, what role they play in the model, and how to derive them for a given data set?

I see that there is a param_df page, but as far as I can tell, the param_df object is not actually used for anything in running the model.

I considered running a sensitivity analysis, but it takes ~4 hours per model run right now, so it would be difficult to sample the parameter space that way.

Thank you!

@kaitejohnson
Copy link
Collaborator

Hi @akeyel thanks for bringing this up!

We have added documentation to the example_df, param_df, and example_params.toml. The example_params.toml mostly contains priors, though there are some parameter specifications for the delays that we set (generation interval and infection to hospital admissions delay). If you are fitting your model to data for COVID, these are probably a reasonable place to start. None of these depend on the dataset, and any dependencies on the data in the priors or set parameters will be handled in the get_stan_data_site_level() function

As for the formatting of the input dataset, hopefully the documentation for example_df is now more clear as to what should go where. We also revised the toy_data_vignette.Rmd to clarify a bit how to construct a dataset like this from your own data. The data needed is a long form tidy dataframe with every wastewater site observation, with the hospital admissions data joined to it. See the generate_simulated_data.R function to see how this was constructed from the model generated data.

The param_df in the toy_data_vignette.Rmd is actually just for comparing the known values of some key parameters that we specified in the defaults in generate_simulated_data.R, to the posterior estimates of those values from fitting to the simulated data in the toy_data_vignette.Rmd. Apologies for the confusion!

Let us know if this makes sense and properly addresses your question.

Thanks for your patience, we know this isn't super user friendly at the moment, but hearing from others trying to modify it for their use cases is super helpful for us to identify critical areas for improvement as we work towards making it more user-friendly!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants