Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

distribution interface to dist_spec #504

Merged
merged 136 commits into from Mar 12, 2024
Merged

distribution interface to dist_spec #504

merged 136 commits into from Mar 12, 2024

Conversation

sbfnk
Copy link
Contributor

@sbfnk sbfnk commented Nov 10, 2023

The aim of this PR is to introduce a new distribution interface to replace the current dist_spec interface. This should make user interaction more intuitive and be more easily extensible if developers want to add more distributions (Weibull seems an obvious candidate). Eventually (following a deprecation cycle as dist_spec became part of a release necessitated by updates to rstan) dist_spec would become an internal function and the user would only interact with it via the distribution functions.

Examples of the new interface:

library("EpiNow2")
#> 
#> Attaching package: 'EpiNow2'
#> The following object is masked from 'package:stats':
#> 
#>     Gamma
Gamma(mean = 3, sd = 1)
#> - gamma distribution:
#>   shape:
#>     9
#>   rate:
#>     3
Gamma(mean = 3, sd = 1, max = 14)
#> - gamma distribution (max: 14):
#>   shape:
#>     9
#>   rate:
#>     3
Gamma(mean = Normal(mean = 3, sd = 1), sd = 4, max = 14)
#> Warning in new_dist_spec(params, "gamma"): Uncertain gamma distribution
#> specified in terms of parameters that are not the "natural" parameters of the
#> distribution (shape, rate). Converting using a crude and very approximate
#> method that is likely to produce biased results. If possible, it is preferable
#> to specify the distribution directly in terms of the natural parameters.
#> - gamma distribution (max: 14):
#>   shape:
#>     - normal distribution:
#>       mean:
#>         0.56
#>       sd:
#>         0.31
#>   rate:
#>     - normal distribution:
#>       mean:
#>         0.19
#>       sd:
#>         0.18
Gamma(shape = Normal(mean = 3, sd = 1), rate = Normal(mean = 2, sd = 0.5), max = 14)
#> - gamma distribution (max: 14):
#>   shape:
#>     - normal distribution:
#>       mean:
#>         3
#>       sd:
#>         1
#>   rate:
#>     - normal distribution:
#>       mean:
#>         2
#>       sd:
#>         0.5
Gamma(shape = 2, rate = 1, max = 14)
#> - gamma distribution (max: 14):
#>   shape:
#>     2
#>   rate:
#>     1
LogNormal(meanlog = 1.5, sdlog = 1)
#> - lognormal distribution:
#>   meanlog:
#>     1.5
#>   sdlog:
#>     1
LogNormal(mean = Normal(1.5, 0.1), sd = Normal(1, 0.1))
#> Warning in new_dist_spec(params, "lognormal"): Uncertain lognormal distribution
#> specified in terms of parameters that are not the "natural" parameters of the
#> distribution (meanlog, sdlog). Converting using a crude and very approximate
#> method that is likely to produce biased results. If possible, it is preferable
#> to specify the distribution directly in terms of the natural parameters.
#> - lognormal distribution:
#>   meanlog:
#>     - normal distribution:
#>       mean:
#>         0.22
#>       sd:
#>         0.043
#>   sdlog:
#>     - normal distribution:
#>       mean:
#>         0.61
#>       sd:
#>         0.071
pmf(c(0.2, 0.4, 0.4))
#> - nonparametric distribution
#>   PMF: [0.2 0.4 0.4]
pmf(c(0.2, 0.4, 0.2, 0.05))
#> - nonparametric distribution
#>   PMF: [0.24 0.47 0.24 0.059]

Created on 2024-01-11 with reprex v2.0.2

Somewhat sad to be masking base::gamma and stats::sd but perhaps unavoidable - or of course could name them differently.

To do following initial review:

  • finalise changes in the stan models to support delay distributions with arbitrary numbers of parameters
  • document all new functions
  • add appropriate input and other checks (e.g. everything passed to the stan model needs to have a max)
  • add warnings when using non-natural uncertain parameterisations
  • update tests
  • update all uses of dist_spec in examples / vignettes
  • update adjust_infection_to_report to use this and deprecate gamma_dist_def and lnorm_dist_def
  • write a brief guide for developers on how to add other distributions
  • add news item

Tests etc are expected to fail at initial state.

Closes #382
Closes #498

@sbfnk

This comment was marked as outdated.

@sbfnk

This comment was marked as outdated.

This comment was marked as outdated.

@sbfnk

This comment was marked as outdated.

@sbfnk sbfnk marked this pull request as ready for review November 27, 2023 15:22

This comment was marked as outdated.

seabbs

This comment was marked as outdated.

@sbfnk

This comment was marked as outdated.

@sbfnk

This comment was marked as outdated.

@seabbs

This comment was marked as outdated.

@sbfnk sbfnk force-pushed the dist-interface branch 2 times, most recently from 053f3e4 to 2135b4b Compare January 11, 2024 16:51
@sbfnk

This comment was marked as outdated.

@sbfnk

This comment was marked as outdated.

@sbfnk sbfnk requested a review from seabbs January 12, 2024 12:46
@sbfnk

This comment was marked as outdated.

This comment was marked as outdated.

@sbfnk

This comment was marked as outdated.

This comment was marked as outdated.

@sbfnk

This comment was marked as outdated.

@seabbs

This comment was marked as outdated.

@sbfnk

This comment was marked as outdated.

This comment was marked as outdated.

@sbfnk

This comment was marked as outdated.

@sbfnk

This comment was marked as outdated.

This comment was marked as outdated.

inst/dev/design_dist.md Outdated Show resolved Hide resolved
R/checks.R Outdated Show resolved Hide resolved
R/create.R Show resolved Hide resolved
NEWS.md Outdated Show resolved Hide resolved
@seabbs seabbs enabled auto-merge (squash) March 12, 2024 10:52
@seabbs seabbs self-requested a review March 12, 2024 10:56
seabbs
seabbs previously approved these changes Mar 12, 2024
Copy link
Contributor

@seabbs seabbs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is good to be merged in after both another site read and a functionality review. A few minor comments outstanding.

I do think that any porting of this for use elsewhere needs to come with a refactor/redesign as the current implementation seems quite brittle (all the ifelse and magic string based detection of distributions).

I also expect there to be a few edge cases this review has missed that will need fixing once this is on main but we will see (and 🤞🏻) .

Nice work on this - a lot of work but worth it as much nicer to use!

Copy link

This is how benchmark results would change (along with a 95% confidence interval in relative change) if 3a3d8d8 is merged into main:

  • ✔️default: 32.9s -> 32.4s [-21.11%, +18.35%]
  • ✔️no_delays: 33.5s -> 1.92m [-280.87%, +768.08%]
  • ✔️random_walk: 9.05s -> 10.2s [-0.02%, +24.76%]
  • ✔️stationary: 19.4s -> 18.7s [-19.75%, +11.86%]
  • ✔️uncertain: 57.5s -> 59.6s [-31.31%, +38.81%]
    Further explanation regarding interpretation and methodology can be found in the documentation.

Co-authored-by: Sam Abbott <s.e.abbott12@gmail.com>
@seabbs seabbs merged commit 19b5707 into main Mar 12, 2024
11 checks passed
@seabbs seabbs deleted the dist-interface branch March 12, 2024 13:54
Copy link

This is how benchmark results would change (along with a 95% confidence interval in relative change) if 3a3d8d8 is merged into main:

  • ✔️default: 32.8s -> 32.2s [-16.71%, +13.3%]
  • ✔️no_delays: 34.9s -> 41.2s [-8.22%, +44.68%]
  • ✔️random_walk: 9.41s -> 9.35s [-15.52%, +14.39%]
  • ✔️stationary: 19.9s -> 20.3s [-24.85%, +28.5%]
  • ✔️uncertain: 55.5s -> 53.6s [-20.51%, +13.72%]
    Further explanation regarding interpretation and methodology can be found in the documentation.

@sbfnk sbfnk mentioned this pull request Mar 26, 2024
7 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

gamma_dist_def: scale is inverse scale? Reconsider distribution interface.
3 participants