Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor distributions #112

Merged
merged 7 commits into from
Sep 7, 2023
Merged

Refactor distributions #112

merged 7 commits into from
Sep 7, 2023

Conversation

RomeshA
Copy link
Contributor

@RomeshA RomeshA commented Sep 3, 2023

Refactor sampling relating to #110

@RomeshA RomeshA mentioned this pull request Sep 3, 2023
stisim/people.py Outdated
self.states = ss.ndict(base_states, states)
states = [
ss.State('age', float),
ss.State('female', bool, False),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace with distribution -- if i'm following, this would be ss.State('female', bool, ss.choice([True,False])) right? Optionally with an argument for sex_ratio I guess.

I wonder if we'll need to be careful about dtypes here. Actully maybe this is an argument against #108. Or on the other hand, maybe not - maybe the distribution class could handle the dtypes without too much trouble.

Copy link
Contributor Author

@RomeshA RomeshA Sep 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep just pushed that which will be a useful example to have in the codebase. I didn't add in the optional argument to set the sex ratio (although that's just an extra argument to choice) because I'm not sure how exactly we'd want to incorporate it.

Something I also realized in working on this refactor/PR is that this opens up the possibility of having the distribution function be based on data, e.g., it would be possible to define a Distribution that takes in binned age counts for a setting and then returns sampled ages accordingly. So that could give some new options for incorporating data too

@robynstuart robynstuart mentioned this pull request Sep 4, 2023
Copy link
Contributor

@cliffckerr cliffckerr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I like how this PR gets around some of the arbitrariness of the implementation of sample(). A few comments/questions:

  • I think it would be nice to support plain python objects where possible, so e.g. parameters can be exported to JSON and then imported. Could we have something like InterventionDict but for distributions? e.g. dist = dict(dist='poisson', rate=0.01); sti.dist(dist) would be equivalent to sti.poisson(rate=0.01)?
  • I can imagine users might be confused about when to use these distributions and when to use equivalent NumPy or SciPy distributions. Should we support SciPy distributions as well (which to me look more or less equivalent except with sample() instead of rvs).
  • Some of these distributions were introduced to support specific use cases for Covasim (and/or because this was the only way of providing a distribution), I wonder if we need all of them? That said, I agree we wouldn't want users to be forced to specify parameters as e.g. partners = lambda n: np.random.poisson(0.01, size=n) instead of partners = sti.poisson(0.01).

@robynstuart
Copy link
Contributor

Issues here: #120

@robynstuart robynstuart merged commit 48360a1 into main Sep 7, 2023
2 checks passed
@robynstuart robynstuart deleted the refactor-distributions branch September 7, 2023 22:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants