-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Calculating probability of premature death #1418
Conversation
…ned as before 70 years old) and a test for that code. Based off of life_expectancy.py, using the cumulative probabilities of death before the age of 70 years. Calculated seperately for M and F.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice!
from tlo.analysis.utils import get_scenario_info, summarize | ||
|
||
|
||
def _calculate_probability_of_dying_before_70( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rename to _calculate_probability_of_premature_death
and throughout 'soft-code' the "magic number of 70".
I would declare at the top of the script
AGE_BEFORE_WHICH_DEATH_IS_DEFINED_AS_PREMATURE = 70
and provide the reference to the paper that is cited in the issue.
That way we're super-clear about the origin of the magic number. And, if, we change our mind about the definition, it's a simple change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
from tlo.analysis.probability_premature_death import get_probability_of_dying_before_70 | ||
|
||
|
||
def test_get_probability_premature_death(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
given that we're running this test on a known data set, would it make sense to check that the actual numerical answer obtained is exactly equal to what we want?
(i realise the test you modelled this on, didn't do this, but arguably it should have!)
""" | ||
probability_of_dying_before_70 = dict() | ||
|
||
age_group_labels = _person_years_at_risk.index.get_level_values('age_group').unique() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as this logic has so much in common with the life-expectancy calculation, I think it would make sense to factor it out, in order that the same calculation can serve both purposes (life-expectnacy and probability of premature death).
I would also recommend putting these functions into the same file as the life-expectnacy calculations
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So do you think this file should go entirely, and just include it all in the life expectancy file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, I think that would be fine.
Main point is about the refactoring to avoid repeating the logic about computing death rates in age-groups and building the cumulative probability of death.
(Location of the function in which file is secondary, but I think it being in one file would be fine).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done! REfactored all logic around probability of dying
…ound between the life_expectancy calculation and the probability_premature_death calculation. Also included the probability_premature death calculations in this file. In test_probability_premature_death.py, created a new test based in an artificial cohort that all start the simulation age 0. Then run for 70 years. Can compare the number that die before 70 with the estimates produced by the function.
…ry. Instead have uploaded data from a dummy cohort, where everyone starts at age 0 and it runs for 70 years.
… was a proportion Removed defunct calculate_mortality metrics Moved test for premature death into life expectancy file Added docstring clarifying probability premature death test Clarified test by adding explicit conditions for what is considered premature (which can be modified in the future).
Renamed _calculate_probability_of_premature_death to show that it is one run
* inequality should be < not <= * read of the picklefile can streamlined * check should be equality of the estimates of the two methods * linting * update docstring
No idea why If it's only that, we can merge this now. |
It's failing on |
Thanks Matt. Ok, will merge in now. |
Fixes #1374
Initial code for calculating the probability of premature death (defined as before 70 years old) and a test for that code. Based off of life_expectancy.py, using the cumulative probabilities of death before the age of 70 years. Calculated separately for M and F. Seems to roughly match figures based on https://apps.who.int/gho/data/view.searo.60980?lang=en.