Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Calculating probability of premature death #1418

Merged
merged 49 commits into from
Oct 3, 2024

Conversation

RachelMurray-Watson
Copy link
Collaborator

@RachelMurray-Watson RachelMurray-Watson commented Jul 9, 2024

Fixes #1374

Initial code for calculating the probability of premature death (defined as before 70 years old) and a test for that code. Based off of life_expectancy.py, using the cumulative probabilities of death before the age of 70 years. Calculated separately for M and F. Seems to roughly match figures based on https://apps.who.int/gho/data/view.searo.60980?lang=en.

…ned as before 70 years old) and a test for that code. Based off of life_expectancy.py, using the cumulative probabilities of death before the age of 70 years. Calculated seperately for M and F.
Copy link
Collaborator

@tbhallett tbhallett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice!

from tlo.analysis.utils import get_scenario_info, summarize


def _calculate_probability_of_dying_before_70(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rename to _calculate_probability_of_premature_death and throughout 'soft-code' the "magic number of 70".

I would declare at the top of the script

AGE_BEFORE_WHICH_DEATH_IS_DEFINED_AS_PREMATURE = 70

and provide the reference to the paper that is cited in the issue.

That way we're super-clear about the origin of the magic number. And, if, we change our mind about the definition, it's a simple change.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

from tlo.analysis.probability_premature_death import get_probability_of_dying_before_70


def test_get_probability_premature_death():
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

given that we're running this test on a known data set, would it make sense to check that the actual numerical answer obtained is exactly equal to what we want?
(i realise the test you modelled this on, didn't do this, but arguably it should have!)

"""
probability_of_dying_before_70 = dict()

age_group_labels = _person_years_at_risk.index.get_level_values('age_group').unique()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as this logic has so much in common with the life-expectancy calculation, I think it would make sense to factor it out, in order that the same calculation can serve both purposes (life-expectnacy and probability of premature death).

I would also recommend putting these functions into the same file as the life-expectnacy calculations

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So do you think this file should go entirely, and just include it all in the life expectancy file?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I think that would be fine.

Main point is about the refactoring to avoid repeating the logic about computing death rates in age-groups and building the cumulative probability of death.

(Location of the function in which file is secondary, but I think it being in one file would be fine).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done! REfactored all logic around probability of dying

RachelMurray-Watson and others added 9 commits July 29, 2024 09:55
…ound between the life_expectancy calculation and the probability_premature_death calculation. Also included the probability_premature death calculations in this file.

In test_probability_premature_death.py, created a new test based in an artificial cohort that all start the simulation age 0. Then run for 70 years. Can compare the number that die before 70 with the estimates produced by the function.
…ry. Instead have uploaded data from a dummy cohort, where everyone starts at age 0 and it runs for 70 years.
@RachelMurray-Watson RachelMurray-Watson changed the title Calculating probability of death before 70 Calculating probability of premature death Jul 31, 2024
src/tlo/analysis/life_expectancy.py Outdated Show resolved Hide resolved
src/tlo/analysis/life_expectancy.py Outdated Show resolved Hide resolved
tests/test_probability_premature_death.py Outdated Show resolved Hide resolved
tests/test_probability_premature_death.py Outdated Show resolved Hide resolved
tests/test_probability_premature_death.py Outdated Show resolved Hide resolved
tests/test_probability_premature_death.py Outdated Show resolved Hide resolved
src/tlo/analysis/life_expectancy.py Outdated Show resolved Hide resolved
tests/test_probability_premature_death.py Outdated Show resolved Hide resolved
… was a proportion

Removed defunct calculate_mortality metrics

Moved test for premature death into life expectancy file

Added docstring clarifying probability premature death test

Clarified test by adding explicit conditions for what is considered premature (which can be modified in the future).
Renamed _calculate_probability_of_premature_death to show that it is one run
* inequality should be < not <=
* read of the picklefile can streamlined
* check should be equality of the estimates of the two methods
* linting
* update docstring
@tbhallett
Copy link
Collaborator

tbhallett commented Oct 3, 2024

No idea why test_determinism.py is failing on this branch now. Could this be something to do with the runners @matt-graham ???

If it's only that, we can merge this now.

@matt-graham
Copy link
Collaborator

No idea why test_determinism.py is failing on this branch now. Could this be something to do with the runners @matt-graham ???

If it's only that, we can merge this now.

It's failing on master post merging #1470 (more details in #1472). I was hoping #1473 would fix but it doesn't appear to and currently trying to investigate alternatives (annoyingly this is only happening on Actions runner so bit difficult to figure out what is going on!). As unrelated to changes here I think we can safely merge this in if that's the only test failing.

@tbhallett
Copy link
Collaborator

No idea why test_determinism.py is failing on this branch now. Could this be something to do with the runners @matt-graham ???
If it's only that, we can merge this now.

It's failing on master post merging #1470 (more details in #1472). I was hoping #1473 would fix but it doesn't appear to and currently trying to investigate alternatives (annoyingly this is only happening on Actions runner so bit difficult to figure out what is going on!). As unrelated to changes here I think we can safely merge this in if that's the only test failing.

Thanks Matt. Ok, will merge in now.

@tbhallett tbhallett merged commit 72cd17e into master Oct 3, 2024
59 of 60 checks passed
@tbhallett tbhallett deleted the rmw/probability_premature_death branch October 3, 2024 15:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

Summarise 'Probability of Premature Death' (PPD) as additional useful health metric
4 participants