-
Notifications
You must be signed in to change notification settings - Fork 13
Renal reg #781
Renal reg #781
Conversation
This adds with_record_in_UKRR() to patients.py for the purpose of passing through arguments.
These should fail.
rebkwok
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good so far - just a few minor comments
cohortextractor/patients.py
Outdated
| def with_record_in_UKRR( | ||
| # picks dataset held by UKRR | ||
| from_dataset, | ||
| returning, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pretty much everywhere else in patients.py we use keyword args only - I'm assuming that's to prevent users putting positional args in the wrong order
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I completely understand this comment. Do you mean change the order of them or remove? We need to be able to identify the dataset in the UKRR somehow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just meant to make them keywords instead of positional args. It means users can't mix up the order and specify returning where we think we're getting from_dataset.
Like:
def with_record_in_UKRR(
# picks dataset held by UKRR
from_dataset=None,
returning="binary_flag",
...
I think elsewhere we generally default to binary_flag for returning. from_dataset could have a default if it makes sense to default to one of them, or None and we raise an exception if they don't provide it
tests/test_tpp_backend.py
Outdated
|
|
||
| assert_results( | ||
| study.to_dicts(convert_to_strings=False), | ||
| renal_registry_2019_prev=[1, 0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think binary_flag returns strings (but the test output will tell us that, I guess :) )
| ICHD, HHD, HD, PD, Tx | ||
| * "treatment_modality_prevalence" - the treatment modality from the prevalence data | ||
| * "latest_creatinine" - most recent creatinine held by UKRR | ||
| * "latest_egfr" - most recent eGFR held by UKRR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it makes more sense, we can map these things to more user-friendly labels (same for the dataset names). I'm not sure if it's helpful here, but some of the other methods do (e.g. patients_admitted_to_hospital).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's talk about this
cohortextractor/patients.py
Outdated
| * "latest_creatinine" - most recent creatinine held by UKRR | ||
| * "latest_egfr" - most recent eGFR held by UKRR | ||
| on_or_before: date of interest as a string with the format `YYYY-MM-DD`. Filters results to measurements | ||
| on or before the given date (as defined by `date_filter_column`). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
copy/paste error (and in the next 2 args)? There's no date_filter_column
cohortextractor/tpp_backend.py
Outdated
| on_or_before=None, | ||
| on_or_after=None, | ||
| between=None, | ||
| include_date_of_match=False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove this arg?
| # Date filtering: date limits | ||
| on_or_before=None, | ||
| on_or_after=None, | ||
| between=None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We only need between here - process_covariate_definitions.handle_time_period_options converts any on_or_before or on_or_after args into a between one before it gets to the backend.
cohortextractor/tpp_backend.py
Outdated
| elif returning == "latest_egfr": | ||
| column_definition = "eGFR_ckdepi" | ||
| else: | ||
| column_definition = returning |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be nicer as another mapping:
column_mapping = {
"binary_flag": 1,
"treatment_modality_start": "mod_start",
...
}
column_definition = column_mapping.get(returning)
And then we can also check and raise an exception if the returning value isn't one we expected
tests/test_tpp_backend.py
Outdated
| renal_centre="AD100", | ||
| rrt_start="2019-01-01", | ||
| mod_start="RT", | ||
| mod_prev="NULL", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not entirely confident about this, but I think you want None here, and sqlalchemy will make it a null in the db. (Unless nulls in the real data are represented by the string "NULL".)
rebkwok
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a few minor suggestions/questions, but this looks good 👍
THere's something formatting-related that black is complaining about atm
On my part
Start of adding the renal registry to the backend.