-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Null values currently fail validation for EmergencyCareEpisodeSchema due to int64 type #44
Comments
Having had a better look at feature_maps.py I think might be better managed by doing a fillna(0) on the relevant columns! Was wondering if I could clarify a few other things however @vvcb
|
Missing SNOMED codes should be replaced with 0. This avoids pandas NaN issues (I have included a link in the documentation). Is PR #46 still necessary if this is already done? If there is a specific code for missing values, then this should be included in feature_maps. Will be great if you are happy to do this . |
|
|
a) Treat absence of diagnosis as 'Non-ACSC' We have about 10% of patients where there is no diagnosis assigned within the emergency care dataset |
Ah...I see it now 😊. Option b maybe the correct one but worth checking with the lead team regarding how they want this handled. 10% is a sizable proportion to be discarding. |
Following columns are currently failing validation because they contain null values but are set to
dtype=np.int64
in theEmergencyCareEpisodeSchema
edcomorb_[0-9]{2}$
eddiag_[0-9]{2}$
eddiagqual_[0-9]{2}$
edentryseq_[0-9]{2}$
edinvest_[0-9]{2}$
edtreat_[0-9]{2}$
Suggest changing to
pd.Int64Dtype()
to allow null valuesCould be changed to float type but you get this awkward situation where pandas adds a decimal point onto the end of the SNOMED code
The text was updated successfully, but these errors were encountered: