Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make checkbox fields explicit #199

Merged
merged 1 commit into from
Nov 20, 2023
Merged

Make checkbox fields explicit #199

merged 1 commit into from
Nov 20, 2023

Conversation

pipliggins
Copy link
Collaborator

No description provided.

Copy link

github-actions bot commented Nov 8, 2023

Summary of the missing optional fields for the ccp-ghana parser:

table missing total_fields percentage_coverage
subject 14 56 75.000000%
visit 16 53 69.811321%
observation 23 70 67.142857%
SUBJECT                                         
dataset_id                                      False
sex                                             False
preterm_infant                                  False
has_asplenia                                    False
has_tuberculosis                                False
has_chronic_respiratory_disease                 False
diabetes_type                                   False
has_apnoea                                      False
has_inflammatory_bowel_disease                  False
has_rare_disease_inborn_metabolism_error        False
has_tuberculosis_past                           False
has_comorbidity_other                           False
vaccinated_covid19                              False
vaccinated_covid19_dates                        False
VISIT                                           
dataset_id                                      False
phase                                           False
treatment_oxygen_mask_prongs                    False
treatment_antifungal_agent_type                 False
treatment_anticoagulation                       False
treatment_steroids                              False
treatment_immunosuppressant                     False
treatment_cpr                                   False
treatment_offlabel                              False
treatment_respiratory_support                   False
treatment_colchicine                            False
treatment_immunoglobulins                       False
treatment_delirium                              False
treatment_delirium_type                         False
treatment_monoclonal_antibody                   False
treatment_pacing                                False
OBSERVATION                                     
acvpu                                           False
clinical_classification_critical_illness_scale  False
total_fluid_output_ml                           False
oxygen_o2hb                                     False
clinical_frailty_score                          False
inability_to_walk_scale                         False
blantyre_coma_score                             False
mid_upper_arm_circumference_cm                  False
fio2_percent                                    False
richmond_agitation-sedation_scale               False
riker_sedation-agitation_scale                  False
mean_arterial_blood_pressure_mmHg               False
anorexia                                        False
bleeding                                        False
confusion                                       False
cyanosis                                        False
feeding_intolerance_pediatrics                  False
hepatomegaly                                    False
irritability_pediatrics                         False
lung_sounds                                     False
lymphadenopathy                                 False
severe_dehydration                              False
heart_sounds                                    False

Copy link
Collaborator

@ekamau ekamau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good!

@pipliggins
Copy link
Collaborator Author

@sadiekelly sorry to bug again... I thought this covered all the bases for checkbox fields by looking for fields with ___x in the name, but having looked again at some of the data files it's not clear to me that a lot of the symptom data wasn't also recorded as checkbox-like - e.g., the Guinea, Ghana & Uganda-v2 (these are just the first ones I checked) seem to have a data structure that suggests they use 0 as a default for admission symptoms/comorbidities, even though there are 'unknown' as options.

Any insights?

@sadiekelly
Copy link
Collaborator

@sadiekelly sorry to bug again... I thought this covered all the bases for checkbox fields by looking for fields with ___x in the name, but having looked again at some of the data files it's not clear to me that a lot of the symptom data wasn't also recorded as checkbox-like - e.g., the Guinea, Ghana & Uganda-v2 (these are just the first ones I checked) seem to have a data structure that suggests they use 0 as a default for admission symptoms/comorbidities, even though there are 'unknown' as options.

Any insights?

hi @pipliggins for those databases you've mentioned, it is fine to treat these not as checkboxes - they were either radio button or dropdown lists for the Yes/No/Unknown options meaning that one option should be selected and if none were selected then the cell in the csv for that data item would just be blank, there would not be a default entry for these.
I think your approach of identifying checkbox fields by the ___x pattern sounds sensible, this will definitely cover all redcap databases that used checkboxes. I think other database formats didn't tend to use checkboxes, it was quite a redcap thing. I can do a quick look through the data dictionaries to find non-redcap setups and look for any rogue checkboxes if that would be helpful?

@pipliggins
Copy link
Collaborator Author

@sadiekelly sorry to bug again... I thought this covered all the bases for checkbox fields by looking for fields with ___x in the name, but having looked again at some of the data files it's not clear to me that a lot of the symptom data wasn't also recorded as checkbox-like - e.g., the Guinea, Ghana & Uganda-v2 (these are just the first ones I checked) seem to have a data structure that suggests they use 0 as a default for admission symptoms/comorbidities, even though there are 'unknown' as options.
Any insights?

hi @pipliggins for those databases you've mentioned, it is fine to treat these not as checkboxes - they were either radio button or dropdown lists for the Yes/No/Unknown options meaning that one option should be selected and if none were selected then the cell in the csv for that data item would just be blank, there would not be a default entry for these. I think your approach of identifying checkbox fields by the ___x pattern sounds sensible, this will definitely cover all redcap databases that used checkboxes. I think other database formats didn't tend to use checkboxes, it was quite a redcap thing. I can do a quick look through the data dictionaries to find non-redcap setups and look for any rogue checkboxes if that would be helpful?

Hi @sadiekelly if you have the spare capacity that would be great! The well filled-in radio fields make spotting these easily a bit more difficult, good to know.

@sadiekelly
Copy link
Collaborator

@sadiekelly sorry to bug again... I thought this covered all the bases for checkbox fields by looking for fields with ___x in the name, but having looked again at some of the data files it's not clear to me that a lot of the symptom data wasn't also recorded as checkbox-like - e.g., the Guinea, Ghana & Uganda-v2 (these are just the first ones I checked) seem to have a data structure that suggests they use 0 as a default for admission symptoms/comorbidities, even though there are 'unknown' as options.
Any insights?

hi @pipliggins for those databases you've mentioned, it is fine to treat these not as checkboxes - they were either radio button or dropdown lists for the Yes/No/Unknown options meaning that one option should be selected and if none were selected then the cell in the csv for that data item would just be blank, there would not be a default entry for these. I think your approach of identifying checkbox fields by the ___x pattern sounds sensible, this will definitely cover all redcap databases that used checkboxes. I think other database formats didn't tend to use checkboxes, it was quite a redcap thing. I can do a quick look through the data dictionaries to find non-redcap setups and look for any rogue checkboxes if that would be helpful?

Hi @sadiekelly if you have the spare capacity that would be great! The well filled-in radio fields make spotting these easily a bit more difficult, good to know.

hi @pipliggins I've checked through the data dictionaries, most are REDCap format so will be covered by the ___x. Of the couple that are not, generally checkboxes were not used. I think predicovid may have been one, and the checkbox fields there are not mapped so no changes needed. Otherwise looks like the checkboxes have been accounted for so all good!

@pipliggins
Copy link
Collaborator Author

@sadiekelly sorry to bug again... I thought this covered all the bases for checkbox fields by looking for fields with ___x in the name, but having looked again at some of the data files it's not clear to me that a lot of the symptom data wasn't also recorded as checkbox-like - e.g., the Guinea, Ghana & Uganda-v2 (these are just the first ones I checked) seem to have a data structure that suggests they use 0 as a default for admission symptoms/comorbidities, even though there are 'unknown' as options.
Any insights?

hi @pipliggins for those databases you've mentioned, it is fine to treat these not as checkboxes - they were either radio button or dropdown lists for the Yes/No/Unknown options meaning that one option should be selected and if none were selected then the cell in the csv for that data item would just be blank, there would not be a default entry for these. I think your approach of identifying checkbox fields by the ___x pattern sounds sensible, this will definitely cover all redcap databases that used checkboxes. I think other database formats didn't tend to use checkboxes, it was quite a redcap thing. I can do a quick look through the data dictionaries to find non-redcap setups and look for any rogue checkboxes if that would be helpful?

Hi @sadiekelly if you have the spare capacity that would be great! The well filled-in radio fields make spotting these easily a bit more difficult, good to know.

hi @pipliggins I've checked through the data dictionaries, most are REDCap format so will be covered by the ___x. Of the couple that are not, generally checkboxes were not used. I think predicovid may have been one, and the checkbox fields there are not mapped so no changes needed. Otherwise looks like the checkboxes have been accounted for so all good!

Fab, thanks for checking this @sadiekelly!

@pipliggins pipliggins merged commit 0d4768b into main Nov 20, 2023
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants