Skip to content

Commit

Permalink
Fecal_Coliform/E_coli replace '#' outside of replace_unit_by_dict()
Browse files Browse the repository at this point in the history
replace_unit_by_dict() is based on the entire string, since '#' is hard to deal with it's replaced the same way regardless of where in the unit string it is. There is a current TODO: to determine why doing string replacements before dict replacement (generally preferable as it standardizes units first) were problematic.
  • Loading branch information
jbousquin committed Aug 1, 2023
1 parent c5e5f43 commit a18a70d
Showing 1 changed file with 3 additions and 1 deletion.
4 changes: 3 additions & 1 deletion harmonize_wq/harmonize.py
Original file line number Diff line number Diff line change
Expand Up @@ -1040,7 +1040,9 @@ def harmonize_generic(df_in, char_val, units_out=None, errors='raise',
elif out_col in ['Fecal_Coliform', 'E_coli']:
# NOTE: Ecoli ['cfu/100ml', 'MPN/100ml', '#/100ml']
# NOTE: feca ['CFU', 'MPN/100ml', 'cfu/100ml', 'MPN/100 ml', '#/100ml']
# Replace known unit problems ('#' count; assume CFU/MPN is /100ml)
# Replace known special character in unit ('#' count assumed as CFU)
wqp.replace_unit_by_str('#', 'CFU')
# Replace known unit problems (e.g., assume CFU/MPN is /100ml)
wqp.replace_unit_by_dict(domains.UNITS_REPLACE[out_col])
#TODO: figure out why the above must be done before replace_unit_by_str
# Replace all instances in results column
Expand Down

0 comments on commit a18a70d

Please sign in to comment.