Conversation
These QC rules are still very ad hoc -- more work needed.
|
@divine7022 Sorry for the duplicates -- #17 is really intended to just be the AGU notebooks and I'd like to get this one merged first so that the diff is comprehensible there. I'll leave breadcrumbs to my changes in this branch that address your comments from #17. |
dlebauer
left a comment
There was a problem hiding this comment.
Looks good. A few comments / questions, but ready to merge.
|
|
||
| Statewide runs continue to use the 198 sites evaluated in phase 2. | ||
| We also introduce focused validation runs using the subset of sites where | ||
| direct observations of soil carbon and/or biomass are available during the |
There was a problem hiding this comment.
is biomass validated here?
| probably need a "drop into this directory with this format" | ||
| step. do NOT include validation_site_info.csv | ||
|
|
||
| #### Validation data |
There was a problem hiding this comment.
it would be useful to describe what the validation data must contain, and whether there is a common format that we should use that will work across datasets. I'd suggest following the existing naming from PEcAn/BETYdb where applicable.
| # TODO is this desirable? | ||
| # In production, may be better to complain if no PFT match | ||
| drop_na(pft) |> | ||
| # Temporary hack: |
There was a problem hiding this comment.
just for clarity, to make sure i understand: given this was filtered to 'control' and 'none' above: https://github.com/ccmmf/workflows/pull/16/changes#diff-cf89db02c85dc7dfd85dcf4d1a504296836708ed83a2e40b89ab1950f210217cR27
Is the only case of multiple treatments where there are both 'control' and 'none'? does this indicate that there is data cleaning to do?
There was a problem hiding this comment.
This deserves a more careful check, but (1) some sites report several measurements separately for the same treatment (these could probably be averaged together if we want), but mostly (2) the motivation here was just "During initial devleopment, only run the model once for a given location until we start passing it treatments that can be expected to differ." Stay tuned.
Pushing for visibility, details TK