-
Notifications
You must be signed in to change notification settings - Fork 167
Description
What's the issue?
The definition of Nused within the obs_diag documentation is: The number of observations that were assimilated. However, this is only true if N_trusted is false for all observations. If some observations types are trusted, the Nused is actually: The number of observations that are included in the obs_diag metrics calculation.
Where did you find the issue?
CLM-DART users have reported it, and this is consistent with my understanding.
What needs to be fixed?
At a minimum the definition of Nused should be fixed to state: The number of observation included in the metrics calculation, or something to that effect. This issue might also extend to plotting figures such as plot_rmse_evolution_xxx.m, if Nused is the variable utilized to calculate *=assimilated. Not sure yet.
Suggestions for improvement
The bigger question is what was the intent of the Nused metric and what do users find to be most useful? I don't find the number of observations considered in the metric calculation as particularly useful, and maybe the original intent of the variable was to define Nassim instead of Nused. Not clear to me if this is a coding error or a documentation error.
In general, I do find the use of N_trusted observations to be very helpful, because oftentimes observations are rejected not because they are untrustworthy, but because of outlier threshold rejection criteria based on systematic errors between the obs and model state. As a result, as the model state is adjusted closer to the observations, more observations are considered in the metrics calculation which can cause unexpected temporal behavior in rmse and bias statistics. Sometimes setting the observations to trusted is more intuitive because the temporal statistics of rmse and biase make more sense -- and are based on the same number (spatial region) of observations at all times. In my opinion, the fact that the default behavior is for the metrics to consider a 'moving window' of observations is sub-optimal.
Anything else we should know?
Maybe use this thread as an opportunity to suggest new or different DART statistics, especially now that we hope to make a transition to the pyDART code.