-
Notifications
You must be signed in to change notification settings - Fork 37
rounding-based stale value detection #37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Do you have a real world data set that benefits from this? In practice we find the existing stale filter to be too aggressive to be useful. Context: SolarArbiter/solarforecastarbiter-core#124 |
I'll see if Matt Muller (who devised this filter) has some specific examples. |
Matt Muller indicated to me that he hasn't see much of a problem with the sort of false-positives that are noted in the SFA issue, at least not for the data sets he has used. Obviously this approach has the same issues with rounded (or otherwise altered) data (in particular Matt called out rounded temperature data as a problem case). I don't see much of a way around that other than making sure the caller uses a window that is large enough to span more than the expected time for the variable to change. "More aggressive" was a poor choice of words on my part. All I really meant by it was that |
|
The original implementation of this was based on loops, this implementation uses pandas functions for all manipulations which should improve performance and maintainability.
We now have two stale-value detection methods. Also adds a note about caveats for the use of these functions.
Also removes unused pandas import.
Using .count() doesn't work for python 3.5, len() should be supported on pretty much every version.
Makes the description somewhat more clear. Co-Authored-By: Cliff Hansen <cwhanse@sandia.gov>
Co-Authored-By: Cliff Hansen <cwhanse@sandia.gov>
0330226
to
002d450
Compare
Make this match the default for the other gaps functions
Remove note about the differences with the stale_values_diff funciton as it is no longer applicable.
Co-authored-by: Cliff Hansen <cwhanse@sandia.gov>
window comes first in stale_values_diff
Implementation of the stale value detection method from the pvfleets_qa_analysis code.
This is a very similar function to
stale_values_diff
, but it is based on the difference between consecutive data points after rounding rather than putting a lower bound on the differences. It is also more aggressive in what it considers stale, marking the full sequence of repeated values, rather than just a suffix of the sequence as instale_values_diff
.