-
Notifications
You must be signed in to change notification settings - Fork 37
Functions to detect when the sun is up #40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Not working yet, but includes a couple initial tests and the rough documentation of the function. 'features.time' may not be the right place for it, but it makes sense to me at the moment. I think this kind of function can be useful in a quality check, but in and of itself it is not a QA function---it is a feature extraction function. Another organization option could be 'features.irradiance' (since the function operates on irradiance data); however, because it can also operate power data it doesn't make sense to me to put it there.
Fixes bug where maximum number of days with data was not calculated correctly. Since the data was transformed from a series to a frame the return value of 'data.max()' was a series. The conditional, however, was expecting a scalar. To avoid this we just compute the minutes of the day as their own series and use that series to do the grouping rather than making the minutes of the day another column of the data.
Looks for periods of sufficiently high power or irradiance rather than just looking at the frequency of non-zero values. Should be more robust to data with non-zero (but low) night time values than daytime_frequency, but may suffer from other (as yet unknown limitations).
@matt14muller let me know if you have any thoughts on the |
Adds summary of functions to API documentation as well.
Our original plan (see README.md) was to put these types of functions in the |
I'm OK with putting it these functions in |
I have been doing some testing of what Kirsten wrote using derivates and integrals to identify day/night/and clipping and so far the results look good. I am working on some edge case testing to see if they cause problems.
Also Kirsten said she is happy to participate.
Matt
|
|
More specific name and a place to hold functions for features related to when the sun is up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some questions to discuss with @matt14muller before we merge
raise ValueError("Too few days with data (got {}, minimum_days={})" | ||
.format(value_frequency.max(), minimum_days)) | ||
daylight_minutes = value_frequency[ | ||
value_frequency > threshold * value_frequency.mean() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@matt14muller I don't understand how this works. value_frequency[minute_of_day]
is the count of days when the data value is positive. value_frequency.mean()
is the average of these counts. What happens if the data span a summer at high latitude? Since the daylight period is long, value_frequency.mean()
will be large compared to data from a winter at the same latitude. So it is more likely to label a minute as daytime during winter than summer?
up. | ||
""" | ||
max_power = power_or_irradiance.quantile(quantile) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@matt14muller max_power
is the 95th quantile of the time series, e.g. a value representative of solar noon. So it is labeling daylight as minutes when the time series is greater than or equal to the 95th percentile. The labels from this algorithm are going to depend on the season, e.g., early morning minutes are less likely to be labeled in the summer than in the winter since the solar noon peak is higher in summer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wrote this one based on our discussion a few weeks ago (not based on the pvfleets code, even though it is structurally similar). I had not thought of the issue you pointed out with the higher values in the summer, that might be a good reason not to include this. If you only really want mid-day periods and you don't mind losing the morning/evening data then I do think that this algorithm works well (even on the wacky irradiance data set that Matt shared with me).
Co-authored-by: Cliff Hansen <cwhanse@sandia.gov>
Co-authored-by: Cliff Hansen <cwhanse@sandia.gov>
Closing this since #67 implements a better solution. |
Two methods to identify when the sun is up from power or irradiance data. These are useful when you do not trust/know the timezone of timestamps in your data.
Both functions work by aggregating data by minute of the day
features.daylight.frequency()
looks for the minutes of the day where the vast majority of days (defaults to 80%) in the data have positive values.features.daylight.level()
looks for the minutes of the day where the mean of all data for that minute is greater than some percentage (defaults to 20%) of some quantile (defaults to 95%) of the data.To Do
features.daylight
submodule