Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nighttime flag should account for time within an interval #567

Closed
wholmgren opened this issue Sep 14, 2020 · 7 comments · Fixed by #579
Closed

nighttime flag should account for time within an interval #567

wholmgren opened this issue Sep 14, 2020 · 7 comments · Fixed by #579
Labels
bug Something isn't working
Milestone

Comments

@wholmgren
Copy link
Member

The NIGHTTIME flag is set based on the solar zenith at the interval label timestamp. For example, for an observation with interval label = beginning, for an interval [start, end) the flag is computed based on only solar_zenith(start) < 87. And for an observation with interval label = ending, for an interval (start, end] the flag is computed based on only solar_zenith(end) < 87. This is fine for 1 minute observations, but becomes problematic for longer intervals. We instead should label intervals by the fraction of time within an interval that is nighttime.

Here are the relevant sections of code:

@mask_flags('NIGHTTIME')
def check_irradiance_day_night(solar_zenith, max_zenith=87):
""" Checks for day/night periods based on solar zenith.
Parameters
----------
solar_zenith : Series
Solar zenith angle in degrees
max_zenith : maximum zenith angle for a daylight time
Returns
-------
flags : Series
True when solar zenith is less than max_zenith.
"""
flags = _check_limits(solar_zenith, ub=max_zenith)
return flags

def _solpos_night(observation, values):
solar_position = pvmodel.calculate_solar_position(
observation.site.latitude, observation.site.longitude,
observation.site.elevation, values.index)
night_flag = validator.check_irradiance_day_night(solar_position['zenith'],
_return_mask=True)
return solar_position, night_flag

def calculate_solar_position(latitude, longitude, elevation, times):
"""
Calculates solar position using pvlib's implementation of NREL SPA.
Parameters
----------
latitude : float
longitude : float
elevation : float
times : pd.DatetimeIndex
Returns
-------
solar_position : pd.DataFrame
The DataFrame will have the following columns: apparent_zenith
(degrees), zenith (degrees), apparent_elevation (degrees),
elevation (degrees), azimuth (degrees),
equation_of_time (minutes).
"""
solpos = pvlib.solarposition.get_solarposition(times, latitude,
longitude,
altitude=elevation,
method='nrel_numpy')
return solpos

My proposal for the arbiter is the following:

  1. tasks._solpos_night computes solar position for sub-intervals, similar to persistence_scalar_index. We'd need to choose an appropriate sub-interval length. 1 minute? 5 minutes like in persistence_scalar_index?
  2. validator.check_irradiance_day_night computes the flag on the higher resolution solar position data.
  3. tasks._solpos_night applies boolean resampling logic, similar to accounting for marginal nighttime and clearsky conditions when resampling observations #556, to create a flag for the times in the original index.
  4. One of the following:
    A. Resample the solar position data to match the observation rules, and return.
    B. Subset and continue to return the instantaneous values at the labels.
    C. Return the full high resolution solar position dataframe and modify the downstream functions to account for averaging/subsetting. Perhaps my favorite approach, but also the most work.
  5. Return the solar_position, night_flag tuple.

@cwhanse have you looked into this for pvanalytics? I didn't see any code or issues.

@wholmgren wholmgren modified the milestones: 1.0, 1.0 rc4 Sep 14, 2020
@cwhanse
Copy link
Contributor

cwhanse commented Sep 14, 2020

The day/night labeler in pvanalytics assumes that you don't yet trust the location or timestamps. We haven't really discussed beyond that inference algorithm.

Here, the challenge is to label day/night for time averages where the location and timestamps are trusted.

What about an approach that identifies the periods when day/night change (sunrise/sunset) and computes the daylight fraction of those periods? Maybe more complicated but perhaps faster than computing solar position for of all the downscaled time series.

@wholmgren
Copy link
Member Author

What about an approach that identifies the periods when day/night change (sunrise/sunset) and computes the daylight fraction of those periods? Maybe more complicated but perhaps faster than computing solar position for of all the downscaled time series.

That's an interesting alternative. sun_rise_set_transit_spa (or ephem/geometric variants) give us the zenith = 90 points, so we'd need to let go of consistency with the results of 1 minute observations resampled to longer intervals. calc_time would let us find when zenith = 87, but I suspect it's much slower.

Or were you suggesting that we find sunrise/sunset and then compute the solar position of the downscaled time series for just the hours around those times?

Perhaps an opportunity to add some more performance tests to pvlib.

@cwhanse
Copy link
Contributor

cwhanse commented Sep 14, 2020

What about an approach that identifies the periods when day/night change (sunrise/sunset) and computes the daylight fraction of those periods? Maybe more complicated but perhaps faster than computing solar position for of all the downscaled time series.

That's an interesting alternative. sun_rise_set_transit_spa (or ephem/geometric variants) give us the zenith = 90 points, so we'd need to let go of consistency with the results of 1 minute observations resampled to longer intervals. calc_time would let us find when zenith = 87, but I suspect it's much slower.

sun_rise_set_transit_ephem accepts a 'horizon' kwarg that defines the angle for sunrise/sunset. I don't know if SPA could be extended with a similar kwarg.

Or were you suggesting that we find sunrise/sunset and then compute the solar position of the downscaled time series for just the hours around those times?

I was thinking the above, but, if we know exactly when sunrise occurs, then it should be easy to compute the minutes between sunrise and interval start, and end, without having to downscale the interval.

@wholmgren
Copy link
Member Author

sun_rise_set_transit_ephem accepts a 'horizon' kwarg that defines the angle for sunrise/sunset.

Thanks for reminding me of that feature.

pvlib/pvlib-python#1059 adds asv performance benchmarks for the relevant solar position functions. The results are here: https://pvlib-benchmarker.github.io/pvlib-benchmarks/#/summarylist

sun_rise_set_transit_ephem: 50 ms to calculate sunrise and sunset for 100 days. Linear performance from 10 --> 100 days. I confirmed that setting a horizon doesn't have much effect on the performance. Would need to test timing stitching together flags into a continuous series. I expect that would be relatively slow but it's worth investigating.

sun_rise_set_transit_geometric: constant 13 ms for 1, 10, 10 days. Would need to do more to compute horizon = 3 degrees and stitching.

sun_rise_set_transit_spa: 28 ms for 100 days, also would need to do more to compute horizon = 3 degrees and stitching. Only slightly faster (22, 23 ms) for (1, 10 days).

sun_rise_set_transit_spa with numba: 7.6 ms for 100 days, also would need to do more to compute horizon = 3 degrees and stitching. Somewhat faster (2.3, 4.3 ms) for (1, 10 days).

calc_time: 170 us to find a single time. 170 us * 100 days * 2 = 34 ms. Altitude at initial guesses needs to be within +/- 90 degrees of sunrise/set - this could be a little finicky. Still needs stitching. Seems like ephem is a better choice than this one.

spa_python: 1272 ms to calculate 100 days of 1 minute resolution solar positions. I suspect masking and resampling is on order of milliseconds. Linear performance from 1 --> 10 --> 100 days.

spa_python with numba: 250 ms to calculate 100 days of 1 minute resolution solar positions. Approximately linear performance from 1 --> 10 --> 100 days.

It takes several seconds to compile the numba code, so it's only useful for repeated calculations. My understanding of the workflow is that we would not benefit unless using precompiled numba code, which I don't have experience with. @alorenzo175 can you comment on the applicability of numba in the workers?

Also worth pointing out that any database IO that involves 100 days of 1 minute resolution data is going to take a bit of time - seemed like a couple of seconds going through csv on the dashboard. I'd rather not add another second of latency but that's not the end of the world and hundreds of ms is not a big deal.

@alorenzo175
Copy link
Member

It takes several seconds to compile the numba code, so it's only useful for repeated calculations. My understanding of the workflow is that we would not benefit unless using precompiled numba code, which I don't have experience with. @alorenzo175 can you comment on the applicability of numba in the workers?

I think the workers spawn process to perform the validation. So the numba codes is precompiled at worker startup and fork is used to spawn the process, it might not need to be recompiled for each validation. Hard to say if the time to test and implement that is worth it over going without numba.

@wholmgren
Copy link
Member Author

Similar issue in

def _check_power_limits(
power, solar_zenith, capacity, capacity_limit_low,
capacity_limit_high_day, capacity_limit_high_night
):
# convert fractions to absolute values
capacity_low = capacity * capacity_limit_low
capacity_high_day = capacity * capacity_limit_high_day
capacity_high_night = capacity * capacity_limit_high_night
# True for daytime values
day_night = check_irradiance_day_night(solar_zenith, max_zenith=93)
flag_low = _check_limits(power, lb=capacity_low)
flag_high_day = _check_limits(power, ub=capacity_high_day) & day_night
flag_high_night = _check_limits(power, ub=capacity_high_night) & ~day_night
# composite constructed such that True values within limits for day
# or night. False values exceed any limit.
flags = flag_low & (flag_high_day | flag_high_night)
return flags

Perhaps we should change check_ac_power_limits and check_dc_power_limits to take a day/night flag instead of solar zenith.

@cwhanse
Copy link
Contributor

cwhanse commented Sep 17, 2020

Perhaps we should change check_ac_power_limits and check_dc_power_limits to take a day/night flag instead of solar zenith.

Strikes me as an improvement. I haven't looked at what it would change in tasks

@wholmgren wholmgren added the bug Something isn't working label Sep 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants