Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[JOSS] Solar position calculation introduces time shift error for ERA5 #158

Closed
kandersolar opened this issue Jun 13, 2021 · 8 comments · Fixed by #199
Closed

[JOSS] Solar position calculation introduces time shift error for ERA5 #158

kandersolar opened this issue Jun 13, 2021 · 8 comments · Fixed by #199

Comments

@kandersolar
Copy link

I think, when simulating PV production, the solar position calculation is performed for the edge of each weather interval, introducing an effective time shift error when simulating solar output using interval-averaged weather like ERA5.

Note that I am new to ERA5 (and atlite), so it's possible that my understanding of its conventions is incorrect.

Description

I don't have code to reproduce, but this is the reasoning (please check that my understanding is actually correct!):

There appears to be a small timeshift in the plot of Cell 16 in the historic-comparison-germany notebook -- it may be from this issue, although I did not think through which direction the time shift error is and whether it matches the shift in that notebook.

Expected Behavior

A better alternative is to calculate solar position in the middle of the interval instead of one edge. It may also be desirable to handle the sunrise and sunset hours separately (see Section 4.3 here: https://www.nrel.gov/docs/fy15osti/64102.pdf) though that level of detail might be out of scope here.

Actual Behavior

Sun position is calculated for the end of the interval, which is not as representative of the average position across the interval.

Error Message

This is not a programming error, just a conceptual one.

Your Environment

  • The atlite version used: 0.2.5.dev0+g97ff81c.d20210613
  • How you installed atlite (conda, pip or github): github
  • Operating System: Windows
  • My environment: not relevant, but happy to provide if necessary

xref openjournals/joss-reviews#3294

@FabianHofmann
Copy link
Contributor

@kanderso-nrel thanks again for this helpful comment.

Regarding the implementation:
In contrast to ERA5, the SARAH data provides instantaneous data points. So, when merging the SARAH with ERA5, we should align to the ERA5 convention. Same for the solar positions.

@EdgarUbaldo
Copy link

Hello @kanderso-nrel ,

We have also noticed this issue: https://confluence.ecmwf.int/display/OIFSUF/Solar+position+and+DNI+calculations+using+the+hourly+time+index+of+the+reanalysis-era5-single-levels+dataset

I posted a question in the ERA5 support community portal.

Is there any explanation on your side?

Best regards,
Edgar

@kandersolar
Copy link
Author

Hi @EdgarUbaldo, I'm not sure I understand your question. The problem is that calculating solar position for the interval edge does not produce a solar position that is representative of the interval average. It looks like you already have a solution: apply a 30-minute shift when calculating solar position so that it is calculated for the center of the interval rather than the edge. My understanding of ERA5's interval labeling and such is at the top of this thread, but please note that I am not an ERA5 expert by any means.

@EdgarUbaldo
Copy link

EdgarUbaldo commented Oct 13, 2021

@kanderso-nrel

Yeah, kind of.

As per a college, we noticed it and found this way to go around it. I was looking for arguments to support our -30min time shift because, to me, if any, a +30min time shift seemed more intuitive. This is why I asked you if you had any explanation since your suggestion was related.

However, after I posted the issue in the ERA5 forum, I realized that a -30min actually makes sense because the values in this particular dataset are "... accumulated over a particular time period which depends on the data extracted" and not an average. Thus, and considering that the dataset is right-labeled (values from 13:00-14:00 are labeled as 14:00 as per the documentation: "accumulations are over the hour (the accumulation/processing period) ending at the validity date/time"), a -30min time shift would be correct.

I am more confident now of this -30min time shift. However, it would need to be cross-checked with a "ground truth" value from another dataset or in situ data to be 100% certain. I'm not sure that we have time for that. Hopefully, others intervene in this or the other forum.

@lukas-rokka
Copy link

Made this gif gif using atlite where you also can note the shift error. Making an half hour shift or a 7.5 degree shift in the longitude both work for such an issue. (there exist more complex methods on how to do integrate over a time interval that are slightly more correct that can be found in the reference literature of ERA5, but I think it would be an overkill with hourly data and considering the high uncertainty that already exist in the solar radiation data).

You can see how I solved it here at lines 156 to 166.

@lukas-rokka
Copy link

It's eq at line 62 in solar_position.py that needs to be modified. In case of hourly data, 7.5° needs to be subtracted from the lstm variable (or -7.5*time_interval_in_hours in case the code needs to handle also other than hourly data intervals).

A workaround for those that just want to do PV simulations with current version is to subtract 7.5° from the longitude coord before the PV calculation with the following code : cutout.data.assign_coords(lon = cutout.data.lon - 7.5) .

@euronion
Copy link
Collaborator

euronion commented Nov 1, 2021

As @FabianHofmann mentioned, cutouts based on SARAH data use a different convention and should therefore not affected.

@lukas-rokka Could you post a link to the more complex methods from the ERA5 reference literature?

I wonder if it is better to

  • (Option 1) shift the solar position from X:00 to (X-1):30, or
  • (Option 2) to disaggregate the accumulated hourly data into half-hour data and reaggregate them, such that e.g. the originally labelled "14:00" which spans (13:00, 14:00] is reaggregated to contain values for (13:30, 14:30], "15:00" containing (14:30, 15:30] and so on. The solar movement is pretty deterministic and we might be able to calculate weights from it, so this might also be an option.

@FabianHofmann We might be able to implement the shift in solar position in an dataset-agnostic way by calculating the solar position already during the cutout.prepare(...) step and saving the solar position with the cutout:

  • for SARAH solar irradiation data no adjustment would be necessary
  • for ERA5 the solar position would be calculated using the time/longitude shift suggested by @lukas-rokka

(Sideeffect: Should significantly accelerate .pv(..) time-series generation, as solar position is only calculated ones per cutout)

@lukas-rokka
Copy link

See Chapter 2 in IFS Cy41r2 documentations for details about how the radiation is calculated in ERA5. Didn't find the temporal integration part now but to my understanding more advanced integration methods made sense when the radiation scheme was calculated at 3 hours intervals.
(An other thing that can be of interest from the documentation is that the radiation fluxes are calculated on a coarser spatial resolution than other variables in ERA5/IFS Cy41r2).

Think SARAH data comes in 30 minutes instantaneous values, so it needs the solar position to be calculated at both half and full hour (13:00, 13:30, 14:00 etc). So should be possible to use same solar position data for both ERA5 and SARAH if it is in 30 minutes intervals.

Not sure I understood the the options 2. But even if you first interpolate ERA5 to 30 minutes you still need to do a time shift for the solar position calculation, but then it should be a -15 minutes shift instead of -30 minutes.

Also if you decide to calculate the solar position data once as an preparation step as an time * lon * lat sized array it can be worth considering that the data is only unique in the time * lat dimensions. Eg. array[time=yyyy-mm-dd 13:00, lon=15, lat=0] will hold the same solar position data as array[time=yyyy-mm-dd 12:00, lon=0, lat=0] and the -30 minutes shifted solar position data should be at array[time=yyyy-mm-dd 13:00, lon=7.5, lat=0] (I might have messed up the direction there, but should be easy to test).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants