Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add get_srml iotools function; deprecate read_srml_month_from_solardat #1779

Merged
merged 13 commits into from Jun 29, 2023

Conversation

AdamRJensen
Copy link
Member

@AdamRJensen AdamRJensen commented Jun 19, 2023

  • I am familiar with the contributing guidelines
  • Tests added
  • Updates entries in docs/sphinx/source/reference for API changes.
  • Adds description and name entries in the appropriate "what's new" file in docs/sphinx/source/whatsnew for all changes. Includes link to the GitHub Issue with :issue:`num` or this Pull Request with :pull:`num`. Includes contributor name and/or GitHub username (link with :ghuser:`user`).
  • New code is fully documented. Includes numpydoc compliant docstrings, examples, and comments where necessary.
  • Pull request is nearly complete and ready for detailed review.
  • Maintainer: Appropriate GitHub Labels (including remote-data) and Milestone are assigned to the Pull Request and linked Issue.

This PR adds a function pvlib.iotools.get_srml which eventually will replace the current pvlib.iotools.read_srml_month_from_solardat which does not follow the iotools convention.

The new function features a start/end parameter which allows for retrieving multiple months of data. The function also returns a tuple of (data, metadata) and exposes the url parameter.

Ideally, the pvlib.iotools.read_srml_month_from_solardat will begin its deprecation cycle in this PR.

@AdamRJensen AdamRJensen added io remote-data triggers --remote-data pytests labels Jun 19, 2023
@AdamRJensen AdamRJensen added this to the 0.10.0 milestone Jun 19, 2023
@wholmgren
Copy link
Member

I haven't read this PR in detail, but FYI... there was some reason that we went through fetch gymnastics with SRML when we added it to pvlib and solar forecast arbiter. I'd make sure that the equivalent of this test passes.

@AdamRJensen AdamRJensen added remote-data triggers --remote-data pytests and removed remote-data triggers --remote-data pytests labels Jun 19, 2023
Copy link
Member

@kandersolar kandersolar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd make sure that the equivalent of this test passes.

This PR delegates the year/month business to pd.date_range instead of DIYing like the Arbiter does, so I think we can probably get away without that level of testing, although considering what happens when end < start may be worthwhile... :)

except urllib.error.HTTPError:
warnings.warn(f"The following file was not found: {f}")

data = pd.concat(dfs, axis='rows')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about following this with a data = data.loc[start:end]? Right now I get back a full month even if I request just a single day, which isn't really ideal.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not opposed to this suggestion, although we may run into some issues related to timezone. I think this is why the start/end parameters in Solar Forecast Arbiter are required to be timezone localized - that seems like a hassle though.

There already exist several functions in pvlib that return a full month when requesting a single day btw, e.g. get_bsrn, and probably others too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, that's a good point. Up to you on what is best here. If we keep the current behavior of returning complete months, might be worth a note in the docstring.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not opposed to this suggestion, although we may run into some issues related to timezone. I think this is why the start/end parameters in Solar Forecast Arbiter are required to be timezone localized - that seems like a hassle though.

This sounds right. And a complicating factor for SRML was nighttime 0s in the future if we requested a day from the current month.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's a raw data file for the current month that shows -999 and similar for future dates.

image

Here's a raw data file that includes -999 and similar as well as 0s

image

This caused a problem in SFA SolarArbiter/solarforecastarbiter-core#572 and it's reasonable to expect that it would cause a problem with other user code. I don't know if that's pvlib's problem to solve, but I think it's somewhat more likely to come up with this new function that accepts datetimes instead of entire months.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see how the new function makes this more of an issue, currently, I would still request the same months with the old function, but it would just be more manual work.

I do think that it deserves a Warning entry and perhaps we can also implement a line that cuts off future data? For example:

data = data.loc[:pd.Timestamp.today(), :]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think many users would have different expectations of the new function that accepts start, end datetimes than the old function that accepts a year and a month.

Thanks for adding the warning. I'd rather see the previously rejected start:end slicing than .today() slicing. I'm also fine with just adding the warning and seeing if users complain.

pvlib/iotools/srml.py Show resolved Hide resolved
docs/sphinx/source/whatsnew/v0.10.0.rst Outdated Show resolved Hide resolved
pvlib/iotools/srml.py Outdated Show resolved Hide resolved
pvlib/tests/iotools/test_srml.py Show resolved Hide resolved
pvlib/tests/iotools/test_srml.py Outdated Show resolved Hide resolved
pvlib/iotools/srml.py Outdated Show resolved Hide resolved
docs/sphinx/source/whatsnew/v0.10.0.rst Outdated Show resolved Hide resolved
AdamRJensen and others added 2 commits June 22, 2023 00:38
Co-authored-by: Kevin Anderson <kevin.anderso@gmail.com>
@kandersolar kandersolar added remote-data triggers --remote-data pytests and removed remote-data triggers --remote-data pytests labels Jun 23, 2023
pvlib/tests/iotools/test_srml.py Show resolved Hide resolved
pvlib/iotools/srml.py Show resolved Hide resolved
except urllib.error.HTTPError:
warnings.warn(f"The following file was not found: {f}")

data = pd.concat(dfs, axis='rows')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, that's a good point. Up to you on what is best here. If we keep the current behavior of returning complete months, might be worth a note in the docstring.

pvlib/iotools/srml.py Outdated Show resolved Hide resolved
@AdamRJensen AdamRJensen added remote-data triggers --remote-data pytests and removed remote-data triggers --remote-data pytests labels Jun 25, 2023
Copy link
Member

@kandersolar kandersolar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple very minor things below, but otherwise LGTM!



def get_srml(station, start, end, filetype='PO', map_variables=True,
url="http://solardat.uoregon.edu/download/Archive/"):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking more about minor consistency details, I guess we could standardize the order of map_variables and url. Looking through other functions, here's what I see:

  • map_variables first: get_cams
  • url first: get_psm3, get_pvgis_hourly, get_pvgis_tmy

#1767 currently does map_variables first but of course can be changed easily.

I don't see one as much better than the other. I guess I'd favor map_variables first since I think it is very uncommon that people will want to mess with url, but maybe map_variables=False has some somewhat common uses. @AdamRJensen what do you think?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually thought about discussing this for your PR, but I guess I don't have a strong opinion except for consistency. But the logic of having URL last makes sense to me - it's just more of a hassle changing this as there are more existing functions with URL first.

I opened #1791 based on this discussion fyi.

pvlib/tests/iotools/test_srml.py Outdated Show resolved Hide resolved
@kandersolar kandersolar added remote-data triggers --remote-data pytests and removed remote-data triggers --remote-data pytests labels Jun 28, 2023
Co-authored-by: Kevin Anderson <kevin.anderso@gmail.com>
@kandersolar kandersolar added remote-data triggers --remote-data pytests and removed remote-data triggers --remote-data pytests labels Jun 29, 2023
@kandersolar
Copy link
Member

Seems like the stated concerns are adequately addressed at this point, so I will go ahead and merge. Thanks @AdamRJensen and @wholmgren :)

@kandersolar kandersolar changed the title Add get_srml iotools function Add get_srml iotools function; deprecate read_srml_month_from_solardat Jun 29, 2023
@kandersolar kandersolar merged commit 81e5593 into pvlib:main Jun 29, 2023
35 checks passed
@AdamRJensen AdamRJensen deleted the get_srml_v2 branch June 29, 2023 19:41
@wholmgren
Copy link
Member

Thanks @AdamRJensen and @kandersolar!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
io remote-data triggers --remote-data pytests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants