-
Notifications
You must be signed in to change notification settings - Fork 945
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add function to retrieve example dataset paths #1763
base: main
Are you sure you want to change the base?
Conversation
This issue is related to #1056. However, I haven't addressed any of it since in first place I don't know which purpose each file has. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @echedey-ls :)
there isn't a tools.rst file, so I believe it doesn't need referencing.
It's true that we don't usually document the functions in pvlib.tools
, but in this case I think we probably should so that curious users can see what the function does and doesn't do. How about a new entry list at the bottom of the iotools page, something like this:
Functions for locating the example data files included in pvlib.
.. autosummary::
:toctree: generated/
tools.locate_example_dataset
If we do that, the function will need a complete docstring (parameter descriptions, etc).
Co-Authored-By: Kevin Anderson <57452607+kandersolar@users.noreply.github.com>
Co-Authored-By: Kevin Anderson <57452607+kandersolar@users.noreply.github.com>
Co-Authored-By: Kevin Anderson <57452607+kandersolar@users.noreply.github.com>
Co-Authored-By: Kevin Anderson <57452607+kandersolar@users.noreply.github.com>
Co-Authored-By: Kevin Anderson <57452607+kandersolar@users.noreply.github.com>
Here is the documentation link [UPDATED]. |
Co-Authored-By: Kevin Anderson <57452607+kandersolar@users.noreply.github.com>
This should be ok now. Edit: oh, and I ended up adding the parameter, since it won't really impact the use, someone more experienced may be able to organize the data folder. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great! @echedey-ls can you resolve the merge conflict? After that I think this will be ready to go.
Maybe @wholmgren would like to review as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm concerned about using 1 public function to do three things:
- load resources for pvlib code
- load resources for test code
- load resources for documentation code
The function name get_test_dataset_path
implies that it's only for tests, yet it's used for all 3 of those items. Seems ripe for problems down the road.
I also don't think we need this, at least not until we clean up our data directories and make it clear what is public and what is not. I think the "preferably one way to do it" is now to use importlib.resources.files
. Example below
In [3]: import importlib.resources
In [6]: importlib.resources.files('pvlib') / 'data'
Out[6]: WindowsPath('C:/Users/WILHOL/Miniconda3/envs/pvdeg/lib/site-packages/pvlib/data')
In [7]: tmy3_filepath = importlib.resources.files('pvlib') / 'data' / '723170TYA.CSV'
In [8]: tmy3_filepath.exists()
Out[8]: True
pvlib/clearsky.py
Outdated
@@ -189,8 +188,7 @@ def lookup_linke_turbidity(time, latitude, longitude, filepath=None, | |||
# 1st column: 179.9583 W, 2nd column: 179.875 W | |||
|
|||
if filepath is None: | |||
pvlib_path = os.path.dirname(os.path.abspath(__file__)) | |||
filepath = os.path.join(pvlib_path, 'data', 'LinkeTurbidities.h5') | |||
filepath = get_test_dataset_path('LinkeTurbidities.h5') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is definitely not test data
Good to know about
I suggested including But I don't see why one function for three types of data files might cause problems later on. If all the function does is |
There's an importlib-resources backport for older python. We already use this approach for I think your original list of pros to a function are still valid. There the focus is on a function for example code and maybe user onboarding that separates public resources from package structure. If the user-facing function only does that then it might be a local win. On the other hand, it can also be good to propagate standard python patterns in example code so it's less clear that it's a global win. I think the internal code should keep it simple and stick to importlib-resources. |
I understand the issue. We can limit the scope of this function until #1056 is addressed.
@kandersolar what is your opinion on renaming it to I'd like to bring attention to two points:
|
This should be fine to go. Renamed to |
It seems to use the same
Yes, see #473. There is a non-pvlib discussion here on whether including tests in sdists and wheels is a good idea in general (tl;dr opinions vary). |
That's good to know @kandersolar ! |
Done:
docs/sphinx/source/reference
for API changes.About this one, there isn't atools.rst
file, so I believe it doesn't need referencing.I'll wait for milestone assignment:
docs/sphinx/source/whatsnew
for all changes. Includes link to the GitHub Issue with:issue:`num`
or this Pull Request with:pull:`num`
. Includes contributor name and/or GitHub username (link with:ghuser:`user`
).remote-data
) and Milestone are assigned to the Pull Request and linked Issue.Adds a
get_example_dataset_path()
function topvlib.tools
that returns a filepath to a dataset bundled with pvlib. See linked issue above.Here is the documentation link [UPDATED 2023-06-18].