-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tw/reeds cols #187
Tw/reeds cols #187
Conversation
…CRS warning in outlier finding step.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great little change! Always love additional flexibility for end-users.
Just a few comments to help up keep our documentation clean and up-to-date.
Thanks for this, Travis!
reVX/utilities/reeds_cols.py
Outdated
By default, ``None``. | ||
The first key is "data_fp", and it points either a dictionary of | ||
new field/new value pairs or to the path where the extra data is being | ||
extracted from. This must be a dictionary, an HDF5, or JSON |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we update this text to something like
"The first key is "data_fp", and it points either a dictionary of new field/new value pairs or to the path where the extra data is being extracted from. The latter must be an HDF5 or JSON file (i.e., if not a dictionary, it must end in ".h5" or ".json")."
Basically I'm just trying to avoid saying "dictionary" so many times
reVX/utilities/reeds_cols.py
Outdated
of the input ``data_frame``. For HDF5 data, the datasets must be 1D | ||
datasets, and they will be merged with the input ``data_frame`` | ||
on ``merge_col`` (column must be in the HDF5 file meta). By default, | ||
``None``. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add documentation for the regions
input parameter at the end of this docstring? It doesn't have to be anything long or fancy, but this is the text that gets put into the documentation for this function, so it's important that we have all of our parameters documents.
reVX/utilities/reeds_cols.py
Outdated
if data_fp.endswith(".json"): | ||
if isinstance(data_fp, dict): | ||
extra_data = data_fp | ||
elif str(data_fp).endswith(".json"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this cast to str
(and the one below) really necessary? This value should either be a string or a dictionary, and we already check for a dictionary above. Any other kind of input should error out here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It could be a Posix Path, can add to the docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah gotcha. I don't think you need to update the docs then, unless you really want to
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## main #187 +/- ##
=======================================
Coverage 81.36% 81.36%
=======================================
Files 113 113
Lines 13369 13381 +12
=======================================
+ Hits 10878 10888 +10
- Misses 2491 2493 +2
Flags with carried forward coverage won't be shown. Click here to find out more.
|
@WilliamsTravis I went ahead and made some of the requested updates and then merged code to fix the tests. Once they all pass, this branch should be good to merge. One thing I ended up doing is changing the dictionary key name from |
There were potential problems in the remote county file read and the reprojection step (both potential network-related problem since pyproj uses a network connection too), so this tweak allows the user to pass their own local region file through add_reeds_cols. This also has the potential to significantly speed up the routine, depending on how the user treats this file.
I also turned off a CRS warning when finding the centroid for the outliers later. This might be a bit contentious, but since these are used to handle what is likely just handful of outliers (sc points that fall outside the boundaries of regions) it seems like a trivial warning. The best way would be to reproject the files into a local area projection, but that could take significantly more runtime and it would be difficult to find the most appropriate system for each region.
I also tweaked the function to add a dictionary option for the 'data_fp' argument in 'extra_data' so the user can do everything in one python script without having to write jsons.