Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New zenodo api bandaid #2942

Merged
merged 3 commits into from
Oct 16, 2023
Merged

New zenodo api bandaid #2942

merged 3 commits into from
Oct 16, 2023

Conversation

jdangerx
Copy link
Member

@jdangerx jdangerx commented Oct 16, 2023

Closes #2939 .

  • Use new record files API endpoint to get datapackage.json
  • Add shim to re-construct legacy URLs that are in datapackage.json / pretend that we were using the "best, most stable URLs" the whole time
  • remove access keys, yay

Copy link
Member

@zaneselvans zaneselvans left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this PR really supposed to be going into main rather than dev?

Another simplifying change I think we can make with the new API is getting rid of the API keys / tokens entirely, since it seems like read-only operations are fine without them (at least, curl works without them which I don't think was the case before!)

test/unit/settings_test.py Outdated Show resolved Hide resolved
src/pudl/workspace/datastore.py Outdated Show resolved Hide resolved
src/pudl/workspace/datastore.py Show resolved Hide resolved
src/pudl/workspace/datastore.py Outdated Show resolved Hide resolved
Comment on lines +288 to +274
for f in dpkg.json()["entries"]:
if f["key"] == "datapackage.json":
resp = self._fetch_from_url(f["links"]["content"])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If our whole setup is contingent on being able to construct a path to a file given the filename and record ID, is there a reason to use the API to obtain the datapackage.json?

I guess if we want to verify the checksum on datapackage.json we need to get the checksum from Zenodo, since the file can't contain its own checksum.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My reason for doing this is that "keeping the datapackage.json around lets me not think too hard about how to construct a DatapackageDescriptor."

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry if I was unclear, I meant why not just pull down the datapackage.json by constructing its URL, rather than querying the API to get its URL -- having it around is clearly useful!

@jdangerx jdangerx changed the base branch from main to dev October 16, 2023 17:56
@jdangerx jdangerx marked this pull request as ready for review October 16, 2023 19:20
@codecov
Copy link

codecov bot commented Oct 16, 2023

Codecov Report

All modified lines are covered by tests ✅

Comparison is base (8ba81e9) 88.5% compared to head (7db9f07) 88.5%.
Report is 5 commits behind head on dev.

Additional details and impacted files
@@          Coverage Diff          @@
##             dev   #2942   +/-   ##
=====================================
  Coverage   88.5%   88.5%           
=====================================
  Files         90      90           
  Lines      10797   10800    +3     
=====================================
+ Hits        9564    9568    +4     
+ Misses      1233    1232    -1     
Files Coverage Δ
src/pudl/workspace/datastore.py 77.6% <100.0%> (+0.5%) ⬆️

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@jdangerx jdangerx merged commit b54dfb8 into dev Oct 16, 2023
19 checks passed
@jdangerx jdangerx deleted the new-zenodo-api-bandaid branch October 16, 2023 20:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

Update datastore to read from Zenodo's new InvenioRDM API
2 participants