New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fill out query params for Stocksnap DAG, allow restarts #4102
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Requesting clarification on the changed test; approving in case page 0 is meaningless and we are intentionally changing this to skip that page now.
expect_result = "https://stocksnap.io/api/load-photos/date/desc/0" | ||
expect_result = "https://stocksnap.io/api/load-photos/date/desc/1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is surpising... does this difference matter? Is page 0 a meaningful page that we need to retrieve, different from page 1?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this is actually a more semantically appropriate tests - the way the class was set up previously is that it would produce 0
if that attribute was accessed directly after construction, but when it was actually called it would be after the first get_next_query_parameters
, which would always increment the value and set it to 1
. We want page 1
and need to start there, this change just more appropriately reflects that as the starting page (rather than relying on the expected ingestion flow which would increment it before actually calling the endpoint).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great!
Fixes
Related to #4101
Description
This PR adds some machinery around the
query_params
handling for the Stocksnap DAG so that we can leverage the options we already have available for query parameters for manipulating the page(s) that the Stocksnap DAG operates on. This doesn't address the error described in the above issue, but it should make it much easier to reproduce (since we should now receive a specific page for where the run failed rather than having to wait 9 hours!).Testing Instructions
just catalog/pgcli
andSELECT count(*) from images;
to verify that the ingestion limit records were loaded.initial_query_params
set to{"page": 100}
. Use the same commands as above to verify that there are now twice as many records (e.g. new data and not the same page was loaded).raise ValueError("Whoops")
in theget_record_data
function of the Stocksnap ingester, and check that the error logs show which page failed.[{"page": 5}, {"page": 10}]
) to trigger a new DAG to only run over a few different sets of parameters.Checklist
Update index.md
).main
) or a parent feature branch.Developer Certificate of Origin
Developer Certificate of Origin