-
Notifications
You must be signed in to change notification settings - Fork 172
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add end_value support to incremental #467
Conversation
✅ Deploy Preview for dlt-hub-docs ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR looks good but the whole concept raises some questions:
- when having a concrete chunk to load we should IMO not modify the state at all. we may read the state to get ie. last_value or initial_value but I think it's better if we require initial value to be always present when end value is present and always create a mock state
- this let's you to run several chunks in parallel. the state is never written to (and possibly also not read)
- the incremental loading will start when user calls the resource without passing end date. it is user responsibility to pass the correct initial value (which is the highest end date used by back loading). interestingly the incremental load also can happen in parallel
what do you think?
Yeah, I agree. And 👍 on requiring initial value. Should error also when initial value is higher, i.e. |
@rudolfix updated like this in last commit. Using dummy state dict for this, and added a test to make sure the state isn't updated. |
* Stop the generator when end_value is reached * Override last_value with initial_value while loading end_value * Add `Incremental.merge` method for simpler overrides
9fb9971
to
5f31a42
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
excellent! I've added a short paragraph on end_value
to our docs.
Implements chunked loading support described in dlt-hub/verified-sources#197 (comment)
Usage e.g.
When end_value is reached the generator is stopped and incremental state gets updated with either
end_value
or previouslast_value
(whichever is higher (or lower, etc, depending onlast_value_func
)).So it's possible to load chunks in any order and then stop sending
end_value
arg to continue incremental load and vice versa.Decided on using
initial_value
instead of adding another "start" argument. When end_value is supplied, the initial acts as an override.