-
Notifications
You must be signed in to change notification settings - Fork 172
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(pipeline): add an ability to auto truncate #1292
Conversation
staging destination after load
✅ Deploy Preview for dlt-hub-docs canceled.
|
@rudolfix since we have two different concepts of "staging", a staging destination as well as a staging dataset, maybe we should use those terms in config vars more clearly and not just "staging" which could mean both really |
@IlyaFaer in the ticket it says we want to truncate the staging dataset, not the staging destination. I also think that the setting should probably be part of the LoaderConfiguration and not be passed down from the pipeline. @rudolfix should we always truncate or only if all loadjobs where successfull? If a mergejob fails because of a connection error or something like that I don't think we should truncate the staging tables, because you can retry the load. |
@sh-rp we should truncate only when we complete the load id successfully. so after the complete method is executed. |
One thing that I didn't manage to solve is how to patch the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK almost there, see review and
- add info on new config ie to
## Data left behind
ofrunning.md
- we need to enable this behavior for one of the merge tests ie.
test_pipeline_upfront_tables_two_loads
and make sure that all tables in staging dataset are truncated - this is to test it on all destinations
@rudolfix, okay, what do we do, if the client doesn't have |
yeah because qdrand does not use staging dataset for merge! look at my comments: client must implement |
@rudolfix, hm, some tests fail for Dremio. It seems to me, there staging is always used. Does it mean that staging dataset is not used? I don't see credentials for Dremio in the Secret Manager, so I can't test locally what's going on there. |
we only have local container and there are instructions. take a look in dremio folder in destinations. but I think you won't need it if you fix the pr. see review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Closes #1104