-
Notifications
You must be signed in to change notification settings - Fork 16.3k
Add truncate table (before copy) option to S3ToRedshiftOperator #9246
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
feluelle
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure all those functionality should be in a transfer operator. It gets really complex. Where do we limit it?
|
I think the functionality you added is useful, but this can also be done in separate tasks which then makes it also easier to monitor i.e. every step you can check in the UI. |
|
Got it. So I will remove from the PR the functionalities of deleting s3 file, duplicates checking and executing previous queries, and leave the other things. Is that okay? |
By different tasks I mean different Airflow tasks - not different PR's. |
|
I think also the "truncate" functionality can be done by a different PostgresOperator task. WDYT? |
|
While I agree that the rest of functionalities can be done in separates operators (for instance, you may want to delete several s3 files at the end of your dag, or you don't want to check duplicates, or you may execute previous queries in a single operator at the beginning of the dag), I think the "truncate" is more "fittable" to be in this operator, because it is way simpler than the others (just one line) and it is just kind of preparation for the copy, not an extra step. Also, I believe the log/errors that can be thrown because of the truncate are simple enough to be monitored in the same task of the COPY. Anyway, if you think I should also remove that from the PR, I'll do it. Thanks! |
|
Good explanation. I can see the "truncate" functionality stay in there. 👍 |
Other than what @JavierLopezT mentioned if you separated Truncate & Copy you might end up with empty table. While this doesn't guaranty atomecy it's the closest thing you can have. |
|
Okay, but then we should at least use SQL transactions. |
|
@JavierLopezT WDYT? Can you use transactions when doing truncate + copy? :) |
|
@feluelle Sure, I'll come back to this PR as soon as I can |
|
I would add it to a new PR. Note that the PR you linked is proposing a change to |
f0d2642 to
805781b
Compare
|
Hello @feluelle, I have done the changes, and now comes the drama: TESTING I think I have to add a test to check that the truncate works. Thank you very much |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JavierLopezT can we rename transaction variable to someting more appropriate like sql or so?
|
Javier I also removed the following from the description:
..as this isn't in your changes / PR ?! |
Oh yes, I have just seen that this was already solved in August in c635804#diff-85c770282ec360127508910595a47fd1 |
|
The template_field is also added in #10890. I'll edit my description again |
Done |
|
LGTM @JavierLopezT. Let's wait until the issue on master is fixed and then I am merging this one. |
|
@JavierLopezT There are so many failing checks, can you rebase once more and if this is fine I am merging it. :) |
a97c580 to
0a90b24
Compare
Done. Not sure if I have done it correctly though. (I followed https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#id9) |
|
@JavierLopezT test seems to fail. Can you look at it, please? :) |
turbaszek
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👌 @feluelle ?
feluelle
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, merging it.
…he#9246) - add table arg to jinja template fields - change ui_color Co-authored-by: javier.lopez <javier.lopez@promocionesfarma.com>
…he#9246) - add table arg to jinja template fields - change ui_color Co-authored-by: javier.lopez <javier.lopez@promocionesfarma.com>
This PR adds the truncate argument (bool), which makes a truncate right before the copy statement, within the same transaction. Also, the color has been changed to a more vivid one.
Make sure to mark the boxes below before creating PR: [x]