New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make move_chunk use AN txns on DN #3372
Make move_chunk use AN txns on DN #3372
Conversation
We used to run transactions in autocommit mode on DN while running the chunk copy/move activity. This meant that any failures on the access node were de-coupled from the activity on the DN. This can make future cleanup messy since we wouldn't know what's failed/succeeded on the data nodes. We now drive the entire activity via transactions started on the access node.
|
||
/* Stop data transfer on the destination node */ | ||
cmd = psprintf("ALTER SUBSCRIPTION %s DISABLE", NameStr(cc->fd.operation_id)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might makes sense to separate all of those steps in two/three separate stages, since we have a separate stage to create the replication slot
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We created these additional substeps just to allow things to work in a transaction block. Semantically it's all about dropping the subscription.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but from point of view of the healing function which way is more convenient to operate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's not an issue. The cleanup function will be exactly the same as this one with these 3 steps part of a transaction block
Codecov Report
@@ Coverage Diff @@
## multinode-feature-copy-chunk #3372 +/- ##
===============================================================
Coverage ? 90.67%
===============================================================
Files ? 213
Lines ? 36054
Branches ? 0
===============================================================
Hits ? 32693
Misses ? 3361
Partials ? 0 Continue to review full report at Codecov.
|
@@ -400,7 +400,7 @@ chunk_copy_stage_create_publication(ChunkCopy *cc) | |||
NameStr(cc->chunk->fd.table_name))); | |||
|
|||
/* Create the publication in autocommit mode */ | |||
ts_dist_cmd_run_on_data_nodes(cmd, list_make1(NameStr(cc->fd.source_node_name)), false); | |||
ts_dist_cmd_run_on_data_nodes(cmd, list_make1(NameStr(cc->fd.source_node_name)), true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I never remember what booleans like this means and I had to look it up in the code. Just a mental note that often it is better to use descriptive enums, e.g., CMD_TRANSACTIONAL
. Obviously not for this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed.
We used to run transactions in autocommit mode on DN while running the
chunk copy/move activity. This meant that any failures on the access
node were de-coupled from the activity on the DN. This can make future
cleanup messy since we wouldn't know what's failed/succeeded on the
data nodes.
We now drive the entire activity via transactions started on the access
node.