-
Notifications
You must be signed in to change notification settings - Fork 4.8k
HIVE-19340 : Disable timeout of transactions opened by replication ta… #337
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| enum TxnStatus {OPEN, ABORTED, COMMITTED, UNKNOWN} | ||
|
|
||
| public enum TxnType { | ||
| REPL_CREATED(0), READ_ONLY(1), UNKNOWN(2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall use DEFAULT instead of UNKNOWN as it seems like invalid type. Also, value "0" can be default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unknown is just a place holder ..default is null
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For TxnState, UNKNOWN is used to specify an invalid state of txn. So, thought it would make sense to use something else to mean Read-Write non-REPL created txns instead of UNKNOWN.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need change
| " TXN_STARTED bigint NOT NULL," + | ||
| " TXN_LAST_HEARTBEAT bigint NOT NULL," + | ||
| " TXN_USER varchar(128) NOT NULL," + | ||
| " TXN_TYPE integer," + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use single char instead of integer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure how different databases behaves for char type ..int looks quite standard so used it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TXN_STATE column uses character only... :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
integer is fine
| this.value = value; | ||
| } | ||
|
|
||
| public int getValue() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Won't by default integer values assigned to each enum member? Shall we use it instead of explicit assignment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is some conversion error to int from enum ..other places also it is used like this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
| " TXN_STARTED bigint NOT NULL," + | ||
| " TXN_LAST_HEARTBEAT bigint NOT NULL," + | ||
| " TXN_USER varchar(128) NOT NULL," + | ||
| " TXN_TYPE integer," + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add it as last column to keep same as upgrade flow with ADD COLUMN?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can not predict the internal behavior of dbms related to column order ..so it should not be an issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok
| TXN_META_INFO varchar(128), | ||
| TXN_HEARTBEAT_COUNT integer | ||
| TXN_HEARTBEAT_COUNT integer, | ||
| TXN_TYPE integer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can be NOT NULL column.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need not be not null ..as user can skip it if its not required. We don't need this value for all kind of transactions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How user will skip it? It is internal metadata set by Hive itself to identify the type of txn. As I see in code, it is set as REPL_CREATED for repl txns and UNKNOWN if not which means it is set in all flow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
keeping it null wont impact anything
|
|
||
| CREATE INDEX IDX_RUNTIME_STATS_CREATE_TIME ON RUNTIME_STATS(CREATE_TIME); | ||
|
|
||
| ALTER TABLE TXNS ADD COLUMN TXN_TYPE integer; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall initialize with 0 for existing entries in TXNS.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think txns does not span across upgrades ..anyways default is null
| rqst.getReplPolicy() + " and Source transaction id : " + rqst.getReplSrcTxnIds().toString()); | ||
| return new OpenTxnsResponse(targetTxnIdList); | ||
| } | ||
| txnType = TxnType.REPL_CREATED; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need rebasing with master.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
| numOpen++; | ||
| } | ||
| } | ||
| assertEquals(3, numAborted); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Won't this be flaky if run the whole suit as the number of aborted/open txns cannot be guaranteed to be 0 when start test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
previous test cases also have same issues ..looks like it creates a new txn handler ..so should not be an issue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
|
|
||
| CREATE INDEX IDX_RUNTIME_STATS_CREATE_TIME ON RUNTIME_STATS(CREATE_TIME); | ||
|
|
||
| ALTER TABLE TXNS ADD TXN_TYPE int; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's better to have -- HIVE-19340 comment for changes done part of this ticket. Shall have it in all upgrade scripts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
cc3af33 to
0f35298
Compare
…k at target cluster
|
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. |
The transactions opened by applying EVENT_OPEN_TXN should never be aborted automatically due to time-out. Aborting of transaction started by replication task may leads to inconsistent state at target which needs additional overhead to clean-up. So, it is proposed to mark the transactions opened by replication task as special ones and shouldn't be aborted if heart beat is lost. This helps to ensure all ABORT and COMMIT events will always find the corresponding txn at target to operate.