-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make transaction metadata nullable to support existing databases #841
Conversation
// For preparing an update of a deemed-commit state record, we need to use | ||
// version 0 rather than NULL since we want to distinguish the preparation from | ||
// an initial record insertion. | ||
columns.add(IntColumn.of(Attribute.BEFORE_VERSION, 0)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As commented in the code, we don't use the original NULL value for before_version
in prepare records
to distinguish them from prepare records for newly-inserted ones.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I don't fully get this.
If we don't do this, i.e., if we just use NULL for before_version, what issue would occur?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for seeking clarification. Simply put, the record cannot be rollbacked correctly since the recovery process falls in the deletion path for the initial record case here.
scalardb/core/src/main/java/com/scalar/db/transaction/consensuscommit/RollbackMutationComposer.java
Lines 67 to 69 in b3cf9c9
if (beforeId.hasNullValue() && beforeVersion.hasNullValue()) { | |
// no record to rollback, so it should be deleted | |
mutations.add(composeDelete(base, latest)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, right. Thank you!
I think it would be great if you could make the comment more specific about the case.
(i.e., adding the reason why we want to use 0 instead of null for before_version.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I will try to revise it with a more specific reason.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@feeblefakie Fixed in d4566f0. PTAL when you get a chance.
if (key.equals(Attribute.VERSION) && v.getIntValue() == 0) { | ||
columns.add(IntColumn.ofNull(Attribute.VERSION)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For rollback, we need to change 0 -> null as the opposite side of prepare.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think adding a comment for this would be helpful.
return !value.getName().startsWith(Attribute.BEFORE_PREFIX) | ||
&& !isValueInKeys(value, primary, clustering); | ||
private boolean isBeforeRequired(Column<?> column, Key primary, Optional<Key> clustering) { | ||
return !column.getName().startsWith(Attribute.BEFORE_PREFIX) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might be a silly question, but what if an original column name starts with before_
prefix like before_update_balance
(non-primary or clustering key)? This logic seems to return false, but I think before_before_update_balance
column should be taken care of.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, good point. Yeah, it should be taken care of if we allow users to use a column name starting with before_
. For new projects, at least, prohibiting such column names would not cause serious problems. For existing databases, some users might be using such column names, but I guess it would be rare, and we could ask users to rename them. So, IMHO, we can prohibit user columns with before_
in both cases for now.
@brfrn169 By the way, I think we don't have any guard for those names in the schema-loader, and I was able to create a before_xx
column and faced an issue that its before column was not updated in the prepare phase. Is my understanding correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I can reproduce it with the current code.
ScalarDB SQL CLI
0: scalardb> select * from example_app.account2;
+----+----------+-----+
| id | before_x | y |
+----+----------+-----+
| 1 | 10 | 100 |
+----+----------+-----+
1 row selected (0.005 seconds)
0: scalardb> update example_app.account2 set before_x = 20, y = 200 where id = 1;
1 row affected (0.04 seconds)
psql
postgres=# select id, before_x, y, tx_state, before_before_x, before_y from example_app.account2;
id | before_x | y | tx_state | before_before_x | before_y
----+----------+-----+----------+-----------------+----------
1 | 20 | 200 | 3 | | 100
(1 row)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@komamitsu Thanks! Yeah, I faced the same issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought we accepted to create the before_xx
column as a normal column, and in this case, its before column should be before_before_xx
. But, we might miss something to handle that.
The logic to detect the before column is here:
scalardb/core/src/main/java/com/scalar/db/transaction/consensuscommit/ConsensusCommitUtils.java
Lines 179 to 204 in 44a7baf
/** | |
* Returns whether the specified column is a part of the before image columns or not. | |
* | |
* @param columnName a column name | |
* @param tableMetadata a transaction table metadata | |
* @return whether the specified column is a part of the before image columns or not | |
*/ | |
public static boolean isBeforeImageColumn(String columnName, TableMetadata tableMetadata) { | |
if (!tableMetadata.getColumnNames().contains(columnName) | |
|| tableMetadata.getPartitionKeyNames().contains(columnName) | |
|| tableMetadata.getClusteringKeyNames().contains(columnName)) { | |
return false; | |
} | |
if (BEFORE_IMAGE_META_COLUMNS.containsKey(columnName)) { | |
return true; | |
} | |
if (columnName.startsWith(Attribute.BEFORE_PREFIX)) { | |
// if the column name without the "before_" prefix exists, it's a part of the before image | |
// columns | |
return tableMetadata | |
.getColumnNames() | |
.contains(columnName.substring(Attribute.BEFORE_PREFIX.length())); | |
} | |
return false; | |
} |
The basic idea is, if there are both xxx
and before_xxx
(before prefixed) columns in the table, xxx
is an after image column, and before_xxx
is a before image column.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, okay. It should be a bug. Let me check. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, okay we should use ConsensusCommitUtils.isAfterImageColumn()
here:
scalardb/core/src/main/java/com/scalar/db/transaction/consensuscommit/PrepareMutationComposer.java
Line 155 in 44a7baf
return !value.getName().startsWith(Attribute.BEFORE_PREFIX) |
I thought I had fixed it 😞
Let me fix it in another PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, we already have the validation to check if a table has transaction meta columns when creating the table 😅 :
scalardb/core/src/main/java/com/scalar/db/transaction/consensuscommit/ConsensusCommitUtils.java
Lines 46 to 48 in 44a7baf
checkIsNotTransactionMetaColumn(tableMetadata.getColumnNames()); | |
Set<String> nonPrimaryKeyColumns = getNonPrimaryKeyColumns(tableMetadata); | |
checkBeforeColumnsDoNotAlreadyExist(nonPrimaryKeyColumns, tableMetadata); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@brfrn169 Thanks for the additional info. But I guess it only checks to prevent from being created both xxx and before_xxx as normal columns. We can still create before_xxx column itself...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you.
I only left minor refactoring suggestions.
core/src/main/java/com/scalar/db/transaction/consensuscommit/PrepareMutationComposer.java
Outdated
Show resolved
Hide resolved
core/src/main/java/com/scalar/db/transaction/consensuscommit/PrepareMutationComposer.java
Outdated
Show resolved
Hide resolved
core/src/main/java/com/scalar/db/transaction/consensuscommit/RollbackMutationComposer.java
Outdated
Show resolved
Hide resolved
core/src/main/java/com/scalar/db/transaction/consensuscommit/RollbackMutationComposer.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looking good. Thank you!
Left a few questions, so PTAL when you get a chance.
// For preparing an update of a deemed-commit state record, we need to use | ||
// version 0 rather than NULL since we want to distinguish the preparation from | ||
// an initial record insertion. | ||
columns.add(IntColumn.of(Attribute.BEFORE_VERSION, 0)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I don't fully get this.
If we don't do this, i.e., if we just use NULL for before_version, what issue would occur?
core/src/main/java/com/scalar/db/transaction/consensuscommit/RollbackMutationComposer.java
Outdated
Show resolved
Hide resolved
@Torch3333 Thank you for the feedback! Fixed in b3cf9c9. PTAL if you have time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
@jnmt (cc: @brfrn169) Just noticed this PR doesn't have integration test cases for NULL status record. Do you think we should add some integration tests? I think it would require non-straightforward way to add NULL status records (directly using the native JDBC?) at arrangement phase. So I'm not sure those integration tests should be added in this PR or in another PR, though. |
@komamitsu Thanks for the question. Yes, we should add some integration tests since this is a kind of major change in the protocol behavior. We are also thinking about applying the Jepsen tests for it and going to discuss it at some point in time (next or next next week). But, in any case, the integration tests will be added by another PR. |
@jnmt I think jepsen tests can be done in another PR, but it would be great if you could add integration tests in this PR. 🙇 |
@feeblefakie Sure! I will do it in the same PR if it's better. |
// deleted as an initial record in a rollback situation. To avoid this and roll | ||
// back to a record with a NULL version (i.e., regarded as committed) correctly, | ||
// we need to use version 0 rather than NULL for before_version. | ||
columns.add(IntColumn.of(Attribute.BEFORE_VERSION, 0)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@feeblefakie I'm wondering if we could remove the tx_version
column in the first place. As discussed before, if the tx_id
column is unique, we don't really need the tx_version
column. If we can remove it, we don't need this special handling, and we can make the code simpler.
From a backward compatibility perspective, even if we remove the tx_version
column, we simply don't use the column on existing tables, so I don't think we will run into any problems. For new tables, we will no longer need to create the tx_version
column.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added integration tests for null-tx-metadata handling (incl. rollback/roll-forward) referencing ConsensusCommitSpecificIntegrationTestBase
, but skipped the following cases.
- Tests that do not use existing data (e.g., insert a new record and do something)
- Tests for conflict / write skew situations
// deleted as an initial record in a rollback situation. To avoid this and roll | ||
// back to a record with a NULL version (i.e., regarded as committed) correctly, | ||
// we need to use version 0 rather than NULL for before_version. | ||
columns.add(IntColumn.of(Attribute.BEFORE_VERSION, 0)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for re-reviewing, but I updated a little for a better understanding of the logic and added integration tests based on the feedback in db51113. So, it would be great if you could take a look again when you get a chance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left several minor comments but LGTM other than that. Thank you!
public boolean isDeemedAsCommitted() { | ||
return getId() == null; | ||
} | ||
|
||
public boolean hasBeforeImage() { | ||
// We need to check not only before_id but also before_version to determine if the record has | ||
// the before image or not since we set before_version to 0 for the prepared record when | ||
// updating the record deemed as the committed state (cf. PrepareMutationComposer). | ||
return !getBeforeIdColumn().hasNullValue() || !getBeforeVersionColumn().hasNullValue(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
public TextColumn getBeforeIdColumn() { | ||
return (TextColumn) result.getColumns().get(Attribute.BEFORE_ID); | ||
} | ||
|
||
public IntColumn getBeforeVersionColumn() { | ||
return (IntColumn) result.getColumns().get(Attribute.BEFORE_VERSION); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It look like we can make these method private?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! Fixed in d5ff7af.
if (key.equals(Attribute.VERSION) && v.getIntValue() == 0) { | ||
columns.add(IntColumn.ofNull(Attribute.VERSION)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think adding a comment for this would be helpful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looking good to me!
Left one minor comment.
public String getId() { | ||
return getText(Attribute.ID); | ||
} | ||
|
||
public TransactionState getState() { | ||
return TransactionState.getInstance(getInt(Attribute.STATE)); | ||
int state = getInt(Attribute.STATE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although getInt
returns 0 for a null value, should it be slightly more appropriate/readable to use isNull
since it actually checks if the value is null?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's right. Thanks. Fixed in d5ff7af.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! 👍
private DistributedStorage storage; | ||
private Coordinator coordinator; | ||
private RecoveryHandler recovery; | ||
private CommitHandler commit; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[trivial] This can be a local variable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! Thank you! Fixed on 3dd3072.
.clusteringKey(Key.ofInt(ACCOUNT_TYPE, j)) | ||
.value(IntColumn.of(BALANCE, INITIAL_BALANCE)) | ||
.build(); | ||
storage.put(put); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. We can manipulate underlying database by directly using DistributedStorage...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you!
This PR adds a feature to handle null transaction metadata. To support existing databases, we assume that we ask users to add transaction metadata columns (with null values in records) in existing tables. So, as this PR does, we would like to handle those records with null transaction metadata as a sort of 'deemed' committed state.