Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IGNITE-20457 Verify commitTimestamp against enlisted partitions expiration timestamps #2658

Merged
merged 28 commits into from Oct 13, 2023

Conversation

sanpwc
Copy link
Contributor

@sanpwc sanpwc commented Oct 3, 2023

https://issues.apache.org/jira/browse/IGNITE-20457

  • Added TxManagerImpl#verifyCommitTimestamp method that checks whether previously enlisted primary replicas aren't expired and that commit timestamp is less or equal than primary replicas expiration timestamp. Given method will either complete result future with void or PrimaryReplicaExpiredException
  • Added aforementioned verifyCommitTimestamp into TxManagerImpl#finish flow
  • Added PlacementDriver as an additional TxManagerImpl constructor param in order to retrieve current primary replicas for the futher checks in verifyCommitTimestamp.

@sanpwc sanpwc marked this pull request as ready for review October 6, 2023 21:44
@sanpwc sanpwc changed the title IGNITE-20427 IGNITE-20457 Verify commitTimestamp against enlisted partitions expiration timestamps Oct 9, 2023
…-20457

# Conflicts:
#	modules/table/src/test/java/org/apache/ignite/internal/table/RepeatedFinishReadWriteTransactionTest.java
#	modules/table/src/test/java/org/apache/ignite/internal/table/TxLocalTest.java
#	modules/table/src/testFixtures/java/org/apache/ignite/distributed/ItTxTestCluster.java
#	modules/transactions/src/main/java/org/apache/ignite/internal/tx/impl/ReadWriteTransactionImpl.java
#	modules/transactions/src/main/java/org/apache/ignite/internal/tx/impl/TxManagerImpl.java
.thenAccept(currentPrimaryReplica -> {
if (currentPrimaryReplica == null
|| !expectedEnlistmentConsistencyToken.equals(currentPrimaryReplica.getStartTime().longValue())
|| commitTimestamp.after(currentPrimaryReplica.getExpirationTime())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it really possible when you pass a commit timstamp in getPrimaryReplica method?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the current implementation - no. However, it seems more reliable and clear to explicitly check it. So, I'd rather left this as is.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you think it is not possible? This is a crucial part: can I trust that any (non-local) replica is primary or not in a particular moment just by using the placement driver API.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because getPrimaryReplica(..., commitTimestamp) won't return expired lease
if (lease.isAccepted() && lease.getExpirationTime().after(timestampWithClockSkew)) {
thus it's not possible. So yep, it's possible to change given commitTimestamp check with an assert. However I believe that current approach is better cleaner.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does we consider a clock skew in placement driver API (and when I looked in the implementation we doing it twice)? I think we should describe this case in detail.
I always think, that the lease expiration date grater than lease start date, it is a contract, but here we are checking it....

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed the "twice" CLOCK_SKEW addition. And we use CLOCK_SKEW related logic because it's a contract of how await and get works.

Collection<ReplicationGroupId> replicationGroupIds = new HashSet<>();
replicationGroupIds.addAll((Collection<ReplicationGroupId>) (Collection<?>) enlistedGroups.keySet());

boolean verifiedCommit = throwable == null && commit;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not understand where we will be able to see this exception, which changed the transaction state (to rollback) at the last moment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check

                    // verification future is added in order to share proper exception with the client
                    .thenCompose(r -> verificationFuture);

at the end of the future chaining.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tricky way, I cannot insist, but I would make this code more clear.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, it's a common pattern of "re-throwing" an exception from a future after handling it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would create another future that is used to return.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, proposed solution is a common pattern.

(unused, throwable) -> {
// TODO: https://issues.apache.org/jira/browse/IGNITE-19170 Use ZonePartitionIdMessage and remove cast
Collection<ReplicationGroupId> replicationGroupIds = new HashSet<>();
replicationGroupIds.addAll((Collection<ReplicationGroupId>) (Collection<?>) enlistedGroups.keySet());
Copy link
Contributor

@Cyrill Cyrill Oct 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Collection<ReplicationGroupId> replicationGroupIds = new HashSet<>(enlistedGroups.keySet());

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.


return finishFuture.exceptionally(
e -> {
if (e instanceof PrimaryReplicaExpiredException) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check doesn't work, because PrimaryReplicaExpiredException is always wrapped in CompletionException (thrown byallOf that is called in TxManagerImpl#verifyCommitTimestamp).

Also, I would recommend to check the error code of TransactionException in tests that you added - this would have indicated this problem.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. I've removed exceptionally block and adjusted the tests.

@@ -96,20 +96,22 @@ public IgniteBiTuple<ClusterNode, Long> enlist(TablePartitionId tablePartitionId
/** {@inheritDoc} */
@Override
protected CompletableFuture<Void> finish(boolean commit) {
Map<ClusterNode, List<IgniteBiTuple<TablePartitionId, Long>>> groups = new LinkedHashMap<>();
Map<TablePartitionId, Long> enlistedGroups = new LinkedHashMap<>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This map is almost always recreated later at line 102. IMO we need to put the declaration to the assignment statement at line 102 and create a map explicitly for the else clause

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

}
});
})
.thenCompose(Function.identity())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It this invokation required?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't think so. Removed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My fault. It's required. Otherwise exceptions from original future won't be thrown. Lot's of tests including for example org.apache.ignite.internal.table.TxAbstractTest#testTransactionAlreadyCommitted will fail. Restored.

@@ -222,7 +222,7 @@ public CompletableFuture<ReplicaMeta> awaitPrimaryReplica(
long timeout,
TimeUnit unit
) {
return inBusyLockAsync(busyLock, () -> getOrCreatePrimaryReplicaWaiter(groupId).waitFor(timestamp)
return inBusyLockAsync(busyLock, () -> getOrCreatePrimaryReplicaWaiter(groupId).waitFor(timestamp.addPhysicalTime(CLOCK_SKEW))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is not logick which avoi eaxtra waitng.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed. TODO added.

@sanpwc sanpwc merged commit 7b60e68 into apache:main Oct 13, 2023
1 check passed
@sanpwc sanpwc deleted the ignite-20457 branch October 13, 2023 12:07
Lozmanov pushed a commit to Lozmanov/ignite-3 that referenced this pull request Nov 6, 2023
* IGNITE-19425 Use data nodes from DistributionZoneManager instead of BaselineManager#nodes and remove BaselineManager (apache#2605)

* IGNITE-20127 Implement delayed replication acks - Fixes apache#2520.

Signed-off-by: Alexey Scherbakov <alexey.scherbakoff@gmail.com>

* IGNITE-20330 Create an abstraction for building indexes (apache#2631)

* IGNITE-20513 Use NativeType as the type of the system view column (apache#2668)

* IGNITE-20590 Remove TxManager.beginImplicit() (apache#2669)

* IGNITE-20588 Use consistent schema when converting data in KV/record views (apache#2667)

* IGNITE-20342 Rollback transaction for SQL execution issues (apache#2611)

* IGNITE-20445 Clean up write intents for RW transaction on replication… (apache#2657)

* IGNITE-20607 Move NativeType to core (apache#2670)

* IGNITE-20484 NPE when some operation occurs when the primary replica … (apache#2672)

* IGNITE-20442 Sql. Extend grammar with transaction related statements. (apache#2663)

* IGNITE-20620 Add index availability command to catalog (apache#2680)

* IGNITE-20521 Split Security API from ignite-security module (apache#2662)

* IGNITE-20621 Add a replication group index build completion listener to IndexBuilder (apache#2682)

* IGNITE-19226 Fetch table schema by timestamp (apache#2648)

* IGNITE-19325 Unmute test MultiActorPlacementDriverTest#prolongAfterActiveActorChanger (apache#2677)

* IGNITE-20625 Fix ODBC metadata reading error (apache#2687)

* IGNITE-20457 Verify commitTimestamp against enlisted partitions expiration timestamps (apache#2658)

* IGNITE-20635 Cleanup code wrt IGNITE-18733 mentions (apache#2686)

* IGNITE-20317 Return metastorage invokes for zones changes in handlers, immediately recalculate data nodes when scale up/down is immediate. (apache#2685)

* IGNITE-19276 Implement a mechanism to build indices distributively (apache#2676)

* IGNITE-20567 Move the 'enabled' flag from the authentication configuration to security (apache#2665)

* IGNITE-20578 Implement scan over system view (apache#2678)

* IGNITE-20387: Remap most exceptions to SqlExceptions for SQL API (apache#2613)

* IGNITE-19217 Implement foreign keys query for ODBC  (apache#2692)

* IGNITE-20512 Remove port range from HTTP server (apache#2673)

* Remove port range from HTTP server configuration
* Update junit5 version to avoid jar hell.

* IGNITE-20657 At the checkpoint ArrayIndexOutOfBoundsException (apache#2696)

* IGNITE-20444 Sql. Add restrictions for execution tx related statements with single statement mode (apache#2683)

* IGNITE-20116 Linearize storage updates with safeTime adjustment rules (apache#2689)

* IGNITE-20636 Add to the MakeIndexAvailableCommand the ability to use only the indexId (apache#2691)

* IGNITE-19219  Implement primary keys query in ODBC (apache#2699)

* IGNITE-20366 testBatchReadPutConcurrently failed (apache#2694)

* IGNITE-20435 Preserve key order in batch opperations (deleteAll, deleteAllExact, insertAll) (apache#2664)

* IGNITE-20385 Sql. Fixed handling of NodeLeftException (apache#2622)

* IGNITE-20672 Broke compilation (apache#2705)

* IGNITE-20530 Start building indexes for write-only indexes (apache#2704)

* IGNITE-20659 Placement driver do not create a lease (apache#2700)

* IGNITE-20668 Increase wait after a DDL to account for idle safe-time propagation period (apache#2703)

* IGNITE-20522 Create the default user 'Ignite' (apache#2684)

* Ignite/ignite default user is added to the configuration if the user did not specify one
* security.authentication.enabled -> security.enabled
* We didn't have the ability to set dynamic defaults in the current implementation of configuration before this patch. Moreover, we don't have the ability to set a default value for NamedConfigValue at all. Now you can add the code to the configuration module if you want to add some defaults based on user-provided values or other external conditions.

* IGNITE-20616 ensureReplicaIsPrimary should use getPrimaryReplica instead of awaitPrimaryReplica (apache#2701)

* IGNITE-20671 Sql. Fixed ItSqlApiTest#ddl test (apache#2708)

* IGNITE-20677 Sql. Improve error reporting in assertThrowsSqlException (apache#2707)

* IGNITE-20600 Sql. Fix a message of an error, which occurs while updating primary key column (apache#2695)

* IGNITE-20478 Sql. Get rid of UNSPECIFIED_VALUE_PLACEHOLDER (apache#2671)

* IGNITE-20430 Got rid of unused set and fixed replica waiters removal (apache#2604)

* IGNITE-20454 Sql. Added a callback that is notified when data prefetching is complete (apache#2674)

* IGNITE-20002 Implement durable unlock on primary partition re-election (apache#2697)

* IGNITE-20630 Select only available nodes for deployment unit download (apache#2713)

* IGNITE-20693 Fixed NPE in placement driver actor on deactivation (apache#2718)

* IGNITE-20695 Cleanup resource (apache#2723)

* IGNITE-20545 Improve logging in AbstractRpcTest (apache#2717)

* IGNITE-20395 Clean up write intents for RW transaction on primary (apache#2679)

* IGNITE-20699 Decrease idle safe time propagation period in tests (apache#2730)

* IGNTIE-20629 Exclude ODBC build from assemble pipeline by default (apache#2706)

* IGNITE-20706 Rename CatalogSchemaManager to SchemaManager (apache#2733)

* IGNITE-20702 Fix NPE in ReplicaManager.onReplicaMessageReceived (apache#2731)

* IGNITE-20704 Add methods to fsync files and directories (apache#2735)

* IGNITE-20703 LeaseUpdaterTest causes a NullPointerException (apache#2732)

* IGNITE-20599 Implement 'NOT_TOMBSTONE' operation in the meta storage dsl. (apache#2722)

* IGNITE-20042 Check table existence before executing each operation in an RW transaction (apache#2721)

* IGNITE-20576 Fix testGetReturningTupleWithUnknownSchemaRequestsNewSchema (apache#2738)

* IGNITE-20359 Expose storage profiles as a node attribute (apache#2711)

* IGNITE-20439 Sql. Support multiple schemas in CatalogSqlSchemaManager (apache#2719)

* IGNITE-20311 Sql. Fix behaviour of ROUND function (apache#2690)

* IGNITE-20684 Use proper sync method (apache#2715)

* IGNITE-20670 Extract deployment code integration test to related module (apache#2714)

* IGNITE-20304 Sql. Documentation for system views module (apache#2726)

* IGNITE-20644 Java thin: Fix error detection in doSchemaOutInOpAsync (apache#2739)

Fix well-known error detection in `ClientTable` when `sendServerExceptionStackTraceToClient` is enabled: `unwrapRootCause` did not work, the important exception can be on any level.

* IGNITE-20498 Fix potential catalog version order violations (apache#2734)

* IGNITE-20560 Remove the field out of sync with the map (apache#2660)

* IGNITE-20099 Update okhttp version (apache#2746)

* IGNITE-20720 Move ClusterPerTestIntegrationTest and ClusterPerClassIntegrationTest to Test Fixtures (apache#2744)

* IGNITE-20618 Sql. Degradation of SELECT operations performance over time (apache#2728)

* IGNITE-20726: Fix incorrect link that mentions a closed ticket. (apache#2745)

* IGNITE-20727 Pass schema version in each read/write ReplicaRequest (apache#2747)

* IGNITE-20561 Change condition for DistributionZonesUtil#triggerKeyConditionForZonesChanges to solve inconsistency issues (apache#2743)

* IGNITE-20734 Fixed a link to cli doc (apache#2750)

* IGNITE-20739 fix compilation after merge of IGNITE-20561 (apache#2753)

* IGNITE-20624 Fix race between getting logical topology and mapping fragments (apache#2710)

---------

Signed-off-by: Alexey Scherbakov <alexey.scherbakoff@gmail.com>
Co-authored-by: Mirza Aliev <alievmirza@gmail.com>
Co-authored-by: Alexey Scherbakov <alexey.scherbakoff@gmail.com>
Co-authored-by: Kirill Tkalenko <tkalkirill@yandex.ru>
Co-authored-by: korlov42 <korlov@gridgain.com>
Co-authored-by: Roman Puchkovskiy <roman.puchkovskiy@gmail.com>
Co-authored-by: Max Zhuravkov <shhwwa@gmail.com>
Co-authored-by: Cyrill <cyrill.sizov@gmail.com>
Co-authored-by: Ivan Gagarkin <gagarkin.iiu@gmail.com>
Co-authored-by: Vladislav Pyatkov <vldpyatkov@gmail.com>
Co-authored-by: Igor Sapego <isapego@apache.org>
Co-authored-by: Alexander Lapin <lapin1702@gmail.com>
Co-authored-by: ygerzhedovich <41903880+ygerzhedovich@users.noreply.github.com>
Co-authored-by: Aleksandr Pakhomov <apkhmv@gmail.com>
Co-authored-by: Pavel Pereslegin <xxtern@gmail.com>
Co-authored-by: Evgeniy Stanilovskiy <stanilovsky@gmail.com>
Co-authored-by: Denis Chudov <moonglloom@gmail.com>
Co-authored-by: Mikhail <Pochatkin@users.noreply.github.com>
Co-authored-by: Vadim Pakhnushev <8614891+valepakh@users.noreply.github.com>
Co-authored-by: Alexander Polovtcev <alex.polovtcev@gmail.com>
Co-authored-by: Pavel Tupitsyn <ptupitsyn@apache.org>
Co-authored-by: Andrew V. Mashenkov <AMashenkov@users.noreply.github.com>
Co-authored-by: IgGusev <igusev@gridgain.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants