-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nessie: Use latest hash for catalog APIs #6789
Conversation
TableIdentifier.parse("foo.tbl1"), TableIdentifier.parse("foo.tbl2")); | ||
|
||
Assertions.assertThat(catalog.listTables(Namespace.of("foo"))) | ||
.containsExactlyInAnyOrder( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This particular check fails without this PR. As, catalog
refers to the old state for listTables()
.
dc01d51
to
f4069a6
Compare
@@ -223,13 +224,13 @@ protected String defaultWarehouseLocation(TableIdentifier table) { | |||
|
|||
@Override | |||
public List<TableIdentifier> listTables(Namespace namespace) { | |||
return client.listTables(namespace); | |||
return refreshedClient().listTables(namespace); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
refreshing the client is enough for a few APIs that doesn't use TableOperations
. But the code looks odd. So, I am refreshing all the APIs as it is a lightweight operation.
Restarting the build.
|
Sure this does not break the expected-hash calculations for commits? |
I just added a testcase for commit from multiple catalogs/clients on same branch + same table. It passes. @snazy: Do you see any problem with the current changes? Do you feel we should only modify |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @snazy 's concern about "expected hash". I think the question here is not about multiple clients, but about correctness within the same JVM. If a transaction starts with Nessie reference being in a certain state, the commit of that Tx should use "expected" reference as it was at the start of the Tx... but it looks like this PR increases the risk of inadvertently updating the reference while transactions are in progress.
Isn't now also (without this PR) if two concurrent commits happen for a same table, the client can get refreshed between the
to
because of existing refresh in this code by concurrent operations?
|
I think so. This issue is not specific to this PR. I just think this PR makes it more likely to manifest. Would it be possible to keep the "expected" hash at the Tx level? |
@dimas-b : In that case, txn can only commit one successful commit?
@snazy: can you please elaborate on this? Without this PR also, concurrent commits from the same client can refresh the catalog and cause breakage of expected-hash. a) Are you suggesting that this PR will increase the probability of that event happening and we don't want to do that? |
I'm suggesting to provide a fix for it. |
+1 to fixing how Nessie Catalog (and related classes) track expected hash and refresh references. |
The Nessie server is able to resolve non-conflicting changes to the catalog, even when the expected hash is behind the current branch HEAD. |
Based on my analysis, the Nessie commit operation flow is as above. There are chances that we use older commit hash as the expected hash with the latest client. For this, Nessie backend server will fail the commit and Iceberg will retry the transaction/commit using this code[1] by rebasing the metadata and client. [1] code for rebase with refreshed metadata iceberg/core/src/main/java/org/apache/iceberg/BaseTransaction.java Lines 354 to 359 in 3efaee1
So, Even if we add some code to update/refresh the expected hash to be the latest, It is not enough as So, I don't think we need to change anything. Please clarify or guide me if you think it is incorrect. |
@snazy, @dimas-b:
And the client inside the
TLDR; Please recheck the PR again and let me know if a change required on this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for you work on making this use case more robust @ajantha-bhat ! Some more comments below.
nessie/src/main/java/org/apache/iceberg/nessie/NessieCatalog.java
Outdated
Show resolved
Hide resolved
@AfterEach | ||
public void afterEach() throws Exception { | ||
dropTables(catalog); | ||
dropTables(anotherCatalog); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why bother? The super class will delete/reset all Nessie data. Table files are located in a temp. dir.
If the intention is to remove table's files, wouldn't it be preferable to do that in the base class and the FS level for all tests after resetting Nessie?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I followed TestNessieTable
which was cleaning the table files located in a temp dir.
Other test cases doesn't have it.
Agree that this should be at base test class level. I will handle it in a follow up PR.
nessie/src/test/java/org/apache/iceberg/nessie/TestMultipleClients.java
Outdated
Show resolved
Hide resolved
nessie/src/test/java/org/apache/iceberg/nessie/TestMultipleClients.java
Outdated
Show resolved
Hide resolved
nessie/src/test/java/org/apache/iceberg/nessie/TestMultipleClients.java
Outdated
Show resolved
Hide resolved
nessie/src/main/java/org/apache/iceberg/nessie/NessieCatalog.java
Outdated
Show resolved
Hide resolved
@dimas-b: Thanks for the review. I have replied to the comments. Please check. |
This flaky error again !
opened this to get more info: |
678f6bd
to
dccfd6b
Compare
@snazy, @dimas-b, @nastra: I have avoided the unnecessary extra round-trip by fixing it differently than before. Please take a look. Thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change adds too many unnecessary code changes. Most of the changes aren't justified, the required fix can likely be achieved with way less changes. The current PR also doesn't clean up renameTable
wrt the scope of this PR.
@snazy: I gave it another try. Please take a look. Also the |
one of the builds failed due to #7023 https://github.com/apache/iceberg/actions/runs/4405434484 re-triggering the build. |
nessie/src/main/java/org/apache/iceberg/nessie/UpdateableReference.java
Outdated
Show resolved
Hide resolved
nessie/src/main/java/org/apache/iceberg/nessie/UpdateableReference.java
Outdated
Show resolved
Hide resolved
nessie/src/main/java/org/apache/iceberg/nessie/NessieIcebergClient.java
Outdated
Show resolved
Hide resolved
nessie/src/main/java/org/apache/iceberg/nessie/UpdateableReference.java
Outdated
Show resolved
Hide resolved
one of the builds failed due to #7023 https://github.com/apache/iceberg/actions/runs/4434173539/jobs/7779904892 re-triggering the build. |
@@ -60,13 +60,6 @@ public String getHash() { | |||
return reference.getHash(); | |||
} | |||
|
|||
public Branch getAsBranch() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I deprecated it instead of removing it because it is a public method.
This method a safe cast - you replaced it (not clear for which reason) with an unconditional cast (and BTW left another then unused method).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method a safe cast
because all the places where this was called, already there is a check for getRef().checkMutable()
. So, no need for one more method call that checks if it is a branch. Hence, directly casting to branch.
I will remove isBranch
too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: how about returning a Branch
from checkMutable
to avoid type casts in the calling code? (just a thought, no need to delay this PR because of this).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think checkMutable
is an early check to ensure we are working on a branch which will later be modified by a commit. Whereas getRef
or getAsBranch
is to get the current state of the reference. I am not sure about combining them at the moment. I will explore this in a follow-up a suggested.
nessie/src/main/java/org/apache/iceberg/nessie/NessieIcebergClient.java
Outdated
Show resolved
Hide resolved
nessie/src/main/java/org/apache/iceberg/nessie/NessieIcebergClient.java
Outdated
Show resolved
Hide resolved
nessie/src/main/java/org/apache/iceberg/nessie/NessieIcebergClient.java
Outdated
Show resolved
Hide resolved
nessie/src/main/java/org/apache/iceberg/nessie/NessieIcebergClient.java
Outdated
Show resolved
Hide resolved
3bff65d
to
f04c264
Compare
I think the PR was ready. Please recheck. |
@@ -60,13 +60,6 @@ public String getHash() { | |||
return reference.getHash(); | |||
} | |||
|
|||
public Branch getAsBranch() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: how about returning a Branch
from checkMutable
to avoid type casts in the calling code? (just a thought, no need to delay this PR because of this).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @ajantha-bhat for the PR, and @snazy & @dimas-b for the review. LGTM!
* Nessie: Handle refresh for catalog APIs that doesn't use table operations * Add commit testcase * Another test case * Address comments * Avoid hash roundtrip * Address new comments * refactor
* Nessie: Handle refresh for catalog APIs that doesn't use table operations * Add commit testcase * Another test case * Address comments * Avoid hash roundtrip * Address new comments * refactor
Consider the scenario,
a) client1 - java Nessie catalog client creates a table1
b) client2 - spark + iceberg+ Nessie creates a table2
c) client3 - another java Nessie catalog client creates a table3
where all the 3 clients are connected to the same Nessie server.
Now for the first client (client1), list tables show only one table even though all 3 clients are connected to the same Nessie server.
The reason is only the catalog APIs that involves table operations (like
loadTable()
,commit()
, etc) refresh theNessieIcebergClient
. The APIs likelistTables()
,listNamespaces()
use the old state as it doesn't use table operations.Changes:
Other scenarios can be two concurrent spark sessions.