Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nessie: Use latest hash for catalog APIs #6789

Merged
merged 7 commits into from
Apr 11, 2023
Merged

Conversation

ajantha-bhat
Copy link
Member

@ajantha-bhat ajantha-bhat commented Feb 9, 2023

Consider the scenario,
a) client1 - java Nessie catalog client creates a table1
b) client2 - spark + iceberg+ Nessie creates a table2
c) client3 - another java Nessie catalog client creates a table3

where all the 3 clients are connected to the same Nessie server.

Now for the first client (client1), list tables show only one table even though all 3 clients are connected to the same Nessie server.
The reason is only the catalog APIs that involves table operations (like loadTable(), commit(), etc) refresh the NessieIcebergClient. The APIs like listTables(), listNamespaces() use the old state as it doesn't use table operations.

Changes:

  • Pass null for the reference hash in api request for immutable references. So that server will use the latest hash

Other scenarios can be two concurrent spark sessions.

@ajantha-bhat ajantha-bhat marked this pull request as draft February 9, 2023 16:43
@github-actions github-actions bot added the NESSIE label Feb 9, 2023
TableIdentifier.parse("foo.tbl1"), TableIdentifier.parse("foo.tbl2"));

Assertions.assertThat(catalog.listTables(Namespace.of("foo")))
.containsExactlyInAnyOrder(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This particular check fails without this PR. As, catalog refers to the old state for listTables().

@ajantha-bhat ajantha-bhat changed the title Nessie: Handle refresh for catalog APIs that doesn't use table operations Nessie: Add refresh for catalog APIs that doesn't use table operations Feb 9, 2023
@@ -223,13 +224,13 @@ protected String defaultWarehouseLocation(TableIdentifier table) {

@Override
public List<TableIdentifier> listTables(Namespace namespace) {
return client.listTables(namespace);
return refreshedClient().listTables(namespace);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

refreshing the client is enough for a few APIs that doesn't use TableOperations. But the code looks odd. So, I am refreshing all the APIs as it is a lightweight operation.

@ajantha-bhat
Copy link
Member Author

Restarting the build.

Execution failed for task ':iceberg-spark:iceberg-spark-3.3_2.12:checkstyleMain'.
See https://docs.gradle.org/7.6/userguide/command_line_interface.html#sec:command_line_warnings
> A failure occurred while executing org.gradle.api.plugins.quality.internal.CheckstyleAction
499 actionable tasks: 499 executed
   > An unexpected error occurred configuring and executing Checkstyle.
      > java.lang.Error: Error was thrown while processing /home/runner/work/iceberg/iceberg/spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/SparkTableUtil.java

@ajantha-bhat ajantha-bhat reopened this Feb 17, 2023
@ajantha-bhat ajantha-bhat marked this pull request as ready for review February 17, 2023 10:56
@ajantha-bhat
Copy link
Member Author

cc: @snazy, @dimas-b, @nastra

@ajantha-bhat ajantha-bhat changed the title Nessie: Add refresh for catalog APIs that doesn't use table operations Nessie: Add refresh for catalog APIs Feb 17, 2023
@snazy
Copy link
Member

snazy commented Feb 17, 2023

Sure this does not break the expected-hash calculations for commits?

@ajantha-bhat
Copy link
Member Author

ajantha-bhat commented Feb 17, 2023

Sure this does not break the expected-hash calculations for commits?

I just added a testcase for commit from multiple catalogs/clients on same branch + same table. It passes.
So, I think there is no impact.
Also, expected hash is passed after the refresh operation from NessieCatalog. So, the client won't be modified after passing expected hash in the same flow.

@snazy: Do you see any problem with the current changes? Do you feel we should only modify listTables, listNamespaces, listNamespaceMetadata, removeProperties, setProperties instead of refreshing all the APIs?

Copy link
Contributor

@dimas-b dimas-b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @snazy 's concern about "expected hash". I think the question here is not about multiple clients, but about correctness within the same JVM. If a transaction starts with Nessie reference being in a certain state, the commit of that Tx should use "expected" reference as it was at the start of the Tx... but it looks like this PR increases the risk of inadvertently updating the reference while transactions are in progress.

@ajantha-bhat
Copy link
Member Author

Isn't now also (without this PR) if two concurrent commits happen for a same table, the client can get refreshed between the
caller of this


to
UpdateableReference updateableReference = getRef();

because of existing refresh in this code by concurrent operations?

@dimas-b
Copy link
Contributor

dimas-b commented Feb 17, 2023

Isn't now also (without this PR) if two concurrent commits happen for a same table, the client can get refreshed [...]

I think so. This issue is not specific to this PR. I just think this PR makes it more likely to manifest.

Would it be possible to keep the "expected" hash at the Tx level?

@ajantha-bhat
Copy link
Member Author

I think so. This issue is not specific to this PR. I just think this PR makes it more likely to manifest.
Would it be possible to keep the "expected" hash at the Tx level?

@dimas-b : In that case, txn can only commit one successful commit?

Sure this does not break the expected-hash calculations for commits?

@snazy: can you please elaborate on this? Without this PR also, concurrent commits from the same client can refresh the catalog and cause breakage of expected-hash.

a) Are you suggesting that this PR will increase the probability of that event happening and we don't want to do that?
b) Even the sequential updates have issues mentioned in the PR description and I want to fix them. Should I keep the changes only for listTables(), listNamespaces(), loadNamespaceMetadata(), setProperties(), removeProperties()?

@snazy
Copy link
Member

snazy commented Feb 20, 2023

I'm suggesting to provide a fix for it.

@dimas-b
Copy link
Contributor

dimas-b commented Feb 21, 2023

+1 to fixing how Nessie Catalog (and related classes) track expected hash and refresh references.

@dimas-b
Copy link
Contributor

dimas-b commented Feb 21, 2023

[...] In that case, txn can only commit one successful commit?

The Nessie server is able to resolve non-conflicting changes to the catalog, even when the expected hash is behind the current branch HEAD.

@ajantha-bhat
Copy link
Member Author

ajantha-bhat commented Feb 24, 2023

@snazy & @dimas-b:

BaseTransaction.commitTransaction()
-> BaseMetastoreTableOperations.commit() with refresh client 
-> NessieTableOperations.doCommit() 
-> NessieIcebergClient.commitTable with the expected hash. 

Based on my analysis, the Nessie commit operation flow is as above. There are chances that we use older commit hash as the expected hash with the latest client. For this, Nessie backend server will fail the commit and Iceberg will retry the transaction/commit using this code[1] by rebasing the metadata and client.

[1] code for rebase with refreshed metadata

.onlyRetryOn(CommitFailedException.class)
.run(
underlyingOps -> {
try {
underlyingOps.refresh();
} catch (NoSuchTableException e) {

So, Even if we add some code to update/refresh the expected hash to be the latest, It is not enough as TableMetadata will hold the old state. So, Nessie client cannot modify table metadata. The existing retry mechanism will handle this refresh.

So, I don't think we need to change anything. Please clarify or guide me if you think it is incorrect.

@ajantha-bhat
Copy link
Member Author

ajantha-bhat commented Feb 27, 2023

@snazy, @dimas-b:
I checked some more code today.
I found that NessieTableOperation works on a copy of the client. Not the actual client from NessieCatalog. as withReference creates a new client object.

client.withReference(tr.getReference(), tr.getHash()),

And the client inside the NessieTableOperation will never be refreshed/out of state. There is also this check.

throw new CommitFailedException("Cannot commit: stale table metadata");

TLDR;
PR changes are for the client of NessieCatalog, not for the client of NessieTableOperations. Hence, I don't think there is any impact on the expected hash calculation of the commits of NessieTableOperations.

Please recheck the PR again and let me know if a change required on this.

Copy link
Contributor

@dimas-b dimas-b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for you work on making this use case more robust @ajantha-bhat ! Some more comments below.

@AfterEach
public void afterEach() throws Exception {
dropTables(catalog);
dropTables(anotherCatalog);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why bother? The super class will delete/reset all Nessie data. Table files are located in a temp. dir.

If the intention is to remove table's files, wouldn't it be preferable to do that in the base class and the FS level for all tests after resetting Nessie?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I followed TestNessieTable which was cleaning the table files located in a temp dir.
Other test cases doesn't have it.

Agree that this should be at base test class level. I will handle it in a follow up PR.

@ajantha-bhat
Copy link
Member Author

@dimas-b: Thanks for the review. I have replied to the comments. Please check.

@ajantha-bhat
Copy link
Member Author

ajantha-bhat commented Mar 2, 2023

This flaky error again !

* What went wrong:
Execution failed for task ':iceberg-spark:iceberg-spark-3.3_2.12:checkstyleMain'.
> A failure occurred while executing org.gradle.api.plugins.quality.internal.CheckstyleAction
   > An unexpected error occurred configuring and executing Checkstyle.
      > java.lang.Error: Error was thrown while processing /home/runner/work/iceberg/iceberg/spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/source/SparkScanBuilder.java

opened this to get more info:
#6982

@ajantha-bhat ajantha-bhat force-pushed the refresh branch 2 times, most recently from 678f6bd to dccfd6b Compare March 13, 2023 09:57
@ajantha-bhat
Copy link
Member Author

While the code looks technically correct, this change introduces an unnecessary extra round-trip. Usually, people work on the HEAD of a Nessie branch, which, for most Nessie operations, does not require the client to know the HEAD's commit-ID and pass it back to the server.
IMO the NessieIcebergClient needs to be able to handle the case of referencing the HEAD of a branch for NessieCatalog (and only pass the branch/tag name to the Nessie server), but be able provide an instance of NessieIcebergClient with the expected hash / commit ID for newTableOps, renameTable, dropTable, etc.

@snazy, @dimas-b, @nastra: I have avoided the unnecessary extra round-trip by fixing it differently than before. Please take a look. Thanks.

@ajantha-bhat ajantha-bhat changed the title Nessie: Add refresh for catalog APIs Nessie: Use latest hash for catalog APIs Mar 13, 2023
Copy link
Member

@snazy snazy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change adds too many unnecessary code changes. Most of the changes aren't justified, the required fix can likely be achieved with way less changes. The current PR also doesn't clean up renameTable wrt the scope of this PR.

@ajantha-bhat
Copy link
Member Author

This change adds too many unnecessary code changes. Most of the changes aren't justified, the required fix can likely be achieved with way less changes. The current PR also doesn't clean up renameTable wrt the scope of this PR.

@snazy: I gave it another try. Please take a look. Also the renameTable and dropTable was not going via TableOperations. So, I have handled it similarly to the existing commitTable logic now.

@ajantha-bhat
Copy link
Member Author

ajantha-bhat commented Mar 13, 2023

one of the builds failed due to #7023

https://github.com/apache/iceberg/actions/runs/4405434484

re-triggering the build.

@ajantha-bhat
Copy link
Member Author

one of the builds failed due to #7023

https://github.com/apache/iceberg/actions/runs/4434173539/jobs/7779904892

re-triggering the build.

@@ -60,13 +60,6 @@ public String getHash() {
return reference.getHash();
}

public Branch getAsBranch() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I deprecated it instead of removing it because it is a public method.

This method a safe cast - you replaced it (not clear for which reason) with an unconditional cast (and BTW left another then unused method).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method a safe cast

because all the places where this was called, already there is a check for getRef().checkMutable(). So, no need for one more method call that checks if it is a branch. Hence, directly casting to branch.

I will remove isBranch too.

Copy link
Contributor

@dimas-b dimas-b Apr 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: how about returning a Branch from checkMutable to avoid type casts in the calling code? (just a thought, no need to delay this PR because of this).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think checkMutable is an early check to ensure we are working on a branch which will later be modified by a commit. Whereas getRef or getAsBranch is to get the current state of the reference. I am not sure about combining them at the moment. I will explore this in a follow-up a suggested.

@ajantha-bhat ajantha-bhat force-pushed the refresh branch 2 times, most recently from 3bff65d to f04c264 Compare March 16, 2023 10:16
@ajantha-bhat
Copy link
Member Author

I think the PR was ready. Please recheck.

@ajantha-bhat
Copy link
Member Author

Thanks, @snazy and @dimas-b for the review.
@Fokko can you please help in merging this?

@@ -60,13 +60,6 @@ public String getHash() {
return reference.getHash();
}

public Branch getAsBranch() {
Copy link
Contributor

@dimas-b dimas-b Apr 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: how about returning a Branch from checkMutable to avoid type casts in the calling code? (just a thought, no need to delay this PR because of this).

Copy link
Contributor

@Fokko Fokko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ajantha-bhat for the PR, and @snazy & @dimas-b for the review. LGTM!

@Fokko Fokko merged commit af4ddd2 into apache:master Apr 11, 2023
ericlgoodman pushed a commit to ericlgoodman/iceberg that referenced this pull request Apr 12, 2023
* Nessie: Handle refresh for catalog APIs that doesn't use table operations

* Add commit testcase

* Another test case

* Address comments

* Avoid hash roundtrip

* Address new comments

* refactor
manisin pushed a commit to Snowflake-Labs/iceberg that referenced this pull request May 9, 2023
* Nessie: Handle refresh for catalog APIs that doesn't use table operations

* Add commit testcase

* Another test case

* Address comments

* Avoid hash roundtrip

* Address new comments

* refactor
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants