Core: Remove deprecated method from BaseMetadataTable #9298

ajantha-bhat · 2023-12-14T12:56:59Z

No description provided.

ajantha-bhat · 2023-12-14T13:48:00Z

core/src/main/java/org/apache/iceberg/SerializableTable.java

@@ -105,6 +105,8 @@ private String metadataFileLocation(Table table) {
    if (table instanceof HasTableOperations) {
      TableOperations ops = ((HasTableOperations) table).operations();
      return ops.current().metadataFileLocation();
+    } else if (table instanceof BaseMetadataTable) {
+      return ((BaseMetadataTable) table).table().operations().current().metadataFileLocation();


This is needed now since the metadata table won't enter above check of HasTableOperations

ajantha-bhat · 2023-12-14T14:26:37Z

Looks like some tests are directly casting metadata tables with HasTableOperations.
So, some more work is needed for this PR. Let me work on it and ping when it is ready.

ajantha-bhat · 2023-12-15T13:06:06Z

@nastra, @Fokko: PR is ready for review.

nastra · 2023-12-15T14:06:21Z

spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/Spark3Util.java

@@ -948,6 +950,17 @@ public static org.apache.spark.sql.catalyst.TableIdentifier toV1TableIdentifier(
    return org.apache.spark.sql.catalyst.TableIdentifier.apply(table, database);
  }

+  static String tableUUID(org.apache.iceberg.Table table) {
+    if (table instanceof HasTableOperations) {


why not just call table.uuid() in all of those places?

Because BaseMetadataTable gives new UUID instead of base table's UUID.
Shall I fix that method to return base table's UUID ?

iceberg/core/src/main/java/org/apache/iceberg/BaseMetadataTable.java

Lines 204 to 206 in d56dd63

public UUID uuid() {

return UUID.randomUUID();

}

I think while adding UUID interface we concluded that we should not use base table's UUID
#8800 (comment)

Yeah the argument there is that the metadata table can be considered as a separate table and should therefore have it's own unique identifier compared to the base table.

But I think @nastra point still stands, even if it's different then the base table UUID, why does that matter here? I think we just want the table.uuid() right? or do we need the metadata table's underlying table's UUID?

I tried using table.uuid(), many testcase failed as the scan task of metadata table expects UUID of the base table not the metadata table.

java.lang.IllegalArgumentException: No scan tasks found for 2c44000a-aa24-479a-8666-292cee70b95f

Does it make sense to return base table's UUID for the metadata table? (That is change the behaviour from #8800?)

done. Rebased.

Hmm, looks like SparkStagedScan is expecting base table's uuid for cache for metadata tables.

Either we need to change that logic or return base table uuid. I will dig deeper next week.

Sure, I'd check out that logic further and we can see what the right behavior is here. I still think the change that was made in #9310 is definitely the right fix from an API perspective (even if we decide not to use that API here). The main issue that was solved there was semantically the metadata table UUID should be the same for the same reference.

In other words, imo I would not change the UUID API semantics to fit whatever the caching logic relies on.

If we need the base table UUID for the caching logic, then maybe MetadataTable specifically can expose another API for exposing the underlying base table's UUID. Or alternatively keep it as is, and just expose the underlying Table (but that seems to expose too much imo).

Looks like there is a tight correlation between metadata table scan tasks and main table UUID from multiple classes.
If we need to change it, it can be handled in a separate PR (Issue) as it is nothing to do with this deprecated method removal.

Hence, I went back to reverting using table.uuid()

So, this PR can go ahead.
cc: @nastra, @amogh-jahagirdar

@nastra: Thoughts?

core/src/main/java/org/apache/iceberg/BaseMetadataTable.java

ajantha-bhat · 2023-12-27T11:06:03Z

Just rebased to resolve conflict.

amogh-jahagirdar · 2024-01-04T02:07:47Z

Sorry for the delay in review on this @ajantha-bhat , I'll take a look at this tomorrow.

amogh-jahagirdar

I think it looks close @ajantha-bhat just a colmment on the tableUUID implementation returning null when we don't know what kind of table it is.

I think it's good to preserve the metadataTable UUID behavior and not return the base table. The rationale is that a UUID for a table should be unique per table and metadata tables are no different in this regard.

I think the way that it's implemented in this PR is fine.

amogh-jahagirdar · 2024-01-04T20:46:26Z

spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/Spark3Util.java

+    } else if (table instanceof BaseMetadataTable) {
+      return ((BaseMetadataTable) table).table().operations().current().uuid();
+    } else {
+      return null;


I think this should probably throw an exception instead of returning null.

Also instead of tableUUID maybe baseTableUUID sine that's what we're really getting.

amogh-jahagirdar · 2024-01-04T20:51:16Z

spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/BaseFileRewriteCoordinator.java

@@ -72,18 +70,12 @@ public void clearRewrite(Table table, String fileSetId) {

  public Set<String> fetchSetIds(Table table) {
    return resultMap.keySet().stream()
-        .filter(e -> e.first().equals(tableUUID(table)))
+        .filter(e -> e.first().equals(Spark3Util.tableUUID(table)))


Nit: I think it would be a bit cleaner just to import Spark3Util.tableUUID and then just use tableUUID here (the diff on line 73 and other places would essentially go away in favor of just a new import statement). But that's nbd, if we do the method rename like I suggested we lose this benefit anyways.

Yeah, I renamed to baseTableUUID and I am not sure about the guidelines on static import. Some place we use it and some place we don't. So, I left it as it is.

nastra · 2024-01-06T16:52:51Z

spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/Spark3Util.java

+    } else if (table instanceof BaseMetadataTable) {
+      return ((BaseMetadataTable) table).table().operations().current().uuid();
+    } else {
+      throw new UnsupportedOperationException("Cannot fetch table operations for " + table.name());


should this be replicated across all Spark versions? Also I would probably update the error msg to Cannot retrieve UUID for table ...

ajantha-bhat · 2024-01-07T12:57:19Z

Retriggering the build due to flaky test in Flink.

manuzhang · 2024-01-08T01:43:18Z

spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/Spark3Util.java

+      TableOperations ops = ((HasTableOperations) table).operations();
+      return ops.current().uuid();
+    } else if (table instanceof BaseMetadataTable) {
+      return ((BaseMetadataTable) table).table().operations().current().uuid();


call table on table looks strange. It would be better to have a method baseTable().

This function can be called for main table or metadata table. So, the varibale name is table.

Agree that BaseMetadataTable can have a public interface as baseTable(). But current interface table() is a public interface, we can't rename directly. It has to be deprecated first and new interface.

So, I think we can leave it as it is as of now (out of scope for this PR).

ajantha-bhat · 2024-01-09T01:56:09Z

PR is ready.

ajantha-bhat · 2024-01-11T12:28:03Z

ping.
Anything else needed for this PR?

amogh-jahagirdar

This looks good to me now @ajantha-bhat thanks for the follow up. I'll wait for a bit in case others have any comments before merging.

…che#9298)

github-actions bot added the core label Dec 14, 2023

ajantha-bhat force-pushed the dep_ops branch from fa7c7de to 5d14bf1 Compare December 14, 2023 13:47

ajantha-bhat commented Dec 14, 2023

View reviewed changes

ajantha-bhat requested review from rdblue, nastra and Fokko December 14, 2023 13:48

ajantha-bhat marked this pull request as draft December 14, 2023 14:25

ajantha-bhat force-pushed the dep_ops branch from 5d14bf1 to aa1870d Compare December 15, 2023 09:22

github-actions bot added the spark label Dec 15, 2023

ajantha-bhat force-pushed the dep_ops branch 2 times, most recently from 5d3825c to 005ecf8 Compare December 15, 2023 12:22

ajantha-bhat marked this pull request as ready for review December 15, 2023 13:05

nastra reviewed Dec 15, 2023

View reviewed changes

ajantha-bhat mentioned this pull request Dec 16, 2023

Core: Fix Metadata table's UUID #9310

Merged

ajantha-bhat marked this pull request as draft December 16, 2023 12:25

amogh-jahagirdar reviewed Dec 16, 2023

View reviewed changes

core/src/main/java/org/apache/iceberg/BaseMetadataTable.java Show resolved Hide resolved

ajantha-bhat force-pushed the dep_ops branch from 0275034 to a1d74dd Compare December 17, 2023 01:51

ajantha-bhat marked this pull request as ready for review December 17, 2023 01:51

ajantha-bhat force-pushed the dep_ops branch from a1d74dd to e1ab527 Compare December 18, 2023 12:36

ajantha-bhat force-pushed the dep_ops branch from e1ab527 to 60d3527 Compare December 27, 2023 11:05

ajantha-bhat added this to the Iceberg 1.5.0 milestone Jan 3, 2024

amogh-jahagirdar reviewed Jan 4, 2024

View reviewed changes

ajantha-bhat force-pushed the dep_ops branch from 60d3527 to 8707255 Compare January 5, 2024 09:01

nastra reviewed Jan 6, 2024

View reviewed changes

ajantha-bhat force-pushed the dep_ops branch from 8707255 to dba6a64 Compare January 7, 2024 12:32

ajantha-bhat closed this Jan 7, 2024

ajantha-bhat reopened this Jan 7, 2024

manuzhang reviewed Jan 8, 2024

View reviewed changes

ajantha-bhat added 2 commits January 8, 2024 07:35

Core: Remove deprecated method from BaseMetadataTable

50b23c9

Address comments

e13e6d2

ajantha-bhat force-pushed the dep_ops branch from dba6a64 to e13e6d2 Compare January 8, 2024 02:05

amogh-jahagirdar approved these changes Jan 17, 2024

View reviewed changes

nastra approved these changes Jan 18, 2024

View reviewed changes

ajantha-bhat mentioned this pull request Jan 18, 2024

Arrow, AWS, Core: Remove deprecated code for 1.5.0 release #9505

Merged

amogh-jahagirdar merged commit 6e7702d into apache:main Jan 18, 2024
42 checks passed

geruh pushed a commit to geruh/iceberg that referenced this pull request Jan 26, 2024

Core: Remove deprecated operations method from BaseMetadataTable (apa…

6cb2d7b

…che#9298)

adnanhemani pushed a commit to adnanhemani/iceberg that referenced this pull request Jan 30, 2024

Core: Remove deprecated operations method from BaseMetadataTable (apa…

5340bfc

…che#9298)

devangjhabakh pushed a commit to cdouglas/iceberg that referenced this pull request Apr 22, 2024

Core: Remove deprecated operations method from BaseMetadataTable (apa…

2a0d9f0

…che#9298)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Core: Remove deprecated method from BaseMetadataTable #9298

Core: Remove deprecated method from BaseMetadataTable #9298

ajantha-bhat commented Dec 14, 2023

ajantha-bhat Dec 14, 2023

ajantha-bhat commented Dec 14, 2023

ajantha-bhat commented Dec 15, 2023

nastra Dec 15, 2023

ajantha-bhat Dec 15, 2023

ajantha-bhat Dec 15, 2023

amogh-jahagirdar Dec 15, 2023

ajantha-bhat Dec 16, 2023

ajantha-bhat Dec 17, 2023

ajantha-bhat Dec 17, 2023

amogh-jahagirdar Dec 17, 2023 •

edited

Loading

ajantha-bhat Dec 18, 2023

ajantha-bhat Dec 19, 2023

ajantha-bhat commented Dec 27, 2023

amogh-jahagirdar commented Jan 4, 2024

amogh-jahagirdar left a comment

amogh-jahagirdar Jan 4, 2024

amogh-jahagirdar Jan 4, 2024

ajantha-bhat Jan 5, 2024

amogh-jahagirdar Jan 4, 2024 •

edited

Loading

ajantha-bhat Jan 5, 2024

nastra Jan 6, 2024

ajantha-bhat Jan 7, 2024

ajantha-bhat commented Jan 7, 2024

manuzhang Jan 8, 2024

ajantha-bhat Jan 8, 2024

ajantha-bhat commented Jan 9, 2024

ajantha-bhat commented Jan 11, 2024

amogh-jahagirdar left a comment

Core: Remove deprecated method from BaseMetadataTable #9298

Core: Remove deprecated method from BaseMetadataTable #9298

Conversation

ajantha-bhat commented Dec 14, 2023

Choose a reason for hiding this comment

ajantha-bhat commented Dec 14, 2023

ajantha-bhat commented Dec 15, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

amogh-jahagirdar Dec 17, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ajantha-bhat commented Dec 27, 2023

amogh-jahagirdar commented Jan 4, 2024

amogh-jahagirdar left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

amogh-jahagirdar Jan 4, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ajantha-bhat commented Jan 7, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ajantha-bhat commented Jan 9, 2024

ajantha-bhat commented Jan 11, 2024

amogh-jahagirdar left a comment

Choose a reason for hiding this comment

amogh-jahagirdar Dec 17, 2023 •

edited

Loading

amogh-jahagirdar Jan 4, 2024 •

edited

Loading