API: Support removeUnusedSpecs in ExpireSnapshots #10755

advancedxy · 2024-07-23T13:32:24Z

This is a continue work of #3462, all the credits should goes to @RussellSpitzer.

Previously there was no way to remove partition specs from a table once they were
added. To fix this we add an api which searches through all reachable manifest
files and records their specsIds. Any specIds which do not find are marked for
removal which is done through a serializable commit.

advancedxy · 2024-07-23T13:38:51Z

@amogh-jahagirdar @RussellSpitzer @aokolnychyi @szehon-ho it would be great if you guys could take a look at this.

This PR is raised based the previous discussion in #10352 (comment)

amogh-jahagirdar

Thank you for carrying this forward @advancedxy ! I don't think I'd go with a general SetPartitionSpecs update, I think I'd have a RemovePartitionSpec, and the TableMetadata builder APIs to remove a given spec (which will have validation that we're not removing the current spec, the spec to remove is a valid partition spec etc).

Few more things to consider:

1.) Should we included removing unused schemas? I know there are users whose tables undergo numerous evolutions for adding fields, and they want to remove old schemas since their metadata ends up being so bloated to the point of performance concerns when reading! My conclusion here is no, I believe that should be a separate operation since I think we want APIs to do one thing and do it well.. A caller can combine the two if desired.

In this approach I think we'd have to do a REST spec change to introduce the new update type. If we think this API is worth it for general purpose metadata cleaning, beyond preventing the drop column issue then we'd probably have to go through with the spec change. However, if this is only being introduced for the drop column issue, maybe we want to think through lighter weight options to solve that particular issue?

Another possible way: are we able to retain the spec and essentially not care about the dropped field's existence? This may be something worth exploring. Sorry for failing to considering this earlier, I wasn't considering this because I assumed we wouldn't need any heavy weight spec changes and the API would be generalizable beyond this drop column case. I'll also do some exploration here.

core/src/main/java/org/apache/iceberg/TableMetadata.java

advancedxy · 2024-07-24T03:51:52Z

I don't think I'd go with a general SetPartitionSpecs update, I think I'd have a RemovePartitionSpec, and the TableMetadata builder APIs to remove a given spec (which will have validation that we're not removing the current spec, the spec to remove is a valid partition spec etc).

This is a nice suggestion and better approach. I will change the withSpecs/setSpecs by using removePartitionSpec.

Replying other comments inline.

Should we included removing unused schemas?

No, and I think we are on the same page. PruneUnusedSchemas should be in a separate and dedicated action. However, it might not be possible to do that in current code as there's no SchemaId bounding to DataFile/DeleteFile. It's impossible to decide which schemas are unused?

In this approach I think we'd have to do a REST spec change to introduce the new update type. If we think this API is worth it for general purpose metadata cleaning, beyond preventing the drop column issue then we'd probably have to go through with the spec change. However, if this is only being introduced for the drop column issue, maybe we want to think through lighter weight options to solve that particular issue?

I don't think we should expose removePartitionSpec to the REST catalog, at least for now. I think RemoveUnusedSpecs is a general metadata cleaning API and worth introducing, however the API is self-contained and doesn't have to be coupled with REST catalog. Like the comment in the TableMetadata, it's not safe for external client to simply call removePartitionSpec without checking the spec is unused, which might not an easy task in the REST catalog. Unless we think it's worthy to add check by reading manifest files in the REST catalog, in that way, we may expose the removePartitionSpec to the REST catalog. That's why the org.apache.iceberg.MetadataUpdate.SetPartitionSpecs#applyTo method throws an exception instead of actually implementing it.

If introducing a new MetadataUpdate implies a REST spec change, I think we can change the TableMetadata.Builder#build to build without adding a RemovePartitionSpec change, how does that sounds to you?

core/src/main/java/org/apache/iceberg/TableMetadata.java

advancedxy · 2024-07-26T13:37:34Z

Close and re-open to trigger the CI.

Also gently ping @amogh-jahagirdar @RussellSpitzer to take another look.

api/src/main/java/org/apache/iceberg/RemoveUnusedSpecs.java

core/src/main/java/org/apache/iceberg/BaseMetadataTable.java

core/src/main/java/org/apache/iceberg/BaseRemoveUnusedSpecs.java

RussellSpitzer · 2024-07-26T19:54:18Z

core/src/main/java/org/apache/iceberg/TableMetadata.java

+  /**
+   * Prune the unused partition specs from the table metadata.
+   *
+   * <p>Note: it's not safe for external client to call this directly, it's usually called by the


This method is package private, does it need this note?

I'd like to highlight this note to reduce potential misusage as it's possible for users to put their own code in org.apache.iceberg package to bypass Java's access control.

I agree with @RussellSpitzer. This class is internal and many of its methods can break tables if called incorrectly.

Also, we are no longer adding new operation methods to this. These days we make modifications to the Builder instead.

Let me remove the note then.

Also, we are no longer adding new operation methods to this. These days we make modifications to the Builder instead.

For this part, I think modifications are still going through the Builders. The main purpose of this method is to provide a single access point to prune unused specs purely so that prune unused specs are not mixed with other metadata updates.

core/src/test/java/org/apache/iceberg/TestRemoveUnusedSpecs.java

RussellSpitzer

I think this is really close, I just want to have those remaining nits of mine addressed.

amogh-jahagirdar

@advancedxy This is getting closer, thank you for moving from a "set" to a "remove" semantic, that fits better with the general principles of the project but I think the main question I had was around why we need a special flag for hasRemovedSpecs when building? It seems like that's just working around the fact that there's no metadata update, which I know I mentioned would require a spec change for REST. If that's the case, i think the right solution there should be to add a ToDo with a follow on issue for supporting it for REST.

Also since there's no update type for REST this should mean that the operation fails for REST on the client side so it's clear. Again I think that's fine in the interim, until we add the support for it, but I think we should ideally validate that behavior in a test since I think a bad case would be if the commit for the RemoveUnusedSpecs appears to succeed on REST but nothing actually happens. cc @RussellSpitzer @rdblue

core/src/main/java/org/apache/iceberg/TableMetadata.java

amogh-jahagirdar · 2024-07-29T15:46:05Z

core/src/main/java/org/apache/iceberg/TableMetadata.java

+   * @param toRemoveSpecs the partition specs to be removed
+   * @return the new table metadata with the unused partition specs removed
+   */
+  TableMetadata pruneUnusedSpecs(List<PartitionSpec> toRemoveSpecs) {


Can we call the parameter specsToRemove?

core/src/main/java/org/apache/iceberg/BaseRemoveUnusedSpecs.java

amogh-jahagirdar · 2024-07-29T15:58:25Z

core/src/main/java/org/apache/iceberg/TableMetadata.java

+      if (hasRemovedSpecs) {
+        Preconditions.checkArgument(
+            changes.isEmpty(), "Cannot remove partition specs with other metadata update");


@advancedxy I'm not sure I follow, what's the intention of this check?

See https://github.com/apache/iceberg/pull/10755/files#r1690420193 and https://github.com/apache/iceberg/pull/10755/files#r1696099907

I think this check is to make sure that RemoveUnusedSpecs and other metadata updates are not happened together.

amogh-jahagirdar · 2024-07-29T16:01:27Z

core/src/main/java/org/apache/iceberg/TableMetadata.java

@@ -1425,6 +1460,7 @@ private boolean hasChanges() {
          || (discardChanges && !changes.isEmpty())
          || metadataLocation != null
          || suppressHistoricalSnapshots
+          || hasRemovedSpecs


Hm I'm trying to understand why we need this special flag, is this a way so that we avoid having to add the metadata update type for REST (since then that would ultimately just be in the changes list)?

This is not the right way to update hasChanges. Instead, this needs to add a change to changes. That way it is sent to REST services to modify the table in the REST commit path.

is this a way so that we avoid having to add the metadata update type for REST (since then that would ultimately just be in the changes list)?

Yes, as discussed earlier, I don't think we should expose removePartitionSpec directly to the REST API as there's no easy way to ensure that the spec to remove is indeed not used.

This is not the right way to update hasChanges. Instead, this needs to add a change to changes.

It was adding a RemovePartitionSpec to the changes. However, that would require a REST spec change and it's a bit heavy. See discussions as well: #10755 (comment)

That way it is sent to REST services to modify the table in the REST commit path.

I might be missing something, so all the metadata changes have to be sent to the REST service? I didn't work with a REST catalog before and don't see how changes are sent to the REST service. It would be great that some reference or code could be pointed to.

I might be missing something, so all the metadata changes have to be sent to the REST service? I didn't work with a REST catalog before and don't see how changes are sent to the REST service. It would be great that some reference or code could be pointed to.

I took a look at the REST catalog related code today, it seems the impl in this PR doesn't work with Iceberg tables backend by REST catalog as there's no update added for RemovePartitionSpec. The RemoveUnusedSpecs will succeed without actually removing unused specs for REST catalog.

If supporting REST catalog is a must requirement, I think we have to go through a REST spec change to add new update type to reflect that. The only concern is how to enforce the removed spec is indeed not used any more? Do all the similar calculation in org.apache.iceberg.rest.CatalogHandlers#commit like how we did in BaseRemoveUnusedSpecs? Or is that necessary?

WDYT? @RussellSpitzer @rdblue @amogh-jahagirdar

Yes, as discussed earlier, I don't think we should expose removePartitionSpec directly to the REST API as there's no easy way to ensure that the spec to remove is indeed not used.

We don't currently have operations that can't be supported by the REST protocol, so this is an area where we should be careful. I agree that we want to ensure that there are no new references to specs that are being removed, but I'm skeptical that there is no way to do that. There are also problems with not sending this because it would be a silent no-op when sending changes to REST catalogs.

It's unlikely that a new snapshot would be written with a spec that is being removed because this already validates that the default spec is not being removed. A conflict here would require that a concurrent writer is using a spec other than the default or has changed the default spec. There's already a validation for the second case, assert-default-spec-id. For the first case, this could require that no branch states have changed using assert-ref-snapshot-id.

The RemoveUnusedSpecs will succeed without actually removing unused specs for REST catalog.

@advancedxy This indicates there's a problem with the current implementation then imo. If there was a way to avoid the REST spec change I think it would've been OK as long as we can guarantee failure on the client side until the support was added for the metadata update type. But I think that's unavoidable, and I agree with @rdblue that we should probably just add the metadata update type.

I also am reasonably confident that a REST catalog can safely handle this RemovePartitionSpec update. The default spec ID needs to be the same and there must have been no writes to any branches. If any of those are not true, the server should fail the update.

We don't currently have operations that can't be supported by the REST protocol, so this is an area where we should be careful.

Yes, after taking a look at the related code, I think we should strive to make all operations supported by REST protocol.

It's unlikely that a new snapshot would be written with a spec that is being removed because this already validates that the default spec is not being removed. A conflict here would require that a concurrent writer is using a spec other than the default or has changed the default spec. There's already a validation for the second case, assert-default-spec-id. For the first case, this could require that no branch states have changed using assert-ref-snapshot-id.

Thanks for the inspiring explanation. I wasn't worrying about the normal path, which I am also confident that REST catalog can safely handle. I'm more concerned of misusage, such that users issue a RemovePartitionSpec request without going through the RemoveUnusedSpec API or accidentally call TableMetadata.Builder.removePartitionSpecs directly. That's why there's method defined in TableMetadata class in the first place, to hide the Builder's method and ensure single point of access.

For the REST catalog, as it's open by default to all kinds of clients. It's more likely to be affected by the misusage. That's why I'm proposing in the catalog handler side do the check as well:

Do all the similar calculation in org.apache.iceberg.rest.CatalogHandlers#commit like how we did in BaseRemoveUnusedSpecs?

However, it's heavy operation, does that worth it?

core/src/main/java/org/apache/iceberg/TableMetadata.java

core/src/test/java/org/apache/iceberg/TestRemoveUnusedSpecs.java

core/src/main/java/org/apache/iceberg/BaseRemoveUnusedSpecs.java

rdblue · 2024-07-29T23:36:19Z

api/src/main/java/org/apache/iceberg/RemoveUnusedSpecs.java

+ *
+ * <p>{@link #apply()} returns the specs that will remain if committed on the current metadata
+ */
+public interface RemoveUnusedSpecs extends PendingUpdate<List<PartitionSpec>> {}


Does this need to be a separate operation? It seems very specific. I wonder if it is worth adding a maintenance API that could cover more things, like removing old schemas as well.

Missed this comment. I think it would be wonderful to be able to remove unused schemas as well. However, it might depends on #4898 to reliably determine which schemaId is still in use.

Does this need to be a separate operation?

I think so, even if we are going to group other maintenance API like remove unused schema together, they are two different operations and should be in a dedicated operation.

I wonder if it is worth adding a maintenance API that could cover more things, like removing old schemas as well.

Do you by chance have any API name for the Metadata Maintenance class? I am think about something like MetadataCleaner.

dedicated operation

By "dedicated operation", I mean a method in the Table API. Adding this kind of thing is a lot of work, so I'd prefer a reusable option that can handle multiple tasks, like this:

table.maintenance() .removeUnusedSpecs() .removeUnusedSchemas() .commit()

Another option is to do this regularly as part of snapshot expiration. Have you considered that? Since expiration already reads manifests that could be a good place to do this.

[Removing schemas] might depends on #4898 to reliably determine which schemaId is still in use.

We don't need to check whether there are data files that were written with a particular schema ID, only whether there are snapshots that reference the schema ID. Data files are readable by any future schema by design.

table.maintenance() .removeUnusedSpecs() .removeUnusedSchemas() .commit()

This looks promising. But what if we need to further configure the maintenance operations, such as

table.maintenance().removeUnusedSpecs().retainLast(num).commit(); // or something table.maintenance().removeUnusedSchemas().setMinSchemasToKeep(num).commit();

Different maintenance operations may have different configure options. It would be better to use a dedicated operation for each purpose? Of course, It would be great that these maintenance APIs are grouped together.

Have you considered that? Since expiration already reads manifests that could be a good place to do this.

This is attempting and to be honest I haven't thought about it.

Update: just did a quick look at the RemoveSnapshots' implementation, it might add too much complexity to put remove unused spec logic in there.

@advancedxy I think a decent example to look at would be the ManageSnapshots API which handles cherry picking/rollback and branching/tagging operations. That is the public interface (analagous to "maintenance" in this case), but the implementation the individual operation implementations are still in separate classes which are package-private and focused on a single operation, as well as to enable different configuration options as you mentioned.

The ManageSnapshots implementation is tracking all of these operations as part of a transaction (which for the combined schema + partition spec pruning operation case sounds reasonable to me).

I think the pending question is should this be done as part of ExpireSnapshots or should we have a separate operation. If i think about when it makes sense for this cleanup to happen ExpireSnapshots does make sense. I also don't know what we'd call a separate "maintenance" API since there's quite a few maintenance operations in Iceberg.

I think a decent example to look at would be the ManageSnapshots API which handles cherry picking/rollback and branching/tagging operations.

Yes, I am thinking about something similar to that.

If i think about when it makes sense for this cleanup to happen ExpireSnapshots does make sense. I also don't know what we'd call a separate "maintenance" API since there's quite a few maintenance operations in Iceberg.

I agree it's attempting. But I would prefer to use a maintenance API, for the following reasons:

Currently ExpireSnapshots extends PendingUpdate<List<Snapshot>>, if we are going to remove unused spec(and maybe unused schemas as well), we have to change the interface signature, which is breaking change.

ExpireSnapshots doesn't read manifest list yet, it only leverage SnapshotRef and Snapshot to calculate expired snapshots. It's adding complexity and breaks the do one thing and do it well philosophy.

I agree it's attempting. But I would prefer to use a maintenance API, for the following reasons: Currently ExpireSnapshots extends PendingUpdate<List>, if we are going to remove unused spec(and maybe unused schemas as well), we have to change the interface signature, which is breaking change.
ExpireSnapshots doesn't read manifest list yet, it only leverage SnapshotRef and Snapshot to calculate expired snapshots. It's adding complexity and breaks the do one thing and do it well philosophy.

I don't think this is necessarily true that we need to change the interface signature. We are still producing a new set of snapshots that are removed as part of the procedure but in addition to that we are (in an opt-in manner) pruning out the unused specs/schemas which is orthogonal to snapshots for this purpose.

As part of file cleanup the procedure does determine which files are still referenced (going through manifest lists/manifests) and which should be removed (this is in the FileCleanupStrategy implementations like ReachableFileCleanup. I think it's true that it adds a bit of complexity but the complexity is tracking which partition specs are referenced as we traverse the manifests. The same users which are frequently running expire snapshots also probably want to get rid of unused specs/schemas to keep metadata sizes smaller. After some more thought, I feel like it aligns with "one thing and do it well" since the "one thing" logic we're talking about is already common (the logic to traverse manifests) relative to expanding the API surface

rdblue · 2024-07-29T23:38:07Z

core/src/main/java/org/apache/iceberg/BaseRemoveUnusedSpecs.java

+            });
+  }
+
+  private TableMetadata removeUnusedSpecs(TableMetadata current) {


I think other operations typically call this method internalApply when it is the core functionality of apply but the class needs to call it from both apply and commit.

Let me rename it to internalApply then.

rdblue · 2024-07-29T23:42:01Z

core/src/main/java/org/apache/iceberg/BaseRemoveUnusedSpecs.java

+                MetadataTableUtils.createMetadataTableInstance(table, MetadataTableType.ALL_ENTRIES)
+                    .newScan()
+                    .planFiles(),
+                task -> ((BaseEntriesTable.ManifestReadTask) task).partitionSpecId()));


We generally avoid unchecked casts because this creates a brittle dependency on a particular type being produced. That in turn limits our ability to trust the type system and make reasonable changes quickly.

If I understand correctly, the purpose of using a metadata table here is not to use the Table interface, but instead to reuse some of the code in the all_entries table. I think a more direct path would be to use ManifestGroup and possibly refactor some of the logic from the all_entries table to be more easily reused.

Good point, let me refactor this to avoid unchecked cast.

If I understand correctly, the purpose of using a metadata table here is not to use the Table interface, but instead to reuse some of the code in the all_entries table

I think this code is used to avoid actually accessing the underlying rows, see #3462 (comment). We can/should use the Table interface to access Manifest files directly.

How does that sound to you?

rdblue · 2024-07-29T23:45:27Z

core/src/main/java/org/apache/iceberg/TableMetadata.java

@@ -1102,6 +1121,22 @@ public Builder setDefaultPartitionSpec(int specId) {
      return this;
    }

+    private Builder removePartitionSpec(PartitionSpec spec) {
+      Preconditions.checkArgument(
+          changes.isEmpty(), "Cannot remove partition spec with other metadata update");


Why is this necessary?

I think this is needed to avoid conflicts with changes to the default spec ID. I'd probably change this to allow other changes and instead update this and the methods that set the default spec ID to check for one another.

I think this is needed to avoid conflicts with changes to the default spec ID.

Yeah, besides SetDefaultSepc, AddPartitionSpec might also interfere with this. For safety purposes, the previous implementation rejects all other metadata updates since removePartitionSpec is rarely called and always called alone.

advancedxy

Thanks @rdblue and @amogh-jahagirdar for reviewing, will address them in a new commit.

api/src/main/java/org/apache/iceberg/Table.java

core/src/main/java/org/apache/iceberg/BaseRemoveUnusedSpecs.java

advancedxy · 2024-07-30T00:41:04Z

core/src/main/java/org/apache/iceberg/BaseRemoveUnusedSpecs.java

+            });
+  }
+
+  private TableMetadata removeUnusedSpecs(TableMetadata current) {


Let me rename it to internalApply then.

core/src/main/java/org/apache/iceberg/BaseRemoveUnusedSpecs.java

advancedxy · 2024-07-30T00:50:58Z

core/src/main/java/org/apache/iceberg/BaseRemoveUnusedSpecs.java

+                MetadataTableUtils.createMetadataTableInstance(table, MetadataTableType.ALL_ENTRIES)
+                    .newScan()
+                    .planFiles(),
+                task -> ((BaseEntriesTable.ManifestReadTask) task).partitionSpecId()));


Good point, let me refactor this to avoid unchecked cast.

If I understand correctly, the purpose of using a metadata table here is not to use the Table interface, but instead to reuse some of the code in the all_entries table

I think this code is used to avoid actually accessing the underlying rows, see #3462 (comment). We can/should use the Table interface to access Manifest files directly.

How does that sound to you?

advancedxy · 2024-07-30T00:56:11Z

core/src/main/java/org/apache/iceberg/TableMetadata.java

+  /**
+   * Prune the unused partition specs from the table metadata.
+   *
+   * <p>Note: it's not safe for external client to call this directly, it's usually called by the


Let me remove the note then.

Also, we are no longer adding new operation methods to this. These days we make modifications to the Builder instead.

For this part, I think modifications are still going through the Builders. The main purpose of this method is to provide a single access point to prune unused specs purely so that prune unused specs are not mixed with other metadata updates.

advancedxy · 2024-07-30T01:12:28Z

core/src/main/java/org/apache/iceberg/TableMetadata.java

@@ -1102,6 +1121,22 @@ public Builder setDefaultPartitionSpec(int specId) {
      return this;
    }

+    private Builder removePartitionSpec(PartitionSpec spec) {
+      Preconditions.checkArgument(
+          changes.isEmpty(), "Cannot remove partition spec with other metadata update");


I think this is needed to avoid conflicts with changes to the default spec ID.

Yeah, besides SetDefaultSepc, AddPartitionSpec might also interfere with this. For safety purposes, the previous implementation rejects all other metadata updates since removePartitionSpec is rarely called and always called alone.

core/src/main/java/org/apache/iceberg/TableMetadata.java

advancedxy · 2024-07-30T01:16:50Z

core/src/main/java/org/apache/iceberg/TableMetadata.java

+      if (hasRemovedSpecs) {
+        Preconditions.checkArgument(
+            changes.isEmpty(), "Cannot remove partition specs with other metadata update");


See https://github.com/apache/iceberg/pull/10755/files#r1690420193 and https://github.com/apache/iceberg/pull/10755/files#r1696099907

I think this check is to make sure that RemoveUnusedSpecs and other metadata updates are not happened together.

advancedxy · 2024-07-30T01:38:41Z

core/src/main/java/org/apache/iceberg/TableMetadata.java

@@ -1425,6 +1460,7 @@ private boolean hasChanges() {
          || (discardChanges && !changes.isEmpty())
          || metadataLocation != null
          || suppressHistoricalSnapshots
+          || hasRemovedSpecs


is this a way so that we avoid having to add the metadata update type for REST (since then that would ultimately just be in the changes list)?

Yes, as discussed earlier, I don't think we should expose removePartitionSpec directly to the REST API as there's no easy way to ensure that the spec to remove is indeed not used.

This is not the right way to update hasChanges. Instead, this needs to add a change to changes.

It was adding a RemovePartitionSpec to the changes. However, that would require a REST spec change and it's a bit heavy. See discussions as well: #10755 (comment)

That way it is sent to REST services to modify the table in the REST commit path.

I might be missing something, so all the metadata changes have to be sent to the REST service? I didn't work with a REST catalog before and don't see how changes are sent to the REST service. It would be great that some reference or code could be pointed to.

core/src/main/java/org/apache/iceberg/BaseRemoveUnusedSpecs.java

advancedxy · 2024-09-18T02:30:47Z

Gently ping @amogh-jahagirdar @rdblue @RussellSpitzer

amogh-jahagirdar · 2024-09-20T01:18:27Z

Sorry for missing following up on this @advancedxy yes I think we should send the new update type as part of this PR.that would require some parser changes as well for the update type. Let me know if any help is needed, I can try taking a look at this tomorrow if we wanted to get in the parser changes first and then rebase this PR on top of it so this PR is more focused on the operation parts.

advancedxy · 2024-09-20T02:32:11Z

yes I think we should send the new update type as part of this PR.that would require some parser changes as well for the update type. Let me know if any help is needed, I can try taking a look at this tomorrow if we wanted to get in the parser changes first

OK, no problem. Thanks for offering the help, I think I can add these relevant codes in this PR as well. We can split the parser changes as a separate PR if needed.

advancedxy · 2024-09-23T08:07:17Z

@amogh-jahagirdar Updated, It would be great if you can take another look at this.

advancedxy · 2024-09-30T06:13:50Z

Gently ping @amogh-jahagirdar @RussellSpitzer @rdblue

amogh-jahagirdar · 2024-10-01T03:48:23Z

Sorry for the delay @advancedxy , I'll take a look at this first thing tomorrow morning!

amogh-jahagirdar

Ok @advancedxy , sorry for the delay I think the main points from my side:

I'm still leaning a bit towards embedding the unused specs/schemas logic in snapshot expiration logic. I think we can expose the referenced specs from file cleanup (those are all package private, not public APIs). Then we wouldn't need the additional maintenance APIs, it would just be an additional option on snapshot expiration API. The tricky part may be how the commit is performed since file cleanup happens after successful commit of the snapshot expiration. Let me see if I can raise a PR to your branch soon since I know this PR has been open for a while and there's interest, to show what I mean and we can discuss further there.

If that doesn't work or we prefer the additional API, I think we'd probably want to rename MetadataMaintenance to something a bit more narrow like SchemaMaintenance since there are other metadata maintenance operations.

amogh-jahagirdar · 2024-10-01T16:00:47Z

api/src/main/java/org/apache/iceberg/MetadataMaintenance.java

+package org.apache.iceberg;
+
+/** APIs for table metadata maintenance, such as removing unused partition specs. */
+public interface MetadataMaintenance {


Hm I still feel like having the separate Maintenance API has the issue of what goes in here because a user using the API may also feel like expire snapshots should also go here since expire snapshots can also naturally be considered as "metadata maintenance". Maybe it's just a naming issue, so I'll think more about alternatives.

One alternative that comes to mind is SchemaMaintenance. Even though we are removing unused specs, the partition spec in the end is how the partition values are derived from the fields in the schema. And it also fits with the future schema pruning we want to do.

I think in my head I was really thinking adding as part of expire snapshots wouldn't add too much complexity to the procedure and users who are running that would also generally want to prune the unused specs/schemas. I'll look into that a bit more.

cc @RussellSpitzer @rdblue if they have any thoughts on this

@advancedxy I published a PR to your branch advancedxy#1. Let me know what you think!

I did a quick look at your pr: it doesn't add a breaking change to the ExpireSnapshots, so I prefer your idea now. I think we can discuss over your pr there and merge your code first.

I will take a detailed look at your pr tomorrow morning.

amogh-jahagirdar · 2024-10-01T16:15:32Z

api/src/main/java/org/apache/iceberg/RemoveUnusedSpecs.java

+ *
+ * <p>{@link #apply()} returns the specs that will remain if committed on the current metadata
+ */
+public interface RemoveUnusedSpecs extends PendingUpdate<List<PartitionSpec>> {}


I agree it's attempting. But I would prefer to use a maintenance API, for the following reasons: Currently ExpireSnapshots extends PendingUpdate<List>, if we are going to remove unused spec(and maybe unused schemas as well), we have to change the interface signature, which is breaking change.
ExpireSnapshots doesn't read manifest list yet, it only leverage SnapshotRef and Snapshot to calculate expired snapshots. It's adding complexity and breaks the do one thing and do it well philosophy.

I don't think this is necessarily true that we need to change the interface signature. We are still producing a new set of snapshots that are removed as part of the procedure but in addition to that we are (in an opt-in manner) pruning out the unused specs/schemas which is orthogonal to snapshots for this purpose.

As part of file cleanup the procedure does determine which files are still referenced (going through manifest lists/manifests) and which should be removed (this is in the FileCleanupStrategy implementations like ReachableFileCleanup. I think it's true that it adds a bit of complexity but the complexity is tracking which partition specs are referenced as we traverse the manifests. The same users which are frequently running expire snapshots also probably want to get rid of unused specs/schemas to keep metadata sizes smaller. After some more thought, I feel like it aligns with "one thing and do it well" since the "one thing" logic we're talking about is already common (the logic to traverse manifests) relative to expanding the API surface

core/src/main/java/org/apache/iceberg/MetadataUpdateParser.java

amogh-jahagirdar · 2024-10-01T16:26:56Z

core/src/main/java/org/apache/iceberg/TableMetadata.java

@@ -1108,6 +1108,46 @@ public Builder setDefaultPartitionSpec(int specId) {
      return this;
    }

+    Builder removeUnusedSpecs(Iterable<PartitionSpec> specsToRemove) {


Sorry not completely following, do we need this? Why couldn't we always just use removeUnusedSpecsById?

this method has additional check so that we don't accidentally delete with other unknown partition specs, I think this should be preferred over removeUnusedSpecsById which should be used in REST server implementations.

Previously there was no way to remove partition specs from a table once they were added. To fix this we add an api which searches through all reachable manifest files and records their specsIds. Any specIds which do not find are marked for removal which is done through a serializable commit.

amogh-jahagirdar · 2024-10-31T18:35:19Z

@advancedxy I updated the PR to your branch advancedxy#1 in case there was still agreement on adding all of this metadata cleanup as part of snapshot expiration. (Sorry accidentally hit the close button, I meant to hit the comment button, just re-opened)

…xpiration Remove specs as part of expiration

amogh-jahagirdar · 2024-11-05T16:40:36Z

api/src/main/java/org/apache/iceberg/ExpireSnapshots.java

+   *     reachable by any snapshot
+   * @return this for method chaining
+   */
+  default ExpireSnapshots removeUnusedSpecs(boolean removeUnusedSpecs) {


Should we make this API a more generic removeUnusedTableMetadata? This goes back to the previous discussion that removing schemas and partition specs, requires the same level of work that the implementation has to do so imo there's not much value in separating them and forcing a user to chain multiple methods. Generally if they want to remove unused specs, they probably also want to remove unused schemas as well.

Also, while I think we generally try and avoid boolean arguments in APIs, this may be one case where it makes sense. Down the line, if we want to make this behavior the default and have a path for users to disable cleanup of specs/schemas, they can.

cc @danielcweeks @RussellSpitzer @rdblue @nastra thoughts?

I'm not sure we want to even allow the option to not do this. Is there a benefit to leaving a spec or schema in place if it is no longer in use?

That said, I would be fine with just having a "cleanMetadata(boolean cleanMetadata: True)"

@RussellSpitzer I was thinking about not exposing the API at all and doing the cleanup by default in the implementation since it's true that there's no real benefit to keeping the spec/schema in place.

The rationale for having the API is more REST + compatibility related.

A bit ago, we added the ability to send remove spec updates to the server and if we change the snapshot expiration implementation to just always remove metadata in the implementation, servers may not be able to handle the new message yet since it was recently added as a possible update type and services would perhaps unnecessarily fail the entire commit as part of expiration since the service would say spec removal is unsupported. Even though the service could've removed the snapshots.

In the current model a client would opt-in knowing that the service would support the spec removal.
There may be a different way to handle this though so we can keep it all implicit in the procedure though.

That said, I would be fine with just having a "cleanMetadata(boolean cleanMetadata: True)"

I think this is a good candidate, or we should be more specific like cleanExpiredFiles, we should call it cleanExpiredMeta(boolean clean). WDYT? @RussellSpitzer @amogh-jahagirdar

The rationale for having the API is more REST + compatibility related.

This is well thought. I'm in favor of exposing this as an API. As for the boolean parameter, I think it would be consistent with cleanExpiredFiles and it would be easier to call it in a fluent way when expiring files and meta are determined by external caller.

Updated. PTAL again @RussellSpitzer @amogh-jahagirdar

nastra · 2024-11-12T11:32:07Z

core/src/main/java/org/apache/iceberg/MetadataUpdateParser.java

@@ -126,6 +127,9 @@ private MetadataUpdateParser() {}
  // SetCurrentViewVersion
  private static final String VIEW_VERSION_ID = "view-version-id";

+  // RemovePartitionSpecs
+  private static final String PARTITION_SPEC_IDS = "partition-spec-ids";


please add tests for this to TestMetadataUpdateParser

also this is called spec-ids in the OpenAPI definition:

iceberg/open-api/rest-catalog-open-api.yaml

Line 2958 in d368a5f

spec-ids:

nastra · 2024-11-12T11:34:19Z

core/src/main/java/org/apache/iceberg/UpdateRequirements.java

@@ -173,6 +175,26 @@ private void update(MetadataUpdate.SetDefaultSortOrder unused) {
      }
    }

+    private void update(MetadataUpdate.RemovePartitionSpecs unused) {


please add a test to TestUpdateRequirements

nastra · 2024-11-12T11:38:22Z

core/src/test/java/org/apache/iceberg/TestRemoveSnapshots.java

+            file.location(),
+            append.manifestListLocation(),
+            delete.manifestListLocation());
+    assertThat(Iterables.getOnlyElement(table.specs().keySet()))


Suggested change

assertThat(Iterables.getOnlyElement(table.specs().keySet()))

assertThat(table.specs().keySet()).containsExactly(idAndDataBucketSpec.specId())

no need to use Iterables.getOnlyElement

nastra · 2024-11-12T11:38:37Z

core/src/test/java/org/apache/iceberg/TestRemoveSnapshots.java

+        .commit();
+
+    assertThat(deletedFiles).containsExactlyInAnyOrder(append.manifestListLocation());
+    assertThat(Iterables.getOnlyElement(table.specs().keySet()))


same as above

nastra · 2024-11-12T11:38:58Z

core/src/test/java/org/apache/iceberg/catalog/CatalogTests.java

+
+    Table loaded = catalog.loadTable(TABLE);
+    assertThat(loaded.specs().values())
+        .hasSameElementsAs(Lists.asList(spec, current, new PartitionSpec[0]));


containsExactly(..)

nastra · 2024-11-12T11:39:30Z

core/src/test/java/org/apache/iceberg/catalog/CatalogTests.java

+    table.updateSpec().addField(Expressions.bucket("data", 16)).commit();
+    table.updateSpec().removeField(Expressions.bucket("data", 16)).commit();
+    table.updateSpec().addField("data").commit();
+    assertThat(table.specs().size()).as("Should have 3 total specs").isEqualTo(3);


Suggested change

assertThat(table.specs().size()).as("Should have 3 total specs").isEqualTo(3);

assertThat(table.specs()).as("Should have 3 total specs").hasSize(3);

github-actions bot added API core labels Jul 23, 2024

advancedxy mentioned this pull request Jul 23, 2024

Core: Prevent dropping column which is referenced by active partition… #10352

Closed

amogh-jahagirdar requested changes Jul 23, 2024

View reviewed changes

core/src/main/java/org/apache/iceberg/TableMetadata.java Outdated Show resolved Hide resolved

advancedxy commented Jul 24, 2024

View reviewed changes

core/src/main/java/org/apache/iceberg/TableMetadata.java Outdated Show resolved Hide resolved

RussellSpitzer reviewed Jul 24, 2024

View reviewed changes

core/src/main/java/org/apache/iceberg/TableMetadata.java Outdated Show resolved Hide resolved

RussellSpitzer reviewed Jul 24, 2024

View reviewed changes

core/src/main/java/org/apache/iceberg/TableMetadata.java Show resolved Hide resolved

advancedxy commented Jul 25, 2024

View reviewed changes

core/src/main/java/org/apache/iceberg/TableMetadata.java Outdated Show resolved Hide resolved

advancedxy closed this Jul 26, 2024

advancedxy reopened this Jul 26, 2024

RussellSpitzer reviewed Jul 26, 2024

View reviewed changes

api/src/main/java/org/apache/iceberg/RemoveUnusedSpecs.java Outdated Show resolved Hide resolved

RussellSpitzer reviewed Jul 26, 2024

View reviewed changes

core/src/main/java/org/apache/iceberg/BaseMetadataTable.java Outdated Show resolved Hide resolved

RussellSpitzer reviewed Jul 26, 2024

View reviewed changes

core/src/main/java/org/apache/iceberg/BaseRemoveUnusedSpecs.java Outdated Show resolved Hide resolved

RussellSpitzer reviewed Jul 26, 2024

View reviewed changes

core/src/main/java/org/apache/iceberg/BaseRemoveUnusedSpecs.java Outdated Show resolved Hide resolved

RussellSpitzer reviewed Jul 26, 2024

View reviewed changes

core/src/test/java/org/apache/iceberg/TestRemoveUnusedSpecs.java Outdated Show resolved Hide resolved

RussellSpitzer approved these changes Jul 26, 2024

View reviewed changes

amogh-jahagirdar reviewed Jul 29, 2024

View reviewed changes

amogh-jahagirdar requested a review from rdblue July 29, 2024 23:27

amogh-jahagirdar reviewed Jul 29, 2024

View reviewed changes

core/src/main/java/org/apache/iceberg/BaseRemoveUnusedSpecs.java Outdated Show resolved Hide resolved

rdblue reviewed Jul 29, 2024

View reviewed changes

advancedxy commented Jul 30, 2024

View reviewed changes

rdblue reviewed Jul 30, 2024

View reviewed changes

core/src/main/java/org/apache/iceberg/BaseRemoveUnusedSpecs.java Outdated Show resolved Hide resolved

rdblue reviewed Jul 30, 2024

View reviewed changes

core/src/main/java/org/apache/iceberg/BaseRemoveUnusedSpecs.java Outdated Show resolved Hide resolved

advancedxy force-pushed the remove_unused_spec branch from f499164 to a85edcf Compare September 23, 2024 03:28

amogh-jahagirdar self-requested a review October 1, 2024 03:48

amogh-jahagirdar reviewed Oct 1, 2024

View reviewed changes

RussellSpitzer and others added 2 commits October 28, 2024 15:48

feat: Support RemovePartitionSpec in REST catalog

2f3544c

advancedxy force-pushed the remove_unused_spec branch from a85edcf to 2f3544c Compare October 28, 2024 08:20

ajantha-bhat mentioned this pull request Oct 30, 2024

API to optimize table metadata #11425

Open

3 tasks

Remove specs as part of expiration

5841aef

amogh-jahagirdar closed this Oct 31, 2024

amogh-jahagirdar reopened this Oct 31, 2024

Merge pull request #1 from amogh-jahagirdar/remove-specs-as-part-of-e…

1527c81

…xpiration Remove specs as part of expiration

github-actions bot added the spark label Nov 2, 2024

refine

5b21438

advancedxy force-pushed the remove_unused_spec branch from 80e71e5 to 5b21438 Compare November 4, 2024 02:58

advancedxy changed the title ~~API: Add RemoveUnusedSpecs in Table~~ API: Support removeUnusedSpecs in ExpireSnapshots Nov 4, 2024

amogh-jahagirdar reviewed Nov 5, 2024

View reviewed changes

address comments

5ee0f61

nastra reviewed Nov 12, 2024

View reviewed changes

	assertThat(Iterables.getOnlyElement(table.specs().keySet()))
	assertThat(table.specs().keySet()).containsExactly(idAndDataBucketSpec.specId())

	assertThat(table.specs().size()).as("Should have 3 total specs").isEqualTo(3);
	assertThat(table.specs()).as("Should have 3 total specs").hasSize(3);

API: Support removeUnusedSpecs in ExpireSnapshots #10755

Are you sure you want to change the base?

API: Support removeUnusedSpecs in ExpireSnapshots #10755

Conversation

advancedxy commented Jul 23, 2024 • edited Loading

advancedxy commented Jul 23, 2024

amogh-jahagirdar left a comment • edited Loading

Choose a reason for hiding this comment

advancedxy commented Jul 24, 2024

advancedxy commented Jul 26, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

RussellSpitzer left a comment

Choose a reason for hiding this comment

amogh-jahagirdar left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

amogh-jahagirdar Jul 30, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

amogh-jahagirdar Jul 30, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

advancedxy left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

advancedxy commented Sep 18, 2024

amogh-jahagirdar commented Sep 20, 2024

advancedxy commented Sep 20, 2024

advancedxy commented Sep 23, 2024

advancedxy commented Sep 30, 2024

amogh-jahagirdar commented Oct 1, 2024

amogh-jahagirdar left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

amogh-jahagirdar commented Oct 31, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

amogh-jahagirdar Nov 5, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

advancedxy commented Jul 23, 2024 •

edited

Loading

amogh-jahagirdar left a comment •

edited

Loading

amogh-jahagirdar left a comment •

edited

Loading

amogh-jahagirdar Jul 30, 2024 •

edited

Loading

amogh-jahagirdar Jul 30, 2024 •

edited

Loading

amogh-jahagirdar left a comment •

edited

Loading

amogh-jahagirdar commented Oct 31, 2024 •

edited

Loading

amogh-jahagirdar Nov 5, 2024 •

edited

Loading