Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updating Table log retention configuration with write_deltalake silently changes nothing #2108

Closed
liamphmurphy opened this issue Jan 22, 2024 · 4 comments
Labels
enhancement New feature or request

Comments

@liamphmurphy
Copy link
Contributor

liamphmurphy commented Jan 22, 2024

Environment

Delta-rs version:
python-v0.15.1

Binding:
Same as above

Environment:

  • Other: Local, tested on s3 with localstack and local filesystem

Bug

What happened: We have a use case of needing to update our log retention from the default 30 days to 7 days. We already have a set of delta tables made. When testing locally, I wanted to see if an existing delta table could have its log retention updated when the delta table is already in place.

From my testing, changing the configuration mapping passed into the write_deltalake between runs silently does nothing.

What you expected to happen:

If this is a supported use-case, then I'd expect to see the table's configuration updated when changing the configuration mapping.

If it is not a supported use-case, then I'd expect some kind of Exception.

How to reproduce it:

First, I write a Delta table like so, and then print it's metadata object:

write_deltalake(
          s3_path, 
          table,
          schema=pyarrow_schema,
          mode="append",
          partition_by=["date","hour"]
)
delta_table = deltalake.DeltaTable(s3_path) 
print(delta_table.metadata())

As expected, it prints out a Table metadata object with an empty configuration:

Metadata(id: cf08d466-4126-403b-ae22-12649565275d, name: None, description: None, partition_columns: ['date', 'hour'], created_time: 1705964444931, configuration: {})

Then on a second run, on the same table, where the configuration changes and printing the metadata object:

write_deltalake(
  s3_path, 
  table,
  schema=pyarrow_schema,
  mode="append",
  partition_by=["date","hour"],
  configuration={"logRetentionDuration": "7"}
)
delta_table = deltalake.DeltaTable(s3_path) 
print(delta_table.metadata())

The write gives no indication of a failure but after printing the table, the configuration object is empty. It displays the same object as above.

@liamphmurphy liamphmurphy added the bug Something isn't working label Jan 22, 2024
@ion-elgreco
Copy link
Collaborator

@liamphmurphy this is to be expected because only the first write operation can add a configuration, while consecutive changes to the config should be done with Alter table properties. I started working on unsetting some properties, will work on creating a set_table_properties soon

@liamphmurphy
Copy link
Contributor Author

@liamphmurphy this is to be expected because only the first write operation can add a configuration, while consecutive changes to the config should be done with Alter table properties. I started working on unsetting some properties, will work on creating a set_table_properties soon

@ion-elgreco Thanks for the quick response! So if I understand correctly, the use case will be supported but the functionality is not in place yet?

@ion-elgreco
Copy link
Collaborator

Yes, but it will be a separate operation you need to trigger in the future. DeltaTable().alter.set_table_properties()

@ion-elgreco ion-elgreco added enhancement New feature or request and removed bug Something isn't working labels Jan 25, 2024
@ion-elgreco
Copy link
Collaborator

You can track that one here: #1663

@ion-elgreco ion-elgreco closed this as not planned Won't fix, can't repro, duplicate, stale Jan 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants