Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better Iceberg support #1448

Open
2 of 11 tasks
scsmithr opened this issue Aug 1, 2023 · 1 comment
Open
2 of 11 tasks

Better Iceberg support #1448

scsmithr opened this issue Aug 1, 2023 · 1 comment
Labels
feat 🎇 New feature or request

Comments

@scsmithr
Copy link
Member

scsmithr commented Aug 1, 2023

Initial iceberg support was added in #1382. This issue serves to track further improvements to our iceberg support.

Grouped roughly by priority. Non-exhaustive, and priorities subject to change as we get feedback.

In addition to these, we'll want to look at upstreaming stuff to iceberg-rust.

High

  • Understand where we fall apart when it comes to reading V1 format.
  • Add tests for GCS/S3
  • Add tests with iceberg table containing all types supported by iceberg

Medium

  • Support handling row-level deletes
  • Support schema evolution
  • Default value handling
  • Properly provide partitions to ParquetExec (perf)

Low

  • Surface statistics to ParquetExec (perf)

Unknown

  • CREATE EXTERNAL TABLE support (catalog?)
  • Write support
  • ORC, Avro support
@scsmithr scsmithr added the feat 🎇 New feature or request label Aug 1, 2023
@vrongmeal
Copy link
Contributor

I've got a v1 dataset working but we need some old datasets in order to fix more stuff. Moreover, currently I only have serde(default) to get it working, but in longer term we would want to have different structs for v1 and v2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feat 🎇 New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants