Skip to content

Docs: Union type support spec#4664

Closed
funcheetah wants to merge 3 commits intoapache:masterfrom
funcheetah:non-optional-union-type-spec
Closed

Docs: Union type support spec#4664
funcheetah wants to merge 3 commits intoapache:masterfrom
funcheetah:non-optional-union-type-spec

Conversation

@funcheetah
Copy link

Summary

Apache Iceberg does not support non-optional union types (e.g. [“int”, “string”]). This PR add spec to support non-optional union types.

Representation

non-optional union type can be converted to original type for single type union or struct for complex union.

The struct representations converted from non-optional union types are consistent with non-optional union support added in Trino in trinodb/trino#3483.

Deep nested non-optional union types are supported.

Examples

Basic

[“int”, “string”] -> struct<tag int, field0 int, field1 string>

Single type

[“int”] -> int

Implementation PRs

TODO

  • Handle single type union (e.g. [“int”]) as its original type
  • Support for schema pruning within a complex union
  • Support in non-Spark environments (e.g. iceberg-data, flink, hive, etc.)

@github-actions github-actions bot added the docs label Apr 28, 2022
@rdblue
Copy link
Contributor

rdblue commented Apr 29, 2022

@funcheetah, these change cannot go in the spec. Iceberg does not allow union data.

It is fine for Iceberg implementations to support reading unions as structs for backward compatibility, but that is optional behavior and should not be required.

@rdblue rdblue closed this Apr 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants