Skip to content

add option types#6881

Merged
mccanne merged 1 commit into
mainfrom
option-type
May 1, 2026
Merged

add option types#6881
mccanne merged 1 commit into
mainfrom
option-type

Conversation

@mccanne
Copy link
Copy Markdown
Collaborator

@mccanne mccanne commented May 1, 2026

This commit replaces the previous optional field design with legit option types. This model follows the Rust pattern where optional values must be unwrapped and dealt with when encountered. Option types are implemented as a union type that contains type none. A "some" value is a non-none value coded in an option union and a "none" value is a type none value coded in an option value. None values should never be serialized by themselves but the formats support such serialization for debugging etc.

When we add strict type checking, option values that aren't unwrapped will cause the query compilation fail with a type check error.

The BSUP and CSUP formats have changed to support this and their version numbers bumped.

The SUP formatter formats record fields with optional types with a trailing "?" and for types of the form "T|none", then union syntax is dropped (even though a union is implied). If we so decide, we can implement this for larger option unions and also for fusions that embed an option supertype. For now, we will deal with the noisier syntax and see how it feels.

The vector.Option and CSUP/vcache counterparts are not finished and instead this commit codes option values as full-blown unions. In a subsequent PR, we will add support for run-length encoding of option values, which is needed for decent performance by fjson etc when fusion creates wide records. We left behind some comments and stubs that will be utilized in the vector.Option PR.

Now that we have option types, we will reconsider how error("missing") and error("quiet") and related support functionality works in a forthcoming PR.

This change breaks the Parquet writer that used to work for optional fields since it can't handle the none type. In a subsequent PR, we will fix this by treating none as null in Parquet.

Docs updates for option types and the format changes will be merged in a future PR.

This commit replaces the previous optional field design with
legit option types.  This model follows the Rust pattern where
optional values must be unwrapped and dealt with when encountered.
Option types are implemented as a union type that contains type none.
A "some" value is a non-none value coded in an option union and a
"none" value is a type none value coded in an option value.
None values should never be serialized by themselves but the
formats support such serialization for debugging etc.

When we add strict type checking, option values that aren't unwrapped
will cause the query compilation fail with a type check error.

The BSUP and CSUP formats have changed to support this and their
version numbers bumped.

The SUP formatter formats record fields with optional types with
a trailing "?" and for types of the form "T|none", then union syntax
is dropped (even though a union is implied).  If we so decide,
we can implement this for larger option unions and also for
fusions that embed an option supertype.  For now, we will deal
with the noisier syntax and see how it feels.

The vector.Option and CSUP/vcache counterparts are not finished and
instead this commit codes option values as full-blown unions.  In
a subsequent PR, we will add support for run-length encoding of
option values, which is needed for decent performance by fjson etc
when fusion creates wide records.  We left behind some comments and
stubs that will be utilized in the vector.Option PR.

Now that we have option types, we will reconsider how error("missing")
and error("quiet") and related support functionality works in
a forthcoming PR.

This change breaks the Parquet writer that used to work for optional
fields since it can't handle the none type.  In a subsequent PR,
we will fix this by treating none as null in Parquet.

Docs updates for option types and the format changes will be
merged in a future PR.

Co-authored-by: Noah Treuhaft <noah.treuhaft@gmail.com>
errs := c.popErrs()
if !valid {
c.keepErrs(errs)
c.keepErrs(errs[:1])
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reasoning behind this change?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate errors over multiple union types. Best to just report one. The checker needs another overall scrubbing with a bit of refactoring to make this all a bit easier.

Comment thread vector/const.go
Comment on lines +98 to +99
case id == super.IDNone:
return &Const{NewNone(length), length}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sure you did this get test passing but I don't think we should be doing this

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree and didn't want to put that there but I don't know that it's quite factored the right way. We can fix it later.

@mccanne mccanne merged commit 37eebac into main May 1, 2026
4 checks passed
@mccanne mccanne deleted the option-type branch May 1, 2026 21:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants