This Epic is for tracking the formalization and soundness of the Vortex type system.
There are some corners with our current implementation of the Vortex type system that we would ideally like to resolve. This issue will be a way to link to all of these threads.
Status
Proposed.
It is not clear if we actually need to make changes. But if we do decide to move forward with anything, it will likely cause large refactors and breakage.
Goal / Motivation / Issues
We want the Vortex type system to be both sound and well-implemented. People building systems with Vortex in the future should never run into some of the issues we ourselves have run into. They should be able to fearlessly use our built-in types and add their own extension types without worrying about performance.
Unresolved questions
We have run into several issues (mostly performance-related) in how we have implemented Vortex types. Here are a few of them:
TODO ^ these things should be new tracking issues or discussions.
The issues above seem to imply that some adjustments are needed to ensure that we don't keep running into problems like these as we add more type logic in Vortex (with extension types, with new canonical types like union, etc). What those adjustments are is unclear.
This Epic is for tracking the formalization and soundness of the Vortex type system.
There are some corners with our current implementation of the Vortex type system that we would ideally like to resolve. This issue will be a way to link to all of these threads.
Status
Proposed.
It is not clear if we actually need to make changes. But if we do decide to move forward with anything, it will likely cause large refactors and breakage.
Goal / Motivation / Issues
We want the Vortex type system to be both sound and well-implemented. People building systems with Vortex in the future should never run into some of the issues we ourselves have run into. They should be able to fearlessly use our built-in types and add their own extension types without worrying about performance.
Unresolved questions
We have run into several issues (mostly performance-related) in how we have implemented Vortex types. Here are a few of them:
ListandListVieware logically equivalent, they have vastly different performance characteristics and semantics. Choosing to change the canonical list type fromListtoListView(see Tracking Issue: CanonicalizeListViewoverList#4699) caused many performance issues that we still have not fully figured outPrimitiveandDecimalarrays, as the physical encoding is the essentially identical, but we split it into 2 different arrays.BinaryandUtf8useVarBinViewas their canonical encoding, even though they have different semantics and compression characteristics. This is the opposite problem toPrimitivevsDecimalcanonical arrays.FixedSizeBinarybe a new logical type? Or should we just have aFlatencoding that allows for any aligned fixed-size binary values that would encodePrimitive,Decimal, andFixedSizeBinary? Note that this is also similar toFixedSizeList<u8>[size].DecimalArrayLogical and Physical type mismatch #5820TODO ^ these things should be new tracking issues or discussions.
The issues above seem to imply that some adjustments are needed to ensure that we don't keep running into problems like these as we add more type logic in Vortex (with extension types, with new canonical types like union, etc). What those adjustments are is unclear.