Tracking Issue: TurboQuant

This is a tracking issue for adding TurboQuant quantization as a feature in Vortex, but notably it will _not_ be a new physical encoding for `Vector`, but rather it will be a new logical type.

## Motivation



We would like to support the lossy compression (quantization) of data in Vortex. However, it was not entirely clear how we would achieve this in a file format, where many of our invariants rely on the assumption that our physical encodings are all lossless.

After some internal discussions, we have realized that "lossy" data and compression MUST live at the logical layer, not at the physical layer. It is logical because losing data is a full modification of the data, not just a different way of storing it.

For TurboQuant is particular, this means that it **cannot** be an encoding of the `Vector` type. There are too many assumptions that break if were an encoding, such as unit-normalization breaks for quantized vectors, and scalar functions and canonicalization do not return the same result as if we had not quantized the vectors with TQ.

However, this does not mean we want to remove TQ entirely. We see that there is value in building tooling for users who want to _purposefully_ write TQ-quantized vectors into Vortex files, and reading it back knowing that some of the information in the original data has been lost.

Users should be able to write whatever data they want into Vortex, however they want to structure it. But if they want to use a lossy compression / quantization scheme, they need to first transform the data themselves before writing it to Vortex, and they additionally need to make sure that the default Vortex compressor does not recompress the data that they have specially modified.

## Design



It is unclear if this is the ideal way to generally design lossy schemes, but it is certainly the easiest in terms of moving forward.

TurboQuant will live in a new `vortex-turboquant` crate that lives outside of the top-level `vortex` dependency tree, mimicking a third-party crate. This will carry a new `TurboQuant` extension type over a struct array that carries all of the components needed to quantize vector data, which is just the norms of the vectors and the codes (indices) into the centroid book (values), and the centroids can be constructed at read time (cached on the # dimensions and bit width).

Creating a new extension type has the benefit that we do not need to customize the behavior of the compressor, as the default compressor will just compress the inner storage `StructArray` and not canonicalize into a `Vector` array.

But on the flip side, this means that we can no longer canonicalize into a `Vector` array, and we have to reimplement all of the scalar functions on vectors (inner product, cosine similarity, etc) specifically for TQ.

We can still mimic canonicalization by having an `unpack` method that takes a `TurboQuant` extension array and converts it into a `Vector` array. This is actually ideal because we do not want to introduce this idea of computation of "lossy" data into the canonicalization system in Vortex, but we still have this functionality available if we don't want to reimplement scalar function logic on `TurboQuant` arrays.

## Steps



- [ ] Initial prototype (see implementation history)
- [ ] Initial benchmarking (see implementation history)
- [ ] Refactor: https://github.com/vortex-data/vortex/pull/7829
- [ ] Block decomposition?
- [ ] PDX?
- [ ] Documentation
- [ ] Public API stabilization

## Unresolved questions



TODO

## Implementation history

_A lot carried over from https://github.com/vortex-data/vortex/issues/7297_

- Add a first-class `Vector` extension type: https://github.com/vortex-data/vortex/pull/6964
- Add related expressions
    - Add `cosine_similarity`: https://github.com/vortex-data/vortex/pull/6812
    - Add `l2_norm`: https://github.com/vortex-data/vortex/pull/6964
    - Add `l2_denorm`: https://github.com/vortex-data/vortex/pull/7329
    - Add `inner_product`: https://github.com/vortex-data/vortex/pull/7269
    - Add `sorf` or some "make random" reversible expression: https://github.com/vortex-data/vortex/pull/7349
- Add initial TurboQuant encoding and quantized-domain similarity support: https://github.com/vortex-data/vortex/pull/7269
    - [TurboQuant RFC 33](https://vortex-data.github.io/rfcs/rfc/0033.html)
    - Make TurboQuant robust enough for real use
        - https://github.com/vortex-data/vortex/pull/7300
        - https://github.com/vortex-data/vortex/pull/7301
        - https://github.com/vortex-data/vortex/pull/7320
        - https://github.com/vortex-data/vortex/pull/7326
        - https://github.com/vortex-data/vortex/pull/7330
    - Modularize into `L2Denorm(norms, Sorf(matrix, Dict(centroids, codes)))`
        - https://github.com/vortex-data/vortex/pull/7349
- Implement pushdown / reduction rules for constant query vectors
    - https://github.com/vortex-data/vortex/pull/7394
    - https://github.com/vortex-data/vortex/pull/7396
- Vector search benchmarking
    - https://github.com/vortex-data/vortex/pull/7391 (naive and now deprecated)
    - https://github.com/vortex-data/vortex/pull/7399
    - https://github.com/vortex-data/vortex/pull/7446
    - https://github.com/vortex-data/vortex/pull/7458
    - https://github.com/vortex-data/vortex/pull/7499
- https://github.com/vortex-data/vortex/pull/7525
- https://github.com/vortex-data/vortex/pull/7610
- https://github.com/vortex-data/vortex/pull/7829

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tracking Issue: TurboQuant #7830

Motivation

Design

Steps

Unresolved questions

Implementation history

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Tracking Issue: TurboQuant #7830

Description

Motivation

Design

Steps

Unresolved questions

Implementation history

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions