Skip to content

Vortex Fixed-Shape Tensor#6812

Merged
connortsui20 merged 14 commits intodevelopfrom
ct/tensor-experiment
Mar 6, 2026
Merged

Vortex Fixed-Shape Tensor#6812
connortsui20 merged 14 commits intodevelopfrom
ct/tensor-experiment

Conversation

@connortsui20
Copy link
Contributor

@connortsui20 connortsui20 commented Mar 5, 2026

Summary

Adds an experimental fixed-shape tensor extension type in a new vortex-tensor crate.

See https://vortex-data.github.io/rfcs/rfc/0024.html for info about the design of this tensor type.

Additionally adds a CosineSimilarity expression that takes 2 tensor arrays and computes the cosine similarity of tensors in the arrays (resulting in a PrimitiveArray).

Testing

Adds some very basic tests for cosine similarity and tensor metadata operations.

Future Work

I think this was a good way to see if our ExtVTable is not completely wrong, but at the same time this tells us nothing about what we might want to add for extension arrays on the ExtVTable because we as long as the storage DType is correct, any storage array is valid.

The more interesting expressions have not been implemented here. Those would include:

  • Cast
  • Index / Slice (and lazily get back another tensor array, potentially non-contiguous)
  • Maybe others?

Additional work includes exporting to Arrow, NumPy, and PyTorch. Arrow will require a cheap translation from logical to physical shape, but other than that those conversions should be easy.

@connortsui20 connortsui20 requested a review from gatesn March 5, 2026 20:49
@connortsui20 connortsui20 added the changelog/feature A new feature label Mar 5, 2026
Copy link
Contributor

@gatesn gatesn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Maybe move cosine similarity into a scalar_fns module since we're going to add more.

You should also get public-api to work somehow?

@connortsui20 connortsui20 marked this pull request as ready for review March 5, 2026 22:30
@connortsui20 connortsui20 requested review from danking and gatesn March 5, 2026 22:30
@connortsui20 connortsui20 changed the title Experiment: Vortex Fixed-Shape Tensor Vortex Fixed-Shape Tensor Mar 5, 2026
@connortsui20 connortsui20 requested a review from AdamGS March 5, 2026 22:31
Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
@connortsui20 connortsui20 force-pushed the ct/tensor-experiment branch from 189c70d to 3da1787 Compare March 6, 2026 18:19
@connortsui20 connortsui20 enabled auto-merge (squash) March 6, 2026 18:19
@connortsui20 connortsui20 merged commit 4cbfb33 into develop Mar 6, 2026
49 checks passed
@connortsui20 connortsui20 deleted the ct/tensor-experiment branch March 6, 2026 18:26
@codspeed-hq
Copy link

codspeed-hq bot commented Mar 6, 2026

Merging this PR will improve performance by 23.63%

⚡ 3 improved benchmarks
✅ 977 untouched benchmarks
🆕 20 new benchmarks
⏩ 1466 skipped benchmarks1

Performance Changes

Mode Benchmark BASE HEAD Efficiency
Simulation take_map[(0.1, 0.5)] 2.6 ms 2.1 ms +23.63%
Simulation take_map[(0.1, 0.1)] 1,007.5 µs 908.9 µs +10.84%
Simulation take_map[(0.1, 1.0)] 4.2 ms 3.5 ms +20.81%
🆕 Simulation sequence_compress_u32 N/A 5.2 ms N/A
🆕 Simulation sequence_decompress_u32 N/A 4.1 ms N/A
🆕 Simulation decompress_utf8[(1000, 16)] N/A 31 µs N/A
🆕 Simulation decompress_utf8[(1000, 256)] N/A 28.7 µs N/A
🆕 Simulation decompress_utf8[(10000, 16)] N/A 136 µs N/A
🆕 Simulation decompress_utf8[(10000, 1024)] N/A 112.6 µs N/A
🆕 Simulation decompress_utf8[(100000, 16)] N/A 1.2 ms N/A
🆕 Simulation decompress_utf8[(10000, 256)] N/A 113.3 µs N/A
🆕 Simulation decompress_utf8[(1000, 4)] N/A 42.8 µs N/A
🆕 Simulation decompress_utf8[(10000, 4)] N/A 209.9 µs N/A
🆕 Simulation decompress_utf8[(100000, 1024)] N/A 946.9 µs N/A
🆕 Simulation decompress_utf8[(100000, 4096)] N/A 944.7 µs N/A
🆕 Simulation decompress_utf8[(1000000, 256)] N/A 9.4 ms N/A
🆕 Simulation decompress_utf8[(100000, 256)] N/A 958.3 µs N/A
🆕 Simulation decompress_utf8[(1000000, 1024)] N/A 9.3 ms N/A
🆕 Simulation decompress_utf8[(1000000, 4)] N/A 19 ms N/A
🆕 Simulation decompress_utf8[(100000, 4)] N/A 1.9 ms N/A
... ... ... ... ... ...

ℹ️ Only the first 20 benchmarks are displayed. Go to the app to view all benchmarks.


Comparing ct/tensor-experiment (3da1787) with develop (5d6a3c8)2

Open in CodSpeed

Footnotes

  1. 1466 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

  2. No successful run was found on develop (8eee959) during the generation of this report, so 5d6a3c8 was used instead as the comparison base. There might be some changes unrelated to this pull request in this report.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

changelog/feature A new feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants