Skip to content

Conversation

@tomsanbear
Copy link

@tomsanbear tomsanbear commented Dec 15, 2025

Summary

Introduces compound B-tree scalar indices to Lance, enabling efficient lookups on multi-column predicates using a single index structure instead of intersecting multiple single-column indices at query time.

Motivation

Many workloads filter on multiple columns simultaneously:

  • Multi-tenant + time-series: WHERE tenant_id = 'acme' AND timestamp > T
  • Status + time: WHERE status = 'active' AND created_at BETWEEN X AND Y
  • Hierarchical: WHERE region = 'us-west' AND department = 'engineering'

Compound indices store rows sorted by the combined key, following leftmost prefix semantics.

Query Patterns Supported

  • Full key equality: col1 = X AND col2 = Y AND col3 = Z
  • Prefix lookup: col1 = X or col1 = X AND col2 = Y
  • Prefix + range: col1 = X AND col2 > Y
  • IN-list: col1 IN (...) or col1 = X AND col2 IN (...)
  • IS NULL: col1 = X AND col2 IS NULL

API

dataset
    .create_index(&["tenant_id", "timestamp"], IndexType::Scalar)
    .with_index_name("tenant_time_idx")
    .execute()
    .await?;

Benchmark Results

Benchmarks run on multi-tenant time-series data with queries like WHERE tenant_id = X AND timestamp > Y:

Single fragment dataset:

Scenario No Index BTree (tenant only) Dual BTree Compound Compound vs BTree Compound vs Dual
Tenant only 9.31ms 456µs 459µs 1.13ms 2.47x slower 2.45x slower
Tenant + narrow range 10.62ms 476µs 377µs 670µs 1.41x slower 1.78x slower
Tenant + wide range 9.38ms 514µs 1.67ms 959µs 1.87x slower 1.74x faster
Tenant + full range 8.92ms 512µs 3.09ms 1.15ms 2.25x slower 2.68x faster
Timestamp only 4.43ms 4.43ms 3.47ms 4.22ms 1.05x faster 1.22x slower

Multi-fragment dataset (more realistic production scenario):

Scenario No Index BTree (tenant only) Dual BTree Compound Compound vs BTree Compound vs Dual
Tenant only 8.55ms 2.85ms 2.70ms 2.92ms 1.02x slower 1.08x slower
Tenant + narrow range 8.23ms 2.08ms 472µs 693µs 3.00x faster 1.47x slower
Tenant + medium range 7.43ms 2.31ms 2.77ms 2.07ms 1.11x faster 1.34x faster
Tenant + wide range 7.69ms 1.89ms 767µs 818µs 2.32x faster 1.07x slower

Production Experience

We've been running this implementation in our product with beta customers and are seeing stable, positive results. The index has been exercised against real multi-tenant time-series workloads.

A Note on the Implementation

I understand this is a large change. Many decisions, particularly around introducing CompoundSargableQuery as a separate type and the associated changes for managing index creation, were made pragmatically to ease the initial implementation and reduce the effort of maintaining this fork until we could upstream it.

Limitations

  • 2-8 columns per index: arbitarily decided, was unsure where this configuration should live or if we should allow it to be unbounded
  • OR conditions not directly supported
  • LIKE/pattern matching not supported
  • no support for postgres like "skip scans", left most prefix needs to match

@chatgpt-codex-connector
Copy link

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

@github-actions github-actions bot added the enhancement New feature or request label Dec 15, 2025
@wjones127
Copy link
Contributor

It looks like your benchmark only compared to having a BTree index on one of the columns. Could you share results if you create a BTree index on both columns? Our query engine can combine the results of multiple index lookups, so I'd be curious how that compared to a compound index.

@wjones127 wjones127 self-assigned this Dec 15, 2025
@tomsanbear
Copy link
Author

Reran the benchmark, will push an update to it later but here's a look at the initial result:

Benchmarks run on multi-tenant time-series data with queries like WHERE tenant_id = X AND timestamp > Y:

Single fragment dataset:

Scenario No Index BTree (tenant only) Dual BTree Compound Compound vs BTree Compound vs Dual
Tenant only 9.31ms 456µs 459µs 1.13ms 2.47x slower 2.45x slower
Tenant + narrow range 10.62ms 476µs 377µs 670µs 1.41x slower 1.78x slower
Tenant + wide range 9.38ms 514µs 1.67ms 959µs 1.87x slower 1.74x faster
Tenant + full range 8.92ms 512µs 3.09ms 1.15ms 2.25x slower 2.68x faster
Timestamp only 4.43ms 4.43ms 3.47ms 4.22ms 1.05x faster 1.22x slower

Multi-fragment dataset (more realistic production scenario):

Scenario No Index BTree (tenant only) Dual BTree Compound Compound vs BTree Compound vs Dual
Tenant only 8.55ms 2.85ms 2.70ms 2.92ms 1.02x slower 1.08x slower
Tenant + narrow range 8.23ms 2.08ms 472µs 693µs 3.00x faster 1.47x slower
Tenant + medium range 7.43ms 2.31ms 2.77ms 2.07ms 1.11x faster 1.34x faster
Tenant + wide range 7.69ms 1.89ms 767µs 818µs 2.32x faster 1.07x slower

Copy link
Contributor

@wjones127 wjones127 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My immediate concerns on this PR aren't really about the compound index itself, but really about the changes to expression handling and other parts the generic index code. I think this would be the first index that covers multiple columns, and I think it needs careful design. I think we need a design discussion on that level before we are ready to move forward on the compound index itself.

I think reviewing both the API for multi-column indices and the concrete impl of the compound index might be too much. I think we should first outline the API changes needed to support multi-column indices (how are queries routed to them and run, for example).

I think the implementation itself looks pretty solid. There are lots of good tests, including property testing which I like to see.

@westonpace
Copy link
Member

I agree this probably needs to be broken up (11K LOC for one PR is a little daunting 😰). That being said it looks like a lot of well tested work. Thank you for starting this effort!

I also agree with Will that a good start should be on supporting indexes on multiple columns both in the table format and the scanner. This is likely to be useful for other indexes as well. Maybe a good order can be...

  • Table format support for multiple indexes
  • Compound sargable query and it's parser
  • Scanner support for multiple indexes
  • Compound btree index search and simple train
  • Distributed training

At a glance it seems very reasonable and there is historical tradition for these kinds of compound indexes. However, like Will, I am also interested in understanding how compound indexes compare to multiple individual indexes.

I do believe they can speed up certain classes of queries. Tenant + narrow range makes sense to me as the winner since the alternative requires crafting the entire bitmap of tenant which can be slow. Although in that case I might wonder if a bitmap index on tenant plus a btree index on the range column would perform similarly to this compound case.

@tomsanbear
Copy link
Author

Appreciate the feedback, that makes sense. I figured this would need several iterations and probably redesigns, for the most part I wanted to get a POC done on our side to evaluate it vs other indexing strategies and it achieved that on our side.

I guess my next question is what is the best approach for approaching the proper design and development stages for this? I'm interested to contribute to this along with any parallel/prerequisite ground work that you feel might be needed before multi column index can go in.

Introduces compound B-tree scalar indices enabling efficient lookups on
multi-column predicates using a single index structure.

Features:
- CompoundSargableQuery and CompoundScalarQuery types for multi-column queries
- CompoundBTreeIndex with load(), search(), update(), and merge support
- CompoundQueryParser for extracting multi-column predicates from expressions
- Per-column page statistics for query pruning
- Support for prefix lookups, range queries, IN-lists, and IS NULL
- Fragment reuse remapping for compound indices during compaction
- Leftmost prefix rule semantics (2-8 columns)

API:
  dataset.create_index(&["tenant_id", "timestamp"], IndexType::Scalar)
    .with_index_name("tenant_time_idx")
    .execute().await?;

Query patterns supported:
- Full key equality: col1 = X AND col2 = Y AND col3 = Z
- Prefix lookup: col1 = X or col1 = X AND col2 = Y
- Prefix + range: col1 = X AND col2 > Y
- IN-list: col1 IN (...) or col1 = X AND col2 IN (...)
- IS NULL: col1 = X AND col2 IS NULL
Adds a fourth benchmark scenario with two separate BTree indices (one on
tenant_id, one on timestamp) to compare against the compound index.

Key findings:
- Compound index is 2-3x faster than single BTree for multi-column queries
- Dual BTree can outperform compound for very narrow range queries
- Compound wins for medium/wide ranges by avoiding intersection overhead
@tomsanbear tomsanbear force-pushed the feat/compound-index-clean branch from 411d221 to 4871f9c Compare January 15, 2026 17:06
@tomsanbear tomsanbear closed this Jan 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants