High-performance columnar permutation index and filter engine for Apache Arrow.
arrow-view-state is a pure indexing engine — no I/O, no column names, no pagination, no
application state. It separates view-state computation (sorting, filtering, selection) from
data access so that UI layers can reorder and filter millions of Arrow rows without copying
or rewriting the underlying RecordBatches.
SortBuilder— streaming multi-column sort; feedRecordBatches one at a time, finish into aPermutationIndexPermutationIndex— virtual-to-physical row mapping; windowed reads, late materialisationFilterIndex— sparse filter backed by Roaring Bitmap with set algebra (and,or,not)PhysicalSelection— physical row IDs ready for Arrowtakeor Parquet row-selection pushdown- Parallel argsort via Rayon (optional, default-on)
- Memory-mapped storage for large indices that exceed RAM (
persistfeature) - Hash index for O(1) equality lookups (
hash-indexfeature) - Inverted index for token-based text search (
inverted-indexfeature) - SIMD predicate evaluation via
arrow-ord(evaluatefeature)
[dependencies]
arrow-view-state = "0.1"For WASM or minimal builds, disable the default parallel feature:
arrow-view-state = { version = "0.1", default-features = false }use std::sync::Arc;
use arrow_array::{ArrayRef, Int64Array, StringArray, RecordBatch};
use arrow_schema::{Schema, Field, DataType, SortOptions};
use arrow_row::SortField;
use arrow_view_state::{SortBuilder, FilterIndex};
let schema = Arc::new(Schema::new(vec![
Field::new("name", DataType::Utf8, false),
Field::new("value", DataType::Int64, false),
]));
let batch = RecordBatch::try_new(schema.clone(), vec![
Arc::new(StringArray::from(vec!["B", "A", "B", "A"])) as ArrayRef,
Arc::new(Int64Array::from(vec![10, 20, 30, 40])) as ArrayRef,
]).unwrap();
// Sort by name asc, then value desc.
let fields = vec![
SortField::new(DataType::Utf8),
SortField::new_with_options(
DataType::Int64,
SortOptions { descending: true, nulls_first: true },
),
];
let mut builder = SortBuilder::new(fields);
builder.push(&[batch.column(0).clone(), batch.column(1).clone()]).unwrap();
let sorted = builder.finish().unwrap();
let range = sorted.read_range(0, 2); // windowed read
let filter = FilterIndex::from_ids([0, 2]);
let filtered = filter.apply_to_permutation(&sorted); // sparse filter
let selection = sorted.to_physical_selection(0..2); // late materialisationSee EXAMPLES.md for annotated walkthroughs.
| Example | What it shows |
|---|---|
| Sort & Window | Multi-column sort, windowed read, late materialisation |
| Filter Algebra | Composing FilterIndex with and / or / not |
| Persist to Disk | Save and reload a PermutationIndex via the persist feature |
| Flag | Default | Description |
|---|---|---|
parallel |
✓ | Parallel argsort via Rayon |
evaluate |
SIMD predicate evaluation → FilterIndex via arrow-ord |
|
hash-index |
O(1) equality lookup index | |
inverted-index |
Token-based text search index | |
mmap |
Memory-mapped temp storage for large sorts | |
persist |
Save/load indices to disk (implies mmap) |
|
full |
All of the above |
Minimum supported Rust version: 1.94 (edition 2024).
See CONTRIBUTING.md.
Licensed under either of MIT or Apache-2.0 at your option.