Project high-dimensional embeddings onto a 3D sphere for fast semantic search, spatial queries, category-aware exploration, and interactive visualization.
sphereQL maps vectors from any embedding model (OpenAI, Cohere,
sentence-transformers, etc.) onto spherical coordinates via one of four
projection families — linear PCA, kernel PCA with a Gaussian (RBF)
kernel, Laplacian eigenmap over a k-NN similarity graph, or random
projection — then indexes them with shell/sector partitioning for fast
nearest-neighbor lookups. A Category Enrichment Layer computes
inter-category relationships, classifies bridges (Genuine /
OverlapArtifact / Weak), and builds inner spheres for
high-resolution within-category search. sphereQL auto-tunes its
pipeline per corpus against a scalar QualityMetric; a meta-model
recalls winning configs from past tuner runs when a new corpus arrives.
Callable from Rust, Python, or the browser via WASM.
Full documentation lives under docs/.
- Quickstart — Rust · Python · WASM
- Architecture — workspace crates and feature flags
- Projections — how the four projection families work
- Auto-tuning & meta-learning — the metalearning framework
- Empirical findings — when does each projection win?
- Examples catalog · Performance · Project status
# Cargo.toml
[dependencies]
sphereql = { version = "0.1", features = ["full"] }# Python
pip install sphereqlSee architecture.md for feature-flag details.
use sphereql::embed::*;
// 1. Build a pipeline from categorized embeddings.
let input = PipelineInput {
categories: vec![
"science".into(), "science".into(),
"cooking".into(), "cooking".into(),
],
embeddings: vec![
vec![0.1, 0.9, 0.3, 0.0],
vec![0.2, 0.8, 0.4, 0.1],
vec![0.9, 0.1, 0.0, 0.5],
vec![0.8, 0.2, 0.1, 0.4],
],
};
let pipeline = SphereQLPipeline::new(input).unwrap();
// 2. Query nearest neighbors.
let query = PipelineQuery { embedding: vec![0.15, 0.85, 0.35, 0.05] };
let results = pipeline.query(SphereQLQuery::Nearest { k: 3 }, &query);See the Rust quickstart for spatial indexing,
the layout engine, GraphQL, and the full embedding pipeline.
auto-tuning.md covers the PipelineConfig +
auto_tune + MetaModel workflow end-to-end.
import sphereql
categories = ["science", "science", "cooking", "cooking"]
embeddings = [
[0.1, 0.9, 0.3, 0.0],
[0.2, 0.8, 0.4, 0.1],
[0.9, 0.1, 0.0, 0.5],
[0.8, 0.2, 0.1, 0.4],
]
pipeline = sphereql.Pipeline(categories, embeddings)
results = pipeline.nearest([0.15, 0.85, 0.35, 0.05], k=3)
# Interactive 3D visualization in your browser
sphereql.visualize(categories, embeddings, title="My Embeddings")The Python bindings cover the full Rust surface — PCA, Kernel PCA,
Laplacian eigenmap, auto_tune, MetaModel, FeedbackAggregator,
and the category enrichment layer. Type stubs (.pyi) are
auto-generated via pyo3-stub-gen. See the
Python quickstart for semantic search,
3D visualization, vector database bridges, and the core type
surface.
cd sphereql-wasm && wasm-pack build --target webimport init, { Pipeline } from './pkg/sphereql_wasm.js';
await init();
const pipeline = new Pipeline(JSON.stringify({
categories: ["science", "cooking"],
embeddings: [[0.1, 0.9, 0.3], [0.9, 0.1, 0.0]],
}));
const results = pipeline.nearest(JSON.stringify([0.15, 0.85, 0.35]), 1);Same bindings coverage as Python. Every pipeline / category /
metalearning method returns typed values via tsify — wasm-pack build
emits a .d.ts with named interfaces, no JSON.parse required. See
the WASM quickstart for category enrichment
in the browser.
| Crate | Role |
|---|---|
sphereql |
Umbrella crate with feature flags for selective imports. |
sphereql-core |
Spherical math — points, conversions, distance metrics, region types. |
sphereql-index |
Spatial indexing with shell + sector partitioning. |
sphereql-layout |
Layout engines (Fibonacci spiral, k-means, force-directed). |
sphereql-embed |
Projections, query pipeline, Category Enrichment Layer, metalearning framework. |
sphereql-graphql |
async-graphql schema: spatial queries (cone/shell/band/wedge), the full category enrichment surface, subscriptions, and a pluggable TextEmbedder trait for text query input. |
sphereql-vectordb |
Vector store bridge (InMemory, Qdrant, Pinecone) with hybrid search. |
sphereql-python |
Python bindings via PyO3/maturin. |
sphereql-wasm |
WASM bindings via wasm-bindgen. |
sphereql-corpus |
Shared example corpora (775-concept built-in + 300-concept stress). |
Full dependency graph and crate-by-crate description in architecture.md.
sphereQL is at v0.2.0-alpha. The core API is functional and covered by 450+ tests, but may change before 1.0. Known limitations and roadmap are in project-status.md.
Binding parity is protected by a drift check (scripts/check-drift) —
new public items in sphereql-embed / sphereql-layout must either
have a Python/WASM binding or an allowlist entry with a reason in
.bindings-ignore.toml.
- Fork the repo and create a feature branch.
- Run
cargo test --workspace --all-featuresandcargo clippy --workspace --all-features --all-targets. - For Python changes, run
cd sphereql-python && maturin develop && pytest -v. - Open a PR against
main.
The codebase uses Rust 2024 edition. All CI checks must pass before merge. See testing.md for the full pipeline.