sqfox

Embedded SQLite micro-framework for hybrid search (FTS5 + vectors), IoT and RAG.

No server. No Docker. No network. One .db file. Runs on anything — Raspberry Pi, industrial PCs, that old Celeron laptop collecting dust on a shelf.

GitHub: github.com/SiberXXX/sqfox | README на русском

Install

# Core only (zero dependencies, includes SqliteHnswBackend + SqliteFlatBackend)
pip install sqfox

# With hybrid search (English)
pip install sqfox[search]

# With hybrid search (English + Russian)
pip install sqfox[search-ru]

Quick Start

Thread-Safe Writes + Reads

from sqfox import SQFox

with SQFox("app.db") as db:
    db.write("CREATE TABLE sensors (ts TEXT, value REAL)", wait=True)

    # Non-blocking writes from any thread
    db.write("INSERT INTO sensors VALUES (?, ?)", ("2026-03-01T10:00:00", 23.5))

    # Reads — parallel, non-blocking
    rows = db.fetch_all("SELECT * FROM sensors ORDER BY ts DESC LIMIT 10")

All writes go through a single writer thread with batching. Multiple threads can read in parallel via WAL mode. PRAGMAs are tuned automatically.

Hybrid Search: FTS + Vectors

from sqfox import SQFox

def my_embed(texts):
    """Your embedding function — any model, any API."""
    raise NotImplementedError("Implement your embedding function")

with SQFox("knowledge.db") as db:
    db.ingest("Short note about SQLite WAL mode", embed_fn=my_embed, wait=True)

    results = db.search("database optimization", embed_fn=my_embed)
    for r in results:
        print(f"[{r.score:.3f}] {r.text}")

Always use a chunker for documents longer than ~500 words. Without chunking, the entire document becomes a single vector — this kills search precision.

Instruction-Aware Models (Qwen3, E5, BGE)

from sentence_transformers import SentenceTransformer

class QwenEmbedder:
    def __init__(self, dim=256):
        self.model = SentenceTransformer("Qwen/Qwen3-Embedding-0.6B", truncate_dim=dim)

    def embed_documents(self, texts):
        return self.model.encode(texts).tolist()

    def embed_query(self, text):
        return self.model.encode(text, prompt_name="query").tolist()

embedder = QwenEmbedder()

with SQFox("rag.db") as db:
    db.ingest("Document text...", embed_fn=embedder, wait=True)
    results = db.search("query text", embed_fn=embedder)

sqfox detects embed_documents / embed_query methods and calls the right one automatically. Plain callables work too.

Pluggable Vector Backends

sqfox ships with two pure-Python vector backends — zero C dependencies:

HNSW (auto-selected for most environments) — O(log N) approximate nearest-neighbor search, graph stored as BLOB in SQLite.
Flat — brute-force KNN, simple and exact.

from sqfox import SQFox, SqliteHnswBackend

# Auto-selected for desktop/server — HNSW (O(log N), pure Python, graph stored in SQLite as BLOB)
db = SQFox("app.db")

# Flat — brute-force KNN, exact search
db = SQFox("app.db", vector_backend="flat")

# HNSW with custom parameters
backend = SqliteHnswBackend(M=16, ef_construction=200, ef_search=64)
db = SQFox("app.db", vector_backend=backend)

The HNSW graph is stored alongside documents in the same .db file. SQLite is the source of truth — if the graph is lost or corrupted, it rebuilds automatically from embedding BLOBs on next startup.

Write your own backend — implement the VectorBackend protocol:

from sqfox import VectorBackend

class MyBackend:
    def set_writer_conn(self, conn) -> None: ...
    def initialize(self, db_path: str, ndim: int) -> None: ...
    def add(self, keys: list[int], vectors: list[list[float]]) -> None: ...
    def remove(self, keys: list[int]) -> None: ...
    def search(self, query: list[float], k: int, **kwargs) -> list[tuple[int, float]]: ...
    def flush(self) -> None: ...
    def count(self) -> int: ...
    def close(self) -> None: ...

db = SQFox("app.db", vector_backend=MyBackend())

Chunking

from sqfox import SQFox, sentence_chunker, markdown_chunker, html_to_text

with SQFox("docs.db") as db:
    db.ingest(text, chunker=sentence_chunker(chunk_size=500, overlap=1), embed_fn=my_embed, wait=True)
    db.ingest(md_text, chunker=markdown_chunker(max_level=2), embed_fn=my_embed, wait=True)

    clean = html_to_text(raw_html)
    db.ingest(clean, chunker=sentence_chunker(), embed_fn=my_embed, wait=True)

Available: sentence_chunker, paragraph_chunker, markdown_chunker, recursive_chunker, html_to_text. Custom (str) -> list[str] callables also work.

Multi-Database Manager

from sqfox import SQFoxManager

with SQFoxManager("./databases") as mgr:
    mgr.ingest_to("sensors", "Temperature: 25.3C", embed_fn=my_embed, wait=True)
    mgr.ingest_to("manuals", "Replace bearing at 50k RPM", embed_fn=my_embed, wait=True)

    results = mgr.search_all("bearing vibration", embed_fn=my_embed)
    for db_name, r in results:
        print(f"[{db_name}] [{r.score:.3f}] {r.text}")

    mgr.drop("old_logs", delete_file=True)

Metadata

db.ingest(
    "Pressure sensor calibration procedure",
    metadata={"source": "manual_v3", "equipment": "PS-100"},
    embed_fn=my_embed, wait=True,
)

results = db.search("calibration", embed_fn=my_embed)
print(results[0].metadata)  # {"source": "manual_v3", "equipment": "PS-100"}

Reranking (Cross-Encoder)

Two-stage retrieval: fast hybrid search retrieves candidates, then a cross-encoder re-scores them:

from sentence_transformers import CrossEncoder

reranker_model = CrossEncoder("cross-encoder/ms-marco-MiniLM-L-6-v2")

def my_reranker(query: str, texts: list[str]) -> list[float]:
    pairs = [(query, t) for t in texts]
    return reranker_model.predict(pairs).tolist()

results = db.search(
    "database optimization",
    embed_fn=my_embed,
    reranker_fn=my_reranker,
    rerank_top_n=20,
)

Online Backup

with SQFox("app.db") as db:
    db.backup("backup.db")

    def progress(status, remaining, total):
        print(f"Copied {total - remaining}/{total} pages")
    db.backup("backup.db", progress=progress)

Vacuum

with SQFox("app.db") as db:
    # Rewrite DB file, reclaim space from deleted rows
    db.vacuum()

    # Or write a compacted copy (no 2x disk space needed)
    db.vacuum(into="compacted.db")

Async (FastAPI / asyncio)

AsyncSQFox — async facade with dual thread pools. I/O reads are never blocked by CPU-heavy embedding/reranking:

from contextlib import asynccontextmanager
from fastapi import FastAPI
from sqfox import AsyncSQFox

db = AsyncSQFox("data.db", max_cpu_workers=2)

@asynccontextmanager
async def lifespan(app: FastAPI):
    async with db:
        yield

app = FastAPI(lifespan=lifespan)

@app.post("/ingest")
async def ingest(text: str):
    doc_id = await db.ingest(text, embed_fn=my_embed)
    return {"id": doc_id}

@app.get("/search")
async def search(q: str):
    return await db.search(q, embed_fn=my_embed)

Two pools: I/O pool (default asyncio executor) for fetch_one/fetch_all/backup, CPU pool (limited ThreadPoolExecutor) for ingest/search with embeddings. Even with 200 concurrent ingests, only max_cpu_workers embeddings run at once.

Thread safety: If your model doesn't support concurrent calls, set max_cpu_workers=1.

Single process: Use uvicorn app:app --workers 1. For multi-worker, use PostgreSQL.

Error Callback

def on_error(sql: str, exc: Exception):
    with open("failed_queries.log", "a") as f:
        f.write(f"{sql}\n{exc}\n\n")

db = SQFox("app.db", error_callback=on_error)

Fires on syntax errors, constraint violations, batch aborts, and writer thread crashes. Without a callback, errors still go to the Future object and Python logging.

How It Works

Writer Thread (single)          Reader Threads (many)
       |                              |
  PriorityQueue                 threading.local()
       |                              |
  Batching (N writes             One connection
  per transaction)               per thread
       |                              |
  Writer Connection             Reader Connections
       |                              |
       +--------- SQLite WAL ---------+
                     |
              One .db file
                     |
          +----------+----------+
          |                     |
      Flat backend         HNSW graph
     (brute KNN)         (BLOB in SQLite)

Writes: All writes go through a PriorityQueue to a single writer thread. Batched into one transaction (BEGIN IMMEDIATE). No database is locked errors.
Reads: Each thread gets its own connection via threading.local(). Parallel, non-blocking (WAL mode).
Search: FTS5 (BM25) and vector backend (HNSW or Flat) run independently. Merged via Relative Score Fusion with adaptive alpha.
Vector backends: SqliteHnswBackend (default, pure Python HNSW, O(log N), graph serialized as CSR BLOB in SQLite) or SqliteFlatBackend (brute-force KNN). Custom backends via the VectorBackend protocol. Pure Python, zero C dependencies.
Crash safety: Embedding BLOBs are always stored in SQLite (source of truth). The HNSW graph is a rebuildable cache — if corrupted, it auto-recovers from BLOBs on startup.
Auto-adaptive: Detects RAM, CPU count, platform (Desktop / Raspberry Pi / Android Termux / SBC), SD card. Tunes PRAGMAs automatically: LOW (<1 GB) → 4 MB cache, no mmap; MEDIUM (1–4 GB) → 16 MB cache, 64 MB mmap; HIGH (>4 GB) → 64 MB cache, 256 MB mmap. Vector backend auto-selected on first ingest(). Incremental vacuum runs silently in the background. Zero config needed.
Auto-PRAGMA: WAL, synchronous=NORMAL, busy_timeout=5000, temp_store=MEMORY, cache/mmap auto-tuned to RAM, foreign_keys=ON.
Schema: Evolves automatically — EMPTY → BASE → INDEXED → SEARCHABLE → ENRICHED. No manual migrations. Idempotent. Resumable backfill.
Lemmatization: Mixed RU+EN text handled per-word — pymorphy3 for Cyrillic, simplemma for Latin.

Use Cases

Scenario	Description	Key features
DIY smart home / automation	Old laptop or Celeron mini-PC on a shelf — boiler control, sensor logging, climate automation. No cloud, no subscription, just Python + sqfox on hardware you already own	Concurrent writes, WAL, batching, backup, Grafana via SQLite plugin
Telegram / Discord bot	Local knowledge base for bots, desktop tools, CLI utilities, offline assistants	AsyncSQFox, hybrid search, chunking, metadata, reranking
Industrial IoT + Grafana	Edge gateways (Raspberry Pi, Jetson, industrial PCs). Grafana reads the same `.db` via frser-sqlite-datasource	Concurrent writes, WAL, batching, backup
Self-diagnosing edge agent	Two AsyncSQFox instances: telemetry (I/O pool) + knowledge base (CPU pool). Threshold triggers auto-RAG search	Dual-pool isolation, auto-RAG
OBD-II smart mechanic	ELM327 reads engine data, sqfox auto-searches service manual on DTC codes	See `demo/obd_smart_mechanic.py`

Limitations

sqfox is not a replacement for PostgreSQL. It is designed for a specific niche.

When NOT to use sqfox

Requirement	Use instead
Multiple processes writing to one DB	PostgreSQL, MySQL
More than ~500K vectors	pgvector, Qdrant, Milvus
Multi-server / distributed / HA	PostgreSQL, Elasticsearch, CockroachDB
High-frequency data (>10K points/sec)	InfluxDB, ClickHouse

Known limitations

SqliteHnswBackend — pure Python HNSW. Faster than brute-force on 10K+ vectors, but slower than C/Rust implementations (usearch, hnswlib). Acceptable for edge/embedded scenarios.
Lemmatization without context — pymorphy3 picks the most frequent word form (~79% accuracy). Fine for search, not perfect.
synchronous=NORMAL — last few seconds of commits may be lost on power failure. DB file won't corrupt. Use synchronous=FULL manually if you need guaranteed durability.
SD card wear — batching reduces fsyncs, but plan card replacement for long-running IoT deployments.

Sweet spot

Single process, multiple threads. Up to ~100K+ docs with HNSW backend, or ~1M rows without vectors. Local/embedded deployment. Runs fine on dual-core Celerons, old Core 2 Duo laptops, Raspberry Pi — anything with Python 3.10+. Perfect for DIY home automation, boiler controllers, sensor logging, local knowledge bases. If you're thinking "maybe I need PostgreSQL" — you probably do. sqfox is for cases where PostgreSQL is overkill or impossible.

Diagnostics

with SQFox("app.db") as db:
    print(db.diagnostics())

{
  "sqfox_version": "0.3.0",
  "python_version": "3.13.7",
  "platform": "Linux-6.1.0-rpi-aarch64",
  "sqlite_version": "3.50.4",
  "path": "app.db",
  "is_running": true,
  "vector_backend": "hnsw",
  "queue_size": 0,
  "schema_state": "ENRICHED",
  "auto": {
    "total_ram_mb": 512,
    "memory_tier": "LOW",
    "cpu_count": 4,
    "platform_class": "RASPBERRY_PI",
    "is_sd_card": false,
    "fts5_available": true,
    "resolved_cache_size_kb": 4000,
    "resolved_mmap_size_mb": 0
  }
}

Auto-Adaptive Tuning (0.3.0+)

sqfox auto-detects your environment at startup and configures itself:

Detected	How	Fallback
RAM	`/proc/meminfo`, `sysctl`, `GlobalMemoryStatusEx`	1 GB
CPU count	`sched_getaffinity` (Docker-aware), `cpu_count()`	1
Platform	`TERMUX_VERSION` env, `/proc/device-tree/model`, arch	Desktop
SD card	Path heuristic (`/media/`, `/mnt/sd`, etc.)	No
FTS5	Probe in `:memory:`	Vector-only search

PRAGMA tuning by memory tier:

Tier	RAM	cache_size	mmap_size
LOW	< 1 GB	4 MB	0 (disabled)
MEDIUM	1–4 GB	16 MB	64 MB
HIGH	> 4 GB	64 MB	256 MB

Zero config needed. Just SQFox("data.db") — it works on a Raspberry Pi, an Android phone, or a 64 GB desktop.

To override auto-detection, pass explicit values:

# Force specific settings (disables auto for these params)
db = SQFox(
    "sensors.db",
    cache_size_kb=8_000,    # override auto: 8 MB
    mmap_size_mb=0,         # override auto: disable mmap
)

Additional tips:

SD card durability: Default synchronous=NORMAL may lose the last transaction on power loss (file won't corrupt). For hard guarantees, run db.write("PRAGMA synchronous=FULL", wait=True) after start.
Disk space: sqfox enables incremental auto-vacuum on new databases and reclaims pages silently during ingestion. For manual compaction: db.vacuum() or db.vacuum(into="compacted.db").
Android / Termux: FTS5 support depends on the Python build. Most Kivy, BeeWare, and Termux Python packages include FTS5. If FTS5 is unavailable, sqfox detects this and works with vector-only search — no crash, no config needed.
Graceful shutdown: sqfox registers an atexit hook so the writer thread is drained on normal process exit. For SIGKILL or hard power-off, SQLite WAL journaling protects the database file.
FTS5 self-healing: On every startup, sqfox runs an FTS integrity check. If the index is corrupt, it auto-rebuilds.

Platform Support

Platform	Status
Linux x86-64	works
Linux ARM64 (Raspberry Pi, industrial PCs)	works
macOS Intel / Apple Silicon	works
Windows x86-64	works
Alpine Linux (musl)	works
32-bit / Windows ARM64	works (pure Python, no C extensions)

All vector backends are pure Python with zero C dependencies, so sqfox runs on any platform with Python 3.10+.

Demos

# IoT + RAG (requires sentence-transformers)
python demo/run_demo.py

# HNSW X-Ray — interactive graph inspector (requires sentence-transformers)
python demo/run_hnsw_xray.py

# Crash & Recovery — corrupt HNSW graph, auto-rebuild (requires sentence-transformers)
python demo/run_crash_recovery.py

# Smart home IoT emulation
python demo/run_smart_home.py

# OBD-II smart mechanic
python demo/run_smart_mechanic.py

Requirements

Python >= 3.10
Core: zero dependencies (stdlib only)
[search]: simplemma
[search-ru]: + pymorphy3
[search-hnsw]: usearch, numpy (optional USearch backend)
SqliteHnswBackend + SqliteFlatBackend: included in core (pure Python, no extra deps)

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
.github/ISSUE_TEMPLATE		.github/ISSUE_TEMPLATE
demo		demo
src/sqfox		src/sqfox
tests		tests
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
README_RU.md		README_RU.md
pyproject.toml		pyproject.toml
sqfox.jpg		sqfox.jpg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

sqfox

Install

Quick Start

Thread-Safe Writes + Reads

Hybrid Search: FTS + Vectors

Instruction-Aware Models (Qwen3, E5, BGE)

Pluggable Vector Backends

Chunking

Multi-Database Manager

Metadata

Reranking (Cross-Encoder)

Online Backup

Vacuum

Async (FastAPI / asyncio)

Error Callback

How It Works

Use Cases

Limitations

When NOT to use sqfox

Known limitations

Sweet spot

Diagnostics

Auto-Adaptive Tuning (0.3.0+)

Platform Support

Demos

Requirements

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

sqfox

Install

Quick Start

Thread-Safe Writes + Reads

Hybrid Search: FTS + Vectors

Instruction-Aware Models (Qwen3, E5, BGE)

Pluggable Vector Backends

Chunking

Multi-Database Manager

Metadata

Reranking (Cross-Encoder)

Online Backup

Vacuum

Async (FastAPI / asyncio)

Error Callback

How It Works

Use Cases

Limitations

When NOT to use sqfox

Known limitations

Sweet spot

Diagnostics

Auto-Adaptive Tuning (0.3.0+)

Platform Support

Demos

Requirements

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages