Remex

Unleash the power of your files with local AI.

Remex turns any folder of documents — PDFs, Word files, notes, spreadsheets, code — into a private, searchable knowledge base. Ask questions in plain language and get answers grounded in your own files, with sources cited.

Everything runs on your machine. No cloud account. No API key required for search. Bring your own AI provider (Anthropic, OpenAI, or a local Ollama instance) only when you want synthesised answers.

Remex Studio

Native desktop app for Windows. No terminal required.

⬇ Download the latest release

⚠️ Windows SmartScreen warning Windows may display a "Windows protected your PC" warning when downloading or installing Remex Studio. This happens because the app is not yet code-signed with a paid certificate — the software is safe and fully open source, feel free to audit the source code in this repository. To proceed: click "More info" then "Run anyway".

What you can do


🔍 Semantic search	Vector similarity search across one or more collections simultaneously
🤖 AI Answer	Ask a question, get a synthesised answer with cited sources (Anthropic · OpenAI · Ollama)
📄 12 file formats	`.pdf` `.docx` `.md` `.txt` `.csv` `.json` `.jsonl` + `.html` `.pptx` `.xlsx` `.epub` `.odt` (optional package)
🗄 SQLite ingest	Embed rows from any table alongside your files
♻️ Incremental ingest	SHA-256 hash check — only changed files are re-processed
🎯 Source filter	Narrow results to one or more documents before searching or asking AI
🔎 Chunk viewer	Expand any result to read the full chunk, navigate with keyboard arrows
📦 Collections manager	Rename, describe, purge, bulk-delete sources, one-click re-ingest
📤 Export	JSON · CSV · Markdown · BibTeX · RIS · CSL-JSON · Obsidian vault
👁️ Watch folders	Re-ingest automatically when files change inside chosen directories
🔬 All embedding models	`MiniLM`, `bge-base`, `bge-large`, multilingual, `nomic-embed` long-context, custom HuggingFace/FastEmbed names
🌙 Themes	Light, dark, auto (follows OS) + sixteen accent colours
🔎 Searchable query history	Filter past queries by substring
⌨️ Keyboard-driven	Press `?` anywhere in Studio for the full shortcuts reference
⚙️ Optional packages	Install extra file formats, AI integrations, or sentence chunking from `Settings → General` at any time

Remex is free and open-source. Every feature ships in the box — no tiers, no license keys, no payment required.

Python CLI & Library

pip install remex-cli                    # core — ingest + query (7 formats)
pip install "remex-cli[formats]"         # + .pptx .xlsx .epub .html .odt
pip install "remex-cli[ai]"              # + Anthropic & OpenAI embeddings / generation
pip install "remex-cli[sentence]"        # + sentence-aware chunking (NLTK)
pip install "remex-cli[api]"             # + FastAPI sidecar (used by Studio)
pip install "remex-cli[all]"             # everything above

Quick start

# Scaffold a project
remex init

# Ingest a folder of documents
remex ingest ./docs

# Semantic search
remex query "how does authentication work?"

# AI-synthesised answer (requires ANTHROPIC_API_KEY, OPENAI_API_KEY, or a running Ollama)
remex query "how does authentication work?" --ai

Command reference

Command	Description
`remex init [path]`	Scaffold `docs/`, `remex.toml`, and `.gitignore` entries
`remex ingest <dir>`	Ingest files from a directory into a collection
`remex ingest-sqlite <db>`	Ingest rows from a SQLite table
`remex query <text>`	Semantic search; add `--ai` for an AI-synthesised answer
`remex sources`	List all ingested source paths in a collection
`remex stats`	Show chunk and source counts
`remex delete-source <path>`	Remove all chunks for a specific source
`remex purge`	Remove chunks whose source file no longer exists on disk
`remex reset`	Wipe an entire collection
`remex list-collections`	List all collections in a database
`remex serve`	Start the FastAPI sidecar on `localhost:8000`

remex <command> --help    # full option reference for any command

Use as a library

from remex import ingest, query

# Ingest a folder
result = ingest("./docs", collection_name="my-kb")
print(f"{result.chunks_stored} chunks stored")

# Search
results = query("how does auth work?", collection_name="my-kb")
for r in results:
    print(f"[{r.score:.3f}] {r.source}  →  {r.text[:120]}")

Configuration

Drop a remex.toml in your project root (or run remex init to generate one):

[remex]
db              = "./remex_db"
collection      = "my-kb"
embedding_model = "all-MiniLM-L6-v2"

# chunk_size     = 768          # characters per chunk (512-1024 works well)
# overlap        = 150          # ~20% overlap preserves context at boundaries
# min_chunk_size = 50           # discard chunks shorter than this
# chunking       = "recursive"  # "recursive" (default) | "sentence" | "word"

CLI flags always override remex.toml values.

Supported embedding models

Preset	Model	Size	Notes
Light	`all-MiniLM-L6-v2`	22 MB	Default — fast, good accuracy
Balanced	`intfloat/e5-base-v2`	438 MB	Better retrieval quality
Multilingual	`paraphrase-multilingual-MiniLM-L12-v2`	470 MB	50+ languages
Large (Pro)	`BAAI/bge-large-en-v1.5`	1.3 GB	Best English accuracy
E5 Large (Pro)	`intfloat/e5-large-v2`	1.3 GB	Strong retrieval benchmark
Long ctx (Pro)	`nomic-ai/nomic-embed-text-v1.5`	547 MB	8,192-token context window

Any model from SBERT, HuggingFace sentence-similarity, or Ollama can be used by typing the model name directly.

Building from source

Studio requires Rust, Node.js 20+, and the Tauri prerequisites for Windows.

# Python CLI
pip install -e ".[dev]"
pytest

# Studio (dev server with hot-reload)
cd studio
npm install
npm run tauri dev

# Studio (production build)
npm run tauri build

See studio/README.md for the full build guide.

Changelog · Contributing · Licensing · GitHub

_{Python CLI: Apache-2.0 · Studio (v1.3.0+): FSL-1.1-MIT — see LICENSES.md}

Name		Name	Last commit message	Last commit date
Latest commit History 412 Commits
.github/workflows		.github/workflows
docs		docs
remex		remex
studio		studio
tests		tests
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
LICENSES.md		LICENSES.md
README.md		README.md
logo.svg		logo.svg
pyproject.toml		pyproject.toml
remex.toml		remex.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Remex

Remex Studio

What you can do

Python CLI & Library

Quick start

Command reference

Use as a library

Configuration

Supported embedding models

Building from source

About

Uh oh!

Releases 4

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Remex

Remex Studio

What you can do

Python CLI & Library

Quick start

Command reference

Use as a library

Configuration

Supported embedding models

Building from source

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 4

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages