Skip to content

adm-crow/remex

Repository files navigation

Remex

Remex

Unleash the power of your files with local AI.


GitHub Release CI PyPI License Windows Python


Remex Studio — homepage

Remex turns any folder of documents — PDFs, Word files, notes, spreadsheets, code — into a private, searchable knowledge base. Ask questions in plain language and get answers grounded in your own files, with sources cited.

Everything runs on your machine. No cloud account. No API key required for search. Bring your own AI provider (Anthropic, OpenAI, or a local Ollama instance) only when you want synthesised answers.



Remex Studio

Native desktop app for Windows. No terminal required.

⬇ Download the latest release

⚠️ Windows SmartScreen warning Windows may display a "Windows protected your PC" warning when downloading or installing Remex Studio. This happens because the app is not yet code-signed with a paid certificate — the software is safe and fully open source, feel free to audit the source code in this repository. To proceed: click "More info" then "Run anyway".


What you can do

🔍 Semantic search Vector similarity search across one or more collections simultaneously
🤖 AI Answer Ask a question, get a synthesised answer with cited sources (Anthropic · OpenAI · Ollama)
📄 12 file formats .pdf .docx .md .txt .csv .json .jsonl + .html .pptx .xlsx .epub .odt (optional package)
🗄 SQLite ingest Embed rows from any table alongside your files
♻️ Incremental ingest SHA-256 hash check — only changed files are re-processed
🎯 Source filter Narrow results to one or more documents before searching or asking AI
🔎 Chunk viewer Expand any result to read the full chunk, navigate with keyboard arrows
📦 Collections manager Rename, describe, purge, bulk-delete sources, one-click re-ingest
📤 Export JSON · CSV · Markdown · BibTeX · RIS · CSL-JSON · Obsidian vault
👁️ Watch folders Re-ingest automatically when files change inside chosen directories
🔬 All embedding models MiniLM, bge-base, bge-large, multilingual, nomic-embed long-context, custom HuggingFace/FastEmbed names
🌙 Themes Light, dark, auto (follows OS) + sixteen accent colours
🔎 Searchable query history Filter past queries by substring
⌨️ Keyboard-driven Press ? anywhere in Studio for the full shortcuts reference
⚙️ Optional packages Install extra file formats, AI integrations, or sentence chunking from Settings → General at any time

Remex is free and open-source. Every feature ships in the box — no tiers, no license keys, no payment required.



Python CLI & Library

pip install remex-cli                    # core — ingest + query (7 formats)
pip install "remex-cli[formats]"         # + .pptx .xlsx .epub .html .odt
pip install "remex-cli[ai]"              # + Anthropic & OpenAI embeddings / generation
pip install "remex-cli[sentence]"        # + sentence-aware chunking (NLTK)
pip install "remex-cli[api]"             # + FastAPI sidecar (used by Studio)
pip install "remex-cli[all]"             # everything above

Quick start

# Scaffold a project
remex init

# Ingest a folder of documents
remex ingest ./docs

# Semantic search
remex query "how does authentication work?"

# AI-synthesised answer (requires ANTHROPIC_API_KEY, OPENAI_API_KEY, or a running Ollama)
remex query "how does authentication work?" --ai

Command reference

Command Description
remex init [path] Scaffold docs/, remex.toml, and .gitignore entries
remex ingest <dir> Ingest files from a directory into a collection
remex ingest-sqlite <db> Ingest rows from a SQLite table
remex query <text> Semantic search; add --ai for an AI-synthesised answer
remex sources List all ingested source paths in a collection
remex stats Show chunk and source counts
remex delete-source <path> Remove all chunks for a specific source
remex purge Remove chunks whose source file no longer exists on disk
remex reset Wipe an entire collection
remex list-collections List all collections in a database
remex serve Start the FastAPI sidecar on localhost:8000
remex <command> --help    # full option reference for any command

Use as a library

from remex import ingest, query

# Ingest a folder
result = ingest("./docs", collection_name="my-kb")
print(f"{result.chunks_stored} chunks stored")

# Search
results = query("how does auth work?", collection_name="my-kb")
for r in results:
    print(f"[{r.score:.3f}] {r.source}{r.text[:120]}")


Configuration

Drop a remex.toml in your project root (or run remex init to generate one):

[remex]
db              = "./remex_db"
collection      = "my-kb"
embedding_model = "all-MiniLM-L6-v2"

# chunk_size     = 768          # characters per chunk (512-1024 works well)
# overlap        = 150          # ~20% overlap preserves context at boundaries
# min_chunk_size = 50           # discard chunks shorter than this
# chunking       = "recursive"  # "recursive" (default) | "sentence" | "word"

CLI flags always override remex.toml values.



Supported embedding models

Preset Model Size Notes
Light all-MiniLM-L6-v2 22 MB Default — fast, good accuracy
Balanced intfloat/e5-base-v2 438 MB Better retrieval quality
Multilingual paraphrase-multilingual-MiniLM-L12-v2 470 MB 50+ languages
Large (Pro) BAAI/bge-large-en-v1.5 1.3 GB Best English accuracy
E5 Large (Pro) intfloat/e5-large-v2 1.3 GB Strong retrieval benchmark
Long ctx (Pro) nomic-ai/nomic-embed-text-v1.5 547 MB 8,192-token context window

Any model from SBERT, HuggingFace sentence-similarity, or Ollama can be used by typing the model name directly.



Building from source

Studio requires Rust, Node.js 20+, and the Tauri prerequisites for Windows.

# Python CLI
pip install -e ".[dev]"
pytest

# Studio (dev server with hot-reload)
cd studio
npm install
npm run tauri dev

# Studio (production build)
npm run tauri build

See studio/README.md for the full build guide.



Changelog · Contributing · Licensing · GitHub

Python CLI: Apache-2.0 · Studio (v1.3.0+): FSL-1.1-MIT — see LICENSES.md

About

No description, website, or topics provided.

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors