ArxCore - Ask your Archives

Zero-trust Document Processing by LLMs

ArxCore is a lightweight solution that allows you to prompt your documents using natural language. The key feature is zero-trust processing - all document processing happens locally on a single CPU, ensuring your sensitive data never leaves your machine.

Upload documents, create searchable memories, and interact with them using AI-powered chat. Supports both local LLMs (via Ollama) and cloud providers.

Features

Zero-trust processing: All operations run locally on single CPU
Document support: Multiple formats (PDF, DOCX, TXT, MD, Python, JavaScript, JSON, YAML, CSV, HTML, XML, logs)
Memvid storage: Uses Memvid technology for flexible local data storage, enabling simple data sharing compared to traditional vector databases
Vector search: Create searchable memories from documents
AI chat: Natural language queries about your documents
Local & cloud LLMs: Ollama, OpenAI, Google, Anthropic, and custom providers
Web interface: Modern UI for document management and chat
REST API: Full programmatic access
Multi-user sessions: Team collaboration support

ArxCore UI Screenshot

Comparison with Traditional Solutions

Feature	ArxCore (Memvid)	Vector DBs	Traditional DBs
Storage Efficiency	⭐⭐⭐⭐⭐	⭐⭐	⭐⭐⭐
Setup Complexity	Simple	Complex	Complex
Semantic Search	✅	✅	❌
Offline Usage	✅	❌	✅
Zero-Trust Processing	✅	❌	❌
Portability	File-based	Server-based	Server-based
Data Sharing	Single files	Database dumps	Database dumps
Dependencies	Python + Node.js	Multiple services	Database server
Scalability	Millions of chunks	Billions of vectors	Billions of records
Cost	Free	$$$$	$$$

Installation

Prerequisites

Python 3.8+
Node.js (for tooling)
Ollama with "nomic-embed-text" (model for local embedding)

Setup

# Install Python dependencies
pip install -r requirements.txt

# Install Node.js dependencies (for testing and tooling)
npm install

Install Ollama

# Install Ollama (https://ollama.ai/)
curl -fsSL https://ollama.ai/install.sh | sh

# Pull model for memory embedding
ollama pull nomic-embed-text

# Pull model for chatbot
ollama pull deepseek-r1:7b

Recommended: Virtual Environment

# Create virtual environment
python -m venv venv

# Activate it
source venv/bin/activate  # macOS/Linux
# venv\Scripts\activate   # Windows

# Install dependencies
pip install -r requirements.txt

Quick Start

# Start web service
npm run serve

# Open web UI with all features
http://localhost:5050

# Alternatively you can use shell interface
# Generate memory from documents
npm run generate

# Start interactive chat
npm run chat

Usage

Basic Commands

npm run serve - Start web service
npm run generate - Create memories from documents
npm run chat - Interactive chat with your memories
use option npm run <command> -- --help to get more info

Development Commands

npm test - Run test suite

Local LLM Setup with Ollama

For maximum privacy, use local LLMs:

# 1. Install Ollama (https://ollama.ai/)
curl -fsSL https://ollama.ai/install.sh | sh

# 2. Pull embedding model (for memory generation)
ollama pull nomic-embed-text

# 3. Pull chat model
# You can experiment with model size/quantization
# For systems with 24GB+ RAM:
ollama pull gemma3:4b
# or
ollama pull deepseek-r1:7b

# For systems with lower memory (8-16GB):
ollama pull gemma2:2b
# or
ollama pull phi3:mini

# 4. Start Ollama server
ollama serve

# 5. Generate memory from your files
npm run generate -- --help

# 6. Chat in command line
npm run chat -- --help
# or open web UI
http://localhost:5050

Memory Requirements:

gemma3:9b - Requires 24GB+ RAM systems
gemma3:4b, deepseek-r1:7b - Suitable for 16-32GB RAM systems
gemma2:2b, phi3:mini - Suitable for 8-16GB RAM systems
See Ollama model library for more options

API Documentation

ArxCore provides a REST API documented with OpenAPI 3.0.1:

Interactive docs: Start server and visit http://localhost:5050/apidocs
OpenAPI specs: Available in docs/openapi.json and docs/openapi.yaml

Limitations

ArxCore is capable of concurrent work with multiple users, but it's designed more for small team collaboration rather than high-scale concurrent access. The limitations are:

No file locking during memory creation
In-memory session storage (not persistent)
File system based (not database-backed)

Troubleshooting

Common Issues

ModuleNotFoundError

# Ensure virtual environment is activated
source venv/bin/activate
which python  # Should show venv path

PDF Processing Issues

pip install PyPDF2

LLM API Keys

ArxCore supports commercial LLM providers including OpenAI, Google, and Anthropic, which require API credentials to work. Note that commercial provider support has been primarily tested with Anthropic - other providers may require additional configuration.

# Set environment variables for commercial providers
export OPENAI_API_KEY="your-key"
export GOOGLE_API_KEY="your-key"  
export ANTHROPIC_API_KEY="your-key"

Large Document Processing Use smaller chunk sizes for very large documents via the API or configuration files.

Links

Memvid Project: https://github.com/Olow304/memvid
ArxCore Repository: https://github.com/qool/arxcore

Credits

ArxCore is built on top of Memvid, an innovative data storage technology that enables efficient, file-based vector storage without traditional database dependencies. Memvid's unique approach allows for:

Portable data storage - Memories stored as simple files that can be easily shared and moved
Zero-dependency architecture - No need for complex vector database installations
Efficient retrieval - Fast semantic search capabilities with minimal overhead

Special thanks to the Memvid project for providing the foundational technology that makes ArxCore's zero-trust document processing possible.

Future plans

Bundle CSS and external JS libraries
Allow creators to delete memories
Add support for more LLM services (Hugging Face, OpenRouter, Grok, Together AI...)
Allow inter-model chat messages
Experiment with embedding model

License

MIT License - see LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
config		config
docs		docs
memvid		memvid
output		output
scripts		scripts
src		src
tests		tests
uploads		uploads
.gitignore		.gitignore
.mocharc.json		.mocharc.json
.nycrc.json		.nycrc.json
HOWTO.md		HOWTO.md
LICENSE		LICENSE
README.md		README.md
package.json		package.json
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

ArxCore - Ask your Archives

Table of Contents

Features

ArxCore UI Screenshot

Comparison with Traditional Solutions

Installation

Prerequisites

Setup

Install Ollama

Recommended: Virtual Environment

Quick Start

Usage

Basic Commands

Development Commands

Local LLM Setup with Ollama

API Documentation

Limitations

Troubleshooting

Common Issues

Links

Credits

Future plans

License

About

Uh oh!

Releases

Packages

Languages

License

qool/arxcore

Folders and files

Latest commit

History

Repository files navigation

ArxCore - Ask your Archives

Table of Contents

Features

ArxCore UI Screenshot

Comparison with Traditional Solutions

Installation

Prerequisites

Setup

Install Ollama

Recommended: Virtual Environment

Quick Start

Usage

Basic Commands

Development Commands

Local LLM Setup with Ollama

API Documentation

Limitations

Troubleshooting

Common Issues

Links

Credits

Future plans

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages