⚠️ Development Status: Shu is in active development. Themainbranch may not be stable at any given commit. Use tagged releases for the most stable versions. A formal release process is coming soon.
Shu is an extensible AI knowledge platform for individuals, teams, and organizations. It combines a plugin ecosystem, knowledge base and retrieval, conversational interfaces, and background automation into a single platform. This repository contains the core backend API and React-based admin console. Shu currently supports multi-source retrieval-augmented generation (RAG) via its knowledge base and plugin feeds, but RAG is only one part of the broader system.
See our website for more information: https://www.openshu.ai
- Extensible Plugin System: Python plugins with manifest-based discovery and host capabilities
- Plugin Feeds: Scheduled background ingestion from external sources (Gmail, Google Drive, etc.)
- Connected Accounts: OAuth-based provider identity for secure external service access
- Host Capabilities: Plugins access KB ingestion, secrets, HTTP, caching via standardized APIs
- Multi-Source Ingestion: Documents from plugins, file uploads, or direct API
- Vector Similarity Search: Semantic search using sentence-transformers
- Hybrid Search: Combination of semantic and keyword search
- Document Profiling: LLM-powered document and chunk profiling for enhanced retrieval
- Multiple Knowledge Bases: Multi-tenant support with isolated knowledge bases
- Streaming Chat: Real-time LLM responses with server-sent events
- RAG Integration: Automatic context retrieval from knowledge bases
- Multi-Provider LLM Support: OpenAI, Anthropic, Google Gemini, Azure, Ollama
- Conversation Memory: Persistent conversation history with cross-session context
- Configurable Workflows: Compose plugins, KB queries, and LLM synthesis into reusable experiences
- Scheduled Execution: Cron-based triggers for automated workflows
- Step-Based Processing: Multi-step workflows with data flow between steps
- Dual Authentication: Google OAuth + password authentication
- Role-Based Access Control: Admin, Power User, Regular User, Read Only roles
- JWT Token Management: Secure access and refresh token handling
- Provider Identity: Secure OAuth token storage for external service connections
- FastAPI Backend: High-performance async API with OpenAPI documentation
- Background Scheduler: In-process job scheduling for plugin feeds and experiences
- Docker & Kubernetes Ready: Complete containerization with health checks
- Database Migrations: Alembic-based schema management
graph TD
subgraph "External Services"
A[Gmail]
B[Google Drive]
C[Calendar]
D[Other APIs]
end
subgraph "Shu Backend"
subgraph "Plugin Layer"
E[Plugin Registry]
F[Plugin Executor]
G[Feeds Scheduler]
end
subgraph "Core Services"
H[Chat Service]
I[Query Service]
J[Experience Engine]
K[Knowledge Base Service]
end
L[(PostgreSQL + pgvector)]
M[(Redis - optional)]
E --> F
G --> F
F --> A
F --> B
F --> C
F --> D
F --> K
H --> I
H --> L
I --> L
J --> F
J --> I
K --> L
H -.-> M
K -.-> M
end
subgraph "Clients"
N[React Frontend]
O[API Consumers]
end
N --> H
N --> J
N --> K
O --> H
O --> I
- PostgreSQL with pgvector extension
- Redis (optional - enables distributed caching; in-memory fallback for single-node)
- Docker and Docker Compose (recommended)
- Python 3.11+ (for local development)
- Node.js 16+ (for frontend development)
Redis is optional but recommended for production deployments. When available, Shu uses Redis for distributed caching and job queues. Without Redis, Shu falls back to in-memory implementations suitable for single-node deployments.
# macOS
brew install redis && brew services start redis
# Docker
docker run -d --name shu-redis -p 6379:6379 redis:alpine
# Linux (Ubuntu/Debian)
sudo apt update && sudo apt install redis-server
sudo systemctl start redis-server
# Verify
redis-cli ping # Should return "PONG"-
Create or update your
.envat the repository root:# If an example file is present cp .env.example .env # Then edit .env to set SHU_* and provider keys
The docker-compose file at
deployment/compose/docker-compose.ymlwill load this.envautomatically. -
Start the full stack (Postgres + Redis + API + frontend):
Using the Makefile:
make up-full
Or directly with Docker Compose:
docker compose -f deployment/compose/docker-compose.yml --profile frontend up -d
This starts:
shu-postgres(pgvector-enabled Postgres with a persistent volume)shu-db-migrate(one-off job that ensures pgvector and runs Alembic migrations)redisshu-apishu-frontend(React admin panel on port 3000)
-
Backend-only stack (API + Postgres + Redis, no frontend):
make up # or docker compose -f deployment/compose/docker-compose.yml up -dTo run the dev API with auto-reload on port 8000, use
make up-dev(backend only) ormake up-full-dev. -
Verify the system:
# Backend health curl http://localhost:8000/api/v1/health/liveness # Frontend (if started) open http://localhost:3000
-
Stop the stack:
make down # stop containers (docker compose down --remove-orphans) # Optional: also remove volumes (destructive to local DB data) docker compose -f deployment/compose/docker-compose.yml down -v --remove-orphans
- Install backend dependencies
-
(macOS/Linux):
python -m venv venv source venv/bin/activate pip install -r requirements.txt -
(Windows):
python -m venv venv venv\Scripts\activate pip install -r requirements.txt
-
Create and set up the PostgreSQL database:
First, create the database:
createdb shu
Then run the database setup script:
# Set your database URL for the setup script. database.py accepts either # `postgresql://` or `postgresql+asyncpg://` here and normalizes it for # psycopg2/Alembic. export SHU_DATABASE_URL="postgresql+asyncpg://user:password@localhost:5432/shu" # Run full setup (init-db.sql + migrations + requirements check) python backend/scripts/database.py setup # Or provide database URL directly python backend/scripts/database.py setup --database-url postgresql+asyncpg://user:password@localhost:5432/shu
This script will:
- Execute
init-db.sqlto create required PostgreSQL extensions (pgvector, uuid-ossp, etc.) - Configure database settings for optimal performance
- Run Alembic migrations to create all tables
- Verify the setup meets all requirements
Note:
backend/scripts/database.pyaccepts bothpostgresql://andpostgresql+asyncpg://URLs for setup, but when running the Shu API itself you should keepSHU_DATABASE_URLusing the asyncpostgresql+asyncpg://scheme as shown in the next step.Other useful database commands:
python backend/scripts/database.py migrate # Run migrations only python backend/scripts/database.py check # Check status and requirements python backend/scripts/database.py reset --force # Reset database (destructive)
- Execute
-
Set environment variables:
export SHU_DATABASE_URL=postgresql+asyncpg://username:password@localhost:5432/shu export PYTHONPATH=$(pwd)/backend/src # Optional: export SHU_REDIS_URL=redis://localhost:6379 # For distributed caching
-
Start the backend server:
-
Option 1 — Dev helper script (recommended):
python backend/scripts/run_dev.py
-
Option 2 — Run Uvicorn directly from repo root:
uvicorn shu.main:app --app-dir backend/src --reload --host 0.0.0.0 --port 8000
-
Start the frontend (in a new terminal):
cd frontend npm install npm start
Once running, access the generated API documentation:
- Swagger UI: http://localhost:8000/docs
- ReDoc: http://localhost:8000/redoc
- OpenAPI JSON: http://localhost:8000/openapi.json
/api/v1/
├── auth/ # Authentication (login, tokens, user management)
├── health/ # Health checks and monitoring
├── chat/ # Chat conversations with LLM + RAG
├── knowledge-bases/ # Knowledge base management (CRUD, documents)
├── query/ # Direct search and retrieval operations
├── plugins/ # Plugin management (registry, feeds, execution)
├── experiences/ # Workflow automation (CRUD and execution)
├── llm/ # LLM provider configuration
└── model-configurations/ # Model configuration management
The Shu React Admin Panel provides a web interface for managing the system:
- Dashboard: System overview with health monitoring and quick statistics
- Knowledge Base Management: Create, edit, and delete knowledge bases
- Experience Management: Create, configure, and execute workflow experiences
- Plugin Feeds: Configure scheduled feeds, trigger runs, and view recent executions (Admin → Feeds)
- Query Tester: Interactive testing of vector similarity and hybrid search
- Health Monitor: System health monitoring
- Development: http://localhost:3000
- Production: Configured via environment variables
For detailed frontend setup and usage, see frontend/README.md.
Shu uses environment variables for configuration. Key settings include:
# Database
SHU_DATABASE_URL=postgresql+asyncpg://user:password@host:5432/database
# Redis (Optional - enables distributed caching for multi-node deployments)
# SHU_REDIS_URL=redis://localhost:6379
# SHU_REDIS_CONNECTION_TIMEOUT=5
# SHU_REDIS_SOCKET_TIMEOUT=5
# API
SHU_API_HOST=0.0.0.0
SHU_API_PORT=8000
SHU_ENVIRONMENT=production
# Google Drive Integration
GOOGLE_SERVICE_ACCOUNT_JSON=/path/to/service-account.json
# Embedding Configuration
SHU_EMBEDDING_MODEL=Snowflake/snowflake-arctic-embed-l-v2.0
SHU_EMBEDDING_DTYPE=float32
# LLM Integration (Required for chat features)
SHU_LLM_ENCRYPTION_KEY=your_fernet_key_here
# OAuth Integration (Required for Gmail/Google Drive)
SHU_OAUTH_ENCRYPTION_KEY=your_fernet_key_hereShu requires encryption keys for secure storage of API keys and OAuth tokens:
# Generate LLM encryption key
python -c "
from cryptography.fernet import Fernet
key = Fernet.generate_key()
print('SHU_LLM_ENCRYPTION_KEY=' + key.decode())
"
# Generate OAuth encryption key (separate from LLM key)
python -c "
from cryptography.fernet import Fernet
key = Fernet.generate_key()
print('SHU_OAUTH_ENCRYPTION_KEY=' + key.decode())
"Add both keys to your .env file or environment variables.
# React Admin Panel
# Optional: if API host differs from the frontend host; else same-origin is used
VITE_API_BASE_URL=http://localhost:8000
# Optional: enable verbose console logging
VITE_DEBUG=true
# Docker dev only: server-side proxy target (NOT exposed to browser)
# DEV_SERVER_API_PROXY_TARGET=http://shu-api-dev:8000Docker Deployment: For Docker networking (dev and production), see DOCKER_FRONTEND_NETWORKING.md.
For complete configuration options, see CONFIGURATION.md.
Shu supports multiple LLM providers for chat and completion features. Configure providers through the admin API:
OpenAI
curl -X POST http://localhost:8000/api/v1/llm/providers \
-H "Content-Type: application/json" \
-d '{
"name": "OpenAI Production",
"provider_type": "openai",
"api_endpoint": "https://api.openai.com/v1",
"api_key": "sk-your_key_here", # pragma: allowlist secret
"supports_streaming": true,
"rate_limit_rpm": 60
}'Anthropic Claude
curl -X POST http://localhost:8000/api/v1/llm/providers \
-H "Content-Type: application/json" \
-d '{
"name": "Anthropic Claude",
"provider_type": "anthropic",
"api_endpoint": "https://api.anthropic.com/v1",
"api_key": "sk-ant-your_key_here" # pragma: allowlist secret
}'Local Ollama
curl -X POST http://localhost:8000/api/v1/llm/providers \
-H "Content-Type: application/json" \
-d '{
"name": "Local Ollama",
"provider_type": "ollama",
"api_endpoint": "http://localhost:11434/v1",
"api_key": null
}'Azure OpenAI
curl -X POST http://localhost:8000/api/v1/llm/providers \
-H "Content-Type: application/json" \
-d '{
"name": "Azure OpenAI",
"provider_type": "azure",
"api_endpoint": "https://your-resource.openai.azure.com/openai/deployments/your-deployment",
"api_key": "your_azure_key_here" # pragma: allowlist secret
}'After setting up providers, configure models for use with knowledge bases:
# Create model configuration
curl -X POST http://localhost:8000/api/v1/llm/providers/{provider_id}/models \
-H "Content-Type: application/json" \
-d '{
"model_name": "gpt-4",
"display_name": "GPT-4 (Latest)",
"context_window": 8000,
"max_output_tokens": 4000,
"supports_streaming": true
}'Shu stores OAuth tokens encrypted at rest.
If you have an existing database that contains plaintext OAuth tokens from earlier
development versions, those rows will not decrypt once SHU_OAUTH_ENCRYPTION_KEY is enabled.
Currently there is no bundled bulk migration script. The recommended approach is to re-connect affected provider accounts so tokens are re-stored encrypted.
See the testing guide for current commands and scope:
- docs/policies/TESTING.md
Typical local runs:
# Backend integration tests
cd backend/src && python -m tests.integ.run_all_integration_tests
# Backend unit tests
python -m pytest backend/src/tests/unit
# Frontend unit tests
cd frontend && npm test -- --watchAll=false# Health checks
curl http://localhost:8000/api/v1/health/liveness
# Knowledge base operations
curl http://localhost:8000/api/v1/knowledge-bases
# Query operations
curl -X POST http://localhost:8000/api/v1/query/{kb_id}/search \
-H "Content-Type: application/json" \
-d '{"query": "test query", "query_type": "hybrid"}'For detailed testing procedures, see TESTING.md.
import requests
# Query Shu API
response = requests.post(
"http://localhost:8000/api/v1/query/kb_id/search",
json={
"query": "your question",
"query_type": "hybrid",
"limit": 5
}
)
results = response.json()backend/
├── src/shu/ # Backend application package
│ ├── main.py # FastAPI application entrypoint
│ ├── api/ # API endpoints
│ ├── core/ # Core utilities (config, database, logging)
│ ├── models/ # SQLAlchemy database models
│ ├── schemas/ # Pydantic request/response schemas
│ ├── services/ # Business logic layer
│ └── ... # Other backend modules
frontend/
├── src/
│ ├── components/ # React components
│ ├── services/ # API services
│ ├── App.js # Main application
│ └── index.js # Entry point
├── public/ # Static assets
└── package.json # Dependencies
- Make changes to source code
- Run tests:
cd backend/src && python -m tests.integ.run_all_integration_tests - Update database schema:
python -m alembic -c backend/alembic.ini revision --autogenerate -m "description" - Apply migrations:
python -m alembic -c backend/alembic.ini upgrade head - Frontend development:
cd frontend && npm start
For comprehensive deployment documentation including worker modes, scaling patterns, and production configurations, see DEPLOYMENT_GUIDE.md.
# Backend only (API + Postgres + Redis, inline workers)
docker compose -f deployment/compose/docker-compose.yml up -d
# Full stack (API + Postgres + Redis + frontend)
docker compose -f deployment/compose/docker-compose.yml --profile frontend up -d
# Add dedicated workers (production pattern)
docker compose -f deployment/compose/docker-compose.yml --profile worker up -d
# Scale the API service
docker compose -f deployment/compose/docker-compose.yml up -d --scale shu-api=3Using Makefile shortcuts:
make up # Backend only
make up-full # Full stack with frontend
make up-worker # Add dedicated workers
make down # Stop all servicesShu is designed for Kubernetes deployment with:
- Health checks for liveness and readiness probes
- Configurable scaling and resource limits
- Independent scaling of API servers and worker pools
- Monitoring and logging integration
See DEPLOYMENT_GUIDE.md for Kubernetes manifests and scaling strategies.
Shu is dual-licensed:
- Community: under the Shu Community License for personal, educational, and non-commercial use.
- Commercial: under the Shu Commercial License for any use in connection with a business or revenue-generating activity.
For commercial licensing terms and contact information, see LICENSE.COMMERCIAL.md.
By contributing to Shu, you agree to the terms of the Contributor License Agreement (CLA).
- Follow Development Standards: See DEVELOPMENT_STANDARDS.md
- Set Up Linting: Run
make setup-hooksto install pre-commit hooks (see LINTING_QUICKSTART.md) - Run Linters: Use
make lint-fixbefore committing to auto-fix issues - Run Tests: Ensure 100% test success rate
- Update Documentation: Keep README and API docs current
- Follow Git Workflow: Use feature branches and pull requests
Shu uses automated linting and formatting to maintain code quality:
- Python: Ruff (linter + formatter) + mypy (type checker)
- Frontend: ESLint + Prettier
- Automation: Pre-commit hooks + GitHub Actions CI
- Coverage: Automatic test coverage tracking (no enforcement)
Quick Start:
# One-time setup
make setup-hooks
# Before committing
make lint-fix
# Run tests with coverage
make test-cov- API Documentation: http://localhost:8000/docs
- Health Checks: http://localhost:8000/api/v1/health/liveness
- React Admin Panel: http://localhost:3000
- Deployment Guide: DEPLOYMENT_GUIDE.md
- Configuration Guide: CONFIGURATION.md
- Testing Guide: TESTING.md
- Linting Guide: LINTING_GUIDE.md
Open Shu Technologies, LLC. has filed patent applications related to the technology used in this software.
- For Community License Users: You are granted a patent license to use these inventions solely as part of this software for Non-Commercial Use, subject to the terms and conditions of the Shu Community License.
- For Commercial Users: If you are using this software under the Commercial License, your patent rights are governed by that agreement.
- No Other Rights: Use of this software outside the terms of the applicable license may constitute patent infringement.
"Shu" and "Open Shu" are trademarks of Open Shu Technologies, LLC.
Shu is an extensible AI operating system for people and teams, currently focused on plugins, knowledge retrieval, and workflow experimentation.