Enterprise-grade registry for managing Large Language Model (LLM) assets, pipelines, datasets, policies, and test suites with comprehensive version control, dependency tracking, and compliance management.
- Asset Management: Version-controlled storage for models, pipelines, datasets, policies, and test suites
- Dependency Tracking: Automatic dependency resolution and circular dependency detection
- Integrity Verification: SHA-256 checksum validation and provenance tracking
- Policy Enforcement: Compliance validation and policy-based governance
- Event System: Real-time event streaming via NATS for asset lifecycle events
- JWT Authentication: Secure token-based authentication with refresh tokens
- RBAC: Role-Based Access Control with permission inheritance
- Rate Limiting: Token bucket algorithm with configurable limits
- API Security: Request signing, CORS, and security headers
- OpenTelemetry: Distributed tracing with Jaeger integration
- Prometheus Metrics: Comprehensive metrics for monitoring
- Structured Logging: JSON-formatted logs with correlation IDs
- Health Checks: Liveness and readiness probes
- Redis Caching: Distributed caching for improved performance
- Connection Pooling: Optimized database connection management
- Async/Await: Non-blocking I/O throughout the stack
- Horizontal Scaling: Stateless design for easy scaling
βββββββββββββββββββ
β REST API β β JWT Auth, Rate Limiting, RBAC
βββββββββββββββββββ€
β Service Layer β β Business Logic, Policy Validation
βββββββββββββββββββ€
β Data Layer β β PostgreSQL, Redis Cache
βββββββββββββββββββ€
β Event System β β NATS Event Publishing
βββββββββββββββββββ
- Language: Rust 1.75+
- Web Framework: Axum
- Database: PostgreSQL 15+
- Cache: Redis 7+
- Message Queue: NATS 2.10+
- Observability: OpenTelemetry, Prometheus, Jaeger
Install the TypeScript SDK for programmatic access to the LLM Registry:
npm install @llm-dev-ops/llm-registry-sdkUsage:
import { LLMRegistryClient } from '@llm-dev-ops/llm-registry-sdk';
const client = new LLMRegistryClient({
baseURL: 'http://localhost:8080',
apiToken: 'your-api-token'
});
// List models
const models = await client.listModels();
// Create a model
const model = await client.createModel({
name: 'my-model',
version: '1.0.0',
provider: 'openai'
});Install the command-line interface globally:
npm install -g @llm-dev-ops/llm-registryUsage:
# Configure the CLI
llm-registry config --url http://localhost:8080
# List models
llm-registry models list
# Create a model
llm-registry models create --name my-model --version 1.0.0
# Upload an asset
llm-registry assets upload <model-id> ./model.safetensors \
--name weights --version 1.0.0See the CLI documentation for more commands.
Add the LLM Registry crates to your Cargo.toml:
[dependencies]
# Core domain types
llm-registry-core = "0.1.0"
# Database layer with migrations
llm-registry-db = "0.1.0"
# Business logic and service layer
llm-registry-service = "0.1.0"
# REST API layer
llm-registry-api = "0.1.0"
# Complete server binary
llm-registry-server = "0.1.0"Usage:
use llm_registry_core::{Model, Asset};
use llm_registry_service::ModelService;
// Use the service layer in your application
let service = ModelService::new(db_pool);
let models = service.list_models(filters).await?;Install the server binary directly from crates.io:
cargo install llm-registry-serverThen run:
llm-registry-server- Rust 1.75 or later
- Docker and Docker Compose
- PostgreSQL 15+ (or use Docker Compose)
- Redis 7+ (or use Docker Compose)
- NATS 2.10+ (or use Docker Compose)
- Clone the repository
git clone https://github.com/globalbusinessadvisors/llm-registry.git
cd llm-registry- Start infrastructure with Docker Compose
docker-compose up -dThis starts PostgreSQL, Redis, NATS, Prometheus, and Grafana.
- Run database migrations
cargo install sqlx-cli
sqlx database create
sqlx migrate run- Build the project
cargo build --release- Run the server
cargo run --bin llm-registry-serverThe API will be available at http://localhost:8080.
Build and run with Docker:
# Build production image
docker build -t llm-registry:latest .
# Run container
docker run -p 8080:8080 \
-e DATABASE_URL=postgresql://user:pass@host:5432/db \
-e REDIS_URL=redis://host:6379 \
-e NATS_URL=nats://host:4222 \
llm-registry:latestSee k8s/README.md for detailed Kubernetes deployment instructions.
# Deploy to Kubernetes
kubectl apply -f k8s/All protected endpoints require a JWT token:
# Login to get JWT token
curl -X POST http://localhost:8080/v1/auth/login \
-H "Content-Type: application/json" \
-d '{"username": "admin", "password": "password"}'
# Use token in subsequent requests
curl -X GET http://localhost:8080/v1/assets \
-H "Authorization: Bearer <your-token>"POST /v1/assets- Register a new assetGET /v1/assets- List assets with filtering and paginationGET /v1/assets/:id- Get asset by IDPATCH /v1/assets/:id- Update asset metadataDELETE /v1/assets/:id- Delete asset
GET /v1/assets/:id/dependencies- Get dependency graphGET /v1/assets/:id/dependents- Get reverse dependencies
GET /health- Health checkGET /metrics- Prometheus metricsGET /version- Version information
POST /v1/auth/login- Login and get JWT tokenPOST /v1/auth/refresh- Refresh access tokenGET /v1/auth/me- Get current user infoPOST /v1/auth/logout- LogoutPOST /v1/auth/api-keys- Generate API key (requires developer/admin role)
Configuration can be provided via:
- Environment variables
- Configuration files (TOML)
- Command-line arguments
# Server
SERVER_HOST=0.0.0.0
SERVER_PORT=8080
# Database
DATABASE_URL=postgresql://user:pass@localhost:5432/llm_registry
# Redis
REDIS_URL=redis://localhost:6379
# NATS
NATS_URL=nats://localhost:4222
# JWT
JWT_SECRET=your-secret-key-change-in-production
JWT_ISSUER=llm-registry
JWT_AUDIENCE=llm-registry-api
JWT_EXPIRATION_SECONDS=3600
# Rate Limiting
RATE_LIMIT_ENABLED=true
RATE_LIMIT_MAX_REQUESTS=100
RATE_LIMIT_WINDOW_SECS=60
# Logging
RUST_LOG=infollm-registry/
βββ crates/
β βββ llm-registry-core/ # Core domain types
β βββ llm-registry-db/ # Database layer
β βββ llm-registry-service/ # Business logic
β βββ llm-registry-api/ # REST API layer
β βββ llm-registry-server/ # Server binary
βββ migrations/ # Database migrations
βββ config/ # Configuration files
βββ docker/ # Docker files
βββ k8s/ # Kubernetes manifests
βββ deployments/ # Monitoring configs
### Running Tests
```bash
# Run all tests
cargo test --workspace
# Run tests with coverage
cargo tarpaulin --workspace --out Html
# Run specific crate tests
cargo test -p llm-registry-core
# Format code
cargo fmt --all
# Lint code
cargo clippy --workspace -- -D warnings
# Security audit
cargo audit# Create a new migration
sqlx migrate add <migration_name>
# Run migrations
sqlx migrate run
# Revert last migration
sqlx migrate revertMetrics are exposed at /metrics endpoint:
# Access metrics
curl http://localhost:8080/metricsKey metrics:
http_requests_total- Total HTTP requests by method, path, statushttp_request_duration_seconds- Request duration histogramdb_queries_total- Database query countscache_operations_total- Cache operation countsassets_total- Total assets by status
Access Grafana dashboard at http://localhost:3000 (default credentials: admin/admin).
Pre-configured dashboards are available in deployments/grafana/.
Traces are sent to Jaeger at http://localhost:16686.
- User logs in with username/password β receives JWT access token and refresh token
- Access token is used for API requests (expires in 1 hour)
- Refresh token is used to obtain new access tokens (expires in 7 days)
- API keys can be generated for long-lived access
- admin: Full access to all resources and operations
- developer: Can manage assets and generate API keys
- user: Can read and write assets
- viewer: Read-only access to assets
Default limits:
- 100 requests per minute per IP/user
- Configurable per endpoint
- Distributed rate limiting via Redis
- Asset registration: ~500 req/s
- Asset retrieval: ~2000 req/s
- Cache hit rate: >90%
- P99 latency: <50ms
- Enable Redis caching for frequently accessed assets
- Use connection pooling (configured by default)
- Adjust
max_connectionsbased on load - Use CDN for large asset downloads
- Enable compression for API responses
- Change default JWT secret
- Configure TLS/HTTPS
- Set up database backups
- Configure monitoring and alerting
- Set up log aggregation
- Review and adjust rate limits
- Enable RBAC and set up roles
- Configure ingress/load balancer
- Test disaster recovery
- Set up CI/CD pipeline
Horizontal scaling:
# Scale to 5 replicas
kubectl scale deployment llm-registry-server --replicas=5
# Or use HPA (already configured)
kubectl autoscale deployment llm-registry-server --min=3 --max=10 --cpu-percent=70Database Connection Errors
# Check database connectivity
psql $DATABASE_URL -c "SELECT 1"Redis Connection Errors
# Check Redis connectivity
redis-cli -u $REDIS_URL pingHigh Memory Usage
- Reduce connection pool size
- Enable pagination for large queries
- Check for connection leaks
# View logs
RUST_LOG=debug cargo run
# Or in Docker
docker logs <container-id>
# Or in Kubernetes
kubectl logs -f deployment/llm-registry-server -n llm-registry- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
- π API Reference - Complete REST API documentation with examples
- ποΈ Architecture Guide - System architecture, components, and design patterns
- π Security Guide - Security best practices, authentication, and compliance
- π³ Docker Deployment - Production Docker deployment guide
- π§ͺ Testing Guide - Integration test suite documentation
- π€ Contributing Guide - How to contribute to the project
- π Code of Conduct - Community guidelines
- π Changelog - Version history and changes
- Core: llm-registry-core
- Database: llm-registry-db
- Service: llm-registry-service
- API: llm-registry-api
- Server: llm-registry-server
- Issues: GitHub Issues
- Discussions: GitHub Discussions
Built with:
- Axum - Web framework
- SQLx - Database toolkit
- Tower - Middleware
- OpenTelemetry - Observability
- Prometheus - Monitoring
Status: Production Ready v1.0