Skip to content

LLM-Dev-Ops/registry

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

17 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

LLM Registry

License Rust Crates.io npm SDK npm CLI

Enterprise-grade registry for managing Large Language Model (LLM) assets, pipelines, datasets, policies, and test suites with comprehensive version control, dependency tracking, and compliance management.

Features

Core Capabilities

  • Asset Management: Version-controlled storage for models, pipelines, datasets, policies, and test suites
  • Dependency Tracking: Automatic dependency resolution and circular dependency detection
  • Integrity Verification: SHA-256 checksum validation and provenance tracking
  • Policy Enforcement: Compliance validation and policy-based governance
  • Event System: Real-time event streaming via NATS for asset lifecycle events

Security & Authentication

  • JWT Authentication: Secure token-based authentication with refresh tokens
  • RBAC: Role-Based Access Control with permission inheritance
  • Rate Limiting: Token bucket algorithm with configurable limits
  • API Security: Request signing, CORS, and security headers

Observability

  • OpenTelemetry: Distributed tracing with Jaeger integration
  • Prometheus Metrics: Comprehensive metrics for monitoring
  • Structured Logging: JSON-formatted logs with correlation IDs
  • Health Checks: Liveness and readiness probes

Performance

  • Redis Caching: Distributed caching for improved performance
  • Connection Pooling: Optimized database connection management
  • Async/Await: Non-blocking I/O throughout the stack
  • Horizontal Scaling: Stateless design for easy scaling

Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   REST API      β”‚ ← JWT Auth, Rate Limiting, RBAC
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚   Service Layer β”‚ ← Business Logic, Policy Validation
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚   Data Layer    β”‚ ← PostgreSQL, Redis Cache
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚   Event System  β”‚ ← NATS Event Publishing
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Technology Stack

  • Language: Rust 1.75+
  • Web Framework: Axum
  • Database: PostgreSQL 15+
  • Cache: Redis 7+
  • Message Queue: NATS 2.10+
  • Observability: OpenTelemetry, Prometheus, Jaeger

Installation

NPM Packages (TypeScript/JavaScript)

SDK (API Client)

Install the TypeScript SDK for programmatic access to the LLM Registry:

npm install @llm-dev-ops/llm-registry-sdk

Usage:

import { LLMRegistryClient } from '@llm-dev-ops/llm-registry-sdk';

const client = new LLMRegistryClient({
  baseURL: 'http://localhost:8080',
  apiToken: 'your-api-token'
});

// List models
const models = await client.listModels();

// Create a model
const model = await client.createModel({
  name: 'my-model',
  version: '1.0.0',
  provider: 'openai'
});

CLI Tool

Install the command-line interface globally:

npm install -g @llm-dev-ops/llm-registry

Usage:

# Configure the CLI
llm-registry config --url http://localhost:8080

# List models
llm-registry models list

# Create a model
llm-registry models create --name my-model --version 1.0.0

# Upload an asset
llm-registry assets upload <model-id> ./model.safetensors \
  --name weights --version 1.0.0

See the CLI documentation for more commands.

Rust Crates (crates.io)

Add the LLM Registry crates to your Cargo.toml:

[dependencies]
# Core domain types
llm-registry-core = "0.1.0"

# Database layer with migrations
llm-registry-db = "0.1.0"

# Business logic and service layer
llm-registry-service = "0.1.0"

# REST API layer
llm-registry-api = "0.1.0"

# Complete server binary
llm-registry-server = "0.1.0"

Usage:

use llm_registry_core::{Model, Asset};
use llm_registry_service::ModelService;

// Use the service layer in your application
let service = ModelService::new(db_pool);
let models = service.list_models(filters).await?;

Server Binary

Install the server binary directly from crates.io:

cargo install llm-registry-server

Then run:

llm-registry-server

Quick Start

Prerequisites

  • Rust 1.75 or later
  • Docker and Docker Compose
  • PostgreSQL 15+ (or use Docker Compose)
  • Redis 7+ (or use Docker Compose)
  • NATS 2.10+ (or use Docker Compose)

Development Setup

  1. Clone the repository
git clone https://github.com/globalbusinessadvisors/llm-registry.git
cd llm-registry
  1. Start infrastructure with Docker Compose
docker-compose up -d

This starts PostgreSQL, Redis, NATS, Prometheus, and Grafana.

  1. Run database migrations
cargo install sqlx-cli
sqlx database create
sqlx migrate run
  1. Build the project
cargo build --release
  1. Run the server
cargo run --bin llm-registry-server

The API will be available at http://localhost:8080.

Docker Deployment

Build and run with Docker:

# Build production image
docker build -t llm-registry:latest .

# Run container
docker run -p 8080:8080 \
  -e DATABASE_URL=postgresql://user:pass@host:5432/db \
  -e REDIS_URL=redis://host:6379 \
  -e NATS_URL=nats://host:4222 \
  llm-registry:latest

Kubernetes Deployment

See k8s/README.md for detailed Kubernetes deployment instructions.

# Deploy to Kubernetes
kubectl apply -f k8s/

API Documentation

Authentication

All protected endpoints require a JWT token:

# Login to get JWT token
curl -X POST http://localhost:8080/v1/auth/login \
  -H "Content-Type: application/json" \
  -d '{"username": "admin", "password": "password"}'

# Use token in subsequent requests
curl -X GET http://localhost:8080/v1/assets \
  -H "Authorization: Bearer <your-token>"

Endpoints

Asset Management

  • POST /v1/assets - Register a new asset
  • GET /v1/assets - List assets with filtering and pagination
  • GET /v1/assets/:id - Get asset by ID
  • PATCH /v1/assets/:id - Update asset metadata
  • DELETE /v1/assets/:id - Delete asset

Dependencies

  • GET /v1/assets/:id/dependencies - Get dependency graph
  • GET /v1/assets/:id/dependents - Get reverse dependencies

Health & Metrics

  • GET /health - Health check
  • GET /metrics - Prometheus metrics
  • GET /version - Version information

Authentication

  • POST /v1/auth/login - Login and get JWT token
  • POST /v1/auth/refresh - Refresh access token
  • GET /v1/auth/me - Get current user info
  • POST /v1/auth/logout - Logout
  • POST /v1/auth/api-keys - Generate API key (requires developer/admin role)

Configuration

Configuration can be provided via:

  1. Environment variables
  2. Configuration files (TOML)
  3. Command-line arguments

Environment Variables

# Server
SERVER_HOST=0.0.0.0
SERVER_PORT=8080

# Database
DATABASE_URL=postgresql://user:pass@localhost:5432/llm_registry

# Redis
REDIS_URL=redis://localhost:6379

# NATS
NATS_URL=nats://localhost:4222

# JWT
JWT_SECRET=your-secret-key-change-in-production
JWT_ISSUER=llm-registry
JWT_AUDIENCE=llm-registry-api
JWT_EXPIRATION_SECONDS=3600

# Rate Limiting
RATE_LIMIT_ENABLED=true
RATE_LIMIT_MAX_REQUESTS=100
RATE_LIMIT_WINDOW_SECS=60

# Logging
RUST_LOG=info

Development

Project Structure

llm-registry/
β”œβ”€β”€ crates/
β”‚   β”œβ”€β”€ llm-registry-core/      # Core domain types
β”‚   β”œβ”€β”€ llm-registry-db/        # Database layer
β”‚   β”œβ”€β”€ llm-registry-service/   # Business logic
β”‚   β”œβ”€β”€ llm-registry-api/       # REST API layer
β”‚   └── llm-registry-server/    # Server binary
β”œβ”€β”€ migrations/                  # Database migrations
β”œβ”€β”€ config/                      # Configuration files
β”œβ”€β”€ docker/                      # Docker files
β”œβ”€β”€ k8s/                        # Kubernetes manifests
└── deployments/                # Monitoring configs

### Running Tests

```bash
# Run all tests
cargo test --workspace

# Run tests with coverage
cargo tarpaulin --workspace --out Html

# Run specific crate tests
cargo test -p llm-registry-core

Code Quality

# Format code
cargo fmt --all

# Lint code
cargo clippy --workspace -- -D warnings

# Security audit
cargo audit

Database Migrations

# Create a new migration
sqlx migrate add <migration_name>

# Run migrations
sqlx migrate run

# Revert last migration
sqlx migrate revert

Monitoring

Prometheus

Metrics are exposed at /metrics endpoint:

# Access metrics
curl http://localhost:8080/metrics

Key metrics:

  • http_requests_total - Total HTTP requests by method, path, status
  • http_request_duration_seconds - Request duration histogram
  • db_queries_total - Database query counts
  • cache_operations_total - Cache operation counts
  • assets_total - Total assets by status

Grafana

Access Grafana dashboard at http://localhost:3000 (default credentials: admin/admin).

Pre-configured dashboards are available in deployments/grafana/.

Tracing

Traces are sent to Jaeger at http://localhost:16686.

Security

Authentication Flow

  1. User logs in with username/password β†’ receives JWT access token and refresh token
  2. Access token is used for API requests (expires in 1 hour)
  3. Refresh token is used to obtain new access tokens (expires in 7 days)
  4. API keys can be generated for long-lived access

RBAC Roles

  • admin: Full access to all resources and operations
  • developer: Can manage assets and generate API keys
  • user: Can read and write assets
  • viewer: Read-only access to assets

Rate Limiting

Default limits:

  • 100 requests per minute per IP/user
  • Configurable per endpoint
  • Distributed rate limiting via Redis

Performance

Benchmarks

  • Asset registration: ~500 req/s
  • Asset retrieval: ~2000 req/s
  • Cache hit rate: >90%
  • P99 latency: <50ms

Optimization Tips

  1. Enable Redis caching for frequently accessed assets
  2. Use connection pooling (configured by default)
  3. Adjust max_connections based on load
  4. Use CDN for large asset downloads
  5. Enable compression for API responses

Production Deployment

Checklist

  • Change default JWT secret
  • Configure TLS/HTTPS
  • Set up database backups
  • Configure monitoring and alerting
  • Set up log aggregation
  • Review and adjust rate limits
  • Enable RBAC and set up roles
  • Configure ingress/load balancer
  • Test disaster recovery
  • Set up CI/CD pipeline

Scaling

Horizontal scaling:

# Scale to 5 replicas
kubectl scale deployment llm-registry-server --replicas=5

# Or use HPA (already configured)
kubectl autoscale deployment llm-registry-server --min=3 --max=10 --cpu-percent=70

Troubleshooting

Common Issues

Database Connection Errors

# Check database connectivity
psql $DATABASE_URL -c "SELECT 1"

Redis Connection Errors

# Check Redis connectivity
redis-cli -u $REDIS_URL ping

High Memory Usage

  • Reduce connection pool size
  • Enable pagination for large queries
  • Check for connection leaks

Logs

# View logs
RUST_LOG=debug cargo run

# Or in Docker
docker logs <container-id>

# Or in Kubernetes
kubectl logs -f deployment/llm-registry-server -n llm-registry

Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

Documentation

Core Documentation

Developer Resources

Package Registry

NPM Packages

Rust Crates

Support

Acknowledgments

Built with:


Status: Production Ready v1.0

About

No description, website, or topics provided.

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors