Skip to content

mwill20/SecureCLI-Tuner-deploy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SecureCLI-Tuner Deployment & Monitoring

Minimal runnable repository demonstrating deployment, client interaction, monitoring, and security for the SecureCLI-Tuner model.

Architecture

graph LR
    C[Client] -->|port 8000| V[Validator Proxy]
    subgraph Docker Network
        V -->|port 8000| M[vLLM Server]
    end
Loading

Features

  • Deployment: Docker Compose setup for running vLLM behind a Validator Proxy.
  • Client: Python client executing benchmarks.
  • Validation: Server-side output validation to strictly block dangerous commands and chaining operators.
  • Monitoring/Ops: Benchmark script with p50/p95 latency and pass/block/error rates; /health endpoint available on validator.
  • Security: vLLM endpoint isolated to docker network.

Quickstart

Windows GPU note: For NVIDIA GPU support with Docker, prefer running via WSL2 Ubuntu or use a Linux host.

1) Start the Server Stack

From the deploy/ directory, set environment variables and start the stack.

Linux / macOS

cd deploy
export MODEL_ID="mwill-AImission/SecureCLI-Tuner-V2"
export HF_TOKEN=""
docker compose up -d --build

Windows (PowerShell)

cd deploy
$env:MODEL_ID="mwill-AImission/SecureCLI-Tuner-V2"
$env:HF_TOKEN=""
docker compose up -d --build

Windows (CMD)

cd deploy
set MODEL_ID=mwill-AImission/SecureCLI-Tuner-V2
set HF_TOKEN=
docker compose up -d --build

2) Verify Health

curl http://localhost:8000/health

3) Run the Client

The client connects directly to the proxy on localhost:8000.

cd ..
pip install -r requirements.txt
python client/client.py --file client/test_requests.jsonl

4) Run the Benchmark

python client/benchmark.py

Documentation

Certification Scope

This repository demonstrates the following capabilities for Ready Tensor Module 2:

  • Deployment architecture: Two-service Docker Compose stack with private vLLM and public validator proxy.
  • Deterministic inference configuration: Server-enforced temperature, top_p, max_tokens, and stream settings.
  • Security boundary enforcement: Fail-closed validator with role allowlist, payload size caps, operator/pattern blocking, and markdown rejection.
  • Monitoring & observability structure: /health endpoint, structured logging, p50/p95 benchmark script.
  • Cost and capacity modeling: Documented in docs/RT_PUBLICATION_DRAFT.md.

This is an internal deployment pattern. It is not a public SaaS endpoint.

About

No description, website, or topics provided.

Resources

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors