Minimal runnable repository demonstrating deployment, client interaction, monitoring, and security for the SecureCLI-Tuner model.
graph LR
C[Client] -->|port 8000| V[Validator Proxy]
subgraph Docker Network
V -->|port 8000| M[vLLM Server]
end
- Deployment: Docker Compose setup for running vLLM behind a Validator Proxy.
- Client: Python client executing benchmarks.
- Validation: Server-side output validation to strictly block dangerous commands and chaining operators.
- Monitoring/Ops: Benchmark script with p50/p95 latency and pass/block/error rates;
/healthendpoint available on validator. - Security: vLLM endpoint isolated to docker network.
Windows GPU note: For NVIDIA GPU support with Docker, prefer running via WSL2 Ubuntu or use a Linux host.
From the deploy/ directory, set environment variables and start the stack.
Linux / macOS
cd deploy
export MODEL_ID="mwill-AImission/SecureCLI-Tuner-V2"
export HF_TOKEN=""
docker compose up -d --buildWindows (PowerShell)
cd deploy
$env:MODEL_ID="mwill-AImission/SecureCLI-Tuner-V2"
$env:HF_TOKEN=""
docker compose up -d --buildWindows (CMD)
cd deploy
set MODEL_ID=mwill-AImission/SecureCLI-Tuner-V2
set HF_TOKEN=
docker compose up -d --buildcurl http://localhost:8000/healthThe client connects directly to the proxy on localhost:8000.
cd ..
pip install -r requirements.txt
python client/client.py --file client/test_requests.jsonlpython client/benchmark.pyThis repository demonstrates the following capabilities for Ready Tensor Module 2:
- Deployment architecture: Two-service Docker Compose stack with private vLLM and public validator proxy.
- Deterministic inference configuration: Server-enforced temperature, top_p, max_tokens, and stream settings.
- Security boundary enforcement: Fail-closed validator with role allowlist, payload size caps, operator/pattern blocking, and markdown rejection.
- Monitoring & observability structure:
/healthendpoint, structured logging, p50/p95 benchmark script. - Cost and capacity modeling: Documented in
docs/RT_PUBLICATION_DRAFT.md.
This is an internal deployment pattern. It is not a public SaaS endpoint.