Layer: model-serving
Scope: Configuration and operation of local language model endpoints (vLLM A/B).
Non-scope: No routing logic, no folding logic, no governance logic, no event membrane enforcement.
This repo defines how models are served.
It does not define how they are used.
laforge-model-serving MUST NOT:
- implement routing logic
- decide which model handles a request
- implement folding/reconciliation
- mutate control parameters
- read domain content or runtime logs
laforge-model-serving MUST:
- expose stable OpenAI-compatible endpoints
- expose stable model IDs via /v1/models
- respect explicit seeds if provided
- avoid hidden config mutation
Token output may vary. Control determinism is enforced above this layer.