The initial experiments can run on baremetal nodes or single OpenShift nodes. In anticipation of future scaling, we'll need to separate concerns so they can be containerized.
For the initial prototype, the most important piece is containerizing the benchmark.
Each example (codebase + issue + patch) should run in one container such that:
- New patches can be injected and reversed.
- Verifiers can be run (both light-weight and heavier unit/perf tests).
Models will be deployed via vLLM. Feedback from tools appended to context via shared volume.
The initial experiments can run on baremetal nodes or single OpenShift nodes. In anticipation of future scaling, we'll need to separate concerns so they can be containerized.
For the initial prototype, the most important piece is containerizing the benchmark.
Each example (codebase + issue + patch) should run in one container such that:
Models will be deployed via vLLM. Feedback from tools appended to context via shared volume.