Prime verifiers environment for predicting the final bucket in VPCT scenes. It consumes JSONL rows with a messages list (user prompt containing the scene JSON) and metadata.final_bucket for the ground truth, then scores model outputs that contain answer(1|2|3).
pyproject.toml— package metadata/deps for Prime env packaging.vpct.py— environment implementation exposingload_environment.
Each JSONL row should look like:
{
"messages": [
{"role": "user", "content": "<scene JSON here>"}
],
"metadata": {"final_bucket": 2}
}If metadata.final_bucket is missing, the env falls back to parsing answer(X) from the assistant message (if present).
The environment redacts finalBucket fields and any answer(X) strings from the user prompt before sending to the model to avoid label leakage.
uv pip install -e .
uv run vf-eval vpct --data-path /path/to/data.jsonl --max-examples 8vf-eval expects OpenAI-style chat messages; only the first user message is forwarded to the model.
Default data_path is data/test/vpct_test.jsonl if present.
prime env push --auto-bumpUse --team <teamname> or --visibility=PRIVATE as needed. Increase version in pyproject.toml (or rely on --auto-bump) when publishing updates.