Closed
Description
The generated file overview.csv shows "--" for the baseline model compliance results:
model,tests,tests compliant,baseline compliant,tests positive,tests positive compliant,tests negative,tests negative compliant,baseline,tests valid,tests valid compliant
gpt-4o-mini-2024-07-18,30,100%,--,30,30,0,0,0,28,28
gemma2:9b,30,93%,--,30,28,0,0,0,28,27
qwen2.5:3b,30,97%,--,30,29,0,0,0,28,27
llama3.2:1b,30,23%,--,30,7,0,0,0,28,6
```'
It doesn't look like the baseline tests are being evaluated for compliance.