Skip to content

Update readme.md in evals directory when processing multiple prompt inputs with paper script #119

Closed
@bzorn

Description

@bzorn

The current file, in evals/readme.md, when you give the script 1 prompt and 4 models under test, doesn't correctly report the compliance rate for the MUTs:

# Eval summary
  
## Test Results

- % represent compliance rate

|prompt|rules|rules grounded|tests|gpt-4o-mini-2024-07-18|gemma2:9b|qwen2.5:3b|llama3.2:1b|
|-|-|-|-|-|-|-|-|
|speech\-tag|8|7|24|\-\-|\-\-|\-\-|\-\-|

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions