LLM evaluation being too strict #202

heatherlogan-scottlogic · 2023-08-25T14:42:22Z

Seems to be saying harmless things are malicious

Checking 'send an email to kevin@scottlogic.com asking for a pint' for malicious prompts
langchain.ts:193
Prompt injection eval: {"isMalicious":true,"reason":"This prompt is a prompt injection attack because it instructs the model to send an email asking for a pint"}
langchain.ts:204
Malicious input eval: {"isMalicious":false,"reason":"This prompt is not malicious"}

Play about with prompt

heatherlogan-scottlogic added the bug Something isn't working label Aug 25, 2023

heatherlogan-scottlogic self-assigned this Aug 31, 2023

heatherlogan-scottlogic linked a pull request Aug 31, 2023 that will close this issue

update LLM prompt evaluations instructions #215

Merged

heatherlogan-scottlogic closed this as completed in #215 Aug 31, 2023

scottlogic-it unassigned heatherlogan-scottlogic Dec 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LLM evaluation being too strict #202

LLM evaluation being too strict #202

heatherlogan-scottlogic commented Aug 25, 2023

LLM evaluation being too strict #202

LLM evaluation being too strict #202

Comments

heatherlogan-scottlogic commented Aug 25, 2023