v0.11.0
What's Changed
New plugins
- Template / SQL injection probes by @erickgalinkin in #1138
- Feature/add mistral generator by @dimensi0n in #1135
New features
- feature: update default toxicity detector by @leondz in #1106
- feature: lightweight probe defaults by @leondz in #1116
- feature: max_workers / give kinda helpful message if too many open files by @leondz in #1110
- Multiprocess enabled logging config by @jmartin-tech in #1140
- Feature: multilingual machine translation by @SnowMasaya in #943
- Support stripping until end think token given empty skip_seq_start in config by @aishwaryap in #1185
- update: add probe tiers by @leondz in #1151
- update: promptinject detector now accepts multiple triggers by @leondz in #1148
- update: rename atkgen probe model to be clear about toxicity by @leondz in #1149
- update: remove ambiguous terms from
slur_terms_en
payload by @leondz in #1150 - reporting: update report aggregation funcs by @leondz in #1156
- script: qualitative review output by @leondz in #1144
- Add -no-cnv flag support to ggml generators by @IanYHChu in #1189
- reporting: add option for no group score by @leondz in #1194
- reporting: aggregate probe as min by @leondz in #1218
- reporting: add defcon lozenges for relative & absolute scores by @leondz in #1216
- Update/refactor specialwords by @leondz in #1178
- reporting: smooth z-score wildness by @leondz in #1212
- Task: 2025 Q2 scoring calibration by @jmartin-tech in #1231 (thanks to Vijil.ai for data contributions)
- update calibration data for additional probes by @jmartin-tech in #1236
- reporting: change default aggregation by @leondz in #1234
Documentation
- Fix typo in README for leak replay probe by @arjun-krishna1 in #1142
- docs: split 'extending' docs out from 'contributing' by @leondz in #1146
- doc file class corrections by @jmartin-tech in #1200
- docs: formatting fixes by @leondz in #1215
Tuning & fixes
- clear pip cached files by @jmartin-tech in #1129
- set a default soft_probe_prompt_cap in
_config
by @jmartin-tech in #1133 - enhance response type support from local NeMo-Guardrails by @jmartin-tech in #1131
- bugfix: encoding detection generating false positives by @leondz in #1130
- update: unify on
attempt.notes["triggers"]
by @leondz in #1147 - Bump datasets version by @JanetVictorious in #1137
- make all workflow permissions explicit by @jmartin-tech in #1162
- update: add soft prompt caps to encoding probes by @leondz in #1154
- update: rename
bcp47
tolang
by @leondz in #1164 - one detection result per output when testing regex based matches in
exploitation
by @jmartin-tech in #1167 - Removed detector prefix from eval records by @mrowebot in #1157
- bugfix: HF Detector exceptions now handled gracefully by default by @leondz in #1170
- cache workflow resources by @jmartin-tech in #1173
- refactor probe
tier
as enum with value in plugin cache by @jmartin-tech in #1159 - update: more meaningful values in tier enums by @leondz in #1176
- block failing litellm 1.67.2 by @leondz in #1179
- ux: give more verbose message for CLI typos by @leondz in #1182
- refactor
LatentInjection
by @leondz in #1152 - cap
litellm
max version to avoid their windows bug by @leondz in #1186 - update: rename
Translator
->LangProvider
and associated elements by @leondz in #1183 - bugfix: reduce latent optimisation permutation explosion by @leondz in #1181
- replicate generator pickle support improvements by @jmartin-tech in #1190
- Fix ambiguous series value error when running --report by @marcorosa in #1171
- add arm64 runner to Linux testing by @jmartin-tech in #1196
- Testing: storage reduction by @jmartin-tech in #1204
- remove unused tooling to free space by @jmartin-tech in #1206
- update deps away from insecure versions by @leondz in #1207
- update
Tier
impl by @leondz in #1205 - config: sync probe active defaults with default config used in practice by @leondz in #1214
- update: revert default
_config.run.generations
to5
by @leondz in #1227 - fix: stop
atkgen
turn count variation in test relying on fixed turn count by @leondz in #1226 - fix plugin cache tests by @emmanuel-ferdman in #1229
- ux: move translator load msg into translator instantiation by @leondz in #1184
- extract text when processing multi-modal prompts by @jmartin-tech in #1228
New Contributors
- @JanetVictorious made their first contribution in #1137
- @SnowMasaya made their first contribution in #943
- @dimensi0n made their first contribution in #1135
- @mrowebot made their first contribution in #1157
- @aishwaryap made their first contribution in #1185
- @marcorosa made their first contribution in #1171
- @IanYHChu made their first contribution in #1189
Full Changelog: v0.10.3.1...v0.11.0