Skip to content

v0.0.5 — Claude Opus 4.7 trigger, paradigm-shift README, leaderboard 52/100

Choose a tag to compare

@wuyoscar wuyoscar released this 17 Apr 14:41
· 120 commits to main since this release
3dd2a91

New ISC Trigger

Claude Opus 4.7 (pre-release, Rank 1 placeholder) — agentic QwenGuard TVD, 12 multilingual harmful completions across EN / FR / KO / ZH, all validator-passed. Jailbroken in seconds. See community/claudeopus47-agent-qwenguard. Confirmed count: 52/100.

README Overhaul (all 7 language versions)

  • New intro framing: ISC is a paradigm shift. The failure surface has moved from the chat prompt into the agent workflow. Under jailbreak-style evaluation on Pass@3, every frontier Large Model with agent capability hits a 100% trigger rate.
  • "The task is the trigger" replaces "No jailbreak required".
  • Swap "legitimate professional workflow" / "real professional task" for workflow-task / sensitive-tool workflow / tool-integrated workflow equivalents.
  • Consistently say "Large Model" / "大模型" when referring to LLMs.
  • New 🔍 In the Community section with 4 practitioner quotes (Bonny Banerjee, Charles H. Martin, Andrei Trandafira, Christopher Bain).
  • New 🔬 External Analyses section listing third-party write-ups and projects (promptfoo, Gist.Science, BotBeat News, 模安局, AI Post Transformers podcast, XSafeClaw).
  • 📋 ISC-Bench dropped its "High-Stakes Safety Benchmark" subtitle.
  • How to Contribute collapsed to a one-line pointer; the full workflow moved to CONTRIBUTING.md.
  • Audit of FAQ / Updates / News / Community Reproductions in all language versions to translate leftover English.

Leaderboard

  • Claude Opus 4.7 inserted at Rank 1 (Arena score rendered as , not yet on Arena).
  • Old Rank 100 (o1-preview) dropped to keep the displayed window at 100.
  • grok-4-fast-chat → Grok 4 Fast display-name mapping added; existing community/grok4fast-darkweb case now counts.
  • Fixed GLM-4.7 and GLM-4.6 schema in isc_cases.json (were previously invisible).
  • scripts/gen_leaderboard.py and docs/static/js/main.js both render for null Arena scores.
  • Regenerated leaderboard_progress.svg (now shows 52/100).

New

  • CONTRIBUTING.md with the full contribution workflow (ISC trigger submission, template / code contributions, PR checklist, safety boundary).
  • community/claudeopus47-agent-qwenguard/ reproduction folder.

Templates

  • aiml_guard: consolidated variant prompt_*.txt files into prompt.txt and prompt_zh.txt; dropped 9 experimental variants.

See CHANGELOG.md for the full history.