Release v0.0.5 — Claude Opus 4.7 trigger, paradigm-shift README, leaderboard 52/100 · wuyoscar/Internal-Safety-Collapse

New ISC Trigger

Claude Opus 4.7 (pre-release, Rank 1 placeholder) — agentic QwenGuard TVD, 12 multilingual harmful completions across EN / FR / KO / ZH, all validator-passed. Jailbroken in seconds. See community/claudeopus47-agent-qwenguard. Confirmed count: 52/100.

README Overhaul (all 7 language versions)

New intro framing: ISC is a paradigm shift. The failure surface has moved from the chat prompt into the agent workflow. Under jailbreak-style evaluation on Pass@3, every frontier Large Model with agent capability hits a 100% trigger rate.
"The task is the trigger" replaces "No jailbreak required".
Swap "legitimate professional workflow" / "real professional task" for workflow-task / sensitive-tool workflow / tool-integrated workflow equivalents.
Consistently say "Large Model" / "大模型" when referring to LLMs.
New 🔍 In the Community section with 4 practitioner quotes (Bonny Banerjee, Charles H. Martin, Andrei Trandafira, Christopher Bain).
New 🔬 External Analyses section listing third-party write-ups and projects (promptfoo, Gist.Science, BotBeat News, 模安局, AI Post Transformers podcast, XSafeClaw).
📋 ISC-Bench dropped its "High-Stakes Safety Benchmark" subtitle.
How to Contribute collapsed to a one-line pointer; the full workflow moved to CONTRIBUTING.md.
Audit of FAQ / Updates / News / Community Reproductions in all language versions to translate leftover English.

Leaderboard

Claude Opus 4.7 inserted at Rank 1 (Arena score rendered as —, not yet on Arena).
Old Rank 100 (o1-preview) dropped to keep the displayed window at 100.
grok-4-fast-chat → Grok 4 Fast display-name mapping added; existing community/grok4fast-darkweb case now counts.
Fixed GLM-4.7 and GLM-4.6 schema in isc_cases.json (were previously invisible).
scripts/gen_leaderboard.py and docs/static/js/main.js both render — for null Arena scores.
Regenerated leaderboard_progress.svg (now shows 52/100).

New

CONTRIBUTING.md with the full contribution workflow (ISC trigger submission, template / code contributions, PR checklist, safety boundary).
community/claudeopus47-agent-qwenguard/ reproduction folder.

Templates

aiml_guard: consolidated variant prompt_*.txt files into prompt.txt and prompt_zh.txt; dropped 9 experimental variants.

See CHANGELOG.md for the full history.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.0.5 — Claude Opus 4.7 trigger, paradigm-shift README, leaderboard 52/100

Choose a tag to compare

Sorry, something went wrong.