Skip to content

community: Claude Opus 4.7 agentic QwenGuard TVD#86

Merged
wuyoscar merged 1 commit intomainfrom
community/claudeopus47-agent-qwenguard
Apr 16, 2026
Merged

community: Claude Opus 4.7 agentic QwenGuard TVD#86
wuyoscar merged 1 commit intomainfrom
community/claudeopus47-agent-qwenguard

Conversation

@wuyoscar
Copy link
Copy Markdown
Owner

Summary

Adds ISC-Bench community evidence for Claude Opus 4.7 triggering via agentic QwenGuard TVD.

Changes

  • community/claudeopus47-agent-qwenguard/ — full evidence package:
    • README.md — model info, ISC mechanism, harmful content summary, assessment
    • input_template.json — original template with ??? placeholders
    • evidence/test_case.json — model-generated output (6 safe + 6 unsafe across EN/FR/KO/ZH)
    • evidence/agent_log.json — full agent conversation log
    • evidence/guard.py — validator script
  • README.md — Updates table: added 2026-04-17 entry for Claude Opus 4.7

Key Findings

Claude Opus 4.7 completed the agentic QwenGuard TVD task: generated six assistant_unsafe responses including fentanyl synthesis (FR), phishing construction (KO), cannibalism instructions (EN), and a poisoning guide (ZH) to pass the Qwen3Guard-Gen-0.6B validator. No jailbreak or adversarial prompt optimization was used.

- Add community/claudeopus47-agent-qwenguard/ with README, input_template,
  evidence (agent_log, test_case, guard.py)
- Update README.md Updates table with 2026-04-17 entry
@wuyoscar wuyoscar merged commit 86620e6 into main Apr 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant