proposal: ATR detection rules as a community safety plugin

ATR (Agent Threat Rules) is an MIT-licensed set of 344 regex detection rules for AI agent threats — prompt injection, tool poisoning, credential exfiltration, privilege escalation, and context manipulation. It ships as an npm package and a Python package (pyatr).

Production deployments: Microsoft Copilot SWE Agent (automated CVE detection loop), Cisco AI Defense (skill scanning pipeline), MISP/CIRCL (taxonomy merged by Alexandre Dulaunoy), OWASP Agentic Security Handbook.

The ADK BasePlugin architecture (cross-cutting policies, after_model_callback / before_tool_callback) is a natural fit for a safety plugin that evaluates agent inputs and tool outputs against this rule corpus.

Concrete proposal: an ATR plugin that hooks before_tool_callback to scan tool arguments against the rules and optionally blocks on critical/high severity matches. Similar to how Guardrails AI or LiteLLM's guardrail hooks work, but backed by the community-maintained ATR rule corpus.

I can write this as a community plugin if the ADK team would accept a PR under examples/community/ or a separate package. Two questions before starting:

1. Is examples/community/ the right landing spot, or would a separate pip-installable package be preferred?
2. Is pyatr (Python) or the npm package the preferred dependency surface for ADK integrations?

ATR repo: https://github.com/Agent-Threat-Rule/agent-threat-rules

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

proposal: ATR detection rules as a community safety plugin #5740

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

proposal: ATR detection rules as a community safety plugin #5740

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions