[Feature Request] OWASP LLM Top 10 Coverage for AI-Powered Application Testing

## Summary

Strix currently covers a comprehensive set of traditional web vulnerabilities (SQLi, XSS, CSRF, SSRF, IDOR, auth bypass, etc.), which makes it an excellent tool for pentesting conventional web applications. However, as AI-powered applications become increasingly prevalent, there is a growing need to also address the **OWASP Top 10 for Large Language Model Applications (2025 edition)**.

## Motivation

The [OWASP LLM Top 10](https://genai.owasp.org/llm-top-10/) defines the most critical security risks for applications that integrate LLMs. These risks are distinct from traditional web vulnerabilities and are not currently covered by Strix's scanning capabilities.

With agentic AI applications, RAG pipelines, and LLM-backed APIs becoming mainstream targets, security tooling must evolve to address these new attack surfaces.

## Proposed Coverage

We propose adding a dedicated **LLM Security Scan Mode** that maps to the 10 risks:

| ID | Risk | Attack Technique |
|---|---|---|
| LLM01 | Prompt Injection | Direct & indirect prompt injection payloads |
| LLM02 | Sensitive Information Disclosure | Data extraction via crafted prompts |
| LLM03 | Supply Chain | Dependency and model provenance analysis |
| LLM04 | Data & Model Poisoning | Training/fine-tuning data manipulation checks |
| LLM05 | Improper Output Handling | Injection via LLM-generated output (XSS, RCE) |
| LLM06 | Excessive Agency | Tool/action scope and permission boundary testing |
| LLM07 | System Prompt Leakage | Extraction of system/meta prompts |
| LLM08 | Vector & Embedding Weaknesses | RAG poisoning and embedding manipulation |
| LLM09 | Misinformation | Hallucination and output validation bypass |
| LLM10 | Unbounded Consumption | Rate limiting, resource exhaustion, token flooding |

## Suggested Implementation

- Add a new scan mode: `strix --target ./app --mode llm-owasp`
- Integrate a **prompt injection fuzzing engine** leveraging Strix's existing HTTP proxy and browser automation toolkit
- Provide compliance-ready reports tagged with LLM01–LLM10 identifiers
- Allow combining with existing web scans for **hybrid AI+web application coverage**

## References

- [OWASP Top 10 for LLM Applications 2025](https://genai.owasp.org/llm-top-10/)
- [OWASP GenAI Security Landscape Q2 2026](https://genai.owasp.org/resource/al-security-solutions-landscape-for-llm-and-gen-al-apps-q2-2026/)
- Tools with partial LLM coverage for reference: Giskard, StackHawk, Promptfoo

## Additional Context

This feature would position Strix as one of the first **agentic pentesting tools** to natively cover both OWASP Web Top 10 and OWASP LLM Top 10, addressing the full modern attack surface. Given Strix's multi-agent architecture, it is particularly well-suited to simulate the kinds of complex, multi-turn attacks that LLM01 (Prompt Injection) and LLM06 (Excessive Agency) require.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request] OWASP LLM Top 10 Coverage for AI-Powered Application Testing #428

Summary

Motivation

Proposed Coverage

Suggested Implementation

References

Additional Context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

ID	Risk	Attack Technique
LLM01	Prompt Injection	Direct & indirect prompt injection payloads
LLM02	Sensitive Information Disclosure	Data extraction via crafted prompts
LLM03	Supply Chain	Dependency and model provenance analysis
LLM04	Data & Model Poisoning	Training/fine-tuning data manipulation checks
LLM05	Improper Output Handling	Injection via LLM-generated output (XSS, RCE)
LLM06	Excessive Agency	Tool/action scope and permission boundary testing
LLM07	System Prompt Leakage	Extraction of system/meta prompts
LLM08	Vector & Embedding Weaknesses	RAG poisoning and embedding manipulation
LLM09	Misinformation	Hallucination and output validation bypass
LLM10	Unbounded Consumption	Rate limiting, resource exhaustion, token flooding

[Feature Request] OWASP LLM Top 10 Coverage for AI-Powered Application Testing #428

Description

Summary

Motivation

Proposed Coverage

Suggested Implementation

References

Additional Context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions