Skip to content

Security: Prompt injection detector bypassed via Unicode homoglyphs #1925

@Liohtml

Description

@Liohtml

Summary

The prompt injection detector in src/openhuman/prompt_injection/detector.rs normalizes leet-speak (0→o, 1→i, 3→e, etc.) but does NOT handle:

  • Cyrillic homoglyphs: а (U+0430) for Latin a, о (U+043E) for o, etc.
  • Fullwidth characters: ignore passes through undetected
  • NFKD decomposition: accented characters like igñore evade regex rules
  • Confusables from UAX#39: dozens of visually-identical characters from other scripts

Location

src/openhuman/prompt_injection/detector.rsnormalize_prompt() function

Impact

High — Trivial bypass of prompt injection detection. An attacker substitutes a single Cyrillic character in "ignore previous instructions" and the regex rules never fire.

Suggested Fix

  1. Apply Unicode NFKD decomposition before lowercasing
  2. Add confusable mapping from UAX#39 (at minimum Latin↔Cyrillic)
  3. Strip all characters from categories Cf, Mn, Mc that aren't essential to meaning

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions