Skip to content

perf (prompt_injection): cache classifier behind Lazy<> (was Box-alloc per call)#1962

Merged
senamakel merged 1 commit into
tinyhumansai:mainfrom
aregmii:perf/prompt-injection-classifier-cache
May 17, 2026
Merged

perf (prompt_injection): cache classifier behind Lazy<> (was Box-alloc per call)#1962
senamakel merged 1 commit into
tinyhumansai:mainfrom
aregmii:perf/prompt-injection-classifier-cache

Conversation

@aregmii
Copy link
Copy Markdown
Contributor

@aregmii aregmii commented May 16, 2026

Summary

Wraps optional_classifier() in src/openhuman/prompt_injection/detector.rs with Lazy<Option<Box<dyn OptionalClassifier>>> so the env-var read and Box allocation run once per process instead of once per prompt analysis. Closes #1943.

Problem

optional_classifier() reads the OPENHUMAN_PROMPT_INJECTION_CLASSIFIER env var, allocates two Strings (the "off" fallback and the to_ascii_lowercase copy), and on "heuristic" allocates a fresh Box<dyn OptionalClassifier>, all on every call. The single caller is analyze_prompt on the prompt-screening hot path, so every agent prompt analysis pays this cost. The env var is fixed at startup and the classifier choice is deterministic; redoing the resolution per call is wasted work.

Solution

Promote the resolution to a static Lazy<Option<Box<dyn OptionalClassifier>>> so it runs at most once. Pattern matches the file's other Lazy::new(...) statics (SPACE_RE, BASE64_RE, DETECTION_RULES):

static OPTIONAL_CLASSIFIER: Lazy<Option<Box<dyn OptionalClassifier>>> = Lazy::new(|| {
    let choice = env::var("OPENHUMAN_PROMPT_INJECTION_CLASSIFIER")
        .unwrap_or_else(|_| "off".to_string())
        .to_ascii_lowercase();
    let classifier: Option<Box<dyn OptionalClassifier>> = match choice.as_str() {
        "heuristic" => Some(Box::new(HeuristicClassifier)),
        _ => None,
    };
    tracing::debug!(
        "[prompt_injection] optional classifier resolved choice={:?} active={}",
        choice,
        classifier.is_some()
    );
    classifier
});

fn optional_classifier() -> Option<&'static dyn OptionalClassifier> {
    OPTIONAL_CLASSIFIER.as_deref()
}

Return type shifts from owned Box<dyn OptionalClassifier> to borrowed &'static dyn OptionalClassifier. analyze_prompt's use of if let Some(classifier) = optional_classifier() { classifier.classify(&normalized) } compiles unchanged because trait-method dispatch works the same on &dyn as on Box<dyn>.

tracing::debug! inside the initializer fires once at startup and reports the resolved choice for diagnosability; env-var value only, no PII.

Net diff: +13 -3 in one file.

Submission Checklist

  • N/A: perf refactor preserves the function's contract; the existing prompt_injection test suite (blocks_direct_override_and_exfiltration, allows_normal_prompt, plus 5 others) covers the call site and continues to pass.
  • N/A: no new code lines under coverage gate; the refactor reuses already-tested behavior.
  • N/A: no feature rows affected.
  • N/A: no feature IDs touched.
  • N/A: no external dependencies introduced; once_cell::sync::Lazy already imported.
  • N/A: no release-cut surfaces touched.
  • Linked issue closed via Closes #1943 in the ## Related section.

Impact

  • Hot path: prompt-screening no longer does an env-var read + 2 String allocs + 1 Box alloc per call.
  • No public API change (optional_classifier is private to the module).
  • One-time debug log on first invocation surfaces the resolved classifier choice for future diagnosis.

Related


AI Authored PR Metadata (required for Codex/Linear PRs)

N/A: Human-authored, AI-assisted drafting.

Linear Issue

  • Key: N/A
  • URL: N/A

Commit & Branch

  • Branch: perf/prompt-injection-classifier-cache
  • Commit SHA: 90408fee

Validation Run

  • N/A: no JS/TS changed.
  • N/A: no TS types changed.
  • N/A: no TS to compile.
  • cargo check -p openhuman --lib and cargo test --package openhuman --lib openhuman::prompt_injection both pass locally (7/7 tests).
  • N/A: no Tauri code changed.

Validation Blocked

  • command: pre-push hook may exit on inherited cargo check --manifest-path src-tauri/Cargo.toml failure on upstream main (documented in PR fix(scripts): codesign setup pops keychain dialog on every build + dr… #1786). This branch only edits src/openhuman/prompt_injection/detector.rs so it cannot be caused by this change. Per-file required checks (cargo fmt --check on detector.rs, focused cargo test on prompt_injection) re-run individually and pass.
  • error: (as above)
  • impact: Pushed with --no-verify per the per-file pre-push standards checklist.

Behavior Changes

  • Intended behavior change: none functionally; the function returns the same classifier choice for the same env var value. Performance: env-var read and Box allocation move from per-call to once-per-process.
  • User-visible effect: no perceptible change to a single user; reduces cycles + allocator pressure on agent prompt-screening.

Parity Contract

  • Legacy behaviour preserved: Yes. Same env var, same choice mapping, same None fallback. Only the call signature changes from owned Box<dyn> to &'static dyn; the single caller's pattern match works on both.
  • Guard/fallback/dispatch parity checks: N/A; no dispatch path touched.

Duplicate / Superseded PR Handling

  • Duplicate PR(s): None. Overlap check at branch-push time confirmed zero open PRs touch src/openhuman/prompt_injection/.
  • Canonical PR: This one.
  • Resolution: N/A.

Summary by CodeRabbit

  • Refactor
    • Optimized prompt injection detector initialization to reduce redundant processing and improve performance efficiency.

Review Change Stack

… per call)

`optional_classifier()` previously read env + allocated `Box<dyn OptionalClassifier>` per prompt analysis even though the classifier choice is fixed at startup. Wrap it in `Lazy<>` (matching the file's other static rules) so resolution runs once and callers borrow a cached `&'static dyn` thereafter. Closes tinyhumansai#1943.
@aregmii aregmii requested a review from a team May 16, 2026 20:06
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 16, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 7cc2e0a9-65e2-465d-8cfa-205b5847790b

📥 Commits

Reviewing files that changed from the base of the PR and between 36a0e73 and 90408fe.

📒 Files selected for processing (1)
  • src/openhuman/prompt_injection/detector.rs

📝 Walkthrough

Walkthrough

The optional_classifier() function is refactored to cache the resolved prompt-injection classifier in a static Lazy<Option<Box<dyn OptionalClassifier>>> that initializes once from the OPENHUMAN_PROMPT_INJECTION_CLASSIFIER environment variable, eliminating repeated allocations and env reads on the hot path, with debug logging of the resolved configuration.

Changes

Prompt Injection Classifier Caching

Layer / File(s) Summary
Lazy classifier initialization with caching
src/openhuman/prompt_injection/detector.rs
A module-level static OPTIONAL_CLASSIFIER uses Lazy to read OPENHUMAN_PROMPT_INJECTION_CLASSIFIER once and cache the optional classifier, with debug logging of the choice. The optional_classifier() function now borrows the static classifier via as_deref() instead of constructing a new boxed classifier per call.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

🐰 A cache takes the burden of choice,
Once read, let the classifier rejoice,
No more allocations in sight,
Just borrows that flow so light,
One lazy static does the voice! 🎉

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title accurately reflects the main change—caching a classifier using Lazy instead of allocating a Box on each call, which directly summarizes the performance optimization.
Linked Issues check ✅ Passed The PR implements the core objective from issue #1943: caching classifier initialization to eliminate per-call env-var reads and heap allocations using Lazy, matching the suggested fix approach.
Out of Scope Changes check ✅ Passed All changes in detector.rs are narrowly scoped to refactoring optional_classifier initialization and the accessor function, directly addressing the performance issue without introducing unrelated modifications.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@senamakel senamakel merged commit b4fd252 into tinyhumansai:main May 17, 2026
25 checks passed
@aregmii aregmii deleted the perf/prompt-injection-classifier-cache branch May 17, 2026 05:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Perf: prompt_injection detector allocates Box<dyn Classifier> on every call

2 participants