You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
import{trace}from"@opentelemetry/api";import{redactPii,redactIncremental}from"./privacy/redaction";exportasyncfunctionstreamAnswer(req,res,llm){constspan=trace.getActiveSpan();constuserPrompt=awaitreq.text();// Telemetry keeps only hashes and token counts, not raw prompt/completion text.span?.setAttribute("app.prompt.sha256",sha256(userPrompt));span?.setAttribute("gen_ai.usage.input_tokens",estimateTokens(userPrompt));constsafePrompt=redactPii(userPrompt);conststream=awaitllm.responses.stream({input: safePrompt});forawait(consttokenofstream){constsafeToken=redactIncremental(token);// stateful chunk redaction before SSE writeaudit.info({event: "llm_chunk",token_count: estimateTokens(safeToken)});res.write(`data: ${JSON.stringify({delta: safeToken})}\n\n`);}}
Why this is a false positive:
The current skill correctly warns about prompt/completion logging and missing output filtering, but its evidence model does not distinguish raw GenAI telemetry from privacy-preserving telemetry. A review can over-flag an implementation that stores only prompt hashes, token counts, and redacted streamed chunks because the code still contains span, gen_ai, stream, audit, and chunk patterns.
The fix should not simply ban GenAI observability. It should require evidence that raw prompts, completions, retrieved context, tool arguments, and streamed deltas are either disabled, redacted before export, or stored only as non-reversible metadata.
Coverage Gaps
Missed variant 1: SSE token streaming bypasses final output redaction
conststream=awaitopenai.chat.completions.create({model: "gpt-4.1",
messages,stream: true});letfull="";forawait(constpartofstream){constdelta=part.choices[0]?.delta?.content??"";full+=delta;response.write(`data: ${JSON.stringify({ delta })}\n\n`);// raw PHI/PII already sent}constsafeFinal=redactPii(full);// too late for streamed clientssaveConversation({ user_id,response: safeFinal});
Why it should be caught:
The skill requires output-side PII scanning, but it does not require the scanner to sit before every emission boundary. In streaming UX, the privacy boundary is each token/chunk written to SSE, WebSocket, or callback handlers, not just the final assembled response. A final redaction pass is insufficient once raw deltas have already been sent to browsers, mobile clients, logs, replay buffers, or CDN edge traces.
Missed variant 2: tool-call arguments and retrieval snippets leak through traces/APM
Why it should be caught:
The current skill focuses on prompts, completions, vector stores, and conversation logs. Modern agent stacks also persist retrieval snippets, tool arguments, function-call results, chain-of-thought-like debug events, and GenAI span/event payloads. These traces can be exported to SaaS observability vendors with different retention, access control, and DPA terms than the primary application database.
Why it should be caught:
The redaction parser protects the returned answer, but callback instrumentation can capture the raw prompt, retrieved context, intermediate messages, tool I/O, and raw model output before the parser runs. The skill should ask for callback/tracer privacy configuration, sampling policy, redaction hooks, and provider retention settings.
Edge Cases
Streaming redaction must be incremental/stateful because PII can span token boundaries (john. + doe@example.com).
Some OpenTelemetry GenAI semantic-convention events can carry prompt/completion content; reviews should distinguish content-bearing events from token-count-only metrics.
Debug traces may be disabled in production but enabled in staging with production data copies; environment-specific evidence is needed.
APM vendors, prompt-management tools, and LLM evaluation platforms may store raw prompts/completions outside the app's retention system.
Tool traces can contain sensitive data even when the visible model response is clean.
Sampling is not a privacy control unless sampled traces are still redacted before export.
Remediation Quality
Fix resolves the vulnerability
Fix doesn't introduce new security issues
Fix doesn't break functionality
Issues found: Add a dedicated streaming/telemetry privacy gate to the skill. The gate should require reviewers to identify every content emission/export boundary before claiming PII filtering is effective: SSE/WebSocket chunks, callbacks, provider dashboards, OpenTelemetry spans/events, prompt-management logs, tool-call traces, RAG context traces, eval datasets, and replay buffers.
Recommended evidence fields:
Field
Purpose
streaming_redaction_boundary
Whether redaction occurs before each streamed delta is sent.
incremental_redaction_state
Whether PII spanning token/chunk boundaries is detected.
genai_trace_content_policy
Whether prompts/completions/tool args/RAG snippets are disabled, redacted, sampled, or exported raw.
trace_export_destinations
APM/LLMOps vendors receiving GenAI spans/events and their retention/DPA status.
callback_capture_points
Framework callbacks that see raw prompts, model outputs, tool I/O, and retrieved context.
tool_trace_sensitivity
Whether function-call args/results can include PII/PHI/secrets and how they are redacted.
environment_trace_parity
Whether dev/staging/prod differ in trace capture while sharing production-like data.
replay_buffer_retention
Whether streamed content is stored in browser/server replay queues or CDN logs.
Suggested scoring guardrails:
If final-response redaction exists but streaming chunks are emitted raw first, mark output filtering as ineffective.
If raw prompt/completion/tool/RAG content is exported to traces without DPA/retention/access-control evidence, classify as at least High.
If telemetry stores only non-reversible hashes, token counts, model IDs, latency, and redacted snippets, do not flag it as raw PII logging.
If callbacks or LLMOps tools are present but their content-capture settings are unknown, mark the review Not Evaluable for telemetry privacy rather than passing it.
Comparison to Other Tools
Tool
Catches this?
Notes
Semgrep
Partial
Custom rules can catch res.write(delta), span.set_attribute(prompt), or callback registration, but they need framework-specific patterns and cannot prove runtime redaction ordering alone.
CodeQL
Partial
Can model data flow from prompts to logs/HTTP writes in supported languages, but GenAI streaming callbacks and third-party tracer semantics need custom libraries.
Presidio / PII detectors
Partial
Useful for redaction, but they do not prove the detector runs before every stream/trace/export boundary.
OpenTelemetry GenAI conventions
Partial
Provides standard GenAI telemetry shape; reviews still need a policy for whether content-bearing events are disabled or redacted before export.
Overall Assessment
Strengths:
The skill already covers training data privacy, prompt/completion PII, retention, memorization, EU AI Act data governance, and consent.
It correctly treats embeddings and model outputs as privacy-relevant data stores rather than assuming they are anonymous.
The output format already has room for regulatory references and severity.
Needs improvement:
The skill treats output filtering too much like a final response step; streaming systems need per-chunk emission gates.
It does not require a GenAI observability inventory, even though traces and callbacks often capture raw prompts, completions, retrieved context, and tool arguments.
It can over-flag privacy-preserving telemetry because it lacks a distinction between raw content export and metadata-only telemetry.
Priority recommendations:
Add a Streaming and Telemetry Boundary section before the final findings table.
Require reviewers to inventory SSE/WebSocket chunks, callbacks, trace/span/event attributes, prompt-management dashboards, tool traces, and replay buffers.
Add Not Evaluable/pass/fail output fields for redaction order, content-bearing telemetry, export destinations, retention, DPA coverage, and environment parity.
Sources Checked
Current skill reviewed: skills/ai-security/ai-data-privacy/SKILL.md
Skill Being Reviewed
Skill name: ai-data-privacy
Skill path:
skills/ai-security/ai-data-privacy/False Positive Analysis
Benign code that triggers a false positive:
Why this is a false positive:
The current skill correctly warns about prompt/completion logging and missing output filtering, but its evidence model does not distinguish raw GenAI telemetry from privacy-preserving telemetry. A review can over-flag an implementation that stores only prompt hashes, token counts, and redacted streamed chunks because the code still contains
span,gen_ai,stream,audit, andchunkpatterns.The fix should not simply ban GenAI observability. It should require evidence that raw prompts, completions, retrieved context, tool arguments, and streamed deltas are either disabled, redacted before export, or stored only as non-reversible metadata.
Coverage Gaps
Missed variant 1: SSE token streaming bypasses final output redaction
Why it should be caught:
The skill requires output-side PII scanning, but it does not require the scanner to sit before every emission boundary. In streaming UX, the privacy boundary is each token/chunk written to SSE, WebSocket, or callback handlers, not just the final assembled response. A final redaction pass is insufficient once raw deltas have already been sent to browsers, mobile clients, logs, replay buffers, or CDN edge traces.
Missed variant 2: tool-call arguments and retrieval snippets leak through traces/APM
Why it should be caught:
The current skill focuses on prompts, completions, vector stores, and conversation logs. Modern agent stacks also persist retrieval snippets, tool arguments, function-call results, chain-of-thought-like debug events, and GenAI span/event payloads. These traces can be exported to SaaS observability vendors with different retention, access control, and DPA terms than the primary application database.
Missed variant 3: provider/framework callbacks capture raw content despite application-level redaction
Why it should be caught:
The redaction parser protects the returned answer, but callback instrumentation can capture the raw prompt, retrieved context, intermediate messages, tool I/O, and raw model output before the parser runs. The skill should ask for callback/tracer privacy configuration, sampling policy, redaction hooks, and provider retention settings.
Edge Cases
john.+doe@example.com).Remediation Quality
Recommended evidence fields:
streaming_redaction_boundaryincremental_redaction_stategenai_trace_content_policytrace_export_destinationscallback_capture_pointstool_trace_sensitivityenvironment_trace_parityreplay_buffer_retentionSuggested scoring guardrails:
Comparison to Other Tools
res.write(delta),span.set_attribute(prompt), or callback registration, but they need framework-specific patterns and cannot prove runtime redaction ordering alone.Overall Assessment
Strengths:
Needs improvement:
Priority recommendations:
Streaming and Telemetry Boundarysection before the final findings table.Sources Checked
skills/ai-security/ai-data-privacy/SKILL.mdBounty Info