sanitizeUrlProtocols incomplete — ws://, wss://, smb://, irc://, ldap:// bypass safe-output URL sanitization

## Summary

`sanitizeUrlProtocols()` in `actions/setup/js/sanitize_content_core.cjs` uses a hardcoded blocklist that covers `http`, `ftp`, `file`, `ssh`, `git`, `data`, `javascript`, `vbscript`, `about`, `mailto`, and `tel` — but omits a range of modern/non-HTTP protocols including `ws://`, `wss://`, `smb://`, `irc://`, `ldap://`, `ldaps://`, `magnet:`, `feed://`, and `rtsp://`. The companion filter `sanitizeUrlDomains()` only processes `https://` and `//` prefixes and also does not catch these schemes. AI-generated content containing such URLs passes through the full `sanitizeContentCore` pipeline unchanged and reaches the GitHub API verbatim.

Confirmed against v0.68.3 (`sanitize_content_core.cjs` SHA `159c2fed045bdd850374b084fe92182c9e31b147237944f41aecd765d068e685`).

## Affected Area

Output trust boundary — `sanitizeUrlProtocols()` / `sanitizeContentCore` in the safe-output processing pipeline (`actions/setup/js/sanitize_content_core.cjs`, ~line 178/210).

## Reproduction Outline

1. Locate `sanitize_content_core.cjs` from a v0.68.3 deployment (e.g., `$RUNNER_TEMP/gh-aw/safeoutputs/sanitize_content_core.cjs`).
2. Call `sanitizeUrlProtocols('smb://attacker.com/share')` — observe the URL is returned unchanged (not redacted).
3. Call `sanitizeUrlDomains('ws://evil.com/socket', buildAllowedDomains())` — observe URL passes through.
4. Call `sanitizeContentCore('[click here](smb://attacker.com/share)')` — observe the markdown link survives the full pipeline.
5. Use a gh-aw workflow with a `create_issue` or `add_comment` safe-output and an AI prompt that produces `smb://` or `ws://` content — the URL appears verbatim in the created GitHub issue/comment body.

## Observed Behavior

`smb://attacker.com/share`, `ws://evil.com/socket`, `irc://irc.libera.chat/#channel`, `magnet:?xt=urn:btih:abc123`, and `ldap://evil.com/cn=admin` all pass through `sanitizeUrlProtocols` and the full `sanitizeContentCore` pipeline unchanged and are written to GitHub as active markdown links.

## Expected Behavior

All non-`https://` and non-`//` protocol URLs should be redacted (replaced with `(redacted)`) by `sanitizeUrlProtocols()`, matching the behavior applied to `(redacted) `(redacted) etc. The most robust fix is to invert the logic and allowlist only `https://` and `//` rather than blocklisting known-bad schemes — eliminating the category of blocklist-incompleteness bugs.

## Security Relevance

In a cross-prompt injection scenario an attacker can craft issue content that causes the AI to echo back or generate a `smb://attacker.com/share` URL in a safe-output response. On Windows, any user who views the resulting GitHub issue and clicks the link triggers an outbound SMB request, potentially leaking NTLM credentials to an attacker-controlled Responder listener. `ldap://` links in parsed output could enable injection in downstream tooling. The affected path is the documented sanitization gate for all AI-generated write operations.

## Additional Context

If the current blocklist behavior is intentional for any of these schemes, that assumption should be documented explicitly (e.g., in the safe-outputs reference docs and code comments), as the existing documentation implies comprehensive URL filtering.

**gh-aw version**: v0.68.3

Original finding: https://github.com/githubnext/gh-aw-security/issues/2208




> Generated by [File Issue](https://github.com/githubnext/gh-aw-security/actions/runs/25741778474/agentic_workflow) · ● 244.9K · [◷](https://github.com/search?q=repo%3Agithub%2Fgh-aw+is%3Aissue+%22gh-aw-workflow-call-id%3A+githubnext%2Fgh-aw-security%2Ffile-issue%22&type=issues)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sanitizeUrlProtocols incomplete — ws://, wss://, smb://, irc://, ldap:// bypass safe-output URL sanitization #31710

Summary

Affected Area

Reproduction Outline

Observed Behavior

Expected Behavior

Security Relevance

Additional Context

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

sanitizeUrlProtocols incomplete — ws://, wss://, smb://, irc://, ldap:// bypass safe-output URL sanitization #31710

Description

Summary

Affected Area

Reproduction Outline

Observed Behavior

Expected Behavior

Security Relevance

Additional Context

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions