Skip to content

sanitizeUrlProtocols incomplete — ws://, wss://, smb://, irc://, ldap:// bypass safe-output URL sanitization #31710

@szabta89

Description

@szabta89

Summary

sanitizeUrlProtocols() in actions/setup/js/sanitize_content_core.cjs uses a hardcoded blocklist that covers http, ftp, file, ssh, git, data, javascript, vbscript, about, mailto, and tel — but omits a range of modern/non-HTTP protocols including ws://, wss://, smb://, irc://, ldap://, ldaps://, magnet:, feed://, and rtsp://. The companion filter sanitizeUrlDomains() only processes https:// and // prefixes and also does not catch these schemes. AI-generated content containing such URLs passes through the full sanitizeContentCore pipeline unchanged and reaches the GitHub API verbatim.

Confirmed against v0.68.3 (sanitize_content_core.cjs SHA 159c2fed045bdd850374b084fe92182c9e31b147237944f41aecd765d068e685).

Affected Area

Output trust boundary — sanitizeUrlProtocols() / sanitizeContentCore in the safe-output processing pipeline (actions/setup/js/sanitize_content_core.cjs, ~line 178/210).

Reproduction Outline

  1. Locate sanitize_content_core.cjs from a v0.68.3 deployment (e.g., $RUNNER_TEMP/gh-aw/safeoutputs/sanitize_content_core.cjs).
  2. Call sanitizeUrlProtocols('smb://attacker.com/share') — observe the URL is returned unchanged (not redacted).
  3. Call sanitizeUrlDomains('ws://evil.com/socket', buildAllowedDomains()) — observe URL passes through.
  4. Call sanitizeContentCore('[click here](smb://attacker.com/share)') — observe the markdown link survives the full pipeline.
  5. Use a gh-aw workflow with a create_issue or add_comment safe-output and an AI prompt that produces smb:// or ws:// content — the URL appears verbatim in the created GitHub issue/comment body.

Observed Behavior

smb://attacker.com/share, ws://evil.com/socket, irc://irc.libera.chat/#channel, magnet:?xt=urn:btih:abc123, and ldap://evil.com/cn=admin all pass through sanitizeUrlProtocols and the full sanitizeContentCore pipeline unchanged and are written to GitHub as active markdown links.

Expected Behavior

All non-https:// and non-// protocol URLs should be redacted (replaced with (redacted)) by sanitizeUrlProtocols(), matching the behavior applied to (redacted) (redacted) etc. The most robust fix is to invert the logic and allowlist only https:// and // rather than blocklisting known-bad schemes — eliminating the category of blocklist-incompleteness bugs.

Security Relevance

In a cross-prompt injection scenario an attacker can craft issue content that causes the AI to echo back or generate a smb://attacker.com/share URL in a safe-output response. On Windows, any user who views the resulting GitHub issue and clicks the link triggers an outbound SMB request, potentially leaking NTLM credentials to an attacker-controlled Responder listener. ldap:// links in parsed output could enable injection in downstream tooling. The affected path is the documented sanitization gate for all AI-generated write operations.

Additional Context

If the current blocklist behavior is intentional for any of these schemes, that assumption should be documented explicitly (e.g., in the safe-outputs reference docs and code comments), as the existing documentation implies comprehensive URL filtering.

gh-aw version: v0.68.3

Original finding: https://github.com/githubnext/gh-aw-security/issues/2208

Generated by File Issue · ● 244.9K ·

Metadata

Metadata

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions