Skip to content

Raise ArgumentError on non-Hash/non-String patch_signals input#25

Closed
andriytyurnikov wants to merge 14 commits intostarfederation:mainfrom
rubakas:feature/patch-signals-validate-input
Closed

Raise ArgumentError on non-Hash/non-String patch_signals input#25
andriytyurnikov wants to merge 14 commits intostarfederation:mainfrom
rubakas:feature/patch-signals-validate-input

Conversation

@andriytyurnikov
Copy link
Copy Markdown

Summary

The case/when in ServerSentEventGenerator#patch_signals had no else, so passing nil, an Array, an Integer, or any other non-Hash/non-String value silently produced a malformed event:

event: datastar-patch-signals\n\n

— just the header, no data: signals line. The browser receives an effectively empty event and the caller has no idea anything went wrong.

Fix

Add the missing else branch — raise ArgumentError:

else
  raise ArgumentError,
        "patch_signals expects a Hash or a JSON-encoded String, got #{signals.class}"

The Dispatcher#patch_signals docstring already declares this contract:

# @param signals [Hash, String] signals to merge

This makes the runtime match the documented contract.

Tests

4 new specs covering the failure modes: nil, Array, Integer, Symbol. Full suite: 104 examples, 0 failures.

Behavior change

Callers who were accidentally calling patch_signals(nil) or patch_signals(some_array) will now see an ArgumentError immediately instead of producing a silently-malformed wire payload. This is the better failure mode — surfaces caller bugs at the actual call site instead of producing a confused-browser at runtime.

Test plan

  • CI green
  • Existing patch_signals Hash and String paths still work
  • New ArgumentError specs all pass

Previously CI only exercised Ruby 4.0. Adds a build matrix so unit
tests and the Datastar SDK conformance suite run on every supported
Ruby. Also enables the workflow on pull_request events.
CI runs on ubuntu-latest. Without x86_64-linux locked, Bundler resolves
platform-specific gems at install time on CI, which can cause version
drift relative to local development. Locking the platform keeps CI
reproducible.
bundle install fails on Ruby 3.1 because console 1.34.2 (transitive
dep via async) requires ruby >= 3.2. The dependency tree's effective
floor is 3.2.
Adds an optional generator_class: keyword to Dispatcher.new (defaulting
to ServerSentEventGenerator) and routes the three internal generator
instantiations — stream_one, stream_many's connection generator, and
each per-stream generator inside spawned threads — through it.

Lets downstream wrappers layer behavior (logging, metrics, input
scrubbing, etc.) on top of the SDK by passing a subclass, instead of
monkey-patching the SDK class.
Pre-fix #redirect interpolated the URL raw inside a single-quoted JS
string:

  window.location = '#{url}'

A ' in url broke out of the literal, letting attacker-influenced
fragments execute as JS. The classic vulnerable pattern:

  datastar.redirect("/page?ref=#{params[:ref]}")

is a Rails idiom that's safe under Rails' own redirect_to, so the
mismatch is a footgun.

Fix: encode the URL with JSON.generate(ascii_only: true,
escape_slash: true). The output is a properly-quoted JS string
literal that:

  - escapes ", \, and control characters,
  - escapes U+2028 / U+2029 (which terminate JS string literals
    even when delimited by " or '),
  - escapes / to \/ so a </script> substring in the URL can't
    prematurely close the surrounding <script> tag during HTML
    parsing (the parser does not recognize <\/script> as the end
    tag, while \/ is a no-op inside a JS string literal).

Output format change: window.location is now wrapped in double
quotes with JSON-style escapes instead of single quotes. Behavior
at runtime is unchanged for safe URLs.

6 new specs cover single quotes, double quotes, backslashes,
</script> breakout, U+2028, and U+2029.
stream_one's lifecycle contract (handling_sync_errors) fires exactly
one of on_server_disconnect / on_client_disconnect / on_error per
stream — never a combination. stream_many violated this: its
ensure block always fired on_server_disconnect, even when an error
or client disconnect had already triggered the matching callback.

Consumers writing observability/cleanup logic against these
callbacks would see different behavior depending on whether they
called .stream once or many times — a subtle source of bugs.

Fix: track a completed_normally flag in stream_many's control
thread, set it false when handle_streaming_error runs, and only
fire on_server_disconnect in the ensure block when the flag is
still true.

2 new shared examples (run for both ThreadExecutor and
AsyncExecutor):
- streamer raises → on_error fires, on_server_disconnect doesn't
- client disconnects → on_client_disconnect fires, on_server_disconnect doesn't
@andriytyurnikov
Copy link
Copy Markdown
Author

Upstream CI hasn't fired on this PR (workflow is on: push only). Fork CI run on the same commit, green: https://github.com/rubakas/datastar-ruby/actions/runs/24992013161 (Ruby 4.0 on Linux + full Datastar SDK conformance suite). 104 examples, 0 failures — 4 new specs covering nil, Array, Integer, Symbol inputs, plus the existing Hash and String paths still work unchanged.

CI: run matrix against Ruby 3.2, 3.3, 3.4, 4.0
Lock x86_64-linux platform in Gemfile.lock
Allow Dispatcher to inject a custom generator_class
Make multi-stream disconnect callbacks mutually exclusive
Encode redirect URL as a safe JS string literal
Closes the SSE injection surface where attacker-controlled strings
reaching element bodies, script bodies, attribute values, scalar
option values, or array/hash option entries could forge SSE fields
the browser then dispatches as legitimate events. The WHATWG SSE
parser treats \r, \n, and \r\n all as line terminators.

Two scrubbers, applied at the API boundary:

- scrub_body: strip \r only (used for element/script bodies and
  String signal payloads). \n is preserved because the SDK splits
  bodies on \n to emit per-line `data:` fields.
- scrub_option: recursively strip \r and \n from option values
  (Strings, Arrays, Hashes). Option values are written as single
  `data:` lines, so any embedded line terminator forges a field.

Hash signals are unaffected — JSON.dump already escapes \r/\n in
string values, so the serialized payload is always a single safe
line.

execute_script attribute values are scrubbed pre-tag-construction
since they are interpolated raw and bypass patch_elements' splitter.

17 new specs covering element bodies, scalar/array/hash options,
String and Hash signals, remove_elements, execute_script (body
and attributes), redirect, and a top-level reproduction of the
issue's attack vector.
Scrub CR/LF in SSE outputs to prevent field forging
The case/when in patch_signals had no else branch, so passing
nil, an Array, an Integer, or any other non-Hash/non-String value
silently produced an SSE event with only a header and no data:
line — a malformed patch-signals event the browser would receive
as effectively empty.

The Dispatcher docstring already declares the contract:
@param signals [Hash, String]. This makes the runtime match.

Test coverage for nil, Array, Integer, Symbol.
@andriytyurnikov andriytyurnikov force-pushed the feature/patch-signals-validate-input branch from 18abff3 to 994c107 Compare April 27, 2026 12:17
@andriytyurnikov andriytyurnikov deleted the feature/patch-signals-validate-input branch April 27, 2026 12:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant