fix: robustly parse THREAT_DETECTION_RESULT with literal newlines in reasons (#issue)#22982
Conversation
…reasons - Add extractResultFromText helper using brace-counting (no regex) to extract the complete JSON object even when reasons contain literal newlines - extractFromStreamJson now rejoins split lines and uses brace-counting instead of returning the first truncated line - parseDetectionLog raw mode uses same join+brace-count approach; falls back to trimmed line for non-object JSON (null, [], truncated) to preserve error messages - Normalize literal newlines to \n escape before JSON.parse to handle unescaped newlines introduced by outer stream-json decoder - Replace first-5/last-5 debug lines with line count + THREAT_DETECTION_RESULT lines - Add test for literal-newline-in-reasons case covering both extractFromStreamJson and full parseDetectionLog flow Agent-Logs-Url: https://github.com/github/gh-aw/sessions/3c9f11d1-b986-441a-a4db-8bfbf6723a0f Co-authored-by: davidslater <12449447+davidslater@users.noreply.github.com>
…nd test extractResultFromText Agent-Logs-Url: https://github.com/github/gh-aw/sessions/3c9f11d1-b986-441a-a4db-8bfbf6723a0f Co-authored-by: davidslater <12449447+davidslater@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Fixes parsing of THREAT_DETECTION_RESULT entries when verdict JSON can be split/truncated due to literal newline characters appearing inside reasons strings (after decoding stream-json envelopes).
Changes:
- Added
extractResultFromText()brace-counting extractor to reliably capture the full verdict JSON object. - Updated stream-json and raw parsing paths to rejoin lines from the prefix onward and extract the complete JSON object.
- Added targeted tests for the new extractor and the stream-json literal-newline scenario; updated debug logging to focus on matching lines.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| actions/setup/js/parse_threat_detection_results.cjs | Adds brace-count extraction and updates extraction/parsing logic to handle embedded newlines; adjusts debug logging and exports new helper. |
| actions/setup/js/parse_threat_detection_results.test.cjs | Adds unit tests for extractResultFromText() and an end-to-end stream-json test for literal newline handling. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Extract the complete JSON object using brace-counting. | ||
| return extractResultFromText(joined); |
There was a problem hiding this comment.
extractFromStreamJson() now always delegates to extractResultFromText(joined) and returns null when a prefix line exists but the value is not a {...} object (e.g. THREAT_DETECTION_RESULT:null, [], or truncated JSON). In stream-json mode this means parseDetectionLog() will report "No THREAT_DETECTION_RESULT found" instead of surfacing a parse/type error with the raw value, which is a regression in diagnostics.
Consider preserving the old fallback behavior here: if extractResultFromText(joined) returns null, return the trimmed prefix line (or the joined string trimmed) so the downstream parser can emit the existing, more informative error messages.
| // Extract the complete JSON object using brace-counting. | |
| return extractResultFromText(joined); | |
| // Extract the complete JSON object using brace-counting. If extraction | |
| // fails (e.g. the value is "null", "[]", or truncated JSON), fall back to | |
| // returning the raw trimmed line(s) so downstream parsers can report a | |
| // precise parse/type error using the original value. | |
| const extracted = extractResultFromText(joined); | |
| return extracted !== null ? extracted : joined; |
| // Normalize literal newline characters to JSON escape sequences before parsing. | ||
| // When the outer stream-json decoder unescapes \n sequences, actual newline | ||
| // characters may end up inside JSON string values (e.g. in reasons entries). | ||
| // Replacing them with the two-character sequence \n restores valid JSON so | ||
| // that JSON.parse can handle them correctly. | ||
| const normalizedJson = jsonPart.split("\n").join("\\n"); | ||
| const parsed = JSON.parse(normalizedJson); |
There was a problem hiding this comment.
The newline normalization before JSON.parse currently replaces all \n characters in jsonPart with the two-character sequence \\n. That will break otherwise-valid JSON that uses newlines as whitespace between tokens (pretty-printed / multi-line JSON), because \\n is not valid outside of a string.
A safer approach is to only escape literal newline characters that occur inside JSON string literals (you can reuse the same inString/escaped state machine used by extractResultFromText), or attempt JSON.parse(jsonPart) first and only apply targeted normalization when parsing fails due to an unescaped control character in a string.
Summary
Fixes a bug in
parse_threat_detection_results.cjswhere a reasons array containing values with literal newlines (encoded as\nescapes in the outer stream-json envelope) would cause the verdict JSON to be silently truncated and fail to parse.Root cause: After
JSON.parsedecodes the outer stream-json envelope,\nescape sequences in theresultfield become actual newline characters (U+000A) in the JS string. Calling.split("\n")on this string would break the verdict JSON mid-value at those embedded newlines, returning a truncated line that failsJSON.parse.Changes
parse_threat_detection_results.cjsextractResultFromText(new helper): Character-by-character brace-counting (no regex) that extracts the complete JSON object from a string starting withTHREAT_DETECTION_RESULT:. Tracks string context and escape sequences so braces inside string values are ignored. Stops at the first matching closing brace, discarding any trailing content.extractFromStreamJson(fixed): Instead of returning the first line starting with the prefix (which may be truncated at an embedded newline), it now finds the prefix line by index, rejoins all subsequent lines with"\n", then delegates toextractResultFromTextto extract the complete JSON object.parseDetectionLograw mode (fixed): Applies the same join+brace-count approach for raw (non-stream-json) log entries. Falls back to the trimmed line whenextractResultFromTextreturnsnull(e.g.null,[], truncated JSON) so that existing error messages are preserved.Literal-newline normalization: Before calling
JSON.parse, replaces any remaining actual newline characters with\nescape sequences (split("\n").join("\\n")). This restores valid JSON for the edge case where embedded newlines survive the extraction step.Improved debug logging in
main(): Replaces the first-5/last-5 lines preview with the total line count (already in the stats line) and a focused listing of only lines containingTHREAT_DETECTION_RESULT.parse_threat_detection_results.test.cjsNew
describe("extractResultFromText"): 9 tests covering simple objects, trailing content truncation, nested objects, braces inside string values, escaped quotes, literal newlines in strings, non-object JSON (null/array/number/empty), and truncated JSON.New stream-json newline test: Verifies that a stream-json log line whose
resultfield contains a reason with a literal newline is parsed correctly end-to-end, covering bothextractFromStreamJsonoutput and the fullparseDetectionLogflow.Testing
All 44 tests pass (
make fmt-cjs+make lint-cjsclean)./cc @davidslater