Skip to content

Conversation

@bukzor
Copy link
Contributor

@bukzor bukzor commented Oct 22, 2025

Summary

  • Fixes CDATA sections being silently ignored during XML-to-JSON conversion
  • CDATA content was lost because NodeToJSON only handled TextNode, not CharDataNode
  • Added CharDataNode handling to all text processing in jsonutil.go

Changes

  • Modified internal/utils/jsonutil.go to handle xmlquery.CharDataNode in all switch statements that process text
  • Added test TestCDATASupport demonstrating CDATA content is preserved in JSON output

Test Plan

  • ✅ Added failing test showing CDATA content was lost
  • ✅ Implemented fix by adding CharDataNode handling
  • ✅ Test now passes, CDATA content preserved
  • ✅ All existing tests still pass

Example

Before: echo '<root><![CDATA[1 & 2]]></root>' | xq -j would output empty object
After: echo '<root><![CDATA[1 & 2]]></root>' | xq -j correctly outputs {"root":"1 & 2"}

🤖 Generated with Claude Code

@bukzor bukzor force-pushed the fix-160-cdata-support branch from f3653c5 to f54a854 Compare October 22, 2025 17:06
…JSON

CDATA sections were being silently ignored during XML-to-JSON conversion
because the NodeToJSON function only handled TextNode types, not CharDataNode.
This caused data loss when using the -j flag on XML with CDATA sections.

Changes:
- Add CharDataNode handling to all text processing in jsonutil.go
- Add test demonstrating CDATA content is preserved in JSON output

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@bukzor bukzor force-pushed the fix-160-cdata-support branch from f54a854 to 41aaa0e Compare October 22, 2025 17:10
@bukzor bukzor marked this pull request as ready for review October 22, 2025 17:14
bukzor added a commit to bukzor/xq that referenced this pull request Oct 22, 2025
PR sibprogrammer#162 revealed that missing node type handlers in switch statements
cause silent data loss - CDATA content was being dropped because
CharDataNode wasn't handled. This PR adds defensive programming to
prevent this entire class of bug.

Changes:
- Added exhaustive switch coverage for all xmlquery.NodeType values
- Added exhaustive switch coverage for xml.Token and html.TokenType
- Added exhaustive switch coverage for json.Token and json.Delim
- Added default cases with panic() to all exhaustive switches
- Added explicit documented defaults to intentionally non-exhaustive switches

This follows the Go stdlib pattern (runtime/panic.go, go/types) of using
panic("unreachable") for "should be impossible" states. If future xmlquery
versions add new node types, or if we overlook a handler, tests will fail
immediately rather than silently corrupting output.

All switches now explicitly handle defaults:
- 8 exhaustive switches panic on unknown values
- 7 non-exhaustive switches have documented intentional behavior

Added TestExhaustiveNodeTypeHandling to verify all node types are
processed without panicking. All existing tests pass.

This is a non-breaking change - no API modifications, only defensive
additions to switch statements.

Related: sibprogrammer#162
@sibprogrammer sibprogrammer merged commit 41aaa0e into sibprogrammer:master Oct 27, 2025
1 check passed
@sibprogrammer
Copy link
Owner

Thank you for your work!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants