Skip to content

Comments

Fix json encoding mismatch #698

Merged
eminano merged 4 commits intomainfrom
jsonb-fix-for-upstream
Jan 28, 2026
Merged

Fix json encoding mismatch #698
eminano merged 4 commits intomainfrom
jsonb-fix-for-upstream

Conversation

@eminano
Copy link
Contributor

@eminano eminano commented Jan 28, 2026

Same as #685 with linting fixed.

Thanks @VikramChennai 🙏

VikramChennai and others added 4 commits January 28, 2026 16:17
…tches

When pgx receives a map[string]any for JSONB columns, it re-serializes
using encoding/json, which can produce different output than Sonic
(used to parse wal2json). This mismatch causes 'invalid input syntax
for type json' errors on the target database for complex JSONB with
Unicode, emojis, or special escapes.

Fix: Pre-serialize JSONB/JSON columns to []byte using Sonic in
filterRowColumns() before passing to pgx. This ensures consistent
encoding throughout the pipeline.
- Only serialize map[string]any and []any types with Sonic
- Pass string values through unchanged (fixes schemalog snapshot)
- Add comprehensive tests for all JSONB value types
- Compact code for production readiness
@eminano eminano force-pushed the jsonb-fix-for-upstream branch from d2a1978 to b9a4985 Compare January 28, 2026 15:18
@eminano eminano requested a review from Copilot January 28, 2026 15:23
@github-actions
Copy link

Merging this branch will increase overall coverage

Impacted Packages Coverage Δ 🤖
github.com/xataio/pgstream/pkg/wal/processor/postgres 83.50% (+0.13%) 👍

Coverage by file

Changed files (no unit tests)

Changed File Coverage Δ Total Covered Missed 🤖
github.com/xataio/pgstream/pkg/wal/processor/postgres/postgres_wal_dml_adapter.go 93.66% (+0.28%) 142 (+6) 133 (+6) 9 👍

Please note that the "Total", "Covered", and "Missed" counts above refer to code statements instead of lines of code. The value in brackets refers to the test coverage of that file in the old version of the code.

Changed unit test files

  • github.com/xataio/pgstream/pkg/wal/processor/postgres/jsonb_serialization_test.go

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a JSON encoding mismatch issue where JSONB columns were causing 'invalid input syntax for type json' errors in the target database. The problem occurred because pgx was re-serializing JSONB values using encoding/json, which could produce different output than Sonic (used to parse wal2json), especially for complex JSONB with Unicode, emojis, or special escapes.

Changes:

  • Pre-serialize JSONB/JSON map and slice values using Sonic before passing to pgx
  • Add comprehensive tests for JSONB serialization with Unicode, emojis, and special characters
  • Ensure string values pass through unchanged to avoid double-encoding

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
pkg/wal/processor/postgres/postgres_wal_dml_adapter.go Added serializeJSONBValue function to pre-serialize JSONB/JSON columns with Sonic, integrated into filterRowColumns and buildWhereQuery
pkg/wal/processor/postgres/jsonb_serialization_test.go Added comprehensive tests for JSONB handling including maps, arrays, strings, and WHERE clauses with Unicode and special characters

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@eminano eminano merged commit fd72d1c into main Jan 28, 2026
13 checks passed
@eminano eminano deleted the jsonb-fix-for-upstream branch January 28, 2026 15:27
@VikramChennai
Copy link
Contributor

Appreciate you guys being able to get this in!

Had very little time on my end to clean it up so much appreciated :)

@eminano eminano linked an issue Feb 9, 2026 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Snapshot] Invalid input syntax for type json with valid value

2 participants