fix(encoding/json): honor json tags on anonymous struct Unmarshal targets#146
fix(encoding/json): honor json tags on anonymous struct Unmarshal targets#146mck wants to merge 6 commits into
Conversation
…iners
Unmarshal only special-cased struct fields/targets whose runtime value was
already an instance of the right shape; anything else fell through to a
generic "objects become Maps" decode with no awareness of Go element
types. Three related gaps, all traced to the same missing piece — the
decoder needs to walk a field/element's declared TypeInfo, not just the
JSON shape it sees:
1. map[string]Struct / map[string][]Struct fields (and further nesting,
e.g. map[string][]Struct) left plain JS objects/arrays in place instead
of real struct instances, so generated struct-field accessors read back
undefined:
type Ref struct{ Name string `json:"name"` }
type Container struct {
Contexts map[string][]Ref `json:"contexts"`
}
json.Unmarshal([]byte(`{"contexts":{"a":[{"name":"x"}]}}`), &c)
// c.Contexts["a"][0].Name read back "" (plain object, not a Ref)
2. An interface{}-typed array of objects left its element objects as raw
plain JS objects instead of this override's Map representation, so a
`case map[string]any:` type switch on an element panicked calling
Map-only methods like .entries():
var v any
json.Unmarshal([]byte(`[{"x":1},{"y":2}]`), &v)
// v.([]any)[0] was a plain object, not the Map every other
// interface{}-typed object value decodes to
3. json.RawMessage wasn't recognized as a map/slice ELEMENT type (only as
a direct field type), and there was no Pointer branch in the
type-driven path at all, so pointer elements/fields
(map[string]*Property, Items *ItemsSpec) leaked plain objects with
undefined Go field reads.
Fix: a recursive decodeValueForType(decoded, typeInfo, opts) that decodes
a raw JSON value per its Go TypeInfo — Struct constructs a real instance
via the registered ctor, Slice/Map recurse per elemType, Pointer decodes
as its pointee, RawMessage captures the raw fragment as bytes, and
anything else (including interface{}) falls back to a new
decodeInterfaceValue that replaces objectToMap and additionally recurses
into array elements (fixing case 2 for nested and top-level arrays alike).
Wired in at both call sites: assignDecodedFieldValue (struct fields) and
assignDecodedValue's new top-level branch, which resolves a var's own
type via the __goType spelling the runtime boxes onto Unmarshal's `any`
target — needed for e.g. `var m map[string]json.RawMessage` where there
is no struct field to hang type info off of.
Extends gs/encoding/json/index.test.ts with one case per bug (map[string]
[]Struct field, interface{} array-of-objects, pointer-to-struct map value
+ field, and RawMessage as both a map and slice element). All four fail
on master and pass with this fix; verified by temporarily reverting just
the source change and confirming the new tests fail with the same errors
described above.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Signed-off-by: Marco Christian Krenn <marco.krenn@gmail.com>
Addresses review feedback on s4wave#145: pointer-to-struct fields/elements decoded to instances marked via markAsStructValue, but the runtime's pointer type matching (matchesPointerType in gs/builtin/type.ts) requires a struct pointee to stay UNMARKED, since marking is reserved for genuine non-pointer struct values. A decoded *Struct value therefore matched a Struct-by-value type assertion and failed a *Struct assertion, the inverse of correct Go semantics, which would break type switches and pointer-receiver method-set checks on JSON-decoded values. Three related spots, all part of the same gap: - decodeValueForType's Pointer branch recursed into its elemType and let the generic Struct branch mark the result. Now it constructs the pointee directly (mirroring the Struct branch's construction, minus the mark) for struct pointees, and only recurses for non-struct pointees (which carry no marking). - assignDecodedFieldValue's pointer-to-struct field routing called decodeValueForType with the field's elemType, bypassing the Pointer branch (and its now-correct handling) entirely. Now it passes the Pointer TypeInfo itself, so both call sites share one code path. - goTypeDescriptor unconditionally stripped every leading '*' while recursively parsing a __goType spelling, so a nested pointer element type (e.g. the *test.Property in map[string]*test.Property) lost its pointer-ness before a TypeInfo was ever built for it. Now a leading '*' produces a Pointer descriptor instead of being discarded; only the outermost address-of star (the Unmarshal target's own &var) is stripped, by unmarshalTargetGoType, before this function ever runs. Extends gs/encoding/json/index.test.ts with two cases verifying the decoded value matches *Struct and not Struct via the public $.is() API (one for inline struct-field TypeInfo, one for a __goType spelling with a nested pointer element). Both fail on the prior commit and pass with this fix, verified by temporarily reverting just the source change. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Signed-off-by: Marco Christian Krenn <marco.krenn@gmail.com>
There was a problem hiding this comment.
Pull request overview
This PR extends the gs/encoding/json Unmarshal implementation to honor json:"..." tags on anonymous (inline) struct targets by parsing the target’s __goType spelling into field metadata, and then decoding through the existing per-field/type-driven machinery. It also adds tests covering anonymous struct targets and anonymous structs as map element types.
Changes:
- Add parsing for inline
struct{...}type spellings (including tags) and decode anonymous struct targets/element types using parsed field metadata. - Extend type-driven decoding utilities (
resolveTypeInfo,decodeValueForType,decodeInterfaceValue) and wire them into Unmarshal assignment paths. - Add vitest coverage for anonymous struct decoding and related nested container decoding scenarios.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| gs/encoding/json/index.ts | Adds anonymous-struct type-spelling parsing and integrates it into Unmarshal’s type-driven decode/assignment paths. |
| gs/encoding/json/index.test.ts | Adds tests for anonymous struct Unmarshal targets and anonymous struct map element decoding. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| const info = resolveTypeInfo(desc) | ||
| if (info !== null) { | ||
| if ( | ||
| ($.isSliceTypeInfo(info) && Array.isArray(decoded)) || | ||
| ($.isMapTypeInfo(info) && isPlainObject(decoded)) | ||
| ) { | ||
| target.value = decodeValueForType(decoded, desc, opts) | ||
| return | ||
| } | ||
| } |
There was a problem hiding this comment.
Good catch, fixed in 38181bf: assignDecodedValue's type-driven dispatch now also matches Pointer TypeInfo, reusing decodeValueForType's existing unmarked Pointer handling. Added a test with a top-level var p *pkg.T style target verifying it matches *Property, not Property.
| for (const field of fields) { | ||
| const jsonName = jsonFieldName(field.name, field.tag) | ||
| if ( | ||
| jsonName === '' || | ||
| !Object.prototype.hasOwnProperty.call(decoded, jsonName) | ||
| ) { | ||
| continue | ||
| } | ||
| const ref = $.varRef(target[field.name]) | ||
| assignDecodedFieldValue(ref, decoded[jsonName], opts, field.type) | ||
| target[field.name] = ref.value | ||
| } |
There was a problem hiding this comment.
Good catch, fixed in 38181bf: assignAnonymousStructFields now checks decoded keys against the field list's json names when disallowUnknownFields is set, same as assignStructFields does for named structs.
| if ($.isMapTypeInfo(info) && isPlainObject(decoded)) { | ||
| target.value = decodeValueForType(decoded, info, opts) | ||
| return | ||
| } | ||
| // Pointer-to-struct fields (e.g. Items *ItemsSpec) start as nil, so the | ||
| // generic path below has no zero-value instance to populate — construct | ||
| // the pointee via the type-driven decoder. Decode with the Pointer | ||
| // TypeInfo itself (not its elemType) so decodeValueForType's Pointer | ||
| // branch handles the pointer-vs-value marking correctly; see its | ||
| // comment. Pointer-to-basic fields keep the generic path (their | ||
| // representation is not a struct instance). | ||
| if ($.isPointerTypeInfo(info) && isPlainObject(decoded)) { |
There was a problem hiding this comment.
Good catch, fixed in 38181bf: assignAnonymousStructFields now constructs a by-value named-struct field directly via decodeValueForType whenever the field's type resolves to a real, ctor-bearing struct type, instead of falling through to the generic path that expects a pre-existing instance.
…nymous decode Addresses review feedback on s4wave#146: three related gaps left by the anonymous-struct decoder, all found by consistency-checking it against the pointer/disallowUnknownFields handling elsewhere in the file. 1. A top-level pointer-to-struct var, e.g. `var p *pkg.T` decoded via `json.Unmarshal(data, &p)`, fell through to the untyped decode and produced a Map instead of a real (unmarked) struct pointee. assignDecodedValue's type-driven dispatch now also matches Pointer TypeInfo, reusing decodeValueForType's existing (correctly unmarked) Pointer handling -- the same path pointer-typed fields and map/slice elements already went through. 2. assignAnonymousStructFields ignored opts.disallowUnknownFields, so Decoder.DisallowUnknownFields() correctly errored for named structs but silently accepted unknown keys on anonymous struct targets. Now checks decoded keys against the field list's json names the same way assignStructFields does for named structs. 3. A by-value named-struct field inside an anonymous struct (e.g. `struct{ Inner pkg.T "json:\"inner\"" }`) has no pre-existing instance for the generic decode path to populate into -- unlike a registered struct's fields, an anonymous struct's own fields start unset (the target object starts as {}). assignAnonymousStructFields now constructs such fields directly via decodeValueForType whenever the field's type resolves to a real, ctor-bearing struct type; an inline TypeInfo with no ctor (as used by struct fields elsewhere in this test file) is left on the existing generic path unchanged. Extends gs/encoding/json/index.test.ts with one case per gap. All three fail on the parent commit and pass with this fix, verified by temporarily reverting just the source change. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Signed-off-by: Marco Christian Krenn <marco.krenn@gmail.com>
paralin
left a comment
There was a problem hiding this comment.
Requesting changes: nested anonymous struct fields still fall through to the generic interface decoder.
At gs/encoding/json/index.ts:1515, assignAnonymousStructFields only special-cases ctor-backed named struct fields before calling assignDecodedFieldValue. For a field like:
struct {
Inner struct {
Name string `json:"name"`
} `json:"inner"`
}parseAnonymousStructFields stores Inner’s type as another anonymous-struct descriptor, not a registered TypeInfo. resolveTypeInfo(field.type) returns null, so the field falls through and is decoded as a generic interface object: a Map keyed by the JSON tag (name) instead of the anonymous struct representation keyed by Go field name (Name). Nested DisallowUnknownFields is skipped for the same reason. A *struct{...} anonymous field has the same failure mode.
Focused repros on this head:
expected { Name: "Ada" }
received Map { "name" => "Ada" }
Fix direction: before the generic assignDecodedFieldValue fallback, route anonymous field descriptors through decodeValueForType(fieldValue, field.type, opts). For pointer descriptors, let decodeValueForType handle *struct{...} as well; its pointer branch can recurse into the anonymous descriptor and preserve the parsed json tags.
Addresses paralin's review on s4wave#146: a field whose own type is itself an inline struct{...} (or *struct{...}) still fell through to the generic interface{} decoder and came back as a Map keyed by the JSON tag instead of the anonymous struct representation keyed by the Go field name. Nested DisallowUnknownFields was skipped for the same reason. parseAnonymousStructFields stores such a field's type as another anonymous-struct descriptor (or a Pointer TypeInfo wrapping one), not a registered TypeInfo, so resolveTypeInfo(field.type) returns null for it, the same as for a missing type. assignAnonymousStructFields therefore left it on the generic assignDecodedFieldValue path, which treats a plain JSON object as an interface{} value (a Map). assignAnonymousStructFields now routes anonymous field descriptors (struct{...} directly, or *struct{...} via a new isAnonymousStructType helper) through decodeValueForType before the generic fallback. decodeValueForType recurses into assignAnonymousStructFields (or, for the pointer case, its Pointer branch falls through to the same anonymous decode for a non-struct-typeinfo pointee), so the field's own parsed json tags and disallowUnknownFields are honored. Extends gs/encoding/json/index.test.ts with three cases (nested struct{...} field, nested *struct{...} field, nested disallowUnknownFields). All three fail on the parent commit and pass with this fix. Verify: bun run typecheck && vitest run gs/encoding/json/index.test.ts Signed-off-by: Marco Christian Krenn <marco.krenn@gmail.com>
|
Fixed in dea5a9e. A field whose own type is itself an inline struct{...} (or *struct{...}) had its type stored as another anonymous-struct descriptor by parseAnonymousStructFields, so resolveTypeInfo(field.type) returned null and it fell through to the generic assignDecodedFieldValue path, decoding as an interface{} Map keyed by the JSON tag with nested DisallowUnknownFields skipped. assignAnonymousStructFields now routes anonymous field descriptors (struct{...} directly, or *struct{...} via a new isAnonymousStructType helper) through decodeValueForType before that fallback, which recurses into assignAnonymousStructFields (the pointer case falls through its Pointer branch to the same anonymous decode for a non-struct-typeinfo pointee), so the parsed json tags and disallowUnknownFields are honored. Added three regression tests (nested struct{...} field, nested *struct{...} field, nested disallowUnknownFields) that fail on the parent commit and pass here. Verify: bun run typecheck && vitest run gs/encoding/json/index.test.ts |
paralin's review on s4wave#145 flagged that the pointer-to-struct decode path added in this branch bypasses UnmarshalJSON: assignDecodedFieldValue routed straight into decodeValueForType and returned, so a *T field whose T implements UnmarshalJSON got field-by-field populated into a fresh instance instead of decoded through the custom hook Go's encoding/json would always prefer. Two spots needed the hook check, matching the fix direction from the review: - assignDecodedFieldValue now checks unmarshalJSONTarget(target.value) before routing into the type-driven constructor. A field that already holds a non-nil pointer whose pointee implements UnmarshalJSON falls through to assignDecodedValue, which invokes the hook on that existing instance in place (matching Go, which decodes into the pointee rather than allocating a replacement). - decodeValueForType's Pointer branch now checks the freshly allocated pointee for UnmarshalJSON before falling back to assignStructFields. This covers the nil-field case (nothing exists yet to check the hook against until after allocation) and also map/slice pointer-to-struct elements, which go through this same branch. Adds two regression tests to gs/encoding/json/index.test.ts: a nil pointer field decoding through the hook, and a non-nil pointer field decoding into the existing pointee in place (checked via object identity, not just field content). Both fail on the prior commit and pass with this fix, verified by temporarily reverting just the source change. Verify: bunx vitest run gs/encoding/json/index.test.ts Signed-off-by: Marco Christian Krenn <marco.krenn@gmail.com>
…gets Part of s4wave#142 (item 5). Stacked on s4wave#145 (fix/json-unmarshal-typed-containers): both need the __goType-spelling parser (goTypeDescriptor) and the type-driven decoder (decodeValueForType) that PR introduces, so this PR extends them rather than duplicating them. Please merge s4wave#145 first. Unmarshal only knew how to populate a target carrying struct-field runtime metadata (a registered, named struct type). An anonymous (inline, unnamed) struct target carries no such metadata: var loose struct { Name string `json:"name"` Age int `json:"age"` } json.Unmarshal([]byte(`{"name":"Ada","age":30}`), &loose) // loose.Name and loose.Age read back "" and 0 -- every field empty so it fell through to the untyped interface{} decode, which replaces the plain-object struct value with this override's Map representation entirely, dropping every field. Fix: an anonymous struct target still carries its Go type spelling (including field names, types, and json tags) via __goType, e.g. `*struct{Name string "json:\"name\""; Age int "json:\"age\""}`. Parses it into the same fieldMetadata shape a registered struct's fields use (parseAnonymousStructFields, splitting on the tag's Go-quoted string literal so a ';' inside a tag can't split a field early), then decodes each field through the existing per-field path (assignDecodedFieldValue/assignAnonymousStructFields) -- tags, nested slices/maps, and RawMessage fields included. Wired into both assignDecodedValue (a top-level anonymous-struct var) and decodeValueForType (an anonymous struct used as a map/slice element type, e.g. `map[string]struct{...}`). Extends gs/encoding/json/index.test.ts with two cases: a top-level anonymous struct target, and one used as a map value type. Both fail on the parent commit (empty fields / a Map instead of the plain object) and pass with this fix. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Signed-off-by: Marco Christian Krenn <marco.krenn@gmail.com>
…nymous decode Addresses review feedback on s4wave#146: three related gaps left by the anonymous-struct decoder, all found by consistency-checking it against the pointer/disallowUnknownFields handling elsewhere in the file. 1. A top-level pointer-to-struct var, e.g. `var p *pkg.T` decoded via `json.Unmarshal(data, &p)`, fell through to the untyped decode and produced a Map instead of a real (unmarked) struct pointee. assignDecodedValue's type-driven dispatch now also matches Pointer TypeInfo, reusing decodeValueForType's existing (correctly unmarked) Pointer handling -- the same path pointer-typed fields and map/slice elements already went through. 2. assignAnonymousStructFields ignored opts.disallowUnknownFields, so Decoder.DisallowUnknownFields() correctly errored for named structs but silently accepted unknown keys on anonymous struct targets. Now checks decoded keys against the field list's json names the same way assignStructFields does for named structs. 3. A by-value named-struct field inside an anonymous struct (e.g. `struct{ Inner pkg.T "json:\"inner\"" }`) has no pre-existing instance for the generic decode path to populate into -- unlike a registered struct's fields, an anonymous struct's own fields start unset (the target object starts as {}). assignAnonymousStructFields now constructs such fields directly via decodeValueForType whenever the field's type resolves to a real, ctor-bearing struct type; an inline TypeInfo with no ctor (as used by struct fields elsewhere in this test file) is left on the existing generic path unchanged. Extends gs/encoding/json/index.test.ts with one case per gap. All three fail on the parent commit and pass with this fix, verified by temporarily reverting just the source change. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Signed-off-by: Marco Christian Krenn <marco.krenn@gmail.com>
Addresses paralin's review on s4wave#146: a field whose own type is itself an inline struct{...} (or *struct{...}) still fell through to the generic interface{} decoder and came back as a Map keyed by the JSON tag instead of the anonymous struct representation keyed by the Go field name. Nested DisallowUnknownFields was skipped for the same reason. parseAnonymousStructFields stores such a field's type as another anonymous-struct descriptor (or a Pointer TypeInfo wrapping one), not a registered TypeInfo, so resolveTypeInfo(field.type) returns null for it, the same as for a missing type. assignAnonymousStructFields therefore left it on the generic assignDecodedFieldValue path, which treats a plain JSON object as an interface{} value (a Map). assignAnonymousStructFields now routes anonymous field descriptors (struct{...} directly, or *struct{...} via a new isAnonymousStructType helper) through decodeValueForType before the generic fallback. decodeValueForType recurses into assignAnonymousStructFields (or, for the pointer case, its Pointer branch falls through to the same anonymous decode for a non-struct-typeinfo pointee), so the field's own parsed json tags and disallowUnknownFields are honored. Extends gs/encoding/json/index.test.ts with three cases (nested struct{...} field, nested *struct{...} field, nested disallowUnknownFields). All three fail on the parent commit and pass with this fix. Verify: bun run typecheck && vitest run gs/encoding/json/index.test.ts Signed-off-by: Marco Christian Krenn <marco.krenn@gmail.com>
dea5a9e to
16e16f4
Compare
|
Rebased this branch onto the latest #145 (c18d36a, paralin's fix so a non-nil *T implementing UnmarshalJSON decodes through the hook, both when pre-populated and when freshly allocated). The two changes coexist cleanly: the custom UnmarshalJSON hook check runs first in both assignDecodedFieldValue and decodeValueForType's Pointer branch, and everything that isn't a custom unmarshaler still falls through to the anonymous-struct and type-driven handling added here. Ran the full JS suite, the json package tests specifically (34 passing, covering both PRs' regressions), typecheck, lint, and go test ./... — all green. Since this was rebased on #145's tip, #145 should merge first, then this. |
| const space = spec.indexOf(' ') | ||
| if (space < 0) { | ||
| continue | ||
| } | ||
| const name = spec.slice(0, space) |
| // parseAnonymousStructFields parses the field list of an inline struct type | ||
| // spelling ("Name Type" or "Name Type \"tag\"" decls separated by ';') into | ||
| // fieldMetadata. Embedded fields (no explicit name) are skipped: there is no | ||
| // way to address their promoted keys from this spelling alone. | ||
| function parseAnonymousStructFields(body: string): fieldMetadata[] { |
paralin
left a comment
There was a problem hiding this comment.
Requesting changes: anonymous struct parsing currently treats unexported fields as JSON fields.
At gs/encoding/json/index.ts:1394, parseAnonymousStructFields pushes every named field into fieldMetadata:
const name = spec.slice(0, space)
fields.push({
key: name,
name,
type: goTypeDescriptor(spec.slice(space + 1).trim()),
...(tag !== undefined ? { tag } : {}),
})Go's encoding/json ignores unexported fields even when they have a json:"..." tag. Current code populates lowercase fields and also includes them in the DisallowUnknownFields known-name set, so strict mode accepts keys Go would reject.
Focused repros on current head:
expected { name: "keep", Age: 30 }
received { name: "Ada", Age: 30 }
and strict mode:
expected json: unknown field "name"
received undefined
Fix direction: parseAnonymousStructFields should skip unexported named fields before adding them to fieldMetadata. Then ordinary decode leaves lowercase fields untouched, and DisallowUnknownFields rejects JSON keys that only match unexported fields.
I also re-ran the existing focused JSON test and typecheck on the current head:
bun run vitest run gs/encoding/json/index.test.ts
# 34 passed
bun run typecheck
# passed
Part of #142 (item 5).
Stacked on #145 (fix/json-unmarshal-typed-containers): this branch is #145's branch plus its own commits, because this fix extends the __goType-spelling parser (
goTypeDescriptor) and the type-driven decoder (decodeValueForType) that #145 introduces, rather than duplicating them in an independent branch. GitHub doesn't let a cross-fork PR target another fork branch directly, so this is opened against master and shows #145's diff too until #145 merges (after which it auto-shrinks to just the new commits). Please review/merge #145 first, then review only this branch's own two commits (visible individually under the Commits tab):37040341(the item-5 fix) and38181bf7(three follow-up fixes from review, described below).Unmarshalonly knew how to populate a target carrying struct-field runtime metadata (a registered, named struct type). An anonymous (inline, unnamed) struct target carries no such metadata, so it fell through to the untypedinterface{}decode, which replaces the plain-object struct value with this override's Map representation entirely, dropping every field.An anonymous struct target still carries its Go type spelling (including field names, types, and json tags) via
__goType, e.g.*struct{Name string "json:\"name\""; Age int "json:\"age\""}. The fix parses it into the samefieldMetadatashape a registered struct's fields use (parseAnonymousStructFields, splitting on the tag's Go-quoted string literal so a;inside a tag can't split a field early), then decodes each field through the existing per-field path (assignDecodedFieldValue/assignAnonymousStructFields), so tags, nested slices/maps, and RawMessage fields are all included for free. Wired into bothassignDecodedValue(a top-level anonymous-struct var) anddecodeValueForType(an anonymous struct used as a map/slice element type, e.g.map[string]struct{...}).Review also found three related gaps, fixed in the follow-up commit
38181bf7. A top-level pointer-to-struct var (var p *pkg.T) fell through to the untyped decode instead of the same unmarked-pointee Pointer handling used elsewhere;assignDecodedValue's type-driven dispatch now also matches Pointer TypeInfo.assignAnonymousStructFieldsignoredopts.disallowUnknownFields, unlike named-struct decode; it now enforces it the same way. And a by-value named-struct field inside an anonymous struct had no pre-existing instance for the generic decode path to populate into (anonymous structs start with unset fields, unlike registered structs' pre-initialized zero-value instances); it is now constructed directly when the field's type resolves to a real, ctor-bearing struct type.Tests extend
gs/encoding/json/index.test.tswith five cases: a top-level anonymous struct target, one used as a map value type, a top-level pointer-to-struct var,disallowUnknownFieldson an anonymous target, and a by-value named-struct field inside an anonymous struct. Each fails on the commit before its fix and passes with it.Verify:
29 passed. Full suite:
bun run typecheckclean,bun run test:js1282 passed / 4 skipped / 0 failed,go test ./...all packages green,bun run lint:jsclean.Found while transpiling and running a real ~100k-LOC Go interpreter/parser package through goscript.