Background and motivation
The C# compiler just introduced support for union types. Sibling proposals for closed hierarchies and closed enums are not yet available in the compiler preview, so this issue is scoped to unions only. System.Text.Json support for closed hierarchies and closed enums will be tracked separately once their compiler features land.
This issue is a sub-issue of the broader STJ union/closed-types umbrella (#125449), extracting the API surface that is relevant today.
The goal is twofold:
- Provide out-of-the-box serialization support for simple union types whose union cases present no structural ambiguity (e.g.
union Result(int, string)).
- Provide a case classifier abstraction for unions whose cases are of the same type (e.g.
union Pet(Cat, Dog) where both serialize as JSON objects). This abstraction is also extended to the existing polymorphic type infrastructure.
Design summary
Serialization: no discriminator
Union values serialize transparently — the wrapper produced by the C# union keyword is unpacked and the underlying case value is written using its own JSON contract. There is no envelope object, no $type field, no tagging of any kind:
union Result(int, string);
JsonSerializer.Serialize<Result>(new Result(42)); // 42
JsonSerializer.Serialize<Result>(new Result("hello")); // "hello"
This is a deliberate departure from the polymorphism support exposed via [JsonPolymorphic] / [JsonDerivedType], where derived types are written with a $type discriminator. Unions don't have a natural discriminator: any case can be picked by the union's constructors, and two distinct case constructors can produce equal values. Synthesising an artificial discriminator (e.g. the case type name) would push that arbitrary choice into the wire format and lock STJ into it forever. Users who want a tagged representation can keep using [JsonPolymorphic] directly, or attach a custom converter.
Deserialization: first-token dispatch
Without a discriminator on the wire, the converter has to recover the case type from the JSON value itself. The chosen mechanism is to look at a single thing — the first token of the value — and pick the unique union case whose declared type is compatible with that token kind. The mapping is fixed:
JsonTokenType |
Compatible case types |
Number |
numeric primitives (int, long, double, decimal, …) |
String |
string, DateTime, DateTimeOffset, Guid, TimeSpan, Uri, char, byte[], enums |
True / False |
bool |
StartObject |
objects and dictionaries |
StartArray |
arrays and collections |
Null |
null |
Selection is O(1) and does not buffer. union Result(int, string) works cleanly out of the box because int is the only case compatible with Number and string is the only case compatible with String.
Ambiguous unions
The token-only rule is intentionally narrow, so a number of perfectly valid unions cannot be disambiguated by it:
union Num(int, long) — both cases are Number.
union When(DateTime, DateTimeOffset) — both cases are String.
union Pet(Cat, Dog) — both cases are StartObject.
For these unions, the metadata layer throws InvalidOperationException when the JsonTypeInfo is being configured, and the source generator emits the diagnostic SYSLIB1227 so the failure surfaces at compile time. The user is then expected to attach a custom JsonTypeClassifier that decides which case applies.
Why not structural matching or content sniffing?
Two natural-looking alternatives were considered and rejected as defaults:
- Structural matching, where the converter parses the value and tries each case type until one succeeds, would in principle resolve
union Pet(Cat, Dog) automatically. But it requires unbounded read-ahead, costs O(n) on every value, and silently chooses an arm when more than one case structurally matches — which is precisely the case where the user most needs an error.
- Content sniffing for ambiguous string forms (e.g. attempting to parse
"2024-05-01" as DateTime first, falling back to string) is culture-sensitive, security-sensitive (parser oracles on attacker-controlled input), and produces results that depend on which parsers happen to accept which inputs.
Both belong in user code, opted into via a JsonTypeClassifier. The default behaviour stays predictable, allocation-free, and refuses to guess.
Customization
Customization is exposed as a JsonTypeClassifier delegate produced by a JsonTypeClassifierFactory. The same abstraction also plugs into [JsonPolymorphic] types, so custom discriminator strategies for open hierarchies (e.g. "kind" instead of $type) and ambiguity resolution for unions share one extension point.
The simplest case writes the classifier inline against Utf8JsonReader:
class Dog { public string? Name { get; set; } public string? Breed { get; set; } }
class Cat { public string? Name { get; set; } public int Lives { get; set; } }
public sealed class PetClassifier : JsonTypeClassifierFactory
{
public override JsonTypeClassifier CreateJsonClassifier(
JsonTypeClassifierContext context, JsonSerializerOptions options)
{
// The reader passed to the classifier is a defensive copy positioned
// at the start of the value — safe to advance freely.
return static (ref Utf8JsonReader reader) =>
{
if (reader.TokenType is not JsonTokenType.StartObject) return null;
while (reader.Read() && reader.TokenType is not JsonTokenType.EndObject)
{
if (reader.TokenType is JsonTokenType.PropertyName)
{
if (reader.ValueTextEquals("Breed"u8)) return typeof(Dog);
if (reader.ValueTextEquals("Lives"u8)) return typeof(Cat);
reader.Read();
reader.TrySkip();
}
}
return null;
};
}
}
[JsonUnion(TypeClassifier = typeof(PetClassifier))]
union Pet(Cat, Dog);
A higher-level alternative parses the value into a JsonNode first — simpler to write, more allocation:
public sealed class JsonNodePetClassifier : JsonTypeClassifierFactory
{
public override JsonTypeClassifier CreateJsonClassifier(
JsonTypeClassifierContext context, JsonSerializerOptions options)
{
return static (ref Utf8JsonReader reader) =>
{
if (JsonNode.Parse(ref reader) is JsonObject obj)
{
if (obj.ContainsKey("Breed")) return typeof(Dog);
if (obj.ContainsKey("Lives")) return typeof(Cat);
}
return null;
};
}
}
Because the factory runs once per JsonTypeInfo, expensive setup — building lookup tables, walking property metadata, capturing options-derived state — should happen there. The returned classifier delegate then runs on the hot path:
public sealed class PropertyBasedClassifier : JsonTypeClassifierFactory
{
public override JsonTypeClassifier CreateJsonClassifier(
JsonTypeClassifierContext context, JsonSerializerOptions options)
{
var distinguishing = new Dictionary<string, Type>(StringComparer.OrdinalIgnoreCase);
foreach (JsonDerivedType dt in context.CandidateTypes)
{
foreach (JsonPropertyInfo prop in options.GetTypeInfo(dt.DerivedType).Properties)
{
if (prop.Name is { } name)
distinguishing.TryAdd(name, dt.DerivedType);
}
}
return (ref Utf8JsonReader reader) =>
{
if (reader.TokenType is not JsonTokenType.StartObject) return null;
while (reader.Read() && reader.TokenType is not JsonTokenType.EndObject)
{
if (reader.TokenType is JsonTokenType.PropertyName &&
distinguishing.TryGetValue(reader.GetString()!, out Type? match))
return match;
reader.Read();
reader.TrySkip();
}
return null;
};
}
}
Schema generation
JsonSchemaExporter emits an anyOf schema composed from JsonTypeInfo.UnionCases, with shared JsonSchemaType values hoisted to the parent. For example, union(string, int) produces:
{
"anyOf": [
{ "type": "string" },
{ "type": "integer" }
]
}
Schema output is classifier-invariant — the exporter does not invoke TypeClassifier, so swapping or removing a custom classifier does not change the generated schema. This mirrors [JsonPolymorphic], whose schema depends only on the registered DerivedTypes list and not on runtime discriminator resolution.
Source-generated metadata
For each compiler union, the source generator emits a CreateUnionInfo<T>(...) call populating JsonUnionInfoValues<T>. The case list is sorted most-derived-first using the topological-sort helpers shared with the rest of the source generator, so that the switch-based dispatch in the generated constructor and deconstructor never selects a base case before a derived one. For union Pet(Dog, Lab) where Lab : Dog:
var unionInfo = new JsonUnionInfoValues<Pet>
{
UnionCases = new JsonUnionCaseInfo[]
{
new JsonUnionCaseInfo(typeof(Lab)), // most-derived first
new JsonUnionCaseInfo(typeof(Dog)),
},
UnionConstructor = static (Type _, object? value) => value switch
{
Lab caseValue0 => new Pet(caseValue0),
Dog caseValue1 => new Pet(caseValue1),
},
UnionDeconstructor = static (Pet value) => value switch
{
Lab caseValue0 => (typeof(Lab), caseValue0),
Dog caseValue1 => (typeof(Dog), caseValue1),
null => (null, null),
},
};
The Type parameter of UnionConstructor is unused in the generated form because pattern matching on value is sufficient to pick the right union case constructor; it is preserved on the public delegate for hand-written contracts that may want to dispatch on case type explicitly.
API proposal
namespace System.Text.Json.Serialization;
// Classifier abstraction
public delegate Type? JsonTypeClassifier(ref Utf8JsonReader reader);
public sealed class JsonTypeClassifierContext
{
public Type DeclaringType { get; }
public IReadOnlyList<JsonDerivedType> CandidateTypes { get; }
public string? TypeDiscriminatorPropertyName { get; }
}
public abstract class JsonTypeClassifierFactory
{
public abstract JsonTypeClassifier CreateJsonClassifier(
JsonTypeClassifierContext context, JsonSerializerOptions options);
}
// New attribute APIs
[AttributeUsage(AttributeTargets.Class | AttributeTargets.Struct, AllowMultiple = false, Inherited = false)]
public sealed class JsonUnionAttribute : JsonAttribute
{
[DynamicallyAccessedMembers(DynamicallyAccessedMemberTypes.PublicParameterlessConstructor)]
public Type? TypeClassifier { get; set; }
}
public sealed partial class JsonPolymorphicAttribute
{
[DynamicallyAccessedMembers(DynamicallyAccessedMemberTypes.PublicParameterlessConstructor)]
public Type? TypeClassifier { get; set; }
}
namespace System.Text.Json.Serialization.Metadata;
// ---------------------------------------------------------------------------
// 4a. Public metadata surface for unions.
// Mirrors the per-kind shape used for objects/collections/dictionaries.
// ---------------------------------------------------------------------------
public sealed class JsonUnionCaseInfo
{
public JsonUnionCaseInfo(Type caseType);
public Type CaseType { get; }
}
public enum JsonTypeInfoKind
{
// Existing
// None = 0,
// Object = 1,
// Enumerable = 2,
// Dictionary = 3,
Union = 4,
}
public abstract partial class JsonTypeInfo
{
public JsonTypeClassifier? TypeClassifier { get; set; }
public IList<JsonUnionCaseInfo>? UnionCases { get; set; }
public Func<Type, object?, object>? UnionConstructor { get; set; }
public Func<object, (Type? CaseType, object? CaseValue)>? UnionDeconstructor { get; set; }
}
public sealed partial class JsonTypeInfo<T>
{
public new Func<Type, object?, T>? UnionConstructor { get; set; }
public new Func<T, (Type? CaseType, object? CaseValue)>? UnionDeconstructor { get; set; }
}
// Source generator only API surface
[EditorBrowsable(EditorBrowsableState.Never)]
public sealed class JsonUnionInfoValues<T>
{
public IList<JsonUnionCaseInfo>? UnionCases { get; init; }
public Func<Type, object?, T>? UnionConstructor { get; init; }
public JsonTypeClassifier? TypeClassifier { get; init; }
public Func<T, (Type? CaseType, object? CaseValue)>? UnionDeconstructor { get; init; }
}
[EditorBrowsable(EditorBrowsableState.Never)]
public static partial class JsonMetadataServices
{
public static JsonTypeInfo<T> CreateUnionInfo<T>(JsonSerializerOptions options, JsonUnionInfoValues<T> unionInfo) where T : notnull;
}
Alternatives considered
$type on union serialization — unions lack a natural discriminator, particularly when union cases are not objects. Discussed under Serialization: no discriminator.
- Structural matching / content sniffing as defaults — discussed under Why not structural matching or content sniffing?.
- Built-in classifier factories — deferred. The current design exposes only the abstraction; common policies can be added later as concrete factory subclasses without breaking existing code.
Risks
- Breaking changes: none — all new surface.
- Performance: default deserialization is O(1) and zero-buffering. Custom classifiers may opt into read-ahead, with the cost paid only when configured.
Prototype
Branch: json-unions — commit b5c3734.
The prototype additionally implements closed-hierarchy/closed-enum support and the InferDerivedTypes API surface, which are deliberately out of scope for this issue and will be proposed separately once the corresponding compiler features ship.
Background and motivation
The C# compiler just introduced support for union types. Sibling proposals for closed hierarchies and closed enums are not yet available in the compiler preview, so this issue is scoped to unions only.
System.Text.Jsonsupport for closed hierarchies and closed enums will be tracked separately once their compiler features land.This issue is a sub-issue of the broader STJ union/closed-types umbrella (#125449), extracting the API surface that is relevant today.
The goal is twofold:
union Result(int, string)).union Pet(Cat, Dog)where both serialize as JSON objects). This abstraction is also extended to the existing polymorphic type infrastructure.Design summary
Serialization: no discriminator
Union values serialize transparently — the wrapper produced by the C#
unionkeyword is unpacked and the underlying case value is written using its own JSON contract. There is no envelope object, no$typefield, no tagging of any kind:This is a deliberate departure from the polymorphism support exposed via
[JsonPolymorphic]/[JsonDerivedType], where derived types are written with a$typediscriminator. Unions don't have a natural discriminator: any case can be picked by the union's constructors, and two distinct case constructors can produce equal values. Synthesising an artificial discriminator (e.g. the case type name) would push that arbitrary choice into the wire format and lock STJ into it forever. Users who want a tagged representation can keep using[JsonPolymorphic]directly, or attach a custom converter.Deserialization: first-token dispatch
Without a discriminator on the wire, the converter has to recover the case type from the JSON value itself. The chosen mechanism is to look at a single thing — the first token of the value — and pick the unique union case whose declared type is compatible with that token kind. The mapping is fixed:
JsonTokenTypeNumberint,long,double,decimal, …)Stringstring,DateTime,DateTimeOffset,Guid,TimeSpan,Uri,char,byte[], enumsTrue/FalseboolStartObjectStartArrayNullnullSelection is O(1) and does not buffer.
union Result(int, string)works cleanly out of the box becauseintis the only case compatible withNumberandstringis the only case compatible withString.Ambiguous unions
The token-only rule is intentionally narrow, so a number of perfectly valid unions cannot be disambiguated by it:
union Num(int, long)— both cases areNumber.union When(DateTime, DateTimeOffset)— both cases areString.union Pet(Cat, Dog)— both cases areStartObject.For these unions, the metadata layer throws
InvalidOperationExceptionwhen theJsonTypeInfois being configured, and the source generator emits the diagnosticSYSLIB1227so the failure surfaces at compile time. The user is then expected to attach a customJsonTypeClassifierthat decides which case applies.Why not structural matching or content sniffing?
Two natural-looking alternatives were considered and rejected as defaults:
union Pet(Cat, Dog)automatically. But it requires unbounded read-ahead, costs O(n) on every value, and silently chooses an arm when more than one case structurally matches — which is precisely the case where the user most needs an error."2024-05-01"asDateTimefirst, falling back tostring) is culture-sensitive, security-sensitive (parser oracles on attacker-controlled input), and produces results that depend on which parsers happen to accept which inputs.Both belong in user code, opted into via a
JsonTypeClassifier. The default behaviour stays predictable, allocation-free, and refuses to guess.Customization
Customization is exposed as a
JsonTypeClassifierdelegate produced by aJsonTypeClassifierFactory. The same abstraction also plugs into[JsonPolymorphic]types, so custom discriminator strategies for open hierarchies (e.g."kind"instead of$type) and ambiguity resolution for unions share one extension point.The simplest case writes the classifier inline against
Utf8JsonReader:A higher-level alternative parses the value into a
JsonNodefirst — simpler to write, more allocation:Because the factory runs once per
JsonTypeInfo, expensive setup — building lookup tables, walking property metadata, capturing options-derived state — should happen there. The returned classifier delegate then runs on the hot path:Schema generation
JsonSchemaExporteremits ananyOfschema composed fromJsonTypeInfo.UnionCases, with sharedJsonSchemaTypevalues hoisted to the parent. For example,union(string, int)produces:{ "anyOf": [ { "type": "string" }, { "type": "integer" } ] }Schema output is classifier-invariant — the exporter does not invoke
TypeClassifier, so swapping or removing a custom classifier does not change the generated schema. This mirrors[JsonPolymorphic], whose schema depends only on the registeredDerivedTypeslist and not on runtime discriminator resolution.Source-generated metadata
For each compiler union, the source generator emits a
CreateUnionInfo<T>(...)call populatingJsonUnionInfoValues<T>. The case list is sorted most-derived-first using the topological-sort helpers shared with the rest of the source generator, so that theswitch-based dispatch in the generated constructor and deconstructor never selects a base case before a derived one. Forunion Pet(Dog, Lab)whereLab : Dog:The
Typeparameter ofUnionConstructoris unused in the generated form because pattern matching onvalueis sufficient to pick the right union case constructor; it is preserved on the public delegate for hand-written contracts that may want to dispatch on case type explicitly.API proposal
Alternatives considered
$typeon union serialization — unions lack a natural discriminator, particularly when union cases are not objects. Discussed under Serialization: no discriminator.Risks
Prototype
Branch:
json-unions— commitb5c3734.The prototype additionally implements closed-hierarchy/closed-enum support and the
InferDerivedTypesAPI surface, which are deliberately out of scope for this issue and will be proposed separately once the corresponding compiler features ship.