A fast JSON (de)serializer, built on Sigil with a number of somewhat crazy optimization tricks.
Releases are available on Nuget in addition to this repository.
using(var output = new StringWriter())
{
JSON.Serialize(
new
{
MyInt = 1,
MyString = "hello world",
// etc.
},
output
);
}
There is also a Serialize
method that returns a string.
The first time Jil is used to serialize a given configuration and type pair, it will spend extra time building the serializer. Subsequent invocations will be much faster, so if a consistently fast runtime is necessary in your code you may want to "prime the pump" with an earlier "throw away" serialization.
If you need to serialize compile-time unknown types (including subclasses, and virtual properties) you should use JSON.SerializeDynamic
instead.
JSON.SerializeDynamic
does not require a generic type parameter, and can cope with subclasses, object
/dynamic
members, and DLR participating
types such as ExpandoObject and DynamicObject.
using(var input = new StringReader(myString))
{
var result = JSON.Deserialize<MyType>(input);
}
There is also a Deserialize
method that takes a string as input.
The first time Jil is used to deserialize a given configuration and type pair, it will spend extra time building the deserializer. Subsequent invocations will be much faster, so if a consistently fast runtime is necessary in your code you may want to "prime the pump" with an earlier "throw away" deserialization.
using(var input = new StringReader(myString))
{
var result = JSON.DeserializeDynamic(input);
}
There is also a DeserializeDynamic
method that works directly on strings.
These methods return dynamic
, and support the following operations:
- Casts
- ie.
(int)JSON.DeserializeDynamic("123")
- ie.
- Member access
- ie.
JSON.DeserializeDynamic(@"{""A"":123}").A
- ie.
- Indexers
- ie.
JSON.DeserializeDynamic(@"{""A"":123}")["A"]
- or
JSON.DeserializeDynamic("[0, 1, 2]")[0]
- ie.
- Foreach loops
- ie.
foreach(var keyValue in JSON.DeserializeDynamic(@"{""A"":123}")) { ... }
- in this example,
keyValue
is a dynamic withKey
andValue
properties
- in this example,
- or
foreach(var item in JSON.DeserializeDynamic("[0, 1, 2]")) { ... }
- in this example,
item
is adynamic
and will have values 0, 1, and 2
- in this example,
- ie.
- Common unary operators (+, -, and !)
- Common binary operators (&&, ||, +, -, *, /, ==, !=, <, <=, >, and >=)
.Length
&.Count
on arrays.ContainsKey(string)
on objects
Jil will only (de)serialize types that can be reasonably represented as JSON.
The following types (and any user defined types composed of them) are supported:
- Strings (including char)
- Booleans
- Integer numbers (int, long, byte, etc.)
- Floating point numbers (float, double, and decimal)
- DateTimes & DateTimeOffsets
- See Configuration for further details
- TimeSpans
- See Configuration for further details
- Nullable types
- Enumerations
- Including [Flags]
- Guids
- Only the "D" format
- IList<T> implementations
- IDictionary<TKey, TValue> implementations where TKey is a string or enumeration
Jil deserializes public fields and properties; the order in which they are serialized is not defined (it is unlikely to be in
declaration order). The DataMemberAttribute.Name
property and IgnoreDataMemberAttribute
are respected by Jil, as is the ShouldSerializeXXX() pattern. For situations where DataMemberAttribute
and IgnoreDataMemberAttribute
cannot be used, Jil provides the JilDirectiveAttribute
which provides equivalent functionality.
Jil's JSON.Serialize
and JSON.Deserialize
methods take an optional Options
parameter which controls:
- The format of DateTimes, DateTimeOffsets, and TimeSpans; one of
- NewtonsoftStyleMillisecondsSinceUnixEpoch, a string
- "/Date(##...##)/" for DateTimes
- "1.23:45:56.78" for TimeSpans
- MillisecondsSinceUnixEpoch, a number
- for DateTimes it can be passed directly to JavaScript's Date() constructor
- for TimeSpans it's simply TimeSpan.TotalMilliseconds
- SecondsSinceUnixEpoch, a number
- for DateTimes this is commonly refered to as unix time
- for TimeSpans it's simply TimeSpan.TotalSeconds
- ISO8601, a string
- for DateTimes, ie. "2011-07-14T19:43:37Z"
- for TimeSpans, ie. "P40DT11H10M9.4S"
- NewtonsoftStyleMillisecondsSinceUnixEpoch, a string
- Whether or not to exclude null values when serializing dictionaries, and object members
- Whether or not to "pretty print" while serializing, which adds extra linebreaks and whitespace for presentation's sake
- Whether or not the serialized JSON will be used as JSONP (which requires slightly more work be done w.r.t. escaping)
- Whether or not to include inherited members when serializing
Jil aims to be the fastest general purpose JSON (de)serializer for .NET. Flexibility and "nice to have" features are explicitly discounted in the pursuit of speed.
These benchmarks were run on a machine with the following specs:
- Operating System: Windows 8.1 Enterprise 64-bit (6.3, Build 9600) (9600.winblue_r3.140827-1500)
- System Manufacturer: Apple Inc.
- System Model: MacBookPro11,3
- Processor: Intel(R) Core(TM) i7-4960HQ CPU @ 2.60GHz (8 CPUs), ~2.6GHz
- Memory: 16384MB RAM
- DDR3
- Dual Channel
- 798.1 MHZ
As with all benchmarks, take these with a grain of salt.
For comparison, here's how Jil stacks up against other popular .NET serializers in a synthetic benchmark:
- Json.NET - JSON library included with ASP.NET MVC, version 6.0.7
- ServiceStack.Text - JSON, CSV, and JSV library; a part of the ServiceStack framework, version 3.9.71
- protobuf-net - binary serializer for Google's Protocol Buffers, version 2.0.0.688
- does not serialize JSON, included as a baseline
All three libraries are in use at Stack Exchange in various production roles.
Note that the bars in each group of each graph are scaled so that the fastest library is 100.
Numbers, include millisecond timings, can found in this Google Document.
The Question, Answer, and User types are taken from the Stack Exchange API.
Data for each type is randomly generated from a fixed seed. Random text is biased towards ASCII*, but includes all unicode.
*This is meant to simulate typical content from the Stack Exchange API.
The same libraries and same types were used to test deserialization.
Note that the bars in each group of each graph are scaled so that the fastest library is 100.
Numbers, include millisecond timings, can be found in the same Google Document.
Jil has a lot of tricks to make it fast. These may be interesting, even if Jil itself is too limited for your use.
Jil does a lot of IL generation to produce tight, focused code. While possible with ILGenerator, Jil instead uses the Sigil library. Sigil automatically does a lot of the busy work you'd normally have to do manually to produce ideal IL. Using Sigil also makes hacking on Jil much more productive, as debuging IL generation without it is pretty slow going.
Jil's internal serializers and deserializers are (in the absense of recursive types) monolithic, and per-type; avoiding extra runtime lookups, and giving .NET's JIT more context when generating machine code.
The methods Jil create also do no Options checking at serialization time; Options are baked in at first use. This means that Jil may create up to 32 different serializers and 8 different deserializers for a single type (though in practice, many fewer).
Perhaps the most arcane code in Jil determines the preferred order to access members, so the CPU doesn't stall waiting for values from memory.
Members are divided up into 4 groups:
- Simple
- primitive ValueTypes such as int, double, etc.
- Nullable Types
- Recursive Types
- Everything Else
Members within each group are ordered by the offset of the fields backing them (properties are decompiled to determine fields they use).
This is a fairly naive implementation of this idea, there's almost certainly more that could be squeezed out especially with regards to consistency of gains.
.NET's GC is excellent, but no-GC is still faster than any-GC.
Jil tries to avoid allocating any reference types, with some exceptions:
- a 36-length char[] if any integer numbers, DateTimes, or GUIDs are being serialized
- a 32-length char[] if any strings, user defined objects, or ISO8601 DateTimes are being deserialized
Depending on the data being deserialized a StringBuilder
may also be allocated. If a TextWriter
does not have an invariant culture, strings may also be allocated when serializing floating point numbers.
JSON has escaping rules for \
, "
, and control characters. These can be kind of time consuming to deal with. Jil avoids it as much as possible in two ways.
First, all known key names are determined once and baked into the generated delegates like so. Known keys are member names and enumeration values.
Second, rather than lookup encoded characters in a dictionary or a long series of branches Jil does explicit checks for "
and \
and turns the rest into
a subtraction and jump table lookup. This comes out to ~three branches (with mostly consistently taken paths, good for branch prediction in theory) per character.
This works because control characters in .NET strings (bascally UTF-16, but might as well be ASCII for this trick) are sequential, being [0,31].
JSONP also requires escaping of line separator (\u2028) and paragraph separator (\u2029) characters. When configured to serialize JSONP,
Jil escapes them in the same manner as \
and "
.
While number formatting in .NET is pretty fast, it has a lot of baggage to handle custom number formatting.
Since JSON has a strict definition of a number, a Write() implementation without configuration is noticeably faster.
To go the extra mile, Jil contains separate implementations for int
, uint
, ulong
, and long
.
Jil does not include custom decimal
, double
, or single
Write() implementations, as despite my best efforts I haven't been able to beat the ones built into .NET.
If you think you're up to the challenge, I'd be really interested in seeing code that is faster than the included implementations.
Similarly to numbers, each of Jil's date formats has a custom Write() implementation.
- ISO8601 can be unrolled into a smaller number of
/
and%
instructions - Newtonsoft-style is a subtraction and division, then fed into the custom
long
writing code - Milliseconds since the unix epoch is essentially the same
- Seconds since the unix epoch just has a different divisor
Noticing a pattern?
Jil has a custom Guid writer (which is one of the reasons Jil only supports the D format).
Fun fact about this method, I tested a more branch heavy version (which removed the byte lookup) which turned out to be considerably slower than the built-in method due to branch prediction failures. Type 4 Guids being random makes for something quite close to the worst case for branch prediciton.
Although arrays implement IList<T>
the JIT generates much better code if you give it array-ish IL to chew on, so Jil does so.
Many enums end up having sequential values, Jil will exploit this if possible and generate a subtraction and jump table lookup. Non-sequential enumerations are handled with a long series of branches.
Just like Jil maintains many different methods for writing integer types, it also maintains different methods for reading them. These methods omit unnecessary sign checks, overflow checks, and culture-specific formatting support.
Rather than read a member name into a string or buffer when deserializing, Jil will try to match it one character at a time using an automata.
If you're serializing to string
(as indicated by using a particular Serialize<T>
method) Jil will avoid the overhead of virtually dispatching calls against TextWriter
, and instead statically call against it's own specialized StringBuilder
-eqsue class. In the general case Jil prefers to write against a TextWriter
so as to keep memory pressure low (a real concern in many real world deployments), but when Jil is going to allocate a string
anyway avoiding virtual dispatch results in a noticeable speed up.