Description
This issue follows Discourse comments about unifying the APIs of JSON.jl, LazyJSON.jl and JSON2.jl.
1. Define julia types for JSON Values
The current API uses Base.String
to represent encoded JSON and Base.Dict
etc to represent decoded JSON. The functions JSON.parse
and JSON.json
are used to convert between the two representations. This API restricts the implementation to be non-lazy. It also precludes the possibility of implementing short-cut methods for JSON derived values.
JavaScript Object Notation defines 6 value types.
Defining Julia types to represent these JSON value types will enable us to hide the implementation details (lazy vs eager parsing, encoded string representation vs decoded AST representation, etc).
Treating JSON values as first class types (rather than as something that must be converted to a Base
type) allows dispatch on these types and transparent implementation of short-cut methods as needed for efficiency.
const JSON.Value = Union{
JSON.Object,
JSON.Array,
JSON.Number,
JSON.String,
JSON.Bool,
JSON.Null
}
2. Construct JSON Value objects from strings
The implementation might immediately parse encoding strings into Julia collection types, or it might parse to intermediate AST types, or it might just lazily wrap the encoded string. That implementation detail would be hidden from the user (unless there are compelling use cases where user-supplied implementation hints are a big performance win, e.g. a lazy=false
option). It seems likely that a combination of sensible defaults and heuristics can achieve good performance in most cases without any need for the user to fiddle with options.
"""
JSON.Value(::AbstractString)::JSON.Value
Create a JSON object from a JSON formatted string.
"""
julia> x = JSON.Value("""{
"object": {"field": "value"},
"array": [1,2,3],
"number": 43,
"bool": true,
"null": null
}""")
julia> x.object.field
"value"
julia> x.array[1]
1
2. Construct JSON Value objects from julia objects.
The implementation might immediately encode the julia objects to a JSON string, or it might
just wrap them and do nothing, or it might convert them to an intermediate representation.
That detail is hidden from the user.
"""
JSON.Value(o)::JSON.Value
Create a JSON object from a Julia object.
"""
julia> x = JSON.Value(Dict(
"object" => Dict("field" => "value"),
"array" => [1,2,3]
"number" => 43,
"bool" => true,
"null" => nothing
))
julia> x.object.field
"value"
julia> x.array[1]
1
3. Use Base.string
to produce JSON encoded strings.
Rather than using JSON.json
to produce encoded strings, just use Base.string
.
Depending on the JSON.Value
implementation, string
might just return a preexisting encoded string, or it might have to produce an encoded string from an internal representation.
Base.string(o::JSON.Value)::AbstractString
JSON formatted string representation of a JSON object.
julia> x = JSON.Value(Dict(
"object" => Dict("field" => "value"),
"array" => [1,2,3]
"number" => 43,
"bool" => true,
"null" => nothing
))
julia> string(x)
"{\"object\":{\"field\":\"value\"},\"array\":[1,2,3],\"number\":43,\"bool\":true,\"null\":null}"
4. Use Base.convert
to do direct-to-struct parsing.
e.g. like the direct-to-string parsing feature first implemented in JSON2:
julia> struct MyType
field
end
julia> convert(MyType, JSON.Value("""{"field": "value"}""")
MyType("value")
The convert
methods would be @generated
.
5. Use Base.convert
in cases when specific Base
types are needed.
Most of the time JSON.Value
types that implement AbstractDict
, AbstractArray
, Base.Real
, AbstractString
etc are all that the user will need.
In cases where the user wants a specific type, they can use convert
:
julia> convert(Vector{Float64}, JSON.Value("[0.25, 0.5, 1, 2, 4, 8]"))
6-element Array{Float64,1}:
0.25
0.5
1.0
2.0
4.0
8.0
6. Backwards compatibility
The existing API could be maintained as follows:
JSON.parse(x; kw...) = JSON.Value(x; kw...)
JSON.json(x) = string(JSON.Value(x))
We could implement parse
so that it produces lazy value objects by default and produces non-lazy values only when the dicttype=
or inttype=
options are supplied. Or we could disable laziness entirely for the JSON.parse
interface and say "if you want the new lazy thing, use JSON.Value
".
If returning an AbstractDict
from parse
instead of a Dict
causes breakage (or performance regression) in existing code, then we should start out by returning Dict
.
7. Implementation
We can cherry-pick implementation detail from the various existing JSON codebases. i.e. Use the fast float decoder from over here, but use the more robust UTF-16 decoder from there.
It might turn out that using the lazy parser is just as fast as the non-lazy one for doing a full non-lazy parse. In that case we may only need one parser. Or, if there are cases where the existing non-lazy parser has big wins, we can keep both. The user should not be able to tell the difference.