Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Unified JSON API #268

Open
samoconnor opened this issue Nov 6, 2018 · 0 comments
Open

Proposal: Unified JSON API #268

samoconnor opened this issue Nov 6, 2018 · 0 comments

Comments

@samoconnor
Copy link

samoconnor commented Nov 6, 2018

This issue follows Discourse comments about unifying the APIs of JSON.jl, LazyJSON.jl and JSON2.jl.

1. Define julia types for JSON Values

The current API uses Base.String to represent encoded JSON and Base.Dict etc to represent decoded JSON. The functions JSON.parse and JSON.json are used to convert between the two representations. This API restricts the implementation to be non-lazy. It also precludes the possibility of implementing short-cut methods for JSON derived values.

JavaScript Object Notation defines 6 value types.
Defining Julia types to represent these JSON value types will enable us to hide the implementation details (lazy vs eager parsing, encoded string representation vs decoded AST representation, etc).

Treating JSON values as first class types (rather than as something that must be converted to a Base type) allows dispatch on these types and transparent implementation of short-cut methods as needed for efficiency.

const JSON.Value = Union{
    JSON.Object,
    JSON.Array,
    JSON.Number,
    JSON.String,
    JSON.Bool,
    JSON.Null
}

2. Construct JSON Value objects from strings

The implementation might immediately parse encoding strings into Julia collection types, or it might parse to intermediate AST types, or it might just lazily wrap the encoded string. That implementation detail would be hidden from the user (unless there are compelling use cases where user-supplied implementation hints are a big performance win, e.g. a lazy=false option). It seems likely that a combination of sensible defaults and heuristics can achieve good performance in most cases without any need for the user to fiddle with options.

"""
    JSON.Value(::AbstractString)::JSON.Value

Create a JSON object from a JSON formatted string.
"""
julia> x = JSON.Value("""{
           "object": {"field": "value"},
           "array": [1,2,3],
           "number": 43,
           "bool": true,
           "null": null
       }""")

julia> x.object.field
"value"

julia> x.array[1]
1

2. Construct JSON Value objects from julia objects.

The implementation might immediately encode the julia objects to a JSON string, or it might
just wrap them and do nothing, or it might convert them to an intermediate representation.
That detail is hidden from the user.

"""
    JSON.Value(o)::JSON.Value

Create a JSON object from a Julia object.
"""
julia> x = JSON.Value(Dict(
           "object" => Dict("field" => "value"),
           "array" => [1,2,3]
           "number" => 43,
           "bool" => true,
           "null" => nothing
       ))

julia> x.object.field
"value"

julia> x.array[1]
1

3. Use Base.string to produce JSON encoded strings.

Rather than using JSON.json to produce encoded strings, just use Base.string.
Depending on the JSON.Value implementation, string might just return a preexisting encoded string, or it might have to produce an encoded string from an internal representation.

    Base.string(o::JSON.Value)::AbstractString

JSON formatted string representation of a JSON object.
julia> x = JSON.Value(Dict(
           "object" => Dict("field" => "value"),
           "array" => [1,2,3]
           "number" => 43,
           "bool" => true,
           "null" => nothing
       ))
julia> string(x)
"{\"object\":{\"field\":\"value\"},\"array\":[1,2,3],\"number\":43,\"bool\":true,\"null\":null}"

4. Use Base.convert to do direct-to-struct parsing.

e.g. like the direct-to-string parsing feature first implemented in JSON2:

julia> struct MyType
           field
       end
julia> convert(MyType, JSON.Value("""{"field": "value"}""")
MyType("value")

The convert methods would be @generated.

5. Use Base.convert in cases when specific Base types are needed.

Most of the time JSON.Value types that implement AbstractDict, AbstractArray, Base.Real, AbstractString etc are all that the user will need.

In cases where the user wants a specific type, they can use convert:

julia> convert(Vector{Float64}, JSON.Value("[0.25, 0.5, 1, 2, 4, 8]"))
6-element Array{Float64,1}:
 0.25
 0.5
 1.0
 2.0
 4.0
 8.0

6. Backwards compatibility

The existing API could be maintained as follows:

JSON.parse(x; kw...) = JSON.Value(x; kw...)
JSON.json(x) = string(JSON.Value(x))

We could implement parse so that it produces lazy value objects by default and produces non-lazy values only when the dicttype= or inttype= options are supplied. Or we could disable laziness entirely for the JSON.parse interface and say "if you want the new lazy thing, use JSON.Value".

If returning an AbstractDict from parse instead of a Dict causes breakage (or performance regression) in existing code, then we should start out by returning Dict.

7. Implementation

We can cherry-pick implementation detail from the various existing JSON codebases. i.e. Use the fast float decoder from over here, but use the more robust UTF-16 decoder from there.

It might turn out that using the lazy parser is just as fast as the non-lazy one for doing a full non-lazy parse. In that case we may only need one parser. Or, if there are cases where the existing non-lazy parser has big wins, we can keep both. The user should not be able to tell the difference.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant