You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The current API uses Base.String to represent encoded JSON and Base.Dict etc to represent decoded JSON. The functions JSON.parse and JSON.json are used to convert between the two representations. This API restricts the implementation to be non-lazy. It also precludes the possibility of implementing short-cut methods for JSON derived values.
JavaScript Object Notation defines 6 value types.
Defining Julia types to represent these JSON value types will enable us to hide the implementation details (lazy vs eager parsing, encoded string representation vs decoded AST representation, etc).
Treating JSON values as first class types (rather than as something that must be converted to a Base type) allows dispatch on these types and transparent implementation of short-cut methods as needed for efficiency.
The implementation might immediately parse encoding strings into Julia collection types, or it might parse to intermediate AST types, or it might just lazily wrap the encoded string. That implementation detail would be hidden from the user (unless there are compelling use cases where user-supplied implementation hints are a big performance win, e.g. a lazy=false option). It seems likely that a combination of sensible defaults and heuristics can achieve good performance in most cases without any need for the user to fiddle with options.
"""
JSON.Value(::AbstractString)::JSON.Value
Create a JSON object from a JSON formatted string.
"""
2. Construct JSON Value objects from julia objects.
The implementation might immediately encode the julia objects to a JSON string, or it might
just wrap them and do nothing, or it might convert them to an intermediate representation.
That detail is hidden from the user.
"""
JSON.Value(o)::JSON.Value
Create a JSON object from a Julia object.
"""
3. Use Base.string to produce JSON encoded strings.
Rather than using JSON.json to produce encoded strings, just use Base.string.
Depending on the JSON.Value implementation, string might just return a preexisting encoded string, or it might have to produce an encoded string from an internal representation.
Base.string(o::JSON.Value)::AbstractString
JSON formatted string representation of a JSON object.
We could implement parse so that it produces lazy value objects by default and produces non-lazy values only when the dicttype= or inttype= options are supplied. Or we could disable laziness entirely for the JSON.parse interface and say "if you want the new lazy thing, use JSON.Value".
If returning an AbstractDict from parse instead of a Dict causes breakage (or performance regression) in existing code, then we should start out by returning Dict.
7. Implementation
We can cherry-pick implementation detail from the various existing JSON codebases. i.e. Use the fast float decoder from over here, but use the more robust UTF-16 decoder from there.
It might turn out that using the lazy parser is just as fast as the non-lazy one for doing a full non-lazy parse. In that case we may only need one parser. Or, if there are cases where the existing non-lazy parser has big wins, we can keep both. The user should not be able to tell the difference.
The text was updated successfully, but these errors were encountered:
This issue follows Discourse comments about unifying the APIs of JSON.jl, LazyJSON.jl and JSON2.jl.
1. Define julia types for JSON Values
The current API uses
Base.String
to represent encoded JSON andBase.Dict
etc to represent decoded JSON. The functionsJSON.parse
andJSON.json
are used to convert between the two representations. This API restricts the implementation to be non-lazy. It also precludes the possibility of implementing short-cut methods for JSON derived values.JavaScript Object Notation defines 6 value types.
Defining Julia types to represent these JSON value types will enable us to hide the implementation details (lazy vs eager parsing, encoded string representation vs decoded AST representation, etc).
Treating JSON values as first class types (rather than as something that must be converted to a
Base
type) allows dispatch on these types and transparent implementation of short-cut methods as needed for efficiency.2. Construct JSON Value objects from strings
The implementation might immediately parse encoding strings into Julia collection types, or it might parse to intermediate AST types, or it might just lazily wrap the encoded string. That implementation detail would be hidden from the user (unless there are compelling use cases where user-supplied implementation hints are a big performance win, e.g. a
lazy=false
option). It seems likely that a combination of sensible defaults and heuristics can achieve good performance in most cases without any need for the user to fiddle with options.2. Construct JSON Value objects from julia objects.
The implementation might immediately encode the julia objects to a JSON string, or it might
just wrap them and do nothing, or it might convert them to an intermediate representation.
That detail is hidden from the user.
3. Use
Base.string
to produce JSON encoded strings.Rather than using
JSON.json
to produce encoded strings, just useBase.string
.Depending on the
JSON.Value
implementation,string
might just return a preexisting encoded string, or it might have to produce an encoded string from an internal representation.4. Use
Base.convert
to do direct-to-struct parsing.e.g. like the direct-to-string parsing feature first implemented in JSON2:
The
convert
methods would be@generated
.5. Use
Base.convert
in cases when specificBase
types are needed.Most of the time
JSON.Value
types that implementAbstractDict
,AbstractArray
,Base.Real
,AbstractString
etc are all that the user will need.In cases where the user wants a specific type, they can use
convert
:6. Backwards compatibility
The existing API could be maintained as follows:
We could implement
parse
so that it produces lazy value objects by default and produces non-lazy values only when thedicttype=
orinttype=
options are supplied. Or we could disable laziness entirely for theJSON.parse
interface and say "if you want the new lazy thing, useJSON.Value
".If returning an
AbstractDict
fromparse
instead of aDict
causes breakage (or performance regression) in existing code, then we should start out by returningDict
.7. Implementation
We can cherry-pick implementation detail from the various existing JSON codebases. i.e. Use the fast float decoder from over here, but use the more robust UTF-16 decoder from there.
It might turn out that using the lazy parser is just as fast as the non-lazy one for doing a full non-lazy parse. In that case we may only need one parser. Or, if there are cases where the existing non-lazy parser has big wins, we can keep both. The user should not be able to tell the difference.
The text was updated successfully, but these errors were encountered: