Pure-perl JSON parser implementation
Perl
Switch branches/tags
Nothing to show

README

JSON::PP
========

This module is a pure-Perl implementation of a JSON encoder/decoder.
Most of the time, you should probably install JSON.pm (which depends on
this module) so that you may use JSON::XS for improved performance later
without code changes.

INSTALLATION

To install this module type the following:

   perl Makefile.PL
   make
   make test
   make install

NAME
    JSON::PP - JSON::XS compatible pure-Perl module.

SYNOPSIS
     use JSON::PP;

     # exported functions, they croak on error
     # and expect/generate UTF-8

     $utf8_encoded_json_text = encode_json $perl_hash_or_arrayref;
     $perl_hash_or_arrayref  = decode_json $utf8_encoded_json_text;

     # OO-interface

     $coder = JSON::PP->new->ascii->pretty->allow_nonref;
     $pretty_printed_unencoded = $coder->encode ($perl_scalar);
     $perl_scalar = $coder->decode ($unicode_json_text);

     # Note that JSON version 2.0 and above will automatically use
     # JSON::XS or JSON::PP, so you should be able to just:
 
     use JSON;

DESCRIPTION
    This module is JSON::XS compatible pure Perl module. (Perl 5.8 or later
    is recommended)

    JSON::XS is the fastest and most proper JSON module on CPAN. It is
    written by Marc Lehmann in C, so must be compiled and installed in the
    used environment.

    JSON::PP is a pure-Perl module and has compatibility to JSON::XS.

  FEATURES
    *   correct unicode handling

        This module knows how to handle Unicode (depending on Perl version).

        See to "A FEW NOTES ON UNICODE AND PERL" in JSON::XS and "UNICODE
        HANDLING ON PERLS".

    *   round-trip integrity

        When you serialise a perl data structure using only data types
        supported by JSON and Perl, the deserialised data structure is
        identical on the Perl level. (e.g. the string "2.0" doesn't suddenly
        become "2" just because it looks like a number). There *are* minor
        exceptions to this, read the MAPPING section below to learn about
        those.

    *   strict checking of JSON correctness

        There is no guessing, no generating of illegal JSON texts by
        default, and only JSON is accepted as input by default (the latter
        is a security feature). But when some options are set, loose chcking
        features are available.

FUNCTIONS
    Basically, check to JSON or JSON::XS.

  encode_json
        $json_text = encode_json $perl_scalar

  decode_json
        $perl_scalar = decode_json $json_text

  JSON::PP::true
    Returns JSON true value which is blessed object. It "isa"
    JSON::PP::Boolean object.

  JSON::PP::false
    Returns JSON false value which is blessed object. It "isa"
    JSON::PP::Boolean object.

  JSON::PP::null
    Returns "undef".

METHODS
    Basically, check to JSON or JSON::XS.

  new
        $json = new JSON::PP

    Rturns a new JSON::PP object that can be used to de/encode JSON strings.

  ascii
        $json = $json->ascii([$enable])
    
        $enabled = $json->get_ascii

    If $enable is true (or missing), then the encode method will not
    generate characters outside the code range 0..127. Any Unicode
    characters outside that range will be escaped using either a single
    \uXXXX or a double \uHHHH\uLLLLL escape sequence, as per RFC4627. (See
    to "OBJECT-ORIENTED INTERFACE" in JSON::XS).

    In Perl 5.005, there is no character having high value (more than 255).
    See to "UNICODE HANDLING ON PERLS".

    If $enable is false, then the encode method will not escape Unicode
    characters unless required by the JSON syntax or other flags. This
    results in a faster and more compact format.

      JSON::PP->new->ascii(1)->encode([chr 0x10401])
      => ["\ud801\udc01"]

  latin1
        $json = $json->latin1([$enable])
    
        $enabled = $json->get_latin1

    If $enable is true (or missing), then the encode method will encode the
    resulting JSON text as latin1 (or iso-8859-1), escaping any characters
    outside the code range 0..255.

    If $enable is false, then the encode method will not escape Unicode
    characters unless required by the JSON syntax or other flags.

      JSON::XS->new->latin1->encode (["\x{89}\x{abc}"]
      => ["\x{89}\\u0abc"]    # (perl syntax, U+abc escaped, U+89 not)

    See to "UNICODE HANDLING ON PERLS".

  utf8
        $json = $json->utf8([$enable])
    
        $enabled = $json->get_utf8

    If $enable is true (or missing), then the encode method will encode the
    JSON result into UTF-8, as required by many protocols, while the decode
    method expects to be handled an UTF-8-encoded string. Please note that
    UTF-8-encoded strings do not contain any characters outside the range
    0..255, they are thus useful for bytewise/binary I/O.

    (In Perl 5.005, any character outside the range 0..255 does not exist.
    See to "UNICODE HANDLING ON PERLS".)

    In future versions, enabling this option might enable autodetection of
    the UTF-16 and UTF-32 encoding families, as described in RFC4627.

    If $enable is false, then the encode method will return the JSON string
    as a (non-encoded) Unicode string, while decode expects thus a Unicode
    string. Any decoding or encoding (e.g. to UTF-8 or UTF-16) needs to be
    done yourself, e.g. using the Encode module.

    Example, output UTF-16BE-encoded JSON:

      use Encode;
      $jsontext = encode "UTF-16BE", JSON::XS->new->encode ($object);

    Example, decode UTF-32LE-encoded JSON:

      use Encode;
      $object = JSON::XS->new->decode (decode "UTF-32LE", $jsontext);

  pretty
        $json = $json->pretty([$enable])

    This enables (or disables) all of the "indent", "space_before" and
    "space_after" flags in one call to generate the most readable (or most
    compact) form possible.

  indent
        $json = $json->indent([$enable])
    
        $enabled = $json->get_indent

    The default indent space length is three. You can use "indent_length" to
    change the length.

  space_before
        $json = $json->space_before([$enable])
    
        $enabled = $json->get_space_before

  space_after
        $json = $json->space_after([$enable])
    
        $enabled = $json->get_space_after

  relaxed
        $json = $json->relaxed([$enable])
    
        $enabled = $json->get_relaxed

  canonical
        $json = $json->canonical([$enable])
    
        $enabled = $json->get_canonical

    If you want your own sorting routine, you can give a code referece or a
    subroutine name to "sort_by". See to "JSON::PP OWN METHODS".

  allow_nonref
        $json = $json->allow_nonref([$enable])
    
        $enabled = $json->get_allow_nonref

  allow_unknown
        $json = $json->allow_unknown ([$enable])
    
        $enabled = $json->get_allow_unknown

  allow_blessed
        $json = $json->allow_blessed([$enable])
    
        $enabled = $json->get_allow_blessed

  convert_blessed
        $json = $json->convert_blessed([$enable])
    
        $enabled = $json->get_convert_blessed

  filter_json_object
        $json = $json->filter_json_object([$coderef])

  filter_json_single_key_object
        $json = $json->filter_json_single_key_object($key [=> $coderef])

  shrink
        $json = $json->shrink([$enable])
    
        $enabled = $json->get_shrink

    In JSON::XS, this flag resizes strings generated by either "encode" or
    "decode" to their minimum size possible. It will also try to downgrade
    any strings to octet-form if possible.

    In JSON::PP, it is noop about resizing strings but tries
    "utf8::downgrade" to the returned string by "encode". See to utf8.

    See to "OBJECT-ORIENTED INTERFACE" in JSON::XS

  max_depth
        $json = $json->max_depth([$maximum_nesting_depth])
    
        $max_depth = $json->get_max_depth

    Sets the maximum nesting level (default 512) accepted while encoding or
    decoding. If a higher nesting level is detected in JSON text or a Perl
    data structure, then the encoder and decoder will stop and croak at that
    point.

    Nesting level is defined by number of hash- or arrayrefs that the
    encoder needs to traverse to reach a given point or the number of "{" or
    "[" characters without their matching closing parenthesis crossed to
    reach a given character in a string.

    If no argument is given, the highest possible setting will be used,
    which is rarely useful.

    See "SSECURITY CONSIDERATIONS" in JSON::XS for more info on why this is
    useful.

    When a large value (100 or more) was set and it de/encodes a deep nested
    object/text, it may raise a warning 'Deep recursion on subroutin' at the
    perl runtime phase.

  max_size
        $json = $json->max_size([$maximum_string_size])
    
        $max_size = $json->get_max_size

    Set the maximum length a JSON text may have (in bytes) where decoding is
    being attempted. The default is 0, meaning no limit. When "decode" is
    called on a string that is longer then this many bytes, it will not
    attempt to decode the string but throw an exception. This setting has no
    effect on "encode" (yet).

    If no argument is given, the limit check will be deactivated (same as
    when 0 is specified).

    See "SSECURITY CONSIDERATIONS" in JSON::XS for more info on why this is
    useful.

  encode
        $json_text = $json->encode($perl_scalar)

  decode
        $perl_scalar = $json->decode($json_text)

  decode_prefix
        ($perl_scalar, $characters) = $json->decode_prefix($json_text)

INCREMENTAL PARSING
    Most of this section are copied and modified from "INCREMENTAL PARSING"
    in JSON::XS.

    In some cases, there is the need for incremental parsing of JSON texts.
    This module does allow you to parse a JSON stream incrementally. It does
    so by accumulating text until it has a full JSON object, which it then
    can decode. This process is similar to using "decode_prefix" to see if a
    full JSON object is available, but is much more efficient (and can be
    implemented with a minimum of method calls).

    This module will only attempt to parse the JSON text once it is sure it
    has enough text to get a decisive result, using a very simple but truly
    incremental parser. This means that it sometimes won't stop as early as
    the full parser, for example, it doesn't detect parenthese mismatches.
    The only thing it guarantees is that it starts decoding as soon as a
    syntactically valid JSON text has been seen. This means you need to set
    resource limits (e.g. "max_size") to ensure the parser will stop parsing
    in the presence if syntax errors.

    The following methods implement this incremental parser.

  incr_parse
        $json->incr_parse( [$string] ) # void context
    
        $obj_or_undef = $json->incr_parse( [$string] ) # scalar context
    
        @obj_or_empty = $json->incr_parse( [$string] ) # list context

    This is the central parsing function. It can both append new text and
    extract objects from the stream accumulated so far (both of these
    functions are optional).

    If $string is given, then this string is appended to the already
    existing JSON fragment stored in the $json object.

    After that, if the function is called in void context, it will simply
    return without doing anything further. This can be used to add more text
    in as many chunks as you want.

    If the method is called in scalar context, then it will try to extract
    exactly *one* JSON object. If that is successful, it will return this
    object, otherwise it will return "undef". If there is a parse error,
    this method will croak just as "decode" would do (one can then use
    "incr_skip" to skip the errornous part). This is the most common way of
    using the method.

    And finally, in list context, it will try to extract as many objects
    from the stream as it can find and return them, or the empty list
    otherwise. For this to work, there must be no separators between the
    JSON objects or arrays, instead they must be concatenated back-to-back.
    If an error occurs, an exception will be raised as in the scalar context
    case. Note that in this case, any previously-parsed JSON texts will be
    lost.

    Example: Parse some JSON arrays/objects in a given string and return
    them.

        my @objs = JSON->new->incr_parse ("[5][7][1,2]");

  incr_text
        $lvalue_string = $json->incr_text

    This method returns the currently stored JSON fragment as an lvalue,
    that is, you can manipulate it. This *only* works when a preceding call
    to "incr_parse" in *scalar context* successfully returned an object.
    Under all other circumstances you must not call this function (I mean
    it. although in simple tests it might actually work, it *will* fail
    under real world conditions). As a special exception, you can also call
    this method before having parsed anything.

    This function is useful in two cases: a) finding the trailing text after
    a JSON object or b) parsing multiple JSON objects separated by non-JSON
    text (such as commas).

        $json->incr_text =~ s/\s*,\s*//;

    In Perl 5.005, "lvalue" attribute is not available. You must write codes
    like the below:

        $string = $json->incr_text;
        $string =~ s/\s*,\s*//;
        $json->incr_text( $string );

  incr_skip
        $json->incr_skip

    This will reset the state of the incremental parser and will remove the
    parsed text from the input buffer. This is useful after "incr_parse"
    died, in which case the input buffer and incremental parser state is
    left unchanged, to skip the text parsed so far and to reset the parse
    state.

  incr_reset
        $json->incr_reset

    This completely resets the incremental parser, that is, after this call,
    it will be as if the parser had never parsed anything.

    This is useful if you want ot repeatedly parse JSON objects and want to
    ignore any trailing data, which means you have to reset the parser after
    each successful decode.

    See to "INCREMENTAL PARSING" in JSON::XS for examples.

JSON::PP OWN METHODS
  allow_singlequote
        $json = $json->allow_singlequote([$enable])

    If $enable is true (or missing), then "decode" will accept JSON strings
    quoted by single quotations that are invalid JSON format.

        $json->allow_singlequote->decode({"foo":'bar'});
        $json->allow_singlequote->decode({'foo':"bar"});
        $json->allow_singlequote->decode({'foo':'bar'});

    As same as the "relaxed" option, this option may be used to parse
    application-specific files written by humans.

  allow_barekey
        $json = $json->allow_barekey([$enable])

    If $enable is true (or missing), then "decode" will accept bare keys of
    JSON object that are invalid JSON format.

    As same as the "relaxed" option, this option may be used to parse
    application-specific files written by humans.

        $json->allow_barekey->decode('{foo:"bar"}');

  allow_bignum
        $json = $json->allow_bignum([$enable])

    If $enable is true (or missing), then "decode" will convert the big
    integer Perl cannot handle as integer into a Math::BigInt object and
    convert a floating number (any) into a Math::BigFloat.

    On the contary, "encode" converts "Math::BigInt" objects and
    "Math::BigFloat" objects into JSON numbers with "allow_blessed" enable.

       $json->allow_nonref->allow_blessed->allow_bignum;
       $bigfloat = $json->decode('2.000000000000000000000000001');
       print $json->encode($bigfloat);
       # => 2.000000000000000000000000001

    See to "MAPPING" in JSON::XS aboout the normal conversion of JSON
    number.

  loose
        $json = $json->loose([$enable])

    The unescaped [\x00-\x1f\x22\x2f\x5c] strings are invalid in JSON
    strings and the module doesn't allow to "decode" to these (except for
    \x2f). If $enable is true (or missing), then "decode" will accept these
    unescaped strings.

        $json->loose->decode(qq|["abc
                                       def"]|);

    See "SSECURITY CONSIDERATIONS" in JSON::XS.

  escape_slash
        $json = $json->escape_slash([$enable])

    According to JSON Grammar, *slash* (U+002F) is escaped. But default
    JSON::PP (as same as JSON::XS) encodes strings without escaping slash.

    If $enable is true (or missing), then "encode" will escape slashes.

  (OBSOLETED)as_nonblessed
        $json = $json->as_nonblessed

    (OBSOLETED) If $enable is true (or missing), then "encode" will convert
    a blessed hash reference or a blessed array reference (contains other
    blessed references) into JSON members and arrays.

    This feature is effective only when "allow_blessed" is enable.

  indent_length
        $json = $json->indent_length($length)

    JSON::XS indent space length is 3 and cannot be changed. JSON::PP set
    the indent space length with the given $length. The default is 3. The
    acceptable range is 0 to 15.

  sort_by
        $json = $json->sort_by($function_name)
        $json = $json->sort_by($subroutine_ref)

    If $function_name or $subroutine_ref are set, its sort routine are used
    in encoding JSON objects.

       $js = $pc->sort_by(sub { $JSON::PP::a cmp $JSON::PP::b })->encode($obj);
       # is($js, q|{"a":1,"b":2,"c":3,"d":4,"e":5,"f":6,"g":7,"h":8,"i":9}|);

       $js = $pc->sort_by('own_sort')->encode($obj);
       # is($js, q|{"a":1,"b":2,"c":3,"d":4,"e":5,"f":6,"g":7,"h":8,"i":9}|);

       sub JSON::PP::own_sort { $JSON::PP::a cmp $JSON::PP::b }

    As the sorting routine runs in the JSON::PP scope, the given subroutine
    name and the special variables $a, $b will begin 'JSON::PP::'.

    If $integer is set, then the effect is same as "canonical" on.

INTERNAL
    For developers.

    PP_encode_box
        Returns

                {
                    depth        => $depth,
                    indent_count => $indent_count,
                }

    PP_decode_box
        Returns

                {
                    text    => $text,
                    at      => $at,
                    ch      => $ch,
                    len     => $len,
                    depth   => $depth,
                    encoding      => $encoding,
                    is_valid_utf8 => $is_valid_utf8,
                };

MAPPING
    See to "MAPPING" in JSON::XS.

UNICODE HANDLING ON PERLS
    If you do not know about Unicode on Perl well, please check "A FEW NOTES
    ON UNICODE AND PERL" in JSON::XS.

  Perl 5.8 and later
    Perl can handle Unicode and the JSON::PP de/encode methods also work
    properly.

        $json->allow_nonref->encode(chr hex 3042);
        $json->allow_nonref->encode(chr hex 12345);

    Reuturns "\u3042" and "\ud808\udf45" respectively.

        $json->allow_nonref->decode('"\u3042"');
        $json->allow_nonref->decode('"\ud808\udf45"');

    Returns UTF-8 encoded strings with UTF8 flag, regarded as "U+3042" and
    "U+12345".

    Note that the versions from Perl 5.8.0 to 5.8.2, Perl built-in "join"
    was broken, so JSON::PP wraps the "join" with a subroutine. Thus
    JSON::PP works slow in the versions.

  Perl 5.6
    Perl can handle Unicode and the JSON::PP de/encode methods also work.

  Perl 5.005
    Perl 5.005 is a byte sementics world -- all strings are sequences of
    bytes. That means the unicode handling is not available.

    In encoding,

        $json->allow_nonref->encode(chr hex 3042);  # hex 3042 is 12354.
        $json->allow_nonref->encode(chr hex 12345); # hex 12345 is 74565.

    Returns "B" and "E", as "chr" takes a value more than 255, it treats as
    "$value % 256", so the above codes are equivalent to :

        $json->allow_nonref->encode(chr 66);
        $json->allow_nonref->encode(chr 69);

    In decoding,

        $json->decode('"\u00e3\u0081\u0082"');

    The returned is a byte sequence "0xE3 0x81 0x82" for UTF-8 encoded
    japanese character ("HIRAGANA LETTER A"). And if it is represented in
    Unicode code point, "U+3042".

    Next,

        $json->decode('"\u3042"');

    We ordinary expect the returned value is a Unicode character "U+3042".
    But here is 5.005 world. This is "0xE3 0x81 0x82".

        $json->decode('"\ud808\udf45"');

    This is not a character "U+12345" but bytes - "0xf0 0x92 0x8d 0x85".

TODO
    speed
    memory saving

SEE ALSO
    Most of the document are copied and modified from JSON::XS doc.

    JSON::XS

    RFC4627 (<http://www.ietf.org/rfc/rfc4627.txt>)

AUTHOR
    Makamaka Hannyaharamitu, <makamaka[at]cpan.org>

COPYRIGHT AND LICENSE
    Copyright 2007-2010 by Makamaka Hannyaharamitu

    This library is free software; you can redistribute it and/or modify it
    under the same terms as Perl itself.