Skip to content

Conversation

realityking
Copy link
Contributor

Encoding with default settings is about 20-30% faster, encoding with JSON_UNESCAPED_UNICODE about 15% faster
Decoding is also ~5% faster

Encoding performance is achieved by avoiding the UTF-8 to UTF-16 conversation. The slight improvement to decoding performance is probably due to more inlining oppertunities as json_utf8_to_utf16 is now only used when decoding.

Binary size is reduced by 56 bytes.

@realityking
Copy link
Contributor Author

I selected master, but IMO this might as well go into 5.6. I'll leave that up to merger.

@bukka
Copy link
Member

bukka commented Mar 30, 2014

Hi,

I run few tests on your branch. I haven't seen any difference in decoding which I expected as there are no changes in the code. How did you get 5% and what could be the reason for that? It looks that you just removed utf16 check? Please have you got some tests that result in such improvement?

I have done few small string tests that were more or less the same but I would probably have to do much more comprehensive testing for it.

Then I tested sample.json from https://code.google.com/p/json-test-suite/ (it's in the zip archive) and run the test that I uploaded on gist: https://gist.github.com/bukka/9877180

I got a big decrease in performance:
MASTER

[jakub@localhost json-test-suite]$ php test.php 
JSON decode success
JSON encode success
JSOND encode success
JSOND encode success
JSON decode:  time for 1000 iterations: 6.661762
JSON encode:  time for 1000 iterations: 1.729754
JSOND decode:  time for 1000 iterations: 2.443610
JSOND encode:  time for 1000 iterations: 1.744814

YOUR PATCH

[jakub@localhost json-test-suite]$ php test.php 
JSON decode success
JSON encode success
JSOND encode success
JSOND encode success
JSON decode:  time for 1000 iterations: 6.676425
JSON encode:  time for 1000 iterations: 3.006191
JSOND decode:  time for 1000 iterations: 2.428526
JSOND encode:  time for 1000 iterations: 1.749385

I tested the results with JSOND as well to make sure that the env is the same. As you can see the JSON encode is almost twice worse. Please could you double check it and run the test on your computer? You can even generate different sample files and test it on them! Maybe there was something wrong with my build or there is some mistake in the code?

One more question: Have you compiled your tests with -O2 (disabled debug)?

I think that such changes need to be properly tested with variable data. I could help with some testing as I do that for jsond anyway. However I don't think that it should go to 5.6 as it can bring some regressions if not properly tested.

Thanks

@realityking
Copy link
Contributor Author

Very interesting results, thanks for testing.

Decode performance was just an observation from my benchmarks and happened "by accident". I suspect it's due to the removed condition and due to better inlining since json_utf8_to_utf16() is now only used for the decoder.

My build configuration for benchmarks is ./configure --disable-all --disable-cgi --disable-phpdbg --enable-json

As a benchmark I've used these scripts: https://gist.github.com/Rican7/6457237 (small data, 50000 iterations) and https://gist.github.com/realityking/9877752 (large data, 5000 iterations)

Here are results with the large data and default setting for json_encode()
PATCH

rouvens-air-7:php-src rouven$ ./sapi/cli/php bench.php -n 50000
Running benchmark for...
  json
  50000 times

Test completed!!
  Encoding time: 27.572732925415
  Decoding time: 33.475461006165
  Total time:    61.04819393158
  Encoded size:  43682 bytes
  Peak memory:   786432 bytes

PHP-5.6 branch

rouvens-air-7:php-src rouven$ ./php-old bench.php -n 50000
Running benchmark for...
  json
  50000 times

Test completed!!
  Encoding time: 34.740728139877
  Decoding time: 35.299659013748
  Total time:    70.040387153625
  Encoded size:  43682 bytes
  Peak memory:   786432 bytes

I've also repeated your test (minus the jsond part as I don't have that build), and the results for encode go in the same direction as yours (67% worse) but interestingly decode is still slightly faster ;)

PATCH

JSON decode success
JSON encode success
JSON decode:  time for 1000 iterations: 12.785667
JSON encode:  time for 1000 iterations: 5.455190

PHP-5.6 branch

JSON decode success
JSON encode success
JSON decode:  time for 1000 iterations: 13.231535
JSON encode:  time for 1000 iterations: 3.264485

I've also repeated your test with JSON_UNESCAPED_UNICODE and the results are much closer, but still in favor of current master.

PATCH

JSON decode success
JSON encode success
JSON decode:  time for 1000 iterations: 13.062503
JSON encode:  time for 1000 iterations: 2.948170

PHP-5.6 branch

JSON decode success
JSON encode success
JSON decode:  time for 1000 iterations: 13.183959
JSON encode:  time for 1000 iterations: 2.730596

I suspect the big difference is that my tests were mostly ASCII, while the linked test goes to town with unicode (and manages to freeze TextWrangler). Looking at the code, this makes sense too. I basically traded the UTF-16 conversion for more costly encoding of unicode (I suspect the snprintf is a part of that), I'll see if I can optimize that part a bit.

@realityking
Copy link
Contributor Author

By eliminating the utf8_is_valid check (validity is still ensure by the decode function) I got much better (~10%) results. This avoid looping twice trough the entire string, but comes at a cost for invalid data. Since valid data is a much more common case that seems acceptable.

Default settings

JSON decode success
JSON encode success
JSON decode:  time for 1000 iterations: 13.024528
JSON encode:  time for 1000 iterations: 5.016359

JSON_UNESCAPED_UNICODE

JSON decode success
JSON encode success
JSON decode:  time for 1000 iterations: 13.086564
JSON encode:  time for 1000 iterations: 2.493844

So for JSON_UNESCAPED_UNICODE it's now faster than the PHP-5.6 branch, with default settings however it's still about 50% slower.

Edit: Any for my benchmark script above it's now almost twice as fast. It seems do to rather well with longer strings.

@realityking
Copy link
Contributor Author

As suspected snprintf and the code that came with it seems to have been the culprit. By replacing these I'm now up to the following results:

JSON decode success
JSON encode success
JSON decode:  time for 1000 iterations: 13.041896
JSON encode:  time for 1000 iterations: 2.815254

That's around 15% than PHP-5.6 for me. My test case is now up to 90% faster. @bukka Could you rerun the test on your setup and see what results you now get?

if (utf16) {
efree(utf16);
if (state != UTF8_ACCEPT) {
buf->len = oldlen;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this actually the correct thing do to or am I causing some kind of side effect I'm not aware of?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should probably reset buf->c before you set new len otherwise null can be written after the new buf->len. It should be something like

buf->c -= oldlen - buf->len;
buf->len = oldlend;

It would be good to add some tests for it to check if it generates the same output when there is ill-formed utf8

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The behavior is already tested, for example here https://github.com/php/php-src/blob/master/ext/json/tests/bug43941.phpt

So apparently the append functions cover that, but I can probably add that line without harm. Will test.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No sorry don't add that line. I just checked smart_str_appendc_ex. It would screw things up. It's ok as it is.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright I won't ;)

@bukka
Copy link
Member

bukka commented Mar 31, 2014

I am a bit busy right now but I will have a look and do the testing later... ;)

};

static inline php_uint32
decode(php_uint32* state, php_uint32* codep, php_uint32 byte) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would change the name of the function to prevent possible (future) collisions with php codebase. Something like php_json_utf8_decode

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed.

@bukka
Copy link
Member

bukka commented Mar 31, 2014

I finally properly looked at the code and it looks nice! However you should probably add more tests to check if all UTF-8 encoding byte sequences are correct. Look to the Unicode Standard (it's table 3.7 in version 6.2). It would be also good to check surrogate pair the same way as it's in the current impl.

I also run the test and the regression is fixed:

JSON decode:  time for 1000 iterations: 6.703960
JSON encode:  time for 1000 iterations: 1.397669
JSOND decode:  time for 1000 iterations: 2.453284
JSOND encode:  time for 1000 iterations: 1.755271

Then I would like to add it to jsond and do a bit more testing if it's ok with you?

@realityking
Copy link
Contributor Author

Thanks again for testing! As mentioned above, invalid data is already tested and I've encounter at least 1, 2 and 4 byte sequences in my testing.

I don't see a check for surrogate pairs in the old code, what do you mean?

As for jsond, feel free to pull this in but I'd like to keep this open separately. IMO it's a change with far less risk and could be merged earlier.

@bukka
Copy link
Member

bukka commented Mar 31, 2014

You are adding two unicode escapes for the surrogate pair which looks correct. I was wondering how is this handled in the previous code. So it looks like a correct change but the encoded string will be different. I think that such changes should be at least added to the upgrade notes. In case someone did checksums for really weird json string...

@bukka
Copy link
Member

bukka commented Mar 31, 2014

it works fine in both impl. I just tested ;)

@bukka
Copy link
Member

bukka commented Mar 31, 2014

I plan to do some regression test on surrogates only which should cover perf so don't worry about that :)

@realityking
Copy link
Contributor Author

That's the reason the old code transcoded to UTF-16, it than looped over the individual bytes and created the unicode escape sequences (except for those in the ASCII range) for each byte. The new code now works based on codepoints, thus the code becomes a bit clearer (IMO).

The output should be 100% identical between the old and the new implementation.

@bukka
Copy link
Member

bukka commented Apr 1, 2014

Yeah I saw that after the post... :) I asked because I found a bug in decoding where it accepts invalid surrogate character. Just wanted to check the encoding is fine... ;)

@nikic
Copy link
Member

nikic commented Feb 28, 2015

Now that jsond has been merged into master, is this perf improvement still relevant?

@realityking
Copy link
Contributor Author

Last I checked, jsond didn't change the encoder. I'll get the branch running on PHP7 in the next days to check.

@ghost
Copy link

ghost commented Mar 7, 2015

Can one of the admins verify this patch?

@peterbowey
Copy link

I re-worked the patch ac2548e to verify (without merge conflicts) against the PHP-5.6.7 branch.

From ac2548e11367da019932589fd599fd1079e2ba8d Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Rouven=20We=C3=9Fling?= <me@rouvenwessling.de>
Date: Sun, 30 Mar 2014 07:42:33 +0200
Subject: [PATCH] Improve performance of json_encode().

---
 ext/json/json.c        | 126 +++++++++++++++++++++++++------------------------
 ext/json/utf8decoder.h |  54 +++++++++++++++++++++
 2 files changed, 119 insertions(+), 61 deletions(-)
 create mode 100644 ext/json/utf8decoder.h

diff --git a/ext/json/json.c b/ext/json/json.c
index 71f8cc2..f636467 100644
--- a/ext/json/json.c
+++ b/ext/json/json.c
@@ -29,6 +29,7 @@
 #include "ext/standard/php_smart_str.h"
 #include "JSON_parser.h"
 #include "php_json.h"
+#include "utf8decoder.h"
 #include <zend_exceptions.h>

 #include <float.h>
@@ -365,45 +366,32 @@ static int json_utf8_to_utf16(unsigned short *utf16, char utf8[], int len) /* {{
    size_t pos = 0, us;
    int j, status;

-   if (utf16) {
-       /* really convert the utf8 string */
-       for (j=0 ; pos < len ; j++) {
-           us = php_next_utf8_char((const unsigned char *)utf8, len, &pos, &status);
-           if (status != SUCCESS) {
-               return -1;
-           }
-           /* From http://en.wikipedia.org/wiki/UTF16 */
-           if (us >= 0x10000) {
-               us -= 0x10000;
-               utf16[j++] = (unsigned short)((us >> 10) | 0xd800);
-               utf16[j] = (unsigned short)((us & 0x3ff) | 0xdc00);
-           } else {
-               utf16[j] = (unsigned short)us;
-           }
+   for (j=0 ; pos < len ; j++) {
+       us = php_next_utf8_char((const unsigned char *)utf8, len, &pos, &status);
+       if (status != SUCCESS) {
+           return -1;
        }
-   } else {
-       /* Only check if utf8 string is valid, and compute utf16 length */
-       for (j=0 ; pos < len ; j++) {
-           us = php_next_utf8_char((const unsigned char *)utf8, len, &pos, &status);
-           if (status != SUCCESS) {
-               return -1;
-           }
-           if (us >= 0x10000) {
-               j++;
-           }
+       /* From http://en.wikipedia.org/wiki/UTF16 */
+       if (us >= 0x10000) {
+           us -= 0x10000;
+           utf16[j++] = (unsigned short)((us >> 10) | 0xd800);
+           utf16[j] = (unsigned short)((us & 0x3ff) | 0xdc00);
+       } else {
+           utf16[j] = (unsigned short)us;
        }
    }
+
    return j;
 }
 /* }}} */

-
 static void json_escape_string(smart_str *buf, char *s, int len, int options TSRMLS_DC) /* {{{ */
 {
-   int pos = 0, ulen = 0;
-   unsigned short us;
-   unsigned short *utf16;
+   size_t count;
    size_t newlen;
+   php_uint32 codepoint;
+   php_uint32 state = 0;
+   size_t oldlen;

    if (len == 0) {
        smart_str_appendl(buf, "\"\"", 2);
@@ -431,36 +419,21 @@ static void json_escape_string(smart_str *buf, char *s, int len, int options TSR
                return;
            }
        }
-
-   }
-
-   utf16 = (options & PHP_JSON_UNESCAPED_UNICODE) ? NULL : (unsigned short *) safe_emalloc(len, sizeof(unsigned short), 0);
-   ulen = json_utf8_to_utf16(utf16, s, len);
-   if (ulen <= 0) {
-       if (utf16) {
-           efree(utf16);
-       }
-       if (ulen < 0) {
-           JSON_G(error_code) = PHP_JSON_ERROR_UTF8;
-           smart_str_appendl(buf, "null", 4);
-       } else {
-           smart_str_appendl(buf, "\"\"", 2);
-       }
-       return;
-   }
-   if (!(options & PHP_JSON_UNESCAPED_UNICODE)) {
-       len = ulen;
    }
+   
+   oldlen = buf->len;

    /* pre-allocate for string length plus 2 quotes */
    smart_str_alloc(buf, len+2, 0);
    smart_str_appendc(buf, '"');

-   while (pos < len)
+   for (count = 0; count < len; count++, s++)
    {
-       us = (options & PHP_JSON_UNESCAPED_UNICODE) ? s[pos++] : utf16[pos++];
+       if (php_json_utf8_decode(&state, &codepoint, (php_char8)*s)) {
+           continue;
+       }

-       switch (us)
+       switch (codepoint)
        {
            case '"':
                if (options & PHP_JSON_HEX_QUOT) {
@@ -535,23 +508,54 @@ static void json_escape_string(smart_str *buf, char *s, int len, int options TSR
                break;

            default:
-               if (us >= ' ' && ((options & PHP_JSON_UNESCAPED_UNICODE) || (us & 127) == us)) {
-                   smart_str_appendc(buf, (unsigned char) us);
+               if (codepoint >= ' ' && ((options & PHP_JSON_UNESCAPED_UNICODE) || codepoint <= 0x7f)) {
+                   if (codepoint <= 0x7f) {
+                       smart_str_appendc(buf, (unsigned char) codepoint);
+                   } else if (codepoint <= 0x7ff) {
+                       smart_str_appendc(buf, (unsigned char) (0xc0 + (codepoint >> 6)));
+                       smart_str_appendc(buf, (unsigned char) (0x80 + (codepoint & 0x3f)));
+                   } else if (codepoint <= 0xffff) {
+                       smart_str_appendc(buf, (unsigned char) (0xe0 + (codepoint >> 12)));
+                       smart_str_appendc(buf, (unsigned char) (0x80 + ((codepoint >> 6) & 0x3f)));
+                       smart_str_appendc(buf, (unsigned char) (0x80 + (codepoint & 0x3f)));
+                   } else if (codepoint <= 0x1ffff) {
+                       smart_str_appendc(buf, (unsigned char) (0xf0 + (codepoint >> 18)));
+                       smart_str_appendc(buf, (unsigned char) (0x80 + ((codepoint >> 12) & 0x3f)));
+                       smart_str_appendc(buf, (unsigned char) (0x80 + ((codepoint >> 6) & 0x3f)));
+                       smart_str_appendc(buf, (unsigned char) (0x80 + (codepoint & 0x3f)));
+                   }
                } else {
-                   smart_str_appendl(buf, "\\u", 2);
-                   smart_str_appendc(buf, digits[(us & 0xf000) >> 12]);
-                   smart_str_appendc(buf, digits[(us & 0xf00)  >> 8]);
-                   smart_str_appendc(buf, digits[(us & 0xf0)   >> 4]);
-                   smart_str_appendc(buf, digits[(us & 0xf)]);
+                   if (codepoint <= 0xffff) {
+                       smart_str_appendl(buf, "\\u", 2);
+                       smart_str_appendc(buf, digits[(codepoint & 0xf000) >> 12]);
+                       smart_str_appendc(buf, digits[(codepoint & 0xf00)  >> 8]);
+                       smart_str_appendc(buf, digits[(codepoint & 0xf0)   >> 4]);
+                       smart_str_appendc(buf, digits[(codepoint & 0xf)]);
+                   } else {
+                       smart_str_appendl(buf, "\\u", 2);
+                       smart_str_appendc(buf, digits[((0xD7C0 + (codepoint >> 10)) & 0xf000) >> 12]);
+                       smart_str_appendc(buf, digits[((0xD7C0 + (codepoint >> 10)) & 0xf00)  >> 8]);
+                       smart_str_appendc(buf, digits[((0xD7C0 + (codepoint >> 10)) & 0xf0)   >> 4]);
+                       smart_str_appendc(buf, digits[((0xD7C0 + (codepoint >> 10)) & 0xf)]);
+                       smart_str_appendl(buf, "\\u", 2);
+                       smart_str_appendc(buf, digits[((0xDC00 + (codepoint & 0x3FF)) & 0xf000) >> 12]);
+                       smart_str_appendc(buf, digits[((0xDC00 + (codepoint & 0x3FF)) & 0xf00)  >> 8]);
+                       smart_str_appendc(buf, digits[((0xDC00 + (codepoint & 0x3FF)) & 0xf0)   >> 4]);
+                       smart_str_appendc(buf, digits[((0xDC00 + (codepoint & 0x3FF)) & 0xf)]);
+                   }
                }
                break;
        }
    }

-   smart_str_appendc(buf, '"');
-   if (utf16) {
-       efree(utf16);
+   if (state != UTF8_ACCEPT) {
+       buf->len = oldlen;
+       JSON_G(error_code) = PHP_JSON_ERROR_UTF8;
+       smart_str_appendl(buf, "null", 4);
+       return;
    }
+
+   smart_str_appendc(buf, '"');
 }
 /* }}} */

diff --git a/ext/json/utf8decoder.h b/ext/json/utf8decoder.h
new file mode 100644
index 0000000..24684c3
--- /dev/null
+++ b/ext/json/utf8decoder.h
@@ -0,0 +1,54 @@
+/*
+ Copyright (c) 2008-2009 Bjoern Hoehrmann <bjoern@hoehrmann.de>
+
+ Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
+
+ The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
+
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
+*/
+
+#include "ext/standard/basic_functions.h"
+
+/*
+ * Copyright (c) 2008-2009 Bjoern Hoehrmann <bjoern@hoehrmann.de>
+ * See http://bjoern.hoehrmann.de/utf-8/decoder/dfa/ for details.
+ */
+
+#define UTF8_ACCEPT 0
+#define UTF8_REJECT 1
+
+typedef unsigned char php_char8;
+
+static const php_char8 utf8d[] = {
+  // The first part of the table maps bytes to character classes that
+  // to reduce the size of the transition table and create bitmasks.
+   0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,  0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
+   0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,  0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
+   0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,  0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
+   0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,  0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
+   1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,  9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,
+   7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,  7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,
+   8,8,2,2,2,2,2,2,2,2,2,2,2,2,2,2,  2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,
+  10,3,3,3,3,3,3,3,3,3,3,3,3,4,3,3, 11,6,6,6,5,8,8,8,8,8,8,8,8,8,8,8,
+
+  // The second part is a transition table that maps a combination
+  // of a state of the automaton and a character class to a state.
+   0,12,24,36,60,96,84,12,12,12,48,72, 12,12,12,12,12,12,12,12,12,12,12,12,
+  12, 0,12,12,12,12,12, 0,12, 0,12,12, 12,24,12,12,12,12,12,24,12,24,12,12,
+  12,12,12,12,12,12,12,24,12,12,12,12, 12,24,12,12,12,12,12,12,12,24,12,12,
+  12,12,12,12,12,12,12,36,12,36,12,12, 12,36,12,12,12,12,12,36,12,36,12,12,
+  12,36,12,12,12,12,12,12,12,12,12,12, 
+};
+
+static inline php_uint32
+php_json_utf8_decode(php_uint32* state, php_uint32* codep, php_uint32 byte) {
+  php_uint32 type = utf8d[byte];
+
+  *codep = (*state != UTF8_ACCEPT) ?
+    (byte & 0x3fu) | (*codep << 6) :
+    (0xff >> type) & (byte);
+
+  *state = utf8d[256 + *state + type];
+  return *state;
+}

Result: No fuzz or merge conflicts' @ PHP-5.6.7

@laruence
Copy link
Member

@bukka what do you think of this?

@peterbowey
Copy link

The revised patch was applied and tested against the latest PHP branch @ 5.6.7 on a production PHP web server, and over 11 hours [time window], log events indicate that this revised json_encode() is without code issues. I have no solid benchmarks - other than seeing valid code using the revised json_encode() - as given above.

@bukka
Copy link
Member

bukka commented Mar 30, 2015

I have got a modified version of this in the jsond ext ( https://github.com/bukka/php-jsond/blob/4ab59945459ed7bc121b323cfcaa3ab5ba7a2da8/jsond_encoder.c#L179 ). It copies data to the buffer (smart_str) only if it is necessary (before escape or end). It needs to do a bit marking to remember the end of the last flushed buffer if there is an escape of course. My initial tests showed that it is faster than this but need to test it a bit more. I also removed some unnecessary bits like https://github.com/php/php-src/pull/636/files#diff-7b5584898bd76be07eca1aaba23b75f4R511 (there is no need to do that as we have already data in the input buffer and it can be simply copied). I will probably go for the jsond version.

I have got it on my list as one of the priorities and I would like to soon port it and compare it with the current master. I hope I will do that the next few weeks.

I don’t think that this should go to 5.6 as it is not a bug fix and it is a bit invasive for a point release. It can go only to master (PHP 7) IMHO…

@bukka
Copy link
Member

bukka commented Apr 16, 2015

Just a quick update I have ported this to PHP 7. I also ported the jsond version of that. Both commits can be seen here:
master...bukka:json_utf8_decoder

Both of these were apparent improvements on PHP 5. However it's not the case for PHP 7 which got also some encoding optimization from Dmitry. I have done benchmarks for 531 various json instances and I see some decrease in performance (for both commits) compare to current master. However there is a big room for possible improvements so it might eventually get better. Before I start looking into it I would like to first improve my benchmark tools to get a more complete results. I would also like to do some real app (e.g. RESTful services) testing as the looping tests don't have to always get the correct idea about perf in real apps.

@peterbowey
Copy link

@bukka
Thanks for that - looks good!
Interesting comment [benchmark] on how this is an 'apparent improvements on PHP 5'... and [not] PHP 7 :)

@laruence
Copy link
Member

already implemented in josnd

@laruence laruence closed this Dec 26, 2015
@bukka
Copy link
Member

bukka commented Dec 26, 2015

Just to clarify. This is not a part of jsond that is in the core as it didn't perform as expected. As I said in my previous comment, it will require a bit more testing and improving which I have not had time yet...

Anyway it will be different patch if anything so it's cool to close this. Cheers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants