New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance problem or benchmarking issue? #56
Comments
My patch dcc6b73 adds your benchmark to aeson. I do get some different numbers when running it on |
@iustin note that the sizes of the encoded json values are different: print $ encodeJ jdata
print $ encodeA adata
That probably explains why aeson is slower. Now we need to figure out why they differ in size. EDIT: I think the difference can be explained by the fact that |
Bas, thanks for taking a look at this.
What worries me more is that on your benchmark (#57) is that you get approx 4x faster decoding time with aeson, whereas for my data file it's the same. Probably that means there are some JSON constructs that are not handled well on decode by aeson. I started these benchmarks mostly becasuse I was expecting, I don't know, about 5x or more speed difference, even just from the use of ByteString versus String. The fact that we get roughly the same numbers makes me sad… Well, for now I can stay with json and not think yet about migration, that is all. thanks again! |
Do note that I also used GHC-7.4.1-rc1 and you used ghc-6.12.1. That could also make a big difference. |
Yeah, but I thought that it would make more or less the same for json library itself, too. With my current compiler, against your data file, I get the following:
So yes, it seems that file is somehow "biased" toward aeson on decoding performance (and towards json on encoding). I'll try and sanitise my own JSON file which shows similar performance, maybe it can help uncover some corner-case behaviour in aeson. |
I've looked into this a little, and cleaned up the benchmark some. You can see the improved benchmark here. What I see (using GHC 7.2) is that decoding is far faster with aeson (4x to 6x), as I expected. Encoding is only a little faster, though, and this is definitely strange. I can't find any comparative performance numbers from older versions, so I don't yet know whether this is a regression or what. Regardless, now that I know about this, and have a benchmark, I should hopefully be able to do something about it. Thanks for bringing this up! |
I managed to improve encoding performance by about 15% in 9169e42. Also, in 833c8fd I noticed that the json encoder wasn't generating a lazy UTF-8 bytestring like the aeson encoder, so I fixed that to make the comparison fairer. There's still a performance gap: aeson's encoding is about 3% slower than json's. |
I just released new versions of text and aeson: used together, they improve encoding performance by 20% compared to the previous releases of those packages. I can't currently see a way to make encoding faster again, but we're now faster than the json package (if only by a little). I'm going to leave this open for a little while, to remind me to look at encoding performance again. |
Thanks a lot, sounds good. This makes it feasible to move to aeson without a regression in speed, which is excellent. |
Jfyi, with GHC-7.4.1RC1 e.g. with GHC-7.2.2, the I've tested with (see |
I just pushed new versions of aeson and text again. The new version of aeson has 33% better encoding performance than 0.4 (so about another 10% on top of yesterday's release). Handily enough, improving aeson involved making some UTF-8 encoding and string breaking improvements to text that will benefit everyone. |
@hvr, do you think you could try with the new releases? I'll take a look when I get a chance, but it would be nice to have some help :-) |
@bos I've added benchmark measurements for the text-0.11.1.12 + aeson-0.5.0.0 combination to the aforementioned gist... It looks a bit better now, although there still seems to be a tendency for GHC-7.4 to optimize the |
I've had a go at this issue using my branch of the bytestring library, which contains the new bytestring builder. The results are encouraging. Using GHC 7.2.1 on an i7 on 64-bit Linux, I get a
Moreover, I also get a
I attribute these speedups to the following three improvements:
Here are the results of the benchmarks and links to the corresponding branch of the text repository and the corresponding branch of the aeson repository. Note that JSON is still 2x faster for encoding Based on these results, I suggest to let this issue wait some more until my patch to the bytestring library has found its way upstream. Actually, I still have to send it to Duncan first. This will happen in the next two weeks. |
@meiersi sweet... I really hope the the new bytestring library gets released soon, as I can really use those improvements... :-) |
I've merged @meiersi's changes to the text library into the main repo; see for instance b8c8f11923d46b5423ce13a98343865c209e53df. The new functionality will only be available from the text package if the environment contains a version of bytestring >= 0.10.4.0, which for now limits it to the perpetually upcoming GHC 7.8. I had to add the conditionality myself. Merging the corresponding changes to the aeson library looks much messier—I'll definitely need a clean stack of rebased commits to review, and they'll have to come with backwards compatibility baked in. |
That's good news. Thanks for the merge. I've prepared a pull request for |
With the new The overall status is that we're now about as fast as we can be without hand-rolling something similar to the buffer-builder package. |
Hi,
I'm trying to compare the performance of json (Text.JSON) and aeson, and I get surprising numbers. I apologise in advance if in fact the problem is with my benchmark setup, rather then actual performance issue.
I have the following benchmark program:
Run on an about 1.1MB json input file, with ghc 6.12.1 and aeson 0.4.0 and json 0.4.3 it gives the following:
I would expect aeson to be faster, but the numbers are really really close, so I'm not sure what I'm doing wrong here. Any hints? It almost looks like I'm not testing the actual encoding/decoding.
Unfortunately I can't provide actual JSON file easily, but I can try and give a sanitised one if that's needed to help debug the issue.
The text was updated successfully, but these errors were encountered: