-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance enhancements (mainly) to json_object_to_json_string() #208
Closed
rgerhards
wants to merge
9
commits into
json-c:master
from
rgerhards:enh-perf-json_object_object_to_json_string
Closed
Performance enhancements (mainly) to json_object_to_json_string() #208
rgerhards
wants to merge
9
commits into
json-c:master
from
rgerhards:enh-perf-json_object_object_to_json_string
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This function and its helpers used to call sprintbuf() for constant strings. This requires a lot of performance, because the call ends up in vprintf(), which is not necessary as there is no modification of the string to be done. In some other places of the code, printbuf_memappend() was used in such cases. This is the right route to follow, because it greatly decreases processing time. This patch does this replacement and keeps using sprintbuf() only where it really is required. In my testing, this resulted in a very noticable performance improvement (n *times* faster, where n depends on the details of what is formatted and with which options).
We avoid writing to "memory" just in order to check loop termination condition. While this looks like it makes really no difference, the profiler (valgrind's callgrind tool) tells it actually does. Also, wallclock time taken by time is consistently a bit lower than with the previous method. As it doesn't hurt anything (in terms of more complex code or so), I think the change is worth it.
Do not call memcpy for single-byte values, as writing them directly involves fewer overhead and thus is faster. Verified effect with valgrind's callgrind profiler and wallclock timing.
The regular printbuf calls always keep the string properly terminated. However, this is not required during construction of the printable string and thus simply unnecessary overhead. I have added a new function which avoids writing until the result is finally finished. Both profiler (valgrind's callgrind) and wallclock timing show this is a clear improvement).
the previous method relied on return codes to convey failure, however, most often the return code was not checked. This could lead to several issues, among them malformedness of the generated string. Also, it is very unlikely to really get into an OOM condition and, if so, it is extremely likely that the rest of the (calling) app will not behave correctly. This patch takes a different approach: if we run out of memory, we ignore data for which no memory could be allocated. In essence, this can result in the same random malformedness as the previous approach, but we do not try to signal back to the orignal caller. Note that with the previous approach this also did not work reliably. The benefit of this approach is that we gain some performance, have simpler code and have a clearly defined failure state.
String escaping is a surprisingly performance intense operation. I spent many hours in the profiler, and the root problem seems to be that there is no easy way to detect the character classes that need to be escaped, where the root cause is that these characters are spread all over the ascii table. I tried several approaches, including call tables, re-structuring the case condition, different types of if conditions and reordering the if conditions. What worked out best is this: The regular case is that a character must not be escaped. So we want to process that as fast as possible. In order to detect this as quickly as possible, we have a lookup table that tells us if a char needs escaping ("needsEscape", below). This table has a spot for each ascii code. Note that it uses chars, because anything larger causes worse cache operation and anything smaller requires bit indexing and masking operations, which are also comparatively costly. So plain chars work best. What we then do is a single lookup into the table to detect if we need to escape a character. If we need, we go into the depth of actual escape detection. But if we do NOT need to escape, we just quickly advance the index and are done with that char. Note that it may look like the extra table lookup costs performance, but right the contrary is the case. We get amore than 30% performance increase due to it (compared to the latest version of the code that did not do the lookups.
With the new approch we can simplify the code a bit. As a side effect, we get a very slight further performance increase.
The default printbuf size of 32 bytes is very conservative and can cause a lot of malloc() action for all but trivial json objects. The new json_global_set_printbuf_initial_size() API permits an application to set a new default size.
restored int return type for the *_to_json_string formatting functions as otherwise we would need to break a public interface. The performance impact is minimal, as the result is never evaluated.
closing PR due to https://groups.google.com/forum/#!topic/json-c/OeEBrx1iIvs |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This patch set optimized various aspects of generating the output json. Each patch is commented in depth. The performance is increased by a factor of ~3 for my use case.
Special consideration should go to 4969ffd which introduces a new method of out of memory handling, which replaces the inconsistent previous method. Again, commit comment has all details.