This function and its helpers used to call sprintbuf() for constant
strings. This requires a lot of performance, because the call ends
up in vprintf(), which is not necessary as there is no modification
of the string to be done.
In some other places of the code, printbuf_memappend() was used in
such cases. This is the right route to follow, because it greatly
decreases processing time.
This patch does this replacement and keeps using sprintbuf() only
where it really is required. In my testing, this resulted in a very
noticable performance improvement (n *times* faster, where n
depends on the details of what is formatted and with which options).
We avoid writing to "memory" just in order to check loop
termination condition. While this looks like it makes really
no difference, the profiler (valgrind's callgrind tool) tells it
actually does. Also, wallclock time taken by time is consistently
a bit lower than with the previous method. As it doesn't hurt
anything (in terms of more complex code or so), I think the
change is worth it.
The regular printbuf calls always keep the string properly
terminated. However, this is not required during construction of
the printable string and thus simply unnecessary overhead. I have
added a new function which avoids writing until the result is
finally finished. Both profiler (valgrind's callgrind) and
wallclock timing show this is a clear improvement).
the previous method relied on return codes to convey failure,
however, most often the return code was not checked. This could
lead to several issues, among them malformedness of the generated
string. Also, it is very unlikely to really get into an OOM
condition and, if so, it is extremely likely that the rest of
the (calling) app will not behave correctly.
This patch takes a different approach: if we run out of memory,
we ignore data for which no memory could be allocated. In essence,
this can result in the same random malformedness as the previous
approach, but we do not try to signal back to the orignal caller.
Note that with the previous approach this also did not work
reliably. The benefit of this approach is that we gain some
performance, have simpler code and have a clearly defined failure
String escaping is a surprisingly performance intense operation.
I spent many hours in the profiler, and the root problem seems
to be that there is no easy way to detect the character classes
that need to be escaped, where the root cause is that these
characters are spread all over the ascii table. I tried
several approaches, including call tables, re-structuring
the case condition, different types of if conditions and
reordering the if conditions. What worked out best is this:
The regular case is that a character must not be escaped. So
we want to process that as fast as possible. In order to
detect this as quickly as possible, we have a lookup table
that tells us if a char needs escaping ("needsEscape", below).
This table has a spot for each ascii code. Note that it uses
chars, because anything larger causes worse cache operation
and anything smaller requires bit indexing and masking
operations, which are also comparatively costly. So plain
chars work best. What we then do is a single lookup into the
table to detect if we need to escape a character. If we need,
we go into the depth of actual escape detection. But if we
do NOT need to escape, we just quickly advance the index
and are done with that char. Note that it may look like the
extra table lookup costs performance, but right the contrary
is the case. We get amore than 30% performance increase due
to it (compared to the latest version of the code that did not
do the lookups.
The default printbuf size of 32 bytes is very conservative and
can cause a lot of malloc() action for all but trivial json objects.
The new json_global_set_printbuf_initial_size() API permits an
application to set a new default size.