Skip to content

Commit

Permalink
Fix the stringbuilder post
Browse files Browse the repository at this point in the history
  • Loading branch information
skuro committed Mar 11, 2013
1 parent 412f553 commit 05ff7c4
Show file tree
Hide file tree
Showing 8 changed files with 187 additions and 0 deletions.
11 changes: 11 additions & 0 deletions _posts/2013-03-06-java-stringbuilder-myth.md
@@ -0,0 +1,11 @@
---
title: Java StringBuilder myth debunked
layout: post
primary_img: /img/post/jirasvn.png
categories: [java, performance, development]
meta-description: It's a recurring
---

*NOTE: this post was published before it was ready, see the real one [here][goto]*

[goto]:
176 changes: 176 additions & 0 deletions _posts/2013-03-11-java-stringbuilder-myth-now-with-content.md
@@ -0,0 +1,176 @@
---
title: Java StringBuilder myth debunked -- now with content!
layout: post
primary_img: /img/post/joint.png
categories: [java, performance, development]
meta-description: It's common wisdom that String concatenation with '+' is a poor performing bad practice, but is it really the case?
---

The myth
========

> Concatenating two Strings with the plus operator is the source of all evil
>
> -- Anonymous Java dev
***NOTE**: The source code for the tests discussed here can be found on [Github][github]*

It's from university time that I learned to regard `String` concatenation in Java
using the '+' plus operator as a deadly performance sin. Recently there has been
an internal review at [Backbase R&D](http://www.backbase.com) where such recurring
mantra was dismissed as a myth due to `javac` using `StringBuilder` under the hood
any time you use the plus operator to join Strings. I set myself up to prove
such a point and verify the reality under different environments.

The test
========

Relying on your compiler to optimize your `String` concatenation means that things
might change heavily depending on the JDK vendor you adopt. As far as platform
support goes for my daily job, three main vendors should be considered:

* Oracle JDK
* IBM JDK
* ECJ -- for developers only

Moreover, while we officially support Java 5 through 6, we are also looking into
supporting Java 7 for our products, adding another three-folded level of indirection on top of
the three vendors. For the sake of <del>lazyness</del> simplicity, the `ecj` compiled
bytecode will be run with a single JDK, namely Oracle JDK7.

I prepared a [Virtualbox](https://www.virtualbox.org/) VM with all the above JDK
installed, then I developed some classes to express three different concatenation
methods, amounting to three to four concatenations per method invocaiton,
depending on the specific test case.

The test classes are run a thousands times for each test round, with a total of
100 rounds each test case. The same VM is used to run all the rounds for the same
test case, and it's restarted across different test cases, all to let the Java
runtime perform all the optimizations it can, without affecting the other test
cases in any way. The default options were used to start all JVMs.

More details can be found in the benchmark runner [script](https://github.com/skuro/stringbuilder/blob/master/bench.sh).

The code
========

Full code for both test cases and the test suite is available on [Github][github].

The following different test cases were produced to measure performance
differences of the String concatenation with plus against the direct use of a
`StringBuilder`:

// String concat with plus
String result = "const1" + base;
result = result + "const2";

----

// String concat with a StringBuilder
new StringBuilder()
.append("const1")
.append(base)
.append("const2")
.append(append)
.toString();
}

----

//String concat with an initialized StringBuilder
new StringBuilder("const1")
.append(base)
.append("const2")
.append(append)
.toString();

The general idea is to provide a concatenation both at the head and at the tail
of constant `String`s over a variable. The difference between the last two cases,
both making explicit use of `StringBuilder`, is in the latter using the 1-arg
constructor which initializes the builder with the initial part of the result.

The results
===========

Enough talking, down below here you can have a look at the generated graphs, where
each data point corresponds to a single test round (e.g. 1000 executions of the same
test class).

The discussion of the results and some more juicy details will follow.

![Concatenation with plus][catplus]
----
![Concatenation with StringBuilder][catsb]
----
![Concatenation with initialized StringBuilder][catsb2]

The discussion
==============

Oracle JKD5 is the clear loser here, appearing to be in a B league when compared
to the others. But that's not really the scope of this exercise, and thus we'll
gloss over it for the time being.

That said, there are two other interesting bits I observe in the above graph. The first is that indeed
there is generally quite a difference between the use of the plus operator vs an explicit
`StringBuilder`, *especially* if you're using Oracle Java5 which performs tree
times worse the the rest of the crew.

The second observation is that while it generally holds for most of the JDKs that
an explicit `StringBuilder` will offer up to twice the speed as the regular plus
operator, **IBM JDK6 seems not to suffer** from any performance loss, always averaging
25ms to complete the task in all test cases.

A closer look at the generated bytecode reveals some interesting details

The bytecode
============

***NOTE:** the decompiled classes are also available on [Github][github]*

Across all possible JDKs `StringBuilders` are **always** used to implement `String`
concatenation even in presence of a plus sign.
Moreover, across all vendors and versions, **there is almost no difference at all**
for the same test case. The only one that stands a bit apart is [`ecj`][ecjplus],
which is the only one to cleverly optimize the `CatPlus` test case to invoke
the 1-arg constructor of the `StringBuilder` instead of the 0-arg version.

Comparing the resulting bytecode exposes what could affect performance in the
different scnarios:

* when concatenating with plus, *new instances of `StringBuilder`* are created
any time a concatenation happens. This can easily result in a performance
degradation due to useless invocation of the constructor plus more stress on
the garbage collector due to throw away instances

* compilers will take you literally and only initalize `StringBuilder` with its
1-arg constructor if and only if you write it that way in the original code. This
results in respectively four and three invocations of `StringBuilder.append` for
[CatSB][catsbp] and [CatSB2][catsb2p].

The conclusion
==============

Bytecode analysis offers the final answer to the original question.

> Do you need to explicitly use a `StringBuilder` to improve performance? **Yes**
The above graphs clearly show that, unless you're using IBM JDK6 runtime, you will
loss 50% performance when using the plus operator, although it's the one to perform
slightly worse across the candidates when expliciting `StringBuffers`.

Also, it's quite interesting to see how *JIT optimizations* impact the overall
performance: for instance, even in presence of different bytecode between the two
explicit `StringBuilder` test cases, the end result is absolutely the same in the
long run.

![Myth confirmed][myth]

[catplus]: img/post/catplus.png "Concatenation with plus"
[catsb]: img/post/catsb.png "Concatenation with StringBuilder"
[catsb2]: img/post/catsb2.png "Concatenation with initalized StringBuilder"
[github]: https://github.com/skuro/stringbuilder
[ecjplus]: https://github.com/skuro/stringbuilder/blob/master/ecj/CatPlus.class.txt
[catsbp]: https://github.com/skuro/stringbuilder/blob/master/ecj/CatSB.class.txt
[catsb2p]: https://github.com/skuro/stringbuilder/blob/master/ecj/CatSB2.class.txt
[myth]: img/post/myth-confirmed.jpg
Binary file added img/post/catmulti.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/post/catplus.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/post/catsb.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/post/catsb2.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/post/joint.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/post/myth-confirmed.jpg
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 05ff7c4

Please sign in to comment.