Specialize the TBinaryProtocol read path up to ~2x speedups #221

ianoc · 2015-06-03T01:30:12Z

Can see the prior delta in the posted image. The macro stays below but the others that do full deserialization converge with this change.

johnynek · 2015-06-03T02:24:39Z

bijection-scrooge/src/main/scala/com/twitter/bijection/scrooge/ScroogeCodecs.scala

+  }
+
+  override def apply(item: T) = thriftStructSerializer.toBytes(item)
+  override def invert(bytes: Array[Byte]) = attempt(bytes){ bytes =>


what about Macros.fastAttempt here to avoid the closure. :) Maybe no win, but it can't hurt.

Perf sensitive code, might as well take every little bit.

mosesn · 2015-06-03T05:14:39Z

This isn't feasible for dependency reasons, but it would be nice if we didn't duplicate work. We have an implementation of TBinaryProtocol in finagle to minimize allocations:

https://github.com/twitter/finagle/blob/master/finagle-thrift/src/main/scala/com/twitter/finagle/thrift/Protocols.scala#L51

It would be neat to be able to use your optimizations when possible too.

ianoc · 2015-06-03T05:40:45Z

Yeah alas not obvious for that, a scrooge-thrift type package with slim deps might be a good middle ground to use for everything

johnynek · 2015-06-03T19:34:16Z

looks good. Nice work.

Specialize the TBinaryProtocol read path up to ~2x speedups

isnotinvain · 2015-06-18T04:00:51Z

bijection-thrift/src/main/scala/com/twitter/bijection/thrift/TArrayBinaryProtocol.scala

+  def readString: String =
+    try {
+      val size = readI32
+      val s = new String(transport.buf, transport.bufferPos, size, "UTF-8")


you could try the trick used here:
https://github.com/apache/parquet-mr/blob/d6f082b9be5d507ff60c6bc83a179cc44015ab97/parquet-column/src/main/java/org/apache/parquet/io/api/Binary.java#L90

Charset.decode instead of new String

It doesn't mention any relative performance numbers, much faster?

If this is the one that I'm thinking of, up to 40x faster, but it only applies to java6 and was fixed in java7. But I could be wrong.

Well we are moving to java8, so not sure it matters much if fixed in java7

On Thu, Jun 18, 2015 at 11:32 AM, Alex Levenson notifications@github.com
wrote:

In
bijection-thrift/src/main/scala/com/twitter/bijection/thrift/TArrayBinaryProtocol.scala
#221 (comment):

((transport.buf(off + 1) & 0xffL) << 48) |

((transport.buf(off + 2) & 0xffL) << 40) |

((transport.buf(off + 3) & 0xffL) << 32) |

((transport.buf(off + 4) & 0xffL) << 24) |

((transport.buf(off + 5) & 0xffL) << 16) |

((transport.buf(off + 6) & 0xffL) << 8) |

((transport.buf(off + 7) & 0xffL))

}

def readDouble: Double =

java.lang.Double.longBitsToDouble(readI64)

def readString: String =

try {

val size = readI32

val s = new String(transport.buf, transport.bufferPos, size, "UTF-8")

If this is the one that I'm thinking of, up to 40x faster, but it only
applies to java6 and was fixed in java7. But I could be wrong.

—
Reply to this email directly or view it on GitHub
https://github.com/twitter/bijection/pull/221/files#r32762644.

I can't really tell, a read a blog post just now that said they fixed it for single-byte charsets, but not yet for multi-byte charsets like utf8.

Specialize the TBinaryProtocol read path up to ~2x speedups

b02a5e5

johnynek reviewed Jun 3, 2015
View reviewed changes

Comments, add files missed before

a15f74c

johnynek added a commit that referenced this pull request Jun 3, 2015

Merge pull request #221 from twitter/specializeTBinaryProtocolReading

846253c

Specialize the TBinaryProtocol read path up to ~2x speedups

johnynek merged commit 846253c into develop Jun 3, 2015

johnynek deleted the specializeTBinaryProtocolReading branch June 3, 2015 19:35

isnotinvain reviewed Jun 18, 2015
View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Specialize the TBinaryProtocol read path up to ~2x speedups #221

Specialize the TBinaryProtocol read path up to ~2x speedups #221

ianoc commented Jun 3, 2015

johnynek Jun 3, 2015

ianoc Jun 3, 2015

mosesn commented Jun 3, 2015

ianoc commented Jun 3, 2015

johnynek commented Jun 3, 2015

isnotinvain Jun 18, 2015

ianoc Jun 18, 2015

isnotinvain Jun 18, 2015

ianoc Jun 18, 2015

isnotinvain Jun 18, 2015

Specialize the TBinaryProtocol read path up to ~2x speedups #221

Specialize the TBinaryProtocol read path up to ~2x speedups #221

Conversation

ianoc commented Jun 3, 2015

johnynek Jun 3, 2015

Choose a reason for hiding this comment

ianoc Jun 3, 2015

Choose a reason for hiding this comment

mosesn commented Jun 3, 2015

ianoc commented Jun 3, 2015

johnynek commented Jun 3, 2015

isnotinvain Jun 18, 2015

Choose a reason for hiding this comment

ianoc Jun 18, 2015

Choose a reason for hiding this comment

isnotinvain Jun 18, 2015

Choose a reason for hiding this comment

ianoc Jun 18, 2015

Choose a reason for hiding this comment

isnotinvain Jun 18, 2015

Choose a reason for hiding this comment