Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

Use Fastutil Collections in Performance Critical Areas #4

Merged
merged 28 commits into from

5 participants

@ningliang

Fastutil collections (http://fastutil.dsi.unimi.it/) are more efficient versions of java/scala collection classes. See the website for more details.

For this patch, I've replaced the data structures in the code paths relevant to walks and traversals with fastutil data structures.

@stuhood
Collaborator

How much of an improvement do you see?

@ningliang

I'm about to test this one in our internal service. I expect a large improvement in our young gen pause time. Currently, scala lists keep 2 objects for every primitive element - one "::" object and another to box the element. In our random walk logic, we keep track of all List[Int] paths leading to every traversed node during the walk, which results in a very large # of objects.

I can say that it lowered the avg response latency from 50ms to 1ms for another code path in our service, which was using a mutable.Hash[Int, Int] instead of fastutil's Int2IntOpenHashMap. This change also increased the throughput (with respect to garbage collection) from 90% to 96%, and reduced young gc latency by 40%.

@caniszczyk
Owner

If we decide to pull this in, please update the NOTICE file with the attribution.

fastutil is under APLv2 so we have no issues bringing it in

@ningliang

Done, thanks for taking a look.

@dongwang218 dongwang218 commented on the diff
.../twitter/cassovary/graph/DirectedPathCollection.scala
((62 lines not shown))
/**
* @return the total number of unique paths in this collection
*/
def totalNumPaths = {
- pathCountsPerId.keysIterator.foldLeft(0) { case (tot, node) => tot + numUniquePathsTill(node) }
+ var sum = 0

could you explain, why changing to this non-functional style?

Mainly for performance. The scala foldLeft is substantially slower and less efficient object creation wise than a plain old while loop.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@dongwang218 dongwang218 commented on the diff
...in/scala/com/twitter/cassovary/graph/GraphUtils.scala
@@ -34,11 +38,10 @@ class GraphUtils(val graph: Graph) {
* @return Seq of tourist-specific-returned information
*/
- def walk(nodes: Iterator[Node], tourists: Seq[NodeTourist[Any]]) = {

API changes are not backward compatible, please increase the version to 2.0.0

Added, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
project/build.properties
@@ -3,6 +3,6 @@
project.organization=com.twitter
project.name=cassovary
sbt.version=0.7.4
-project.version=1.0.2-SNAPSHOT
+project.version=2.0.0-SNAPSHOT

change this to
2.0.1-SNAPSHOT

change
project/release.properties
2.0.0

Cool, updated

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@dongwang218 dongwang218 merged commit 938f01f into from
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on Mar 27, 2012
  1. @ningtwitter
  2. @ningtwitter
  3. @ningtwitter
Commits on Mar 28, 2012
  1. @ningtwitter
  2. @ningtwitter
  3. @ningtwitter
Commits on Mar 30, 2012
  1. @ningtwitter
  2. @ningtwitter

    almost compiling...

    ningtwitter authored
Commits on Apr 1, 2012
  1. @ningtwitter
Commits on Apr 2, 2012
  1. @ningtwitter
  2. @ningtwitter

    almost compiling

    ningtwitter authored
  3. @ningtwitter
  4. @ningtwitter

    tests almost all passing

    ningtwitter authored
  5. @ningtwitter

    more tests passing

    ningtwitter authored
  6. @ningtwitter

    more tests passing

    ningtwitter authored
  7. @ningtwitter

    more tests passing

    ningtwitter authored
  8. @ningtwitter

    all tests passing

    ningtwitter authored
  9. @ningtwitter
Commits on Apr 3, 2012
  1. @ningtwitter
  2. @ningtwitter
  3. @ningtwitter

    tests passing

    ningtwitter authored
  4. @ningtwitter
  5. @ningtwitter
  6. @ningtwitter

    secondary order on node id in counters. change tests to check for ord…

    ningtwitter authored
    …er inck for order in visits counter results.
  7. @ningtwitter

    update notice

    ningtwitter authored
Commits on Apr 11, 2012
  1. @ningtwitter
  2. @ningtwitter
Commits on Apr 14, 2012
  1. @ningtwitter

    update version info

    ningtwitter authored
Something went wrong with that request. Please try again.