Skip to content
This repository has been archived by the owner. It is now read-only.

Use same load balancing policy as fortis-services #118

Merged
merged 1 commit into from Aug 28, 2017

Conversation

@c-w
Copy link
Contributor

@c-w c-w commented Aug 28, 2017

For some reason we're seeing ~80ms write times from fortis-services to Cassandra but multi-second write times from fortis-spark to Cassandra.

One of the differences between the two setups is that fortis-services uses the NodeJS connector which has a different default load-balancing policy than the Spark connector used in fortis-spark which assumes that Spark workers and Cassandra nodes are co-located on the same hosts (which is not the case in our deployment). This change makes the setup consistent so that both projects use the same load-balancing policy.

@c-w c-w requested a review from erikschlegel Aug 28, 2017
Copy link
Contributor

@jcjimenez jcjimenez left a comment

LGTM with question regarding copy-pasted code.


/**
* Copy-pasted version of DefaultConnectionFactory to over-write the load balancing policy
* There is only one change called out by START/END comments below
Copy link
Contributor

@jcjimenez jcjimenez Aug 28, 2017

What if we do something like this instead of copy-pasting so much code?

class FortisConnectionFactory extends CassandraConnectionFactory {
  override def createCluster(conf: CassandraConnectorConf): Cluster = {
    val lbp = new TokenAwarePolicy(new DCAwareRoundRobinPolicy.Builder().build())
    DefaultConnectionFactory.clusterBuilder(conf).withLoadBalancingPolicy(lbp).build()
  }
}

Copy link
Contributor Author

@c-w c-w Aug 28, 2017

Nice one. Fixed in e86952b.

@c-w
Copy link
Contributor Author

@c-w c-w commented Aug 28, 2017

Our tests pass but running this in Spark actually crashes:

C:\Repos\project-fortis-spark>spark-submit --class CounterDemo --master local[3] target/scala-2.11/project-fortis-spark-assembly-0.0.6.jar
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Exception in thread "main" java.lang.IllegalArgumentException: Singleton object not available: ConnectionFactory$
        at com.datastax.spark.connector.util.ReflectionUtil$.findGlobalObject(ReflectionUtil.scala:55)
        at com.datastax.spark.connector.cql.CassandraConnectionFactory$$anonfun$fromSparkConf$1.apply(CassandraConnectionFactory.scala:155)
        at com.datastax.spark.connector.cql.CassandraConnectionFactory$$anonfun$fromSparkConf$1.apply(CassandraConnectionFactory.scala:155)
        at scala.Option.map(Option.scala:146)
        at com.datastax.spark.connector.cql.CassandraConnectionFactory$.fromSparkConf(CassandraConnectionFactory.scala:155)

This is because for a Scala object (not class), getClass.getName appends a dollar sign at the end which leads the class lookup by name to fail.

Fixed this in 2f382aa.

@c-w c-w force-pushed the cassandra-load-balancing-policy branch from db9c4e7 to e86952b Aug 28, 2017
For some reason we're seeing ~80ms write times from fortis-services to
Cassandra but multi-second write times from fortis-spark to Cassandra.
One of the differences between the two setups is that fortis-services
uses the NodeJS connector which has a different default load-balancing
policy than the Spark connector used in fortis-spark which assumes that
Spark workers and Cassandra nodes are co-located on the same hosts
(which is not the case in our deployment). This change makes the setup
consistent so that both projects use the same load-balancing policy.
@c-w c-w force-pushed the cassandra-load-balancing-policy branch from e86952b to 2f382aa Aug 28, 2017
@c-w
Copy link
Contributor Author

@c-w c-w commented Aug 28, 2017

This is a ~50% performance improvement over the previous implementation but still orders of magnitude slower than it should be.

@c-w c-w merged commit 2e05e3a into master Aug 28, 2017
2 checks passed
@c-w c-w deleted the cassandra-load-balancing-policy branch Aug 28, 2017
@c-w c-w removed the in progress label Aug 28, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

2 participants