New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[#156] Share ActorSystem between Scassandra instances. #157

Merged
merged 6 commits into from Jun 2, 2016

Conversation

Projects
None yet
2 participants
@tolbertam
Contributor

tolbertam commented May 6, 2016

PoC for sharing ActorSystem between Scassandra instances. Dramatically reduces the number of threads/work done when running with multiple nodes.

The only downside that i can think of for this is that the ActorSystem is no longer torn down and instead are marked as daemon. We could add a shutdown method that tears it down, but then it can't be reused. Alternatively could make it mutable and allow it to be recreated after shut down.

Before:
screen shot 2016-05-06 at 4 42 03 pm

After:
screen shot 2016-05-06 at 4 41 19 pm

@tolbertam tolbertam changed the title from [156] Share ActorSystem between Scassandra instances. to [#156] Share ActorSystem between Scassandra instances. May 6, 2016

@tolbertam

This comment has been minimized.

Show comment
Hide comment
@tolbertam

tolbertam May 6, 2016

Contributor

Example showing benefit on a windows system. This might not be platform specific, but rather something wrong with virtualization, as I notice little difference in my native OS X environment.

Running RoundRobinPolicyTest#should_round_robin_irrespective_of_topology. This test creates 10 scassandra nodes.

Before (1.0.7)
SUCCESS: should_round_robin_irrespective_of_topology
Test   : 00:00:28
This PR
SUCCESS: should_round_robin_irrespective_of_topology
Test   : 00:00:07
Contributor

tolbertam commented May 6, 2016

Example showing benefit on a windows system. This might not be platform specific, but rather something wrong with virtualization, as I notice little difference in my native OS X environment.

Running RoundRobinPolicyTest#should_round_robin_irrespective_of_topology. This test creates 10 scassandra nodes.

Before (1.0.7)
SUCCESS: should_round_robin_irrespective_of_topology
Test   : 00:00:28
This PR
SUCCESS: should_round_robin_irrespective_of_topology
Test   : 00:00:07
@tolbertam

This comment has been minimized.

Show comment
Hide comment
@tolbertam

tolbertam May 9, 2016

Contributor

This is working pretty well but needs more refinement / testing.

Contributor

tolbertam commented May 9, 2016

This is working pretty well but needs more refinement / testing.

tolbertam added some commits May 6, 2016

[#158] Use akka-slf4j to allow logging configuration via slf4j.
Also reduce logging on server tests, remove application.conf
Speed up tests by closing Cluster instances quickly.
Each test method was delayed in completion for 2 seconds due to the
default graceful shutdown wait nature of netty.  Since there should be
no ongoing activity on the event loop when our tests finish, it is safe
to close.
@@ -51,7 +51,7 @@
private String host = "localhost";
private int port = 8043;
private int socketTimeout = 1000;
private int socketTimeout = 5000;

This comment has been minimized.

@tolbertam

tolbertam May 12, 2016

Contributor

Bumping of timeouts seems a bit ominous, but the main reason I did this was the window was pretty tight previously and tests would occasionally fail because of socket read timeouts. Could never reproduce the timeout issue locally, but happened from time to time in travis.

@tolbertam

tolbertam May 12, 2016

Contributor

Bumping of timeouts seems a bit ominous, but the main reason I did this was the window was pretty tight previously and tests would occasionally fail because of socket read timeouts. Could never reproduce the timeout issue locally, but happened from time to time in travis.

@@ -27,7 +29,15 @@
private Session session;
public CassandraExecutor21() {
cluster = Cluster.builder().addContactPoint(Config.NATIVE_HOST).withPort(Config.NATIVE_PORT)
NettyOptions closeQuickly = new NettyOptions() {

This comment has been minimized.

@tolbertam

tolbertam May 12, 2016

Contributor

The java-driver recently changed it's event loop shutdown to allow a (netty default) grace period of 2 seconds before shutting down the event loop to prevent rejected execution exceptions and such. Adding this cuts ~2 seconds off every test that shuts down a cluster at the end, which was quite a bit.

@tolbertam

tolbertam May 12, 2016

Contributor

The java-driver recently changed it's event loop shutdown to allow a (netty default) grace period of 2 seconds before shutting down the event loop to prevent rejected execution exceptions and such. Adding this cuts ~2 seconds off every test that shuts down a cluster at the end, which was quite a bit.

@@ -62,7 +62,7 @@ akka {
extensions = []
# Toggles whether threads created by this ActorSystem should be daemons or not
daemonic = off
daemonic = on

This comment has been minimized.

@tolbertam

tolbertam May 12, 2016

Contributor

This is done as the ActorSystem is now shared between all Scassandra instances, and thus does not shutdown. Making daemonic true allows the JVM to exit.

@tolbertam

tolbertam May 12, 2016

Contributor

This is done as the ActorSystem is now shared between all Scassandra instances, and thus does not shutdown. Making daemonic true allows the JVM to exit.

system.awaitTermination()
logger.info("Server is shut down")
def shutdown() = this.synchronized {
val stopped = gracefulStop(scassandra, startupTimeoutSeconds + 1 seconds, ShutdownServer(startupTimeoutSeconds seconds))

This comment has been minimized.

@tolbertam

tolbertam May 12, 2016

Contributor

Unfortunately using a PoisonPill here was not enough, it seemed that the actors created by the IO managers would not be guaranteed to be killed and unbound on their listening interfaces when the corresponding actors stopped (PrimingServer/TcpServer), so instead I created a ShutdownServer message that causes the PrimingServer and TcpServer actors to first send an Unbind to their IO listeners, wait for Unbound and then send a PoisonPill to themselves as part of a shutdown process. This ensures any listening interfaces die before the corresponding actors die. This wasn't needed previously as the entire ActorSystem was shut down as part of the process.

@tolbertam

tolbertam May 12, 2016

Contributor

Unfortunately using a PoisonPill here was not enough, it seemed that the actors created by the IO managers would not be guaranteed to be killed and unbound on their listening interfaces when the corresponding actors stopped (PrimingServer/TcpServer), so instead I created a ShutdownServer message that causes the PrimingServer and TcpServer actors to first send an Unbind to their IO listeners, wait for Unbound and then send a PoisonPill to themselves as part of a shutdown process. This ensures any listening interfaces die before the corresponding actors die. This wasn't needed previously as the entire ActorSystem was shut down as part of the process.

activityLog: ActivityLog,
manager : Option[ActorRef]) extends Actor with ActorLogging {
def this(listenAddress: String, port: Int,

This comment has been minimized.

@tolbertam

tolbertam May 12, 2016

Contributor

Seems weird to have 2 constructors here when i could have given a default value to manager, but unfortunately Akka does not seem to handle when Properties are provided without a value for a argument that has a default value.

@tolbertam

tolbertam May 12, 2016

Contributor

Seems weird to have 2 constructors here when i could have given a default value to manager, but unfortunately Akka does not seem to handle when Properties are provided without a value for a argument that has a default value.

@@ -1,1183 +0,0 @@
scassandra.binary.port=8042

This comment has been minimized.

@tolbertam

tolbertam May 12, 2016

Contributor

Not needed as application.conf pulled from src/main instead apparently.

@tolbertam

tolbertam May 12, 2016

Contributor

Not needed as application.conf pulled from src/main instead apparently.

@tolbertam

This comment has been minimized.

Show comment
Hide comment
@tolbertam

tolbertam May 12, 2016

Contributor

This should be ready for review now that I've worked out all the kinks

Contributor

tolbertam commented May 12, 2016

This should be ready for review now that I've worked out all the kinks

cache:
directories:
- "$HOME/.gradle/caches"

This comment has been minimized.

@chbatey

chbatey May 24, 2016

Member

Nice. I hadn't seen this feature of travis

@chbatey

chbatey May 24, 2016

Member

Nice. I hadn't seen this feature of travis

This comment has been minimized.

@tolbertam

tolbertam May 24, 2016

Contributor

It seems to shave some time off the build which is nice. It is probably is a net positive for the travis infra as well as they don't have to go out and fetch the deps every time too.

@tolbertam

tolbertam May 24, 2016

Contributor

It seems to shave some time off the build which is nice. It is probably is a net positive for the travis infra as well as they don't have to go out and fetch the deps every time too.

Show outdated Hide outdated ...-client/src/test/java/org/scassandra/http/client/ActivityClientTest.java
@@ -112,11 +112,11 @@ public void testErrorDuringConnectionRetrieval() {
//then
}
@Test(expected = ActivityRequestFailed.class, timeout = 2500)
@Test(expected = ActivityRequestFailed.class, timeout = 10000)

This comment has been minimized.

@chbatey

chbatey May 24, 2016

Member

GIven this is just a http client test why have you increased the fixedDelay?

@chbatey

chbatey May 24, 2016

Member

GIven this is just a http client test why have you increased the fixedDelay?

This comment has been minimized.

@tolbertam

tolbertam May 24, 2016

Contributor

I increased the base timeouts in the clients from 1 to 5 seconds as I was seeing that travis was a bit finicky from time to time with actual timeouts occurring when we didn't want them to. In the interest of expedience, maybe I can tune the timeout lower just for those tests so they don't wait 5 secs a piece.

@tolbertam

tolbertam May 24, 2016

Contributor

I increased the base timeouts in the clients from 1 to 5 seconds as I was seeing that travis was a bit finicky from time to time with actual timeouts occurring when we didn't want them to. In the interest of expedience, maybe I can tune the timeout lower just for those tests so they don't wait 5 secs a piece.

Show outdated Hide outdated java-client/src/test/java/org/scassandra/http/client/CurrentClientTest.java
@@ -195,10 +195,10 @@ public void testErrorDuringConnectionRetrieval() {
underTest.getConnections();
}
@Test(expected = ConnectionsRequestFailed.class, timeout = 2500)
@Test(expected = ConnectionsRequestFailed.class, timeout = 10000)

This comment has been minimized.

@chbatey

chbatey May 24, 2016

Member

Same question.

@chbatey

chbatey May 24, 2016

Member

Same question.

@@ -26,8 +28,16 @@
private Session session;
public CassandraExecutor20() {
NettyOptions closeQuickly = new NettyOptions() {

This comment has been minimized.

@chbatey

chbatey May 24, 2016

Member

Nice

@chbatey
Show outdated Hide outdated server/src/main/scala/org/scassandra/server/ScassandraServer.scala
f.map { _ => self ? PoisonPill } pipeTo sender
}
}
}

This comment has been minimized.

@chbatey

chbatey May 24, 2016

Member

New line

@chbatey

chbatey May 24, 2016

Member

New line

This comment has been minimized.

@tolbertam

tolbertam May 24, 2016

Contributor

D'oh, I can never seem to get that right in my git config, will fix.

@tolbertam

tolbertam May 24, 2016

Contributor

D'oh, I can never seem to get that right in my git config, will fix.

// wait indefinitely or until interrupted.
val obj = new Object()
obj.synchronized { obj.wait(); }

This comment has been minimized.

@chbatey

chbatey May 24, 2016

Member

Does a signal to the process cause this thread to be interrupted?

@chbatey

chbatey May 24, 2016

Member

Does a signal to the process cause this thread to be interrupted?

This comment has been minimized.

@tolbertam

tolbertam May 24, 2016

Contributor

It should since wait throws InterruptedException, however I should definitely validate this.

@tolbertam

tolbertam May 24, 2016

Contributor

It should since wait throws InterruptedException, however I should definitely validate this.

This comment has been minimized.

@tolbertam

tolbertam May 24, 2016

Contributor

It does cause the thread to be interrupted and for the process to ultimately exit (which is desired I think).

@tolbertam

tolbertam May 24, 2016

Contributor

It does cause the thread to be interrupted and for the process to ultimately exit (which is desired I think).

system.awaitTermination()
var scassandra: ActorRef = _
def start() = this.synchronized {

This comment has been minimized.

@chbatey

chbatey May 24, 2016

Member

Do we have multiple threads calling this?

@chbatey

chbatey May 24, 2016

Member

Do we have multiple threads calling this?

This comment has been minimized.

@tolbertam

tolbertam May 24, 2016

Contributor

Highly unlikely, it's up to the caller for the most part. We could do something clever like make this a typed actor (which will provide a similar API, but manage all the synchronization & lifecycle for us), although not sure whats available with the version of akka we are using.

@tolbertam

tolbertam May 24, 2016

Contributor

Highly unlikely, it's up to the caller for the most part. We could do something clever like make this a typed actor (which will provide a similar API, but manage all the synchronization & lifecycle for us), although not sure whats available with the version of akka we are using.

@chbatey

This comment has been minimized.

Show comment
Hide comment
@chbatey

chbatey May 24, 2016

Member

LGTM apart from a couple of nits/questions

Member

chbatey commented May 24, 2016

LGTM apart from a couple of nits/questions

@tolbertam

This comment has been minimized.

Show comment
Hide comment
@tolbertam

tolbertam May 24, 2016

Contributor

Thank you for the feedback, I'll address all of these today :)

Contributor

tolbertam commented May 24, 2016

Thank you for the feedback, I'll address all of these today :)

Show outdated Hide outdated server/src/main/scala/org/scassandra/server/ScassandraServer.scala
implicit val t: Timeout = timeout
import context.dispatcher
val f = for {

This comment has been minimized.

@tolbertam

tolbertam May 24, 2016

Contributor

I should probably change this to use Future.sequence instead, a for comprehension seems like overkill here.

@tolbertam

tolbertam May 24, 2016

Contributor

I should probably change this to use Future.sequence instead, a for comprehension seems like overkill here.

Process review feedback.
- Add endline to ScassandraServer.scala.
- Use Future.sequence instead of for comprehension.
- Lower timeouts in tests.
- Rename ConnectionsClientBuilder to CurrentClientBuilder to match
  CurrentClient.
@tolbertam

This comment has been minimized.

Show comment
Hide comment
@tolbertam

tolbertam May 24, 2016

Contributor

Added 5d36a8d to address feedback. I also noticed ConnectionsClientBuilder was improperly named so I renamed it CurrentClientBuilder. Will squash and merge if you think it looks good :)

Contributor

tolbertam commented May 24, 2016

Added 5d36a8d to address feedback. I also noticed ConnectionsClientBuilder was improperly named so I renamed it CurrentClientBuilder. Will squash and merge if you think it looks good :)

@chbatey

This comment has been minimized.

Show comment
Hide comment
@chbatey

chbatey Jun 2, 2016

Member

LGTM

Member

chbatey commented Jun 2, 2016

LGTM

@chbatey chbatey merged commit 077dda8 into scassandra:master Jun 2, 2016

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

tolbertam added a commit to datastax/java-driver that referenced this pull request Jun 30, 2016

Enhance CCMBridge to better support Windows
Use powershell instead of cmd.

Quotes jvmArgs as windows is a bit more picky about quotes.

Use --quiet-windows on start, this is required for C* versions post
2.2.4* to limit stderr/stdout output from C* on windows.

Also upgrades scassandra to 1.0.9 for better windows support.  See:
scassandra/scassandra-server#157 for more
details.

tolbertam added a commit to datastax/java-driver that referenced this pull request Jul 6, 2016

Enhance CCMBridge to better support Windows
Use powershell instead of cmd.

Quotes jvmArgs as windows is a bit more picky about quotes.

Use --quiet-windows on start, this is required for C* versions post
2.2.4* to limit stderr/stdout output from C* on windows.

Also upgrades scassandra to 1.0.9 for better windows support.  See:
scassandra/scassandra-server#157 for more
details.

tolbertam added a commit to datastax/java-driver that referenced this pull request Jul 8, 2016

Enhance CCMBridge to better support Windows
Use powershell instead of cmd.

Quotes jvmArgs as windows is a bit more picky about quotes.

Use --quiet-windows on start, this is required for C* versions post
2.2.4* to limit stderr/stdout output from C* on windows.

Also upgrades scassandra to 1.0.9 for better windows support.  See:
scassandra/scassandra-server#157 for more
details.

tolbertam added a commit to datastax/java-driver that referenced this pull request Jul 8, 2016

Enhance CCMBridge to better support Windows
Use powershell instead of cmd.

Quotes jvmArgs as windows is a bit more picky about quotes.

Use --quiet-windows on start, this is required for C* versions post
2.2.4* to limit stderr/stdout output from C* on windows.

Also upgrades scassandra to 1.0.9 for better windows support.  See:
scassandra/scassandra-server#157 for more
details.

tolbertam added a commit to datastax/java-driver that referenced this pull request Jul 11, 2016

Enhance CCMBridge to better support Windows
Use powershell instead of cmd.

Quotes jvmArgs as windows is a bit more picky about quotes.

Use --quiet-windows on start, this is required for C* versions post
2.2.4* to limit stderr/stdout output from C* on windows.

Also upgrades scassandra to 1.0.9 for better windows support.  See:
scassandra/scassandra-server#157 for more
details.

tolbertam added a commit to datastax/java-driver that referenced this pull request Jul 11, 2016

Enhance CCMBridge to better support Windows
Use powershell instead of cmd.

Quotes jvmArgs as windows is a bit more picky about quotes.

Use --quiet-windows on start, this is required for C* versions post
2.2.4* to limit stderr/stdout output from C* on windows.

Also upgrades scassandra to 1.0.9 for better windows support.  See:
scassandra/scassandra-server#157 for more
details.

tolbertam added a commit to datastax/java-driver that referenced this pull request Jul 12, 2016

Enhance CCMBridge to better support Windows
Use powershell instead of cmd.

Quotes jvmArgs as windows is a bit more picky about quotes.

Use --quiet-windows on start, this is required for C* versions post
2.2.4* to limit stderr/stdout output from C* on windows.

Also upgrades scassandra to 1.0.9 for better windows support.  See:
scassandra/scassandra-server#157 for more
details.

olim7t added a commit to datastax/java-driver that referenced this pull request Jul 13, 2016

Enhance CCMBridge to better support Windows
Use powershell instead of cmd.

Quotes jvmArgs as windows is a bit more picky about quotes.

Use --quiet-windows on start, this is required for C* versions post
2.2.4* to limit stderr/stdout output from C* on windows.

Also upgrades scassandra to 1.0.9 for better windows support.  See:
scassandra/scassandra-server#157 for more
details.

normanmaurer added a commit to normanmaurer/java-driver that referenced this pull request Aug 15, 2016

Enhance CCMBridge to better support Windows
Use powershell instead of cmd.

Quotes jvmArgs as windows is a bit more picky about quotes.

Use --quiet-windows on start, this is required for C* versions post
2.2.4* to limit stderr/stdout output from C* on windows.

Also upgrades scassandra to 1.0.9 for better windows support.  See:
scassandra/scassandra-server#157 for more
details.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment