Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[#156] Share ActorSystem between Scassandra instances. #157

Merged
merged 6 commits into from Jun 2, 2016

Conversation

@tolbertam
Copy link
Contributor

tolbertam commented May 6, 2016

PoC for sharing ActorSystem between Scassandra instances. Dramatically reduces the number of threads/work done when running with multiple nodes.

The only downside that i can think of for this is that the ActorSystem is no longer torn down and instead are marked as daemon. We could add a shutdown method that tears it down, but then it can't be reused. Alternatively could make it mutable and allow it to be recreated after shut down.

Before:
screen shot 2016-05-06 at 4 42 03 pm

After:
screen shot 2016-05-06 at 4 41 19 pm

@tolbertam tolbertam changed the title [156] Share ActorSystem between Scassandra instances. [#156] Share ActorSystem between Scassandra instances. May 6, 2016
@tolbertam
Copy link
Contributor Author

tolbertam commented May 6, 2016

Example showing benefit on a windows system. This might not be platform specific, but rather something wrong with virtualization, as I notice little difference in my native OS X environment.

Running RoundRobinPolicyTest#should_round_robin_irrespective_of_topology. This test creates 10 scassandra nodes.

Before (1.0.7)
SUCCESS: should_round_robin_irrespective_of_topology
Test   : 00:00:28
This PR
SUCCESS: should_round_robin_irrespective_of_topology
Test   : 00:00:07
@tolbertam
Copy link
Contributor Author

tolbertam commented May 9, 2016

This is working pretty well but needs more refinement / testing.

tolbertam added 5 commits May 6, 2016
Also reduce logging on server tests, remove application.conf
Each test method was delayed in completion for 2 seconds due to the
default graceful shutdown wait nature of netty.  Since there should be
no ongoing activity on the event loop when our tests finish, it is safe
to close.
@@ -51,7 +51,7 @@
private String host = "localhost";

private int port = 8043;
private int socketTimeout = 1000;
private int socketTimeout = 5000;

This comment has been minimized.

Copy link
@tolbertam

tolbertam May 12, 2016

Author Contributor

Bumping of timeouts seems a bit ominous, but the main reason I did this was the window was pretty tight previously and tests would occasionally fail because of socket read timeouts. Could never reproduce the timeout issue locally, but happened from time to time in travis.

@@ -27,7 +29,15 @@
private Session session;

public CassandraExecutor21() {
cluster = Cluster.builder().addContactPoint(Config.NATIVE_HOST).withPort(Config.NATIVE_PORT)
NettyOptions closeQuickly = new NettyOptions() {

This comment has been minimized.

Copy link
@tolbertam

tolbertam May 12, 2016

Author Contributor

The java-driver recently changed it's event loop shutdown to allow a (netty default) grace period of 2 seconds before shutting down the event loop to prevent rejected execution exceptions and such. Adding this cuts ~2 seconds off every test that shuts down a cluster at the end, which was quite a bit.

@@ -62,7 +62,7 @@ akka {
extensions = []

# Toggles whether threads created by this ActorSystem should be daemons or not
daemonic = off
daemonic = on

This comment has been minimized.

Copy link
@tolbertam

tolbertam May 12, 2016

Author Contributor

This is done as the ActorSystem is now shared between all Scassandra instances, and thus does not shutdown. Making daemonic true allows the JVM to exit.

system.awaitTermination()
logger.info("Server is shut down")
def shutdown() = this.synchronized {
val stopped = gracefulStop(scassandra, startupTimeoutSeconds + 1 seconds, ShutdownServer(startupTimeoutSeconds seconds))

This comment has been minimized.

Copy link
@tolbertam

tolbertam May 12, 2016

Author Contributor

Unfortunately using a PoisonPill here was not enough, it seemed that the actors created by the IO managers would not be guaranteed to be killed and unbound on their listening interfaces when the corresponding actors stopped (PrimingServer/TcpServer), so instead I created a ShutdownServer message that causes the PrimingServer and TcpServer actors to first send an Unbind to their IO listeners, wait for Unbound and then send a PoisonPill to themselves as part of a shutdown process. This ensures any listening interfaces die before the corresponding actors die. This wasn't needed previously as the entire ActorSystem was shut down as part of the process.

activityLog: ActivityLog,
manager : Option[ActorRef]) extends Actor with ActorLogging {

def this(listenAddress: String, port: Int,

This comment has been minimized.

Copy link
@tolbertam

tolbertam May 12, 2016

Author Contributor

Seems weird to have 2 constructors here when i could have given a default value to manager, but unfortunately Akka does not seem to handle when Properties are provided without a value for a argument that has a default value.

@@ -1,1183 +0,0 @@
scassandra.binary.port=8042

This comment has been minimized.

Copy link
@tolbertam

tolbertam May 12, 2016

Author Contributor

Not needed as application.conf pulled from src/main instead apparently.

@tolbertam
Copy link
Contributor Author

tolbertam commented May 12, 2016

This should be ready for review now that I've worked out all the kinks


cache:
directories:
- "$HOME/.gradle/caches"

This comment has been minimized.

Copy link
@chbatey

chbatey May 24, 2016

Member

Nice. I hadn't seen this feature of travis

This comment has been minimized.

Copy link
@tolbertam

tolbertam May 24, 2016

Author Contributor

It seems to shave some time off the build which is nice. It is probably is a net positive for the travis infra as well as they don't have to go out and fetch the deps every time too.

@@ -112,11 +112,11 @@ public void testErrorDuringConnectionRetrieval() {
//then
}

@Test(expected = ActivityRequestFailed.class, timeout = 2500)
@Test(expected = ActivityRequestFailed.class, timeout = 10000)

This comment has been minimized.

Copy link
@chbatey

chbatey May 24, 2016

Member

GIven this is just a http client test why have you increased the fixedDelay?

This comment has been minimized.

Copy link
@tolbertam

tolbertam May 24, 2016

Author Contributor

I increased the base timeouts in the clients from 1 to 5 seconds as I was seeing that travis was a bit finicky from time to time with actual timeouts occurring when we didn't want them to. In the interest of expedience, maybe I can tune the timeout lower just for those tests so they don't wait 5 secs a piece.

@@ -195,10 +195,10 @@ public void testErrorDuringConnectionRetrieval() {
underTest.getConnections();
}

@Test(expected = ConnectionsRequestFailed.class, timeout = 2500)
@Test(expected = ConnectionsRequestFailed.class, timeout = 10000)

This comment has been minimized.

Copy link
@chbatey

chbatey May 24, 2016

Member

Same question.

@@ -26,8 +28,16 @@
private Session session;

public CassandraExecutor20() {
NettyOptions closeQuickly = new NettyOptions() {

This comment has been minimized.

Copy link
@chbatey
f.map { _ => self ? PoisonPill } pipeTo sender
}
}
}

This comment has been minimized.

Copy link
@chbatey

chbatey May 24, 2016

Member

New line

This comment has been minimized.

Copy link
@tolbertam

tolbertam May 24, 2016

Author Contributor

D'oh, I can never seem to get that right in my git config, will fix.


// wait indefinitely or until interrupted.
val obj = new Object()
obj.synchronized { obj.wait(); }

This comment has been minimized.

Copy link
@chbatey

chbatey May 24, 2016

Member

Does a signal to the process cause this thread to be interrupted?

This comment has been minimized.

Copy link
@tolbertam

tolbertam May 24, 2016

Author Contributor

It should since wait throws InterruptedException, however I should definitely validate this.

This comment has been minimized.

Copy link
@tolbertam

tolbertam May 24, 2016

Author Contributor

It does cause the thread to be interrupted and for the process to ultimately exit (which is desired I think).

system.awaitTermination()
var scassandra: ActorRef = _

def start() = this.synchronized {

This comment has been minimized.

Copy link
@chbatey

chbatey May 24, 2016

Member

Do we have multiple threads calling this?

This comment has been minimized.

Copy link
@tolbertam

tolbertam May 24, 2016

Author Contributor

Highly unlikely, it's up to the caller for the most part. We could do something clever like make this a typed actor (which will provide a similar API, but manage all the synchronization & lifecycle for us), although not sure whats available with the version of akka we are using.

@chbatey
Copy link
Member

chbatey commented May 24, 2016

LGTM apart from a couple of nits/questions

@tolbertam
Copy link
Contributor Author

tolbertam commented May 24, 2016

Thank you for the feedback, I'll address all of these today :)

implicit val t: Timeout = timeout
import context.dispatcher

val f = for {

This comment has been minimized.

Copy link
@tolbertam

tolbertam May 24, 2016

Author Contributor

I should probably change this to use Future.sequence instead, a for comprehension seems like overkill here.

- Add endline to ScassandraServer.scala.
- Use Future.sequence instead of for comprehension.
- Lower timeouts in tests.
- Rename ConnectionsClientBuilder to CurrentClientBuilder to match
  CurrentClient.
@tolbertam
Copy link
Contributor Author

tolbertam commented May 24, 2016

Added 5d36a8d to address feedback. I also noticed ConnectionsClientBuilder was improperly named so I renamed it CurrentClientBuilder. Will squash and merge if you think it looks good :)

@chbatey
Copy link
Member

chbatey commented Jun 2, 2016

LGTM

@chbatey chbatey merged commit 077dda8 into scassandra:master Jun 2, 2016
1 check passed
1 check passed
continuous-integration/travis-ci/pr The Travis CI build passed
Details
tolbertam added a commit to datastax/java-driver that referenced this pull request Jun 30, 2016
Use powershell instead of cmd.

Quotes jvmArgs as windows is a bit more picky about quotes.

Use --quiet-windows on start, this is required for C* versions post
2.2.4* to limit stderr/stdout output from C* on windows.

Also upgrades scassandra to 1.0.9 for better windows support.  See:
scassandra/scassandra-server#157 for more
details.
tolbertam added a commit to datastax/java-driver that referenced this pull request Jul 6, 2016
Use powershell instead of cmd.

Quotes jvmArgs as windows is a bit more picky about quotes.

Use --quiet-windows on start, this is required for C* versions post
2.2.4* to limit stderr/stdout output from C* on windows.

Also upgrades scassandra to 1.0.9 for better windows support.  See:
scassandra/scassandra-server#157 for more
details.
tolbertam added a commit to datastax/java-driver that referenced this pull request Jul 8, 2016
Use powershell instead of cmd.

Quotes jvmArgs as windows is a bit more picky about quotes.

Use --quiet-windows on start, this is required for C* versions post
2.2.4* to limit stderr/stdout output from C* on windows.

Also upgrades scassandra to 1.0.9 for better windows support.  See:
scassandra/scassandra-server#157 for more
details.
tolbertam added a commit to datastax/java-driver that referenced this pull request Jul 8, 2016
Use powershell instead of cmd.

Quotes jvmArgs as windows is a bit more picky about quotes.

Use --quiet-windows on start, this is required for C* versions post
2.2.4* to limit stderr/stdout output from C* on windows.

Also upgrades scassandra to 1.0.9 for better windows support.  See:
scassandra/scassandra-server#157 for more
details.
tolbertam added a commit to datastax/java-driver that referenced this pull request Jul 11, 2016
Use powershell instead of cmd.

Quotes jvmArgs as windows is a bit more picky about quotes.

Use --quiet-windows on start, this is required for C* versions post
2.2.4* to limit stderr/stdout output from C* on windows.

Also upgrades scassandra to 1.0.9 for better windows support.  See:
scassandra/scassandra-server#157 for more
details.
tolbertam added a commit to datastax/java-driver that referenced this pull request Jul 11, 2016
Use powershell instead of cmd.

Quotes jvmArgs as windows is a bit more picky about quotes.

Use --quiet-windows on start, this is required for C* versions post
2.2.4* to limit stderr/stdout output from C* on windows.

Also upgrades scassandra to 1.0.9 for better windows support.  See:
scassandra/scassandra-server#157 for more
details.
tolbertam added a commit to datastax/java-driver that referenced this pull request Jul 12, 2016
Use powershell instead of cmd.

Quotes jvmArgs as windows is a bit more picky about quotes.

Use --quiet-windows on start, this is required for C* versions post
2.2.4* to limit stderr/stdout output from C* on windows.

Also upgrades scassandra to 1.0.9 for better windows support.  See:
scassandra/scassandra-server#157 for more
details.
olim7t added a commit to datastax/java-driver that referenced this pull request Jul 13, 2016
Use powershell instead of cmd.

Quotes jvmArgs as windows is a bit more picky about quotes.

Use --quiet-windows on start, this is required for C* versions post
2.2.4* to limit stderr/stdout output from C* on windows.

Also upgrades scassandra to 1.0.9 for better windows support.  See:
scassandra/scassandra-server#157 for more
details.
normanmaurer added a commit to normanmaurer/java-driver that referenced this pull request Aug 15, 2016
Use powershell instead of cmd.

Quotes jvmArgs as windows is a bit more picky about quotes.

Use --quiet-windows on start, this is required for C* versions post
2.2.4* to limit stderr/stdout output from C* on windows.

Also upgrades scassandra to 1.0.9 for better windows support.  See:
scassandra/scassandra-server#157 for more
details.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

2 participants
You can’t perform that action at this time.