-
Notifications
You must be signed in to change notification settings - Fork 981
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DRILL-7982: Timed out waiting for container port to open #2287
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @luocooong I have already added fix for that n scope of DRILL-7973.
in the following commit.
But since it is not only the one issue related to CI I didn't publish PR yet. I am going to fix CI failures in scope of that ticket.
.withEnv("CASSANDRA_SNITCH", "GossipingPropertyFileSnitch") // Tune Cassandra options for faster startup | ||
.withEnv("JVM_OPTS", "-Dcassandra.skip_wait_for_gossip_to_settle=0 -Dcassandra.initial_token=0") | ||
.withEnv("HEAP_NEWSIZE", "128M") | ||
.withEnv("MAX_HEAP_SIZE", "1024M"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have not much memory on Github CI. Do we need this additional params for our test container? What about Splunk test container then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vdiravka
The Cassandra tests seem to keep timing out and breaking the CI. My recollection of the Cassandra plugin was that Cassandra really does take a long time to start up, possibly more than 1 minute. This was why we implemented it to preserve the connection as long as possible rather than creating tons of new connections.
To my knowledge, the Splunk ones are working fine in the CI.
@luocooong Thanks for this. It makes me wonder, should we add the GossipingPropertyFileSnitch
parameter to the Cassandra plugin?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Cassandra tests seem to keep timing out and breaking the CI. My recollection of the Cassandra plugin was that Cassandra really does take a long time to start up, possibly more than 1 minute. This was why we implemented it to preserve the connection as long as possible rather than creating tons of new connections.
I am aware about timing out test of container for Cassandra plugin. I also fixed that several days ago in my branch. But I didn't merge it, because this is not only the one issue on CI and don't fix CI build. For instance the CI build is restarted several times for this PR and still not green.
To my knowledge, the Splunk ones are working fine in the CI.
My question was about memory options only in the above comment and luocooong
answered that he is going to eliminate them.
It makes me wonder, should we add the GossipingPropertyFileSnitch parameter to the Cassandra plugin?
GossipingPropertyFileSnitch looks Cassandra specific code only.
@vdiravka Thanks for the comments. I saw that the goal of DRILL-7973 is avoid the OOM (use the direct mem). This pull request is ready to speed up the C* startup. no conflict. I am going to remove the heap options if you already have new ways. (Actually, I think the new version of testcontainers will included these setting for desfult). |
@@ -65,7 +66,12 @@ public static void tearDownCluster() { | |||
|
|||
private static void startCassandra() { | |||
cassandra = new CassandraContainer<>("cassandra") | |||
.withInitScript("queries.cql"); | |||
.withInitScript("queries.cql") | |||
.withStartupTimeout(Duration.ofMinutes(3)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It look like 2m is enough. There is no one failure for Cassandra Test Container with this value
vdiravka@b5b3a17
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks fine. Please consider possible additional improvement in my comments
@@ -65,7 +66,10 @@ public static void tearDownCluster() { | |||
|
|||
private static void startCassandra() { | |||
cassandra = new CassandraContainer<>("cassandra") | |||
.withInitScript("queries.cql"); | |||
.withInitScript("queries.cql") | |||
.withStartupTimeout(Duration.ofMinutes(2)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please keep same indent for the methods call chain. For example:
HiveParquetTableMetadataProvider metadataProvider = builder
.withEntries(entries)
.withHivePartitionHolder(hivePartitionHolder)
.withHiveStoragePlugin(hiveStoragePlugin)
.withReaderConfig(readerConfig)
.withSchema(schema)
.build();
Line 78 in 4aefcef
.withEntries(entries) |
.withInitScript("queries.cql") | ||
.withStartupTimeout(Duration.ofMinutes(2)) | ||
.withEnv("CASSANDRA_SNITCH", "GossipingPropertyFileSnitch") // Tune Cassandra options for faster startup | ||
.withEnv("JVM_OPTS", "-Dcassandra.skip_wait_for_gossip_to_settle=0 -Dcassandra.initial_token=0"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly we can do event better here:
-Dcassandra.skip_wait_for_gossip_to_settle=0 -Dcassandra.load_ring_state=false -Dcassandra.initial_token=1 -Dcassandra.num_tokens=nil -Dcassandra.allocate_tokens_for_local_replication_factor=nil
https://github.com/sky-uk/cassandra-operator/issues/103#issuecomment-499040927
@luocooong @vdiravka |
@cgivre Can be merged. Let me try more options (by Vitalii) first. I will use the fastest parameter. |
Done. the current options is faster (included the C* startup and tests). Saved 10 seconds. current options (JDK 8), spend 48 seconds
current options (JDK 11), spend 40 seconds
more options (JDK 8), spend 58 seconds
more options (JDK 11), spend 38 seconds
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is too fast :) Even .withStartupTimeout(Duration.ofMinutes(2))
is not needed. But let's keep it to avoid CI fails in future.
+1 LGTM
DRILL-7982: Timed out waiting for container port to open
Description
If the default 60s timeout is not sufficient, it can be altered with the
withStartupTimeout()
method.See also Waiting for containers to start or be ready
Then, the
testcontainers
has merged new feature about theFor faster startup
.See also Tune Cassandra options for faster startup
Documentation
N/A
Testing
Need the unit tests can run 3 times without the
Timed out waiting for XXX
.