New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

running migration fails if query timeouts #126

Closed
ostefano opened this Issue Jun 27, 2017 · 4 comments

Comments

Projects
None yet
2 participants
@ostefano

ostefano commented Jun 27, 2017

DDL statements (when using Cassandra as storage) can be quite expensive, so they might take more time than usual. Current 'master' fails migration if any of the migration scripts timeouts.

Exception in thread "main" org.cognitor.cassandra.migration.MigrationException: Error during migration of script 003_switch_to_uuids.cql while executing 'DROP TABLE IF EXISTS repair_run_by_cluster'
	at org.cognitor.cassandra.migration.Database.execute(Database.java:159)
	at java.util.ArrayList.forEach(ArrayList.java:1249)
	at org.cognitor.cassandra.migration.MigrationTask.migrate(MigrationTask.java:52)
	at com.spotify.reaper.storage.CassandraStorage.<init>(CassandraStorage.java:96)
	at com.spotify.reaper.ReaperApplication.initializeStorage(ReaperApplication.java:202)
	at com.spotify.reaper.ReaperApplication.run(ReaperApplication.java:129)
	at com.spotify.reaper.ReaperApplication.run(ReaperApplication.java:64)
	at io.dropwizard.cli.EnvironmentCommand.run(EnvironmentCommand.java:43)
	at io.dropwizard.cli.ConfiguredCommand.run(ConfiguredCommand.java:85)
	at io.dropwizard.cli.Cli.run(Cli.java:75)
	at io.dropwizard.Application.run(Application.java:79)
	at com.spotify.reaper.ReaperApplication.main(ReaperApplication.java:84)
Caused by: com.datastax.driver.core.exceptions.OperationTimedOutException: [/10.12.81.4:9042] Timed out waiting for server response
	at com.datastax.driver.core.exceptions.OperationTimedOutException.copy(OperationTimedOutException.java:44)
	at com.datastax.driver.core.exceptions.OperationTimedOutException.copy(OperationTimedOutException.java:26)
	at com.datastax.driver.core.DriverThrowables.propagateCause(DriverThrowables.java:37)
	at com.datastax.driver.core.DefaultResultSetFuture.getUninterruptibly(DefaultResultSetFuture.java:245)
	at com.datastax.driver.core.AbstractSession.execute(AbstractSession.java:64)
	at org.cognitor.cassandra.migration.Database.executeStatement(Database.java:167)
	at org.cognitor.cassandra.migration.Database.execute(Database.java:151)
	... 11 more
Caused by: com.datastax.driver.core.exceptions.OperationTimedOutException: [/10.12.81.4:9042] Timed out waiting for server response
	at com.datastax.driver.core.RequestHandler$SpeculativeExecution.onTimeout(RequestHandler.java:766)
	at com.datastax.driver.core.Connection$ResponseHandler$1.run(Connection.java:1267)
	at io.netty.util.HashedWheelTimer$HashedWheelTimeout.expire(HashedWheelTimer.java:581)
	at io.netty.util.HashedWheelTimer$HashedWheelBucket.expireTimeouts(HashedWheelTimer.java:655)
	at io.netty.util.HashedWheelTimer$Worker.run(HashedWheelTimer.java:367)
	at java.lang.Thread.run(Thread.java:748)
@adejanovski

This comment has been minimized.

Show comment
Hide comment
@adejanovski

adejanovski Jun 28, 2017

Contributor

Hi @ostefano,

we need all migrations to be performed before running Reaper, otherwise it cannot work properly.
What you can do is perform the migrations yourself and update the schema_migration table by hand : https://github.com/thelastpickle/cassandra-reaper/tree/master/src/main/resources/db/cassandra

You need to match the version numbers and have applied_successful set to True.

Contributor

adejanovski commented Jun 28, 2017

Hi @ostefano,

we need all migrations to be performed before running Reaper, otherwise it cannot work properly.
What you can do is perform the migrations yourself and update the schema_migration table by hand : https://github.com/thelastpickle/cassandra-reaper/tree/master/src/main/resources/db/cassandra

You need to match the version numbers and have applied_successful set to True.

@ostefano

This comment has been minimized.

Show comment
Hide comment
@ostefano

ostefano Jun 28, 2017

Hi @adejanovski ,

yes, makes sense. I was wondering if we could increase the timeout so for that query to succeed on busy clusters.

ostefano commented Jun 28, 2017

Hi @adejanovski ,

yes, makes sense. I was wondering if we could increase the timeout so for that query to succeed on busy clusters.

@adejanovski

This comment has been minimized.

Show comment
Hide comment
@adejanovski

adejanovski Jun 30, 2017

Contributor

You can raise the timeout on the client side by changing the read timeout in the SocketOptions :

cassandra:
  clusterName: "test"
  contactPoints: ["127.0.0.1"]
  keyspace: reaper_db
  queryOptions:
    consistencyLevel: LOCAL_QUORUM
    serialConsistencyLevel: SERIAL
  socketOptions:
    readTimeoutMillis: 20000

You'll then be limited by the server side timeouts that are related to the write_request_timeout_in_msof the cassandra.yaml file. A schema migration is a set of mutations, so they go into timeout if any of those mutations goes into timeout.

The stacktrace you pasted shows a client side timeout though, so you should not need to change the nodes configuration.

Contributor

adejanovski commented Jun 30, 2017

You can raise the timeout on the client side by changing the read timeout in the SocketOptions :

cassandra:
  clusterName: "test"
  contactPoints: ["127.0.0.1"]
  keyspace: reaper_db
  queryOptions:
    consistencyLevel: LOCAL_QUORUM
    serialConsistencyLevel: SERIAL
  socketOptions:
    readTimeoutMillis: 20000

You'll then be limited by the server side timeouts that are related to the write_request_timeout_in_msof the cassandra.yaml file. A schema migration is a set of mutations, so they go into timeout if any of those mutations goes into timeout.

The stacktrace you pasted shows a client side timeout though, so you should not need to change the nodes configuration.

@ostefano

This comment has been minimized.

Show comment
Hide comment
@ostefano

ostefano Jun 30, 2017

Thanks, I missed that option. Nice!
Closing the issue accordingly.

ostefano commented Jun 30, 2017

Thanks, I missed that option. Nice!
Closing the issue accordingly.

@ostefano ostefano closed this Jun 30, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment