Cannot shut down a Neo4j instance that forms the first member of a cluster #12530

luanne · 2020-06-06T10:37:40Z

I configured the first instance of a causal cluster and started it up.
As expected, it reports:

2020-06-06 10:19:12.124+0000 INFO  ======== Neo4j 4.0.5 ========
2020-06-06 10:19:12.127+0000 INFO  Starting...
2020-06-06 10:19:15.931+0000 INFO  Database 'system' is waiting for a total of 3 core members...

I tried to shut it down to change some config but I cannot. Tried neo4j stop as well as killing the process when I start it with neo4j console. I was forced to kill the process by pid in both cases.

The logs report:

020-06-06 10:19:49.948+0000 INFO  Neo4j Server shutdown initiated by request
2020-06-06 10:19:56.186+0000 INFO  Database 'system' is waiting for a total of 3 core members...
2020-06-06 10:20:06.245+0000 INFO  Database 'system' is waiting for a total of 3 core members...
2020-06-06 10:20:16.280+0000 INFO  Database 'system' is waiting for a total of 3 core members...
2020-06-06 10:20:26.345+0000 INFO  Database 'system' is waiting for a total of 3 core members...
2020-06-06 10:20:36.351+0000 INFO  Database 'system' is waiting for a total of 3 core members...
2020-06-06 10:20:46.357+0000 INFO  Database 'system' is waiting for a total of 3 core members...
2020-06-06 10:20:56.436+0000 INFO  Database 'system' is waiting for a total of 3 core members...

Neo4j version: 4.0.5
Operating system: OSX
API/Driver:N/A
Steps to reproduce

Start up the first instance configured to be a core member of a cluster. Do not start any other instances.
Try to shut it down

Expected behavior
It shuts down gracefully
Actual behavior
It does not shut down, have to kill the java process.

The text was updated successfully, but these errors were encountered:

martinfurmanski · 2020-06-08T08:11:47Z

@luanne Could you get us a stack trace of it after you have called stop?

luanne · 2020-06-08T16:15:29Z

Sure

2020-06-06 10:19:12.124+0000 INFO  ======== Neo4j 4.0.5 ========
2020-06-06 10:19:12.127+0000 INFO  Starting...
2020-06-06 10:19:15.931+0000 INFO  Database 'system' is waiting for a total of 3 core members...
2020-06-06 10:19:26.012+0000 INFO  Database 'system' is waiting for a total of 3 core members...
2020-06-06 10:19:36.061+0000 INFO  Database 'system' is waiting for a total of 3 core members...
2020-06-06 10:19:46.161+0000 INFO  Database 'system' is waiting for a total of 3 core members...
2020-06-06 10:19:49.948+0000 INFO  Neo4j Server shutdown initiated by request
2020-06-06 10:19:56.186+0000 INFO  Database 'system' is waiting for a total of 3 core members...
2020-06-06 10:20:06.245+0000 INFO  Database 'system' is waiting for a total of 3 core members...
2020-06-06 10:20:16.280+0000 INFO  Database 'system' is waiting for a total of 3 core members...
2020-06-06 10:20:26.345+0000 INFO  Database 'system' is waiting for a total of 3 core members...
2020-06-06 10:20:36.351+0000 INFO  Database 'system' is waiting for a total of 3 core members...
2020-06-06 10:20:46.357+0000 INFO  Database 'system' is waiting for a total of 3 core members...
2020-06-06 10:20:56.436+0000 INFO  Database 'system' is waiting for a total of 3 core members...
2020-06-06 10:21:06.493+0000 INFO  Database 'system' is waiting for a total of 3 core members...
2020-06-06 10:21:16.573+0000 INFO  Database 'system' is waiting for a total of 3 core members...
2020-06-06 10:21:26.652+0000 INFO  Database 'system' is waiting for a total of 3 core members...
2020-06-06 10:21:36.725+0000 INFO  Database 'system' is waiting for a total of 3 core members...
2020-06-06 10:21:46.775+0000 INFO  Database 'system' is waiting for a total of 3 core members...
2020-06-06 10:21:56.799+0000 INFO  Database 'system' is waiting for a total of 3 core members...
2020-06-06 10:22:06.840+0000 INFO  Database 'system' is waiting for a total of 3 core members...
2020-06-06 10:22:16.868+0000 INFO  Database 'system' is waiting for a total of 3 core members...
2020-06-06 10:22:26.918+0000 INFO  Database 'system' is waiting for a total of 3 core members...
2020-06-06 10:22:36.962+0000 INFO  Database 'system' is waiting for a total of 3 core members...
2020-06-06 10:22:46.967+0000 INFO  Database 'system' is waiting for a total of 3 core members...
2020-06-06 10:22:57.024+0000 INFO  Database 'system' is waiting for a total of 3 core members...
2020-06-06 10:23:07.112+0000 INFO  Database 'system' is waiting for a total of 3 core members...
2020-06-06 10:23:17.194+0000 INFO  Database 'system' is waiting for a total of 3 core members...
2020-06-06 10:23:27.244+0000 INFO  Database 'system' is waiting for a total of 3 core members...
2020-06-06 10:23:37.265+0000 INFO  Database 'system' is waiting for a total of 3 core members...
2020-06-06 10:23:47.318+0000 INFO  Database 'system' is waiting for a total of 3 core members...
2020-06-06 10:23:57.392+0000 INFO  Database 'system' is waiting for a total of 3 core members...
2020-06-06 10:24:07.400+0000 INFO  Database 'system' is waiting for a total of 3 core members...
2020-06-06 10:24:16.046+0000 ERROR Clustering components for database 'system' have encountered a critical error Encountered error when attempting to reconcile database system from state 'EnterpriseDatabaseState{databaseId=DatabaseId{00000000[system]}, operatorState=STOPPED, failed=false}' to state 'online'
java.lang.IllegalStateException: Encountered error when attempting to reconcile database system from state 'EnterpriseDatabaseState{databaseId=DatabaseId{00000000[system]}, operatorState=STOPPED, failed=false}' to state 'online'
	at com.neo4j.dbms.DbmsReconciler.reportErrorAndPanicDatabase(DbmsReconciler.java:447)
	at com.neo4j.dbms.DbmsReconciler.handleReconciliationErrors(DbmsReconciler.java:432)
	at com.neo4j.dbms.DbmsReconciler.lambda$postReconcile$15(DbmsReconciler.java:381)
	at java.base/java.util.concurrent.ConcurrentHashMap.compute(ConcurrentHashMap.java:1908)
	at com.neo4j.dbms.DbmsReconciler.postReconcile(DbmsReconciler.java:379)
	at com.neo4j.dbms.DbmsReconciler.lambda$scheduleReconciliationJob$8(DbmsReconciler.java:246)
	at java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:859)
	at java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:837)
	at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
	at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1705)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
	at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: org.neo4j.dbms.api.DatabaseManagementException: Unable to start database `DatabaseId{00000000[system]}`
	at com.neo4j.dbms.database.ClusteredMultiDatabaseManager.startDatabase(ClusteredMultiDatabaseManager.java:71)
	at com.neo4j.dbms.database.ClusteredMultiDatabaseManager.startDatabase(ClusteredMultiDatabaseManager.java:31)
	at com.neo4j.dbms.database.MultiDatabaseManager.forSingleDatabase(MultiDatabaseManager.java:112)
	at com.neo4j.dbms.database.MultiDatabaseManager.startDatabase(MultiDatabaseManager.java:98)
	at com.neo4j.dbms.DbmsReconciler.start(DbmsReconciler.java:549)
	at com.neo4j.dbms.Transitions$TransitionFunction.lambda$prepare$0(Transitions.java:219)
	at com.neo4j.dbms.DbmsReconciler.doTransitionStep(DbmsReconciler.java:347)
	at com.neo4j.dbms.DbmsReconciler.doTransitionStep(DbmsReconciler.java:348)
	at com.neo4j.dbms.DbmsReconciler.doTransitions(DbmsReconciler.java:330)
	at com.neo4j.dbms.DbmsReconciler.lambda$doTransitions$10(DbmsReconciler.java:320)
	at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1700)
	... 3 more
Caused by: org.neo4j.kernel.lifecycle.LifecycleException: Component 'org.neo4j.kernel.lifecycle.LifecycleAdapter$3@565fe93d' was successfully initialized, but failed to start. Please see the attached cause exception "Failed to join or bootstrap a raft group with id RaftId{00000000} and members DatabaseCoreTopology{DatabaseId{00000000} [MemberId{8c86fbbc}]} in time. Please restart the cluster.".
	at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.start(LifeSupport.java:465)
	at org.neo4j.kernel.lifecycle.LifeSupport.start(LifeSupport.java:111)
	at com.neo4j.causalclustering.common.ClusteredDatabase.start(ClusteredDatabase.java:39)
	at com.neo4j.dbms.database.ClusteredMultiDatabaseManager.startDatabase(ClusteredMultiDatabaseManager.java:67)
	... 13 more
Caused by: java.util.concurrent.TimeoutException: Failed to join or bootstrap a raft group with id RaftId{00000000} and members DatabaseCoreTopology{DatabaseId{00000000} [MemberId{8c86fbbc}]} in time. Please restart the cluster.
	at com.neo4j.causalclustering.identity.RaftBinder$BindingConditions.allowContinue(RaftBinder.java:402)
	at com.neo4j.causalclustering.identity.RaftBinder$BindingConditions.allowContinueBinding(RaftBinder.java:382)
	at com.neo4j.causalclustering.identity.RaftBinder.bindToInitialRaftGroup(RaftBinder.java:206)
	at com.neo4j.causalclustering.identity.RaftBinder.getBoundState(RaftBinder.java:147)
	at com.neo4j.causalclustering.identity.RaftBinder.bindToRaft(RaftBinder.java:139)
	at com.neo4j.causalclustering.core.CoreBootstrap.bindAndStartMessageHandler(CoreBootstrap.java:77)
	at com.neo4j.causalclustering.core.CoreBootstrap.perform(CoreBootstrap.java:62)
	at org.neo4j.kernel.lifecycle.LifecycleAdapter$3.start(LifecycleAdapter.java:86)
	at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.start(LifeSupport.java:444)
	... 16 more

Let me know if you want me to send you the debug.log somehow

umuzammil · 2020-08-03T12:04:19Z

Hello, any updates on this please? Quick note that this is not limited to the first core that joins (or makes a failed attempt to join) a cluster. Once any/all members enter a state of waiting for others to join, shutdown times out (perhaps indefinitely) for any of them.

luanne added the bug label Jun 6, 2020

martinfurmanski added the causal clustering label Jun 8, 2020

mnd999 added the team-cluster label Sep 17, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cannot shut down a Neo4j instance that forms the first member of a cluster #12530

Cannot shut down a Neo4j instance that forms the first member of a cluster #12530

luanne commented Jun 6, 2020

martinfurmanski commented Jun 8, 2020

luanne commented Jun 8, 2020

umuzammil commented Aug 3, 2020 •

edited

Cannot shut down a Neo4j instance that forms the first member of a cluster #12530

Cannot shut down a Neo4j instance that forms the first member of a cluster #12530

Comments

luanne commented Jun 6, 2020

martinfurmanski commented Jun 8, 2020

luanne commented Jun 8, 2020

umuzammil commented Aug 3, 2020 • edited

umuzammil commented Aug 3, 2020 •

edited