Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Neo4j >1.9.M03 startup failure #258

Closed
pentagon opened this issue May 16, 2013 · 16 comments
Closed

Neo4j >1.9.M03 startup failure #258

pentagon opened this issue May 16, 2013 · 16 comments

Comments

@pentagon
Copy link

I performed investigations on using recent neo4j gem with several version of neo4j java database (neo4j-community gem with packed java classes in). It looks like there changes happened since version 1.9.M03 and they result in Neo4j initialization failure with error org.neo4j.kernel.lifecycle.LifecycleException in Java::OrgNeo4jKernel::EmbeddedGraphDatabase.new(...) call.
1.9.RC1 and 1.9.RC2 have the same issue. So i guess that upcoming 1.9 release will fail to start with neo4j gem. To reproduce the issue it needs (for example):
in Gemfile

gem 'neo4j', git: 'git://github.com/andreasronge/neo4j.git'
gem 'neo4j-community', '1.9.M05'

in command line:

$ rails c
>  Neo4j.start
....
LoadError: load error: rails/commands -- java.lang.RuntimeException: org.neo4j.kernel.lifecycle.LifecycleException: Component 'org.neo4j.kernel.extension.KernelExtensions@42fc652e' failed to initialize. Please see attached cause exception.
  require at org/jruby/RubyKernel.java:1027
   (root) at script/rails:6

Could please anyone point me to a possible solution to resolve this issue?

@andreasronge
Copy link
Member

Sorry can't really help you. But when I get time I will see if I can get it working just using the Java libraries (see http://docs.neo4j.org/chunked/preview/ha.html) and then do it from JRuby. The HA ruby layer is very thin. I guess it's just a configuration issue or a missmatch in JAR file versions.

@pentagon
Copy link
Author

I think this issue ant't actual anymore since there is Neo4j 1.9 released. Unfortunatelly I failed to test neo4j gem against updated neo4j-* gems since neo4j-core has strict dependency on neo4j < 1.9. Are there any plans to update related gems up that new neo4j release?

@andreasronge
Copy link
Member

Yes, but I not sure when I will get time, hopefully soon.

On Fri, May 24, 2013 at 5:18 PM, Viacheslav Petrenko <
notifications@github.com> wrote:

I think this issue ant't actual anymore since there is Neo4j 1.9 released.
Unfortunatelly I failed to test neo4j gem against updated neo4j-* gems
since neo4j-core has strict dependency on neo4j < 1.9. Are there any plans
to update related gems up that new neo4j release?


Reply to this email directly or view it on GitHubhttps://github.com//issues/258#issuecomment-18411200
.

@andreasronge
Copy link
Member

I have now tested it with a local build of 1.9 jars and all rspecs, backup tests and a simple HA cluster test works without any modifications.
I will try to release it next week.

@pentagon
Copy link
Author

I really appreciate that.That sounds really great. What is "simple HA cluster"?

@andreasronge
Copy link
Member

@andreasronge
Copy link
Member

Strange now I get the same exception as you when I follow my HA wiki page using locally built 1.9 jars.

Java::JavaLang::RuntimeException: org.neo4j.kernel.lifecycle.LifecycleException: Component 'org.neo4j.kernel.lifecycle.LifeSupport@2c2debb0' was successfully initialized, but failed to start. Please see attached cause exception.
    from org.neo4j.kernel.InternalAbstractGraphDatabase.run(InternalAbstractGraphDatabase.java:281)
    from org.neo4j.kernel.ha.HighlyAvailableGraphDatabase.<init>(HighlyAvailableGraphDatabase.java:153)
    from org.neo4j.graphdb.factory.HighlyAvailableGraphDatabaseFactory$1.newDatabase(HighlyAvailableGraphDatabaseFactory.java:47)
    from org.neo4j.graphdb.factory.GraphDatabaseBuilder.newGraphDatabase(GraphDatabaseBuilder.java:207)

After investigating the db/<dbname>/message.log file I found this:

2013-05-28 18:37:57.847+0000 INFO  [o.n.k.i.DiagnosticsManager]: --- STOPPING diagnostics START ---
2013-05-28 18:37:57.847+0000 INFO  [o.n.k.i.DiagnosticsManager]: High Availability diagnostics
Member state:PENDING
State machines:
   AtomicBroadcastMessage:start
   AcceptorMessage:start
   ProposerMessage:start
   LearnerMessage:start
   HeartbeatMessage:start
   ElectionMessage:start
   SnapshotMessage:start
   ClusterMessage:start
Current timeouts:

2013-05-28 18:37:57.847+0000 INFO  [o.n.k.i.DiagnosticsManager]: --- STOPPING diagnostics END ---
2013-05-28 18:37:57.848+0000 INFO  [o.n.k.h.HighlyAvailableGraphDatabase]: Startup failed: Component 'org.neo4j.kernel.lifecycle.LifeSupport@2c2debb0' was successfully initialized, but failed to start. Please see attached cause exception.: Component 'org.neo4j.cluster.client.ClusterClient@38a452a5' was successfully initialized, but failed to start. Please see attached cause exception.: Component 'org.neo4j.cluster.com.NetworkInstance@5208b11e' was successfully initialized, but failed to start. Please see attached cause exception.: Failed to bind to: localhost/127.0.0.1:5002: Address already in use

I then stopped all java processes and started only the HA Neo4j rails console, but then it hangs instead.
The output of my message.log:

2013-05-28 18:51:48.588+0000 INFO  [o.n.k.h.HighlyAvailableGraphDatabase]: Started - database is now available
2013-05-28 18:51:48.589+0000 INFO  [o.n.k.h.HighlyAvailableGraphDatabase]: GC Monitor started. 
2013-05-28 18:51:48.667+0000 INFO  [o.n.c.m.p.PaxosClusterMemberAvailability]: Listening at:cluster://127.0.0.1:5002
2013-05-28 18:51:48.674+0000 INFO  [o.n.c.c.ClusterJoin]: Attempting to join cluster of [localhost:5001, localhost:5002, localhost:5003]
2013-05-28 18:51:49.671+0000 INFO  [o.n.c.c.NetworkInstance]: cluster://127.0.0.1:5002 opened a new channel to localhost/127.0.0.1:5002

I'm using this Neo4j HA configuration: https://github.com/andreasronge/neo4j/wiki/Neo4j%3A%3Aha-cluster
This did work in the old 1.9M03.

@thekendalmiller
Copy link
Contributor

The first "failed to start" looks like the message when you try to start multiple neo4j's with the same ID or port. I am seeing the 'hang'.

I also tried a simple HA setup using embedded neo4j in java and ended up with the same hang. This discussion seems relevant and I added what I experienced: https://groups.google.com/forum/?fromgroups=#!topic/neo4j/OBXHmZZxEZ0

@thekendalmiller
Copy link
Contributor

Resolved!
To use neo4j >1.9.M03 (the 1.9 jars in Pull Request), start the server and load a page so that Neo4j starts. It will continue waiting until you also start the console and trigger Neo4j to start (SomeModel.first will work).

Neo4j 1.9.M03 was the last version that would let a HA cluster start without a majority of the nodes in the cluster. (according to Peter from the google group discussion)

Also, it might we worth adding a note that HA will not work if you have ruby-debug in your Gemfile because IRB will always be defined then. I tried finding another constant that would work but did not have any luck since the railtie is so earlier (before_configuration).

@andreasronge
Copy link
Member

Fantastic, well done !
Regarding the hang and having to trigger Neo4j to start in the console: Maybe we should let neo4j autostart ?
Regarding the ruby-debug issue - maybe there is some Neo4j configuration that we can use to discover the cluster instead of telling it which ports to use depending on if IRB is defined or not ??

@jtescher
Copy link

Awesome! HA is working well for me with the 1.9 jars.

@thekendalmiller
Copy link
Contributor

I think a nice way to solve the hang and having to trigger Neo4j by console might be to start with initial_hosts being only the current process + port. Then when another 'host' starts (console or server), initial_hosts will be the previous host(s) and the new host for this process. It sounds like Neo4j supports this based on HA Setup.

We might be able to use ha.discovery.url and write to the file as server or console processes are started. Or keep track of the processes based on file that behaves like a pid in tmp/.

I'll probably play with this some more this week as I have time. Also, #240 is the ruby-debug ticket which is what I was experiencing.

@thekendalmiller
Copy link
Contributor

One other thought as a 'quick' fix. Neo4j is waiting for a majority to be available. If we change the railtie initial_hosts to be only [0,1] the cluster will probably start up assuming that 50% is considered a majority.

I still like being able to update initial_hosts and more processes are started but might be worth doing the quick fix to unblock people trying neo4j 1.9.

@andreasronge
Copy link
Member

Great, I've now released the 1.9 jars.
Ok, I will wait and see if I get a pull request on this, and then I will make a release.

@pentagon
Copy link
Author

I look forward to test 1.9 :)

@thekendalmiller
Copy link
Contributor

Changing initial hosts to be [1,2] did not work. Both server and console had to have started Neo4j.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants