Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Random exits #32

Closed
coeus01 opened this issue May 16, 2016 · 15 comments
Closed

Random exits #32

coeus01 opened this issue May 16, 2016 · 15 comments
Assignees

Comments

@coeus01
Copy link

coeus01 commented May 16, 2016

Hello, I'm using neo4j in a docker-compose project and sometimes I need to start the neo4j docker 3-4 times in order to stay up and it randomly exits (Exited (137)) sometimes after 3 days, sometimes after 3 hours. The same thing happens for docker-compose up as well as docker start project_db, the other services that I have along are working as expected.

docker-compose.yml

version: '2'   
services:   
    db:   
        image: neo4j   
        ports:   
            - "7474:7474"   
        volumes:   
            - "./data:/data"   
        environment:   
            NEO4J_AUTH: "neo4j/neo4j"   
@benbc
Copy link
Contributor

benbc commented May 16, 2016

@coeus01 Do you see anything in the logs for the failed container? Most likely problem is running out of memory, I would have thought. You may need to set an increased size for heap or cache size -- the environment variable you need depends on which version of Neo4j you're using.

@coeus01
Copy link
Author

coeus01 commented May 16, 2016

@benbc Nothing in the logs, this problem also happens sometimes on docker-compose up where

db_1   | 2016-05-12 13:05:58.796+0000 INFO  No SSL certificate found, generating a self-signed certificate..
db_1   | 2016-05-12 13:06:00.809+0000 INFO  Starting...
db_1   | 2016-05-12 13:06:01.836+0000 INFO  Bolt enabled on localhost:7687.
db_1   | 2016-05-12 13:17:06.634+0000 INFO  Starting...
db_1   | 2016-05-12 13:17:09.079+0000 INFO  Bolt enabled on localhost:7687.
db_1   | 2016-05-12 13:28:04.189+0000 INFO  Starting...
db_1   | 2016-05-12 13:28:05.602+0000 INFO  Bolt enabled on localhost:7687.
db_1   | 2016-05-12 13:34:23.909+0000 INFO  Starting...
db_1   | 2016-05-12 13:34:24.508+0000 INFO  Bolt enabled on localhost:7687.
db_1   | 2016-05-12 13:36:18.871+0000 INFO  Starting...
db_1   | 2016-05-12 13:36:22.748+0000 INFO  Bolt enabled on localhost:7687.
db_1   | 2016-05-12 13:37:32.095+0000 INFO  Starting...
db_1   | 2016-05-12 13:37:32.821+0000 INFO  Bolt enabled on localhost:7687.
db_1   | 2016-05-12 13:37:36.578+0000 INFO  Started.
db_1   | 2016-05-12 13:37:38.397+0000 INFO  Remote interface available at http://localhost:7474/
db_1   | 2016-05-12 13:37:39.044+0000 WARN  Failed authentication attempt for 'neo4j' from 127.0.0.1
db_1   | 2016-05-12 13:37:39.346+0000 INFO  Neo4j Server shutdown initiated by request
db_1   | 2016-05-12 13:37:39.390+0000 INFO  Stopping...
db_1   | 2016-05-12 13:37:41.320+0000 INFO  Stopped.
db_1   | 2016-05-12 13:45:15.939+0000 INFO  Starting...
db_1   | 2016-05-12 13:45:19.587+0000 INFO  Bolt enabled on 0.0.0.0:7687.
db_1   | 2016-05-12 13:45:50.703+0000 INFO  Starting...
db_1   | 2016-05-12 13:45:51.382+0000 INFO  Bolt enabled on 0.0.0.0:7687.
db_1   | 2016-05-12 13:45:55.091+0000 INFO  Started.
db_1   | 2016-05-12 13:45:56.355+0000 INFO  Remote interface available at http://0.0.0.0:7474/
db_1   | 2016-05-12 13:45:57.066+0000 INFO  Neo4j Server shutdown initiated by request
db_1   | 2016-05-12 13:45:57.109+0000 INFO  Stopping...
db_1   | 2016-05-12 13:45:57.374+0000 INFO  Stopped.
db_1   | 2016-05-13 05:20:46.266+0000 INFO  Starting...
db_1   | 2016-05-13 05:20:46.890+0000 INFO  Bolt enabled on 0.0.0.0:7687.
db_1   | 2016-05-13 05:20:50.237+0000 INFO  Started.
db_1   | 2016-05-13 05:20:51.540+0000 INFO  Remote interface available at http://0.0.0.0:7474/
db_1   | 2016-05-13 05:20:52.648+0000 INFO  Neo4j Server shutdown initiated by request
db_1   | 2016-05-13 05:20:52.716+0000 INFO  Stopping...
db_1   | 2016-05-13 05:20:53.022+0000 INFO  Stopped.
db_1   | 2016-05-16 05:17:36.086+0000 INFO  Starting...
db_1   | 2016-05-16 05:17:37.043+0000 INFO  Bolt enabled on 0.0.0.0:7687.
db_1   | 2016-05-16 05:17:42.135+0000 INFO  Started.
db_1   | 2016-05-16 05:17:43.614+0000 INFO  Remote interface available at http://0.0.0.0:7474/
db_1   | 2016-05-16 05:17:44.074+0000 INFO  Neo4j Server shutdown initiated by request
db_1   | 2016-05-16 05:17:44.125+0000 INFO  Stopping...
db_1   | 2016-05-16 05:17:44.480+0000 INFO  Stopped.
db_1   | 2016-05-16 08:40:55.584+0000 INFO  Starting...
db_1   | 2016-05-16 08:40:57.331+0000 INFO  Bolt enabled on 0.0.0.0:7687.
db_1   | Neo4j failed to start

other times it exits earlier:

db_1   | 2016-05-16 08:43:01.494+0000 INFO  No SSL certificate found, generating a self-signed certificate..
db_1   | 2016-05-16 08:43:04.206+0000 INFO  Starting...
db_1   | 2016-05-16 08:43:08.044+0000 INFO  Bolt enabled on localhost:7687.
db_1   | Neo4j failed to start

after a few docker-compose up it works

@benbc
Copy link
Contributor

benbc commented May 16, 2016

@coeus01 Do you have any insight into why there are those repeated logs "Starting/Bolt enabled"? Are you trying to start it multiple times there, or is Docker Compose doing that?

@coeus01
Copy link
Author

coeus01 commented May 16, 2016

@benbc no.. they are completely random, sometimes it's only one, other times they are 10

@benbc
Copy link
Contributor

benbc commented May 16, 2016

@coeus01 Can you give us as much information as possible about your system? Docker and Neo4j versions, OS and version, system memory available, anything else that might be relevant.

@coeus01
Copy link
Author

coeus01 commented May 16, 2016

@benbc

  • neo4j 3.0.0
  • Docker version 1.11.1, build
  • docker-compose version 1.7.1, build 6c29830
  • Debian GNU/Linux 8.4 (jessie)
  • Intel(R) Xeon(R) CPU E5-2620 v2 @ 2.10GHz
  • MemTotal: 4060764 kB

@coeus01
Copy link
Author

coeus01 commented May 18, 2016

UPDATE even when I start all the containers with docker start <1> <2> <3> instead of docker-compose up again the neo4j container takes a few shots to start and stay up

@benbc
Copy link
Contributor

benbc commented May 18, 2016

@coeus01 Thank you, that's useful information. We'll take a look.

@spacecowboy
Copy link
Contributor

spacecowboy commented May 19, 2016

@coeus01 the problem is in your docker-compose.yml file

this line:

NEO4J_AUTH: "neo4j/neo4j"   

change the password to anything other than neo4j.

NEO4J_AUTH: "neo4j/foo"   

Neo4j does not support setting a custom password equal to the default neo4j password. That's why the container never comes up. An error message is shown if you try to do it in the browser.

We should probably add some logic to the docker entrypoint script that clearly tells you not to set the password to neo4j

@coeus01
Copy link
Author

coeus01 commented May 20, 2016

@spacecowboy my password is not neo4j, it was for the sake of an example. Also as I said before after a few times it works ok, so even if I had the password neo4j it wouldn't matter if I could get it working after a few tries.

@benbc
Copy link
Contributor

benbc commented May 25, 2016

@coeus01 We have now reproduced your problem. Setting the password via NEO4J_AUTH doesn't work on Debian Jessie (although it works fine on various Ubuntu versions). We are investigating further.

@benbc benbc self-assigned this May 25, 2016
benbc added a commit that referenced this issue May 26, 2016
… password

This start-up was timing out on Debian Jessie but not other
platforms. The timeout should have been longer anyway because, while
it's reasonable to expect the database to start in 10 seconds, we can
afford to give it a very long timeout to accommodate slow edge cases
since it will only fail if something is badly wrong.

Fixes #32.
@benbc
Copy link
Contributor

benbc commented May 26, 2016

@coeus01 The fix for this will be in Neo4j 3.0.2.

@benbc benbc closed this as completed May 26, 2016
@textbook
Copy link

textbook commented Jun 6, 2016

@benbc when will this version be available in the Docker Hub? Currently it's still showing 3.0.1, so I think the issue should still show as open.

@benbc
Copy link
Contributor

benbc commented Jun 6, 2016

@textbook The 3.0.2 image will be published later today.

@textbook
Copy link

textbook commented Jun 7, 2016

For anyone else wondering, the new image will be available when this PR is accepted: docker-library/official-images#1809

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants