Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The joex container restarts every 3 or 4 minutes #976

Closed
aelsenaar opened this issue Aug 1, 2021 · 11 comments · Fixed by #998
Closed

The joex container restarts every 3 or 4 minutes #976

aelsenaar opened this issue Aug 1, 2021 · 11 comments · Fixed by #998
Labels
bug Something isn't working or in unexpected ways docker All things regarding docker setup

Comments

@aelsenaar
Copy link

Hi,

I am using docspell in a docker(swarm) environment and used the compose file, with some modifications, to deploy docspell. At first everything seems we work fine but the joex container restarts every 3 or 4 minutes.

Monitoring the log of the joex container I see no indication of any problem just before it restart. What can I do to find the cause of these restarts?

Alexander

@aelsenaar
Copy link
Author

Good to add i think; when using the 0.23 version of the docker images the joex container stays up and gets the status 'running' the latest (0.25.1) always keeps the status 'starting'.

@eikek
Copy link
Owner

eikek commented Aug 2, 2021

Hm, this sounds strange. Could this be a memory problem? Does it happen with 0.24.0, too?

@aelsenaar
Copy link
Author

Hi,

I doubt it is a memory problem. No indication whatsoever. The container can claim all memory available.

I checked version v0.24.0 and v0.23.0 of the image and both of them have the same problem. After some time it restarts.

I mentioned that version 0.23 was working fine but I forgot the mention that am I talking about the image eikek0/docspell:joex-v0.23.0. The is the image in the old organization.

Alexander

@eikek
Copy link
Owner

eikek commented Aug 2, 2021

The 0.23.0 images in both repositories are built from the exact same sources. So I suspect a configuration issue for now. Did you change anything in your config files / env files? Maybe you could look into the current docker-compose setup and see if there are some changes. The changelog to 0.24.0 also contains some notes.

@floli
Copy link

floli commented Aug 2, 2021

Does the system log (journalctl) shows anything conclusive?

@aelsenaar
Copy link
Author

Hi,

I checked the system logs but nothing there.

I can see the container exits with exit code 137:

Screenshot_20210803_112022

When googling on this exit code, it seems to be related to OOMKiller. But in the logs, there is no mention of this and, in the above screenshot, you can also see that 'OOMKilled' is false. Strange.

Is exit code 137 related to something in Docspell?

I will try to install a clean Docspell on another machine and see how it behaves there.

Alexander

@aelsenaar
Copy link
Author

I installed docker on another machine and followed the docker-compose procedure here

Right after the ' docker-compose up' command, I see the containers starting:
Docspell-started

After 10 minutes all the containers are running except joex. That one is 'unhealthy.
Docspell-after 10 minutes

Although unhealthy it stays up and running and I experience no problems using docspell. I expect the status should also go to healthy/running? Is my assumption correct?

Maybe an unhealthy container behaves differently in a docker swarm and gets restarted? (Can not find any documentation on that yet).

@eikek
Copy link
Owner

eikek commented Aug 3, 2021

The exit code is jvm related - docspell itself doesn't set any specific rc code. If there are no logs etc, it seems to me that the process is killed in some hard way. I have no experience with docker swarm, I don't know how that works/what it does etc. The health chekc comes from the joex image. I think this can be improved, it currently only greps for a process. I would assume that it should go to "healthy" at some point… But if it is working, then the health check is not working correctly :-)
The rc 137 indicates a OOM related problem. You can use this route to see what memory got allocated to the running jvm: http://localhost:7878/api/info/system (this is the port to joex) Then you probably need to use tools from the system/docker swarm? to see what it did to the process, maybe dmesg is helping here?
Also 10minutes seems very long for getting up and running!

@aelsenaar
Copy link
Author

aelsenaar commented Aug 3, 2021

Hi eikek,

Thanks for the response.

It is healthcheck related! I adjusted my docker-compose file and added the following mock healthcheck to the joex service definition:

healthcheck:
  test: "exit 0"
  interval: 1m
  timeout: 10s
  retries: 2

Now to container gets the status "running" and stays up and running. I think that the docker swarm will 'hard kill' the container when there is a healthcheck definition and the status will not become 'healthy in some reasonable time.

Alexander

@floli
Copy link

floli commented Aug 3, 2021

I can confirm that. On a fresh installed using the repo's docker-compose file the state is

    "State": {
        "Dead": false,
        "Error": "",
        "ExitCode": 0,
        "FinishedAt": "0001-01-01T00:00:00Z",
        "Health": {
            "FailingStreak": 2,
            "Log": [
                {
                    "End": "2021-08-03T20:41:58.619072091+02:00",
                    "ExitCode": 1,
                    "Output": "",
                    "Start": "2021-08-03T20:41:58.550549785+02:00"
                },
                {
                    "End": "2021-08-03T20:42:58.684636824+02:00",
                    "ExitCode": 1,
                    "Output": "",
                    "Start": "2021-08-03T20:42:58.621384851+02:00"
                }
            ],
            "Status": "unhealthy"
        },

@eikek
Copy link
Owner

eikek commented Aug 4, 2021

Thanks for the confirmations! i think the healthcheck is broken. It needs to be fixed. I think we can better use a curl command against the info/version endpoint from joex (as the restserver image does).

@eikek eikek added bug Something isn't working or in unexpected ways docker All things regarding docker setup labels Aug 4, 2021
@eikek eikek added this to the Docspell 0.26.0 milestone Aug 6, 2021
@mergify mergify bot closed this as completed in #998 Aug 11, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working or in unexpected ways docker All things regarding docker setup
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants