Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Having Issue with tini not reaping zombies #130

Open
dbjpanda opened this issue Jan 22, 2019 · 15 comments
Open

Having Issue with tini not reaping zombies #130

dbjpanda opened this issue Jan 22, 2019 · 15 comments

Comments

@dbjpanda
Copy link

dbjpanda commented Jan 22, 2019

When I am exec-ing the mysqld inside entrypoint.sh the zombie process occurs. Without exec-ing I always get an extra process that is shell as immediate child of tini and mysqld as the child of that shell. Am I doing something wrong ?

Dockerfile

COPY entrypoint.sh /
ENTRYPOINT ["tini", "-g", "--"]
EXPOSE 3306

CMD ["/entrypoint.sh"]

entrypoint.sh

#!/bin/sh
set -ex
mkfifo /tmp/mysqld.init
echo 'CREATE DATABASE IF NOT EXISTS test;' > /tmp/mysqld.init &
exec mysqld --init-file="/tmp/mysqld.init"

top

PID  PPID USER     STAT   VSZ %VSZ CPU %CPU COMMAND
   6     1 mysql    S     383m  19%   0   0% mysqld --init-file=/tmp/mysqld.init
  41     0 root     R     1532   0%   0   0% top
   1     0 root     S      764   0%   1   0% tini -g -- /entrypoint.sh
   8     6 root     Z        0   0%   1   0% [entrypoint.sh]

I also tried tini as subreaper. Something like below inside entrypoint.sh, but tini itself become zombie. Without a control operator & it works fine.
tini -s -vvv -- sleep 60 &
And here is the output I get.

[TRACE tini (7)] Registered as child subreaper
[INFO  tini (7)] Spawned child process 'sleep' with pid '8'
....................................................
[TRACE tini (7)] No child to reap
[TRACE tini (7)] No child to reap
[TRACE tini (7)] No child to reap
[DEBUG tini (7)] Received SIGCHLD
[DEBUG tini (7)] Reaped child with pid: '8'
[INFO  tini (7)] Main child exited normally (with status '0')
[TRACE tini (7)] No child to wait
[TRACE tini (7)] Exiting: child has exited
@krallin
Copy link
Owner

krallin commented Feb 7, 2019

Where you ran top, can you run ps auxwww --forest instead? This should let us see the process hierarchy and make it easier to understand where things went off the rails.

@dbjpanda
Copy link
Author

dbjpanda commented Feb 7, 2019

@krallin After some trial and error this exec tini -g -- "$@" --init-file="/tmp/mysqld.init" solved my issue. But yet I am confused why the above doesn't work. I ran top after I created a container. e.g docker-compose up -d and docker-compose exec mariadb top

@hashhar
Copy link

hashhar commented Apr 18, 2019

Dockerfile

FROM fabric8/java-centos-openjdk8-jdk

ENV TINI_VERSION=0.18.0
RUN curl -fSL -o /tmp/tini https://github.com/krallin/tini/releases/download/v${TINI_VERSION}/tini && \
    chmod +x /tmp/tini

COPY docker-entrypoint.sh connector-deploy.sh heartbeat.sh Scratch.java /
RUN javac Scratch.java
ENTRYPOINT ["/tmp/tini", "-vvv", "-g", "-w", "--", "/docker-entrypoint.sh"]
CMD ["start"]

docker-entrypoint.sh

#!/bin/bash
set -e
case $1 in
    start)
        ./connector-deploy.sh &
        ./heartbeat.sh &
        exec java Scratch;;
esac
exec "$@"

heartbeat.sh (Run ps auxwww --forest each 10 seconds)

#!/bin/bash
while true; do
    ps auxwww --forest
    sleep 10
done

connector-deploy.sh (Dies after 10 seconds to become a zombie)

#!/bin/bash

echo "$0: We have started"
while true; do
    sleep 10
    break
done
echo "$0: We are going to become a zombie"

Scratch.java (Simulates a long running task by printing every 20 seconds)

import java.time.Duration;
class Scratch {
  public static void main(String[] args) throws InterruptedException {
    while (true) {
      System.out.println("JAVA: ALIVE");
      Thread.sleep(Duration.ofSeconds(20).toMillis());
    }
  }
}

The output once I build and run the image.

~/tmp/tini-report $ docker run -it --rm foo
[INFO  tini (1)] Spawned child process '/docker-entrypoint.sh' with pid '6'
./connector-deploy.sh: We have started
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
jboss        1  0.0  0.0   4368   648 pts/0    Ss   08:07   0:00 /tmp/tini -vvv -g -w -- /docker-entrypoint.sh start
jboss        6  0.0  0.0  11692  2636 pts/0    D+   08:07   0:00 /bin/bash /docker-entrypoint.sh start
jboss        7  0.0  0.0  11696  2504 pts/0    S+   08:07   0:00  \_ /bin/bash ./connector-deploy.sh
jboss        9  0.0  0.0   4372   648 pts/0    S+   08:07   0:00  |   \_ sleep 10
jboss        8  0.0  0.0  11692  2588 pts/0    S+   08:07   0:00  \_ /bin/bash ./heartbeat.sh
jboss       10  0.0  0.0  51748  3548 pts/0    R+   08:07   0:00      \_ ps auxwww --forest
JAVA: ALIVE
[TRACE tini (1)] No child to reap
...
[TRACE tini (1)] No child to reap
./connector-deploy.sh: We are going to become a zombie
[TRACE tini (1)] No child to reap
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
jboss        1  0.8  0.0   4368   648 pts/0    Ss   08:07   0:00 /tmp/tini -vvv -g -w -- /docker-entrypoint.sh start
jboss        6  2.5  0.1 6485800 24304 pts/0   Sl+  08:07   0:00 java Scratch
jboss        7  0.0  0.0      0     0 pts/0    Z+   08:07   0:00  \_ [connector-deplo] <defunct>
jboss        8  0.0  0.0  11692  2588 pts/0    S+   08:07   0:00  \_ /bin/bash ./heartbeat.sh
jboss       26  0.0  0.0  51748  3444 pts/0    R+   08:07   0:00      \_ ps auxwww --forest
[TRACE tini (1)] No child to reap
JAVA: ALIVE
[TRACE tini (1)] No child to reap
^C[DEBUG tini (1)] Received SIGCHLD
[DEBUG tini (1)] Reaped child with pid: '6'
[INFO  tini (1)] Main child exited normally (with status '130')
[DEBUG tini (1)] Reaped child with pid: '7'
[WARN  tini (1)] Reaped zombie process with pid=7
[TRACE tini (1)] No child to reap
[TRACE tini (1)] Exiting: child has exited

As you can see from the log above that connector-deploy.sh gets marked as a zombie and is never collected. If you go into the container and pkill heartbeat.sh then the same thing happens. ie. heartbeat.sh becomes a zombie and tini collects the orphaned sleep running under heartbeat.sh.

@yosifkit
Copy link

yosifkit commented Apr 18, 2019

As far as I understand it, your problem is in trying to run more than one thing. Tini can only reap processes that have no parent (ie are re-parented to tini via being pid 1 or --subreaper). In both examples here, the zombie processes have a parent. Let me try going through this java example since it has ps output to back my conclusions.

  • tini starts with a single child /bin/bash /docker-entrypoint.sh as pid 6
  • pid 6 creates two children (pids 7 and 8)
  • while 7 and 8 are starting/sleeping, pid 6 replaces itself with a java process via exec and is still pid 6
  • pids 7 and 8 still have a parent and so are not re-parented to tini
  • the java process, pid 6, is now the parent of the two shell scripts and is responsible for doing a wait on them when it is sent the SIGCHLD

Tini can't possibly reap these as written since the parent still exists. You could possibly write the "zombie" scripts to do fork and exec in order to divorce themselves from their parent (ie like a classic daemon), but there might be a race if you leave in the &. I would really suggest to moving to something that will actually monitor the other processes like a full init system or a "sidecar" container (https://kubernetes.io/docs/concepts/workloads/pods/pod-overview/).

@hashhar
Copy link

hashhar commented Apr 19, 2019

@yosifkit I did eventually end up implementing a proper daemon in bash and in that case tini does reap the process since it gets reparented to tini when doing the double fork.

but there might be a race if you leave in the &

Could you elaborate on the quoted part? What kind of race?

@yosifkit
Copy link

I was just thinking that it might still be possible to get a zombie:

  • say the entrypoint is pid 6
  • it spawns your new daemon script to run in the background (heartbeat.sh & as pid 7)
    • heartbeat.sh hasn't yet done its own fork + exec
      • maybe it just hasn't been given processor time
      • or that it has a bunch of setup before it does the fork/exec
      • many reasons since it is now being run in parallel to the entrypoint and thus order between them is not guaranteed
  • pid 6 execs to java or whatever
  • pid 7 forks to create pid 8 and pid 7 exits
  • pid 7 is still a child of pid 6 and is unreaped

If you want to ensure that it has completed its fork and exec, just run the script without backgrounding it with &.

@perllaghu
Copy link

Further to this, we are also experiencing zombie issues..... and we've jet to identify a repeatable trigger.

What I can tell you is we're running Jupyter notebooks in a K8 cluster, using KubeSpawner - and the effect is we have a pod that won't terminate, thus the user cannot start a new instance for a later class (or homework)

At a process-level, a happy container looks something like:

 43338 106036 106036  43338 ?            -1 Sl       0   0:00  \_ containerd-shim -namespace moby -workdir /var/lib/containerd/io.containerd.runtime.v1.linux/moby/bed33e403b4bf0bdd536698df1721b5ee8a7b2bce4f0c934f168a1264cc1530f -address /run/containerd/containerd.sock -containerd-binary /usr/bin/containerd -runtime-root /var/run/docker/runti
106036 106054 106054 106054 ?            -1 Ss    1000   0:00  |   \_ tini -g -- /usr/local/bin/docker-entrypoint.sh jupyter-notebook --ip=0.0.0.0 --port=8888 --NotebookApp.base_url=/user/user_a/ --NotebookApp.token=Xgfcf0fDQfGfM3utR2VFrw --NotebookApp.default_url=/tree --NotebookApp.trust_xheaders=True --NotebookApp.disable_check_xsrf=T
106054 106074 106074 106054 ?            -1 Sl    1000   0:05  |       \_ /opt/conda/bin/python /opt/conda/bin/jupyter-notebook --ip=0.0.0.0 --port=8888 --NotebookApp.base_url=/user/1_s1985869/ --NotebookApp.token=Xgfcf0fDQfGfM3utR2VFrw --NotebookApp.default_url=/tree --NotebookApp.trust_xheaders=True --NotebookApp.disable_check_xsrf=True
106074 106188 106188 106188 ?            -1 Ssl   1000   0:01  |           \_ /opt/conda/bin/python -m ipykernel_launcher -f /home/jovyan/.local/share/jupyter/runtime/kernel-baa4d67d-2ab9-4321-8265-0cb79e911de2.json
106074   8775   8775   8775 ?            -1 Ssl   1000   0:00  |           \_ /opt/conda/bin/python -m ipykernel_launcher -f /home/jovyan/.local/share/jupyter/runtime/kernel-c363be05-417c-4a74-a2c6-e2e1d6d2b87e.json

however my broken container looks like:

 98917  90147  90147  98917 ?            -1 Sl       0   0:00  \_ containerd-shim -namespace moby -workdir /var/lib/containerd/io.containerd.runtime.v1.linux/moby/81cb907d22fe090012823091ba27aee2a0fe8b1339b8170db6db89501a038520 -address /run/containerd/containerd.sock -containerd-binary /usr/bin/containerd -runtime-root /var/run/docker/runtime-runc
 90147  90164  90164  90164 ?            -1 Ss    1000   0:00  |   \_ [tini]
 90164  90182  90182  90164 ?            -1 Rl    1000  25:21  |       \_ /opt/conda/bin/python /opt/conda/bin/jupyter-notebook --ip=0.0.0.0 --port=8888 --NotebookApp.base_url=/user/user_b/ --NotebookApp.token=2cVgcb-8QOGFjpDN8VqTCA --NotebookApp.default_url=/tree --NotebookApp.trust_xheaders=True --NotebookApp.disable_check_xsrf=True
 90182  90818  90818  90818 ?            -1 Rsl   1000  42:58  |           \_ /opt/conda/bin/python -m ipykernel_launcher -f /home/jovyan/.local/share/jupyter/runtime/kernel-e13ed0b6-bd44-43d2-873b-962ff2f8b297.json
 90182  90920  90920  90920 ?            -1 Rsl   1000  44:41  |           \_ /opt/conda/bin/python -m ipykernel_launcher -f /home/jovyan/.local/share/jupyter/runtime/kernel-a83a7ee8-ab57-4a74-92a7-c183b31cd5a5.json

The [tini] is clearly the zombie process here.

If I kill (sudo kill -9 <pid>) on the containerd-shim process, the tini process gets inherited by init (pid 1, as you would expect).
If I try and kill the zombied tini or any of the child processes, nothing happens... they don't go.

The only solution we've found, so far, is to actually reboot the machine (which needs a cordon & wait process because, well... users)

Do you have any suggestions where we can look?
(we're running

  • Centos7: Linux my_host 3.10.0-1062.1.1.el7.x86_64 #1 SMP Fri Sep 13 22:55:44 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
  • docker: Docker version 19.03.3, build a872fc2f86
  • Jupyter notebooks based on the May 2019 images : jupyter/r-notebook:abdb27a6dfbb

if that helps)

@krallin
Copy link
Owner

krallin commented Oct 18, 2019

@perllaghu To help with debugging, can you please:

  • Share the output of running the same ps command within your container's namespace
  • Run tini with -vvv and provide its output

Thanks!

@perllaghu
Copy link

perllaghu commented Oct 18, 2019

  • Share the output of running the same ps command within your container's namespace

This, I can do the next time I have a stalled image

  • Run tini with -vvv and provide its output

Let me talk to my DevOps people - they may balk at the extra logging (hope not)

@krallin
Copy link
Owner

krallin commented Oct 18, 2019 via email

@perllaghu
Copy link

@krallin - here you go....
(I reckon I've got a <0.1% error-rate - which is good)

Installing /srv/src/nbgrader/nbgrader/nbextensions/create_assignment -> create_assignment
Making directory: /opt/conda/share/jupyter/nbextensions/create_assignment/
Copying: /srv/src/nbgrader/nbgrader/nbextensions/create_assignment/create_assignment.css -> /opt/conda/share/jupyter/nbextensions/create_assignment/create_assignment.css
Copying: /srv/src/nbgrader/nbgrader/nbextensions/create_assignment/main.js -> /opt/conda/share/jupyter/nbextensions/create_assignment/main.js
- Validating: �[32mOK�[0m
Installing /srv/src/nbgrader/nbgrader/nbextensions/formgrader -> formgrader
Making directory: /opt/conda/share/jupyter/nbextensions/formgrader/
Copying: /srv/src/nbgrader/nbgrader/nbextensions/formgrader/main.js -> /opt/conda/share/jupyter/nbextensions/formgrader/main.js
- Validating: �[32mOK�[0m
Installing /srv/src/nbgrader/nbgrader/nbextensions/validate_assignment -> validate_assignment
Making directory: /opt/conda/share/jupyter/nbextensions/validate_assignment/
Copying: /srv/src/nbgrader/nbgrader/nbextensions/validate_assignment/main.js -> /opt/conda/share/jupyter/nbextensions/validate_assignment/main.js
- Validating: �[32mOK�[0m
Installing /srv/src/nbgrader/nbgrader/nbextensions/assignment_list -> assignment_list
Making directory: /opt/conda/share/jupyter/nbextensions/assignment_list/
Copying: /srv/src/nbgrader/nbgrader/nbextensions/assignment_list/assignment_list.css -> /opt/conda/share/jupyter/nbextensions/assignment_list/assignment_list.css
Copying: /srv/src/nbgrader/nbgrader/nbextensions/assignment_list/assignment_list.js -> /opt/conda/share/jupyter/nbextensions/assignment_list/assignment_list.js
Copying: /srv/src/nbgrader/nbgrader/nbextensions/assignment_list/main.js -> /opt/conda/share/jupyter/nbextensions/assignment_list/main.js
- Validating: �[32mOK�[0m
    To initialize this nbextension in the browser every time the notebook (or other app) loads:
    
          jupyter nbextension enable nbgrader --py --sys-prefix
    
Enabling tree extension validate_assignment/main...
      - Validating: �[32mOK�[0m
Enabling: nbgrader.server_extensions.validate_assignment
- Writing config: /opt/conda/etc/jupyter
    - Validating...
      nbgrader.server_extensions.validate_assignment  �[32mOK�[0m
Enabling tree extension assignment_list/main...
      - Validating: �[32mOK�[0m
Enabling: nbgrader.server_extensions.assignment_list
- Writing config: /opt/conda/etc/jupyter
    - Validating...
      nbgrader.server_extensions.assignment_list  �[32mOK�[0m
Disabling notebook extension create_assignment/main...
      - Validating: �[32mOK�[0m
Disabling tree extension formgrader/main...
      - Validating: �[32mOK�[0m
Disabling: nbgrader.server_extensions.formgrader
- Writing config: /opt/conda/etc/jupyter
[I 18:24:06.131 NotebookApp] JupyterLab extension loaded from /opt/conda/lib/python3.7/site-packages/jupyterlab
[I 18:24:06.131 NotebookApp] JupyterLab application directory is /opt/conda/share/jupyter/lab
[I 18:24:06.600 NotebookApp] Loading the assignment_list nbgrader serverextension
[I 18:24:06.603 NotebookApp] Loading the validate_assignment nbgrader serverextension
[I 18:24:06.603 NotebookApp] Serving notebooks from local directory: /home/jovyan
[I 18:24:06.603 NotebookApp] The Jupyter Notebook is running at:
[I 18:24:06.603 NotebookApp] http://jupyter-1-5fmyUser:8888/user/1_myUser/?token=...
[I 18:24:06.603 NotebookApp]  or http://127.0.0.1:8888/user/1_myUser/?token=...
[I 18:24:06.603 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[I 18:24:08.187 NotebookApp] 302 GET /user/1_myUser/ (10.42.22.180) 0.78ms
[I 18:24:10.616 NotebookApp] 302 GET /user/1_myUser?token=ZUVG4JvJRm6SSu-duIV_rw (10.42.3.0) 0.68ms
[E 18:24:10.701 NotebookApp] Could not open static file ''
[W 18:24:10.840 NotebookApp] 404 GET /user/1_myUser/static/components/react/react-dom.production.min.js (10.42.3.0) 7.71ms referer=https://noteable.edina.ac.uk/user/1_myUser/tree?token=ZUVG4JvJRm6SSu-duIV_rw
[W 18:24:10.910 NotebookApp] 404 GET /user/1_myUser/static/components/react/react-dom.production.min.js (10.42.3.0) 2.34ms referer=https://noteable.edina.ac.uk/user/1_myUser/tree?token=ZUVG4JvJRm6SSu-duIV_rw
[I 18:24:17.548 NotebookApp] Creating new notebook in 
[W 18:24:18.172 NotebookApp] 404 GET /user/1_myUser/static/components/react/react-dom.production.min.js (10.42.3.0) 2.32ms referer=https://noteable.edina.ac.uk/user/1_myUser/notebooks/Untitled3.ipynb?kernel_name=python3
[W 18:24:18.272 NotebookApp] 404 GET /user/1_myUser/static/components/react/react-dom.production.min.js (10.42.3.0) 3.13ms referer=https://noteable.edina.ac.uk/user/1_myUser/notebooks/Untitled3.ipynb?kernel_name=python3
[I 18:24:20.459 NotebookApp] Kernel started: ed8b3a2b-5f15-4007-b988-da23f2f8b75e
[INFO  tini (1)] Spawned child process '/usr/local/bin/docker-entrypoint.sh' with pid '6'
[TRACE tini (1)] No child to reap
<< repeat 95 tiomes >>
[TRACE tini (1)] No child to reap
[I 18:26:20.865 NotebookApp] Saving file at /Untitled3.ipynb
[TRACE tini (1)] No child to reap
<< repeat 399 times >>
[TRACE tini (1)] No child to reap
[I 18:33:57.878 NotebookApp] 302 GET /user/1_myUser/ (10.42.22.180) 0.59ms
[TRACE tini (1)] No child to reap
<< repeat 117 times >>
[TRACE tini (1)] No child to reap
[I 18:34:27.092 NotebookApp] Saving file at /Untitled3.ipynb
[TRACE tini (1)] No child to reap
<< repeat 118 times >>
[TRACE tini (1)] No child to reap
[I 18:36:21.069 NotebookApp] Saving file at /Untitled3.ipynb
[W 18:36:46.003 NotebookApp] 404 GET /user/1_myUser/static/components/react/react-dom.production.min.js (10.42.3.0) 2.17ms referer=https://noteable.edina.ac.uk/user/1_myUser/notebooks/Workshop%204(1)(1).ipynb
[W 18:36:46.125 NotebookApp] 404 GET /user/1_myUser/static/components/react/react-dom.production.min.js (10.42.3.0) 2.23ms referer=https://noteable.edina.ac.uk/user/1_myUser/notebooks/Workshop%204(1)(1).ipynb
[I 18:36:50.409 NotebookApp] Kernel started: 68b808cf-8adb-46f1-b36d-86c272e8687e
[TRACE tini (1)] No child to reap
<< repeat 117 times >>
[TRACE tini (1)] No child to reap
[I 18:38:51.010 NotebookApp] Saving file at /Workshop 4(1)(1).ipynb
[TRACE tini (1)] No child to reap
<< repeat 118 times >>
[TRACE tini (1)] No child to reap
[I 18:40:20.914 NotebookApp] Saving file at /Untitled3.ipynb
[TRACE tini (1)] No child to reap
<< repeat 117 times >>
[TRACE tini (1)] No child to reap
[I 18:42:20.507 NotebookApp] Saving file at /Untitled3.ipynb
[TRACE tini (1)] No child to reap
<< repeat 118 times >>
[TRACE tini (1)] No child to reap
[I 18:44:20.456 NotebookApp] Saving file at /Untitled3.ipynb
[TRACE tini (1)] No child to reap
<< repeat 117 times >>
[TRACE tini (1)] No child to reap
[I 18:46:20.836 NotebookApp] Saving file at /Untitled3.ipynb
[TRACE tini (1)] No child to reap
<< repeat 118 times >>
[TRACE tini (1)] No child to reap
[I 18:48:20.561 NotebookApp] Saving file at /Untitled3.ipynb
[TRACE tini (1)] No child to reap
<< repeat 117 times >>
[TRACE tini (1)] No child to reap
[I 18:50:20.598 NotebookApp] Saving file at /Untitled3.ipynb
[W 18:51:00.156 NotebookApp] 404 GET /user/1_myUser/static/components/react/react-dom.production.min.js (10.42.3.0) 2.11ms referer=https://noteable.edina.ac.uk/user/1_myUser/edit/mytextfile.txt
[W 18:51:00.364 NotebookApp] 404 GET /user/1_myUser/static/components/react/react-dom.production.min.js (10.42.3.0) 2.59ms referer=https://noteable.edina.ac.uk/user/1_myUser/edit/mytextfile.txt
[W 18:51:41.809 NotebookApp] 404 GET /user/1_myUser/static/components/react/react-dom.production.min.js (10.42.3.0) 2.18ms referer=https://noteable.edina.ac.uk/user/1_myUser/edit/mytextfile.txt
[W 18:51:41.939 NotebookApp] 404 GET /user/1_myUser/static/components/react/react-dom.production.min.js (10.42.3.0) 2.33ms referer=https://noteable.edina.ac.uk/user/1_myUser/edit/mytextfile.txt
[TRACE tini (1)] No child to reap
<< repeat 117 times >>
[TRACE tini (1)] No child to reap
[I 18:52:21.155 NotebookApp] Saving file at /Untitled3.ipynb
[TRACE tini (1)] No child to reap
<< repeat 118 times >>
[TRACE tini (1)] No child to reap
[I 18:54:20.942 NotebookApp] Saving file at /Untitled3.ipynb
[TRACE tini (1)] No child to reap
<< repeat 479 times >>
[TRACE tini (1)] No child to reap
[W 19:03:27.429 NotebookApp] 404 GET /user/1_myUser/static/components/react/react-dom.production.min.js (10.42.3.0) 2.13ms referer=https://noteable.edina.ac.uk/user/1_myUser/notebooks/Workshop%203.ipynb
[W 19:03:27.679 NotebookApp] 404 GET /user/1_myUser/static/components/react/react-dom.production.min.js (10.42.3.0) 2.23ms referer=https://noteable.edina.ac.uk/user/1_myUser/notebooks/Workshop%203.ipynb
[I 19:03:33.161 NotebookApp] Kernel started: 1a484c34-2f9d-4fe7-b675-26a16fd1c9e3
[TRACE tini (1)] No child to reap
<< repeat 138 times >>
[TRACE tini (1)] No child to reap
[I 19:07:36.625 NotebookApp] Saving file at /Workshop 3.ipynb
[TRACE tini (1)] No child to reap
<< repeat 238 times >>
[TRACE tini (1)] No child to reap
[I 19:10:21.600 NotebookApp] Saving file at /Untitled3.ipynb
[TRACE tini (1)] No child to reap
<< repeat 238 times >>
[TRACE tini (1)] No child to reap
[I 19:14:20.944 NotebookApp] Saving file at /Untitled3.ipynb
[TRACE tini (1)] No child to reap
<< repeat 117 times >>
[TRACE tini (1)] No child to reap
[I 19:17:33.508 NotebookApp] Saving file at /Workshop 3.ipynb
[TRACE tini (1)] No child to reap
<< repeat 981 times >>
[TRACE tini (1)] No child to reap
[W 19:34:41.629 NotebookApp] 404 GET /user/1_myUser/static/components/react/react-dom.production.min.js (10.42.3.0) 2.61ms referer=https://noteable.edina.ac.uk/user/1_myUser/notebooks/Workshop%201.ipynb
[W 19:34:41.817 NotebookApp] 404 GET /user/1_myUser/static/components/react/react-dom.production.min.js (10.42.3.0) 2.36ms referer=https://noteable.edina.ac.uk/user/1_myUser/notebooks/Workshop%201.ipynb
[I 19:34:44.119 NotebookApp] Kernel started: edbc753d-9258-4c04-8e27-510961cbf39d
[I 19:34:50.671 NotebookApp] Starting buffering for edbc753d-9258-4c04-8e27-510961cbf39d:af1e0ba7d4194840bb4923ca1d1fb89f
[I 19:34:50.884 NotebookApp] Saving file at /Workshop 4(1)(1).ipynb
[W 19:34:54.236 NotebookApp] 404 GET /user/1_myUser/static/components/react/react-dom.production.min.js (10.42.3.0) 2.10ms referer=https://noteable.edina.ac.uk/user/1_myUser/notebooks/Workshop%202.ipynb
[W 19:34:54.328 NotebookApp] 404 GET /user/1_myUser/static/components/react/react-dom.production.min.js (10.42.3.0) 2.28ms referer=https://noteable.edina.ac.uk/user/1_myUser/notebooks/Workshop%202.ipynb
[I 19:35:01.992 NotebookApp] Kernel started: 7603eb6d-75fb-4326-ac4f-14fe08961914
[I 19:35:22.068 NotebookApp] Starting buffering for 7603eb6d-75fb-4326-ac4f-14fe08961914:c52fce4c124b466faefa5aafd3ce2a61
[W 19:35:24.764 NotebookApp] 404 GET /user/1_myUser/static/components/react/react-dom.production.min.js (10.42.3.0) 2.32ms referer=https://noteable.edina.ac.uk/user/1_myUser/notebooks/Workshop%203.ipynb
[W 19:35:24.842 NotebookApp] 404 GET /user/1_myUser/static/components/react/react-dom.production.min.js (10.42.3.0) 2.08ms referer=https://noteable.edina.ac.uk/user/1_myUser/notebooks/Workshop%203.ipynb
[I 19:36:03.698 NotebookApp] Starting buffering for 68b808cf-8adb-46f1-b36d-86c272e8687e:46b36756fef649548f77d02fdb68de9c
[TRACE tini (1)] No child to reap
<< repeat 118 times >>
[TRACE tini (1)] No child to reap
[I 19:36:20.629 NotebookApp] Saving file at /Untitled3.ipynb
[TRACE tini (1)] No child to reap
<< repeat 117 times >>
[TRACE tini (1)] No child to reap
[I 19:38:20.942 NotebookApp] Saving file at /Untitled3.ipynb
[TRACE tini (1)] No child to reap
<< repeat 118 times >>
[TRACE tini (1)] No child to reap
[I 19:40:20.496 NotebookApp] Saving file at /Untitled3.ipynb
[TRACE tini (1)] No child to reap
<< repeat 117 times >>
[TRACE tini (1)] No child to reap
[I 19:42:20.497 NotebookApp] Saving file at /Untitled3.ipynb
[W 19:42:27.985 NotebookApp] 404 GET /user/1_myUser/static/components/react/react-dom.production.min.js (10.42.3.0) 2.71ms referer=https://noteable.edina.ac.uk/user/1_myUser/edit/data.txt
[W 19:42:41.039 NotebookApp] delete /data.txt
[W 19:42:50.968 NotebookApp] 404 GET /user/1_myUser/static/components/react/react-dom.production.min.js (10.42.3.0) 2.04ms referer=https://noteable.edina.ac.uk/user/1_myUser/edit/data.txt
[TRACE tini (1)] No child to reap
<< repeat 118 times >>
[TRACE tini (1)] No child to reap
[I 19:44:20.934 NotebookApp] Saving file at /Untitled3.ipynb
[TRACE tini (1)] No child to reap
<< repeat 579 times >>
[TRACE tini (1)] No child to reap
[I 19:54:20.507 NotebookApp] Saving file at /Untitled3.ipynb
[TRACE tini (1)] No child to reap
<< repeat 358 times >>
[TRACE tini (1)] No child to reap
[I 20:00:20.498 NotebookApp] Saving file at /Untitled3.ipynb
[TRACE tini (1)] No child to reap
<< repeat 238 times >>
[TRACE tini (1)] No child to reap
[W 20:03:49.985 NotebookApp] 404 GET /user/1_myUser/static/components/react/react-dom.production.min.js (10.42.3.0) 2.62ms referer=https://noteable.edina.ac.uk/user/1_myUser/notebooks/Workshop%204(1)(1).ipynb
[W 20:03:50.111 NotebookApp] 404 GET /user/1_myUser/static/components/react/react-dom.production.min.js (10.42.3.0) 2.36ms referer=https://noteable.edina.ac.uk/user/1_myUser/notebooks/Workshop%204(1)(1).ipynb
[TRACE tini (1)] No child to reap
<< repeat 117 times >>
[TRACE tini (1)] No child to reap
[I 20:05:53.500 NotebookApp] Saving file at /Workshop 4(1)(1).ipynb
[I 20:06:20.498 NotebookApp] Saving file at /Untitled3.ipynb
[TRACE tini (1)] No child to reap
<< repeat 118 times >>
[TRACE tini (1)] No child to reap
[I 20:08:20.543 NotebookApp] Saving file at /Untitled3.ipynb
[TRACE tini (1)] No child to reap
<< repeat 117 times >>
[TRACE tini (1)] No child to reap
[I 20:10:20.558 NotebookApp] Saving file at /Untitled3.ipynb
[TRACE tini (1)] No child to reap
<< repeat 118 times >>
[TRACE tini (1)] No child to reap
[I 20:12:20.854 NotebookApp] Saving file at /Untitled3.ipynb
[TRACE tini (1)] No child to reap
<< repeast 358 times >>
[TRACE tini (1)] No child to reap
[I 20:18:20.915 NotebookApp] Saving file at /Untitled3.ipynb
[TRACE tini (1)] No child to reap
<< repeat 118 times >>
[TRACE tini (1)] No child to reap
[I 20:20:20.938 NotebookApp] Saving file at /Untitled3.ipynb
[TRACE tini (1)] No child to reap
[TRACE tini (1)] No child to reap
<< repeat 238 times >>
[TRACE tini (1)] No child to reap
[I 20:24:07.481 NotebookApp] Starting buffering for 1a484c34-2f9d-4fe7-b675-26a16fd1c9e3:d68fe8348db5471cb677b7e4204a3649
[TRACE tini (1)] No child to reap
<< repeat 117 times >>
[TRACE tini (1)] No child to reap
[I 20:26:20.500 NotebookApp] Saving file at /Untitled3.ipynb
[TRACE tini (1)] No child to reap
<< repeat 118 times >>
[TRACE tini (1)] No child to reap
[I 20:28:20.584 NotebookApp] Saving file at /Untitled3.ipynb
[TRACE tini (1)] No child to reap
<< repeat 117 times >>
[TRACE tini (1)] No child to reap
[I 20:30:20.556 NotebookApp] Saving file at /Untitled3.ipynb
[TRACE tini (1)] No child to reap
<< repeat 118 times >>
[TRACE tini (1)] No child to reap
[I 20:32:20.872 NotebookApp] Saving file at /Untitled3.ipynb
[TRACE tini (1)] No child to reap
<< repeat 117 times >>
[TRACE tini (1)] No child to reap
[I 20:34:20.598 NotebookApp] Saving file at /Untitled3.ipynb
[TRACE tini (1)] No child to reap
<< repeat 117 times >>
[TRACE tini (1)] No child to reap
[I 20:36:20.901 NotebookApp] Saving file at /Untitled3.ipynb
[TRACE tini (1)] No child to reap
<< repeat 118 times >>
[TRACE tini (1)] No child to reap
[I 20:38:21.012 NotebookApp] Saving file at /Untitled3.ipynb
[TRACE tini (1)] No child to reap
<< repeat 238 times >>
[TRACE tini (1)] No child to reap
[I 20:42:20.626 NotebookApp] Saving file at /Untitled3.ipynb
[TRACE tini (1)] No child to reap
<< repeat 117 times >>
[TRACE tini (1)] No child to reap
[I 20:44:20.956 NotebookApp] Saving file at /Untitled3.ipynb
[TRACE tini (1)] No child to reap
<< repeat 400 times >>
[TRACE tini (1)] No child to reap

(this is basically a jupyter/r-notebook:1386e2046833 docker notebook, with a bunch of python libraries installed)

@krallin
Copy link
Owner

krallin commented Oct 22, 2019 via email

@perllaghu
Copy link

Not now..... I'll get you one the next time we have a stalled image
(given this is <0.1% - hopefully it'll be a couple of days)

@perllaghu
Copy link

I acquired 2 over the weekend... both have essentially the same docker log file.

You asked for the output of the ps command. I've run ps axfj on both VMs, and they have the following sections for the relevant docker images (I've obfuscated the userID):

  PPID    PID   PGID    SID TTY       TPGID STAT   UID   TIME COMMAND
  8154  82824  82824   8154 ?            -1 Sl       0   0:10  \_ containerd-shim -namespace moby -workdir /var/lib/containerd/io.containerd.runtime.v1.linux/moby/0552936e0c33b1a7efca73b168cdb0c09e9f65413a2c356d682d10e256015446 -address /run/containerd/containerd.sock -containerd-binary /usr/bin/containerd -runtime-root /var/run/docker/runtime-runc
 82824  82841  82841  82841 ?            -1 Rs    1000 21115459:07  |   \_ tini -g -vvv -- /usr/local/bin/docker-entrypoint.sh jupyter-notebook --ip=0.0.0.0 --port=8888 --NotebookApp.base_url=/user/1_myUser/ --NotebookApp.token=hVC6ubvaRCG3p8rix0s_Xw --NotebookApp.default_url=/tree --NotebookApp.trust_xheaders=True --NotebookApp.disable_check_xsrf=True
 82841  82859  82859  82841 ?            -1 Rl    1000 884:47  |       \_ /opt/conda/bin/python /opt/conda/bin/jupyter-notebook --ip=0.0.0.0 --port=8888 --NotebookApp.base_url=/user/1_myUser/ --NotebookApp.token=hVC6ubvaRCG3p8rix0s_Xw --NotebookApp.default_url=/tree --NotebookApp.trust_xheaders=True --NotebookApp.disable_check_xsrf=True
 82859  83914  83914  83914 ?            -1 Ssl   1000   0:06  |           \_ /opt/conda/bin/python -m ipykernel_launcher -f /home/jovyan/.local/share/jupyter/runtime/kernel-0083e685-e062-45e0-bb82-f0dd5e043289.json
 82859 120385 120385 120385 ?            -1 Ssl   1000   0:08  |           \_ /opt/conda/bin/python -m ipykernel_launcher -f /home/jovyan/.local/share/jupyter/runtime/kernel-5c785b9d-31fd-4159-ac2c-d955e59a3e38.json
 82859 129427 129427 129427 ?            -1 Rsl   1000 888:38  |           \_ /opt/conda/bin/python -m ipykernel_launcher -f /home/jovyan/.local/share/jupyter/runtime/kernel-1f9db86d-f3d1-42ec-b9aa-51edf51b90e7.json
 82859  14774  14774  14774 ?            -1 Ssl   1000   0:06  |           \_ /opt/conda/bin/python -m ipykernel_launcher -f /home/jovyan/.local/share/jupyter/runtime/kernel-73eb7d8c-ded5-43f0-8495-89ec6e1bb153.json

and

  PPID    PID   PGID    SID TTY       TPGID STAT   UID   TIME COMMAND
 39799 120077 120077  39799 ?            -1 Sl       0   0:10  \_ containerd-shim -namespace moby -workdir /var/lib/containerd/io.containerd.runtime.v1.linux/moby/ee1e99006421fc886da45e2ed015c30bb4e8e338715e3bc3f6a7d80875916b39 -address /run/containerd/containerd.sock -containerd-binary /usr/bin/containerd -runtime-root /var/run/docker/runtime-runc
120077 120097 120097 120097 ?            -1 Rs    1000 21115368:56  |   \_ tini -g -vvv -- /usr/local/bin/docker-entrypoint.sh jupyter-notebook --ip=0.0.0.0 --port=8888 --NotebookApp.base_url=/user/1_myUser/ --NotebookApp.token=MIwv2i_LRcO-P1XZd_vTBA --NotebookApp.default_url=/tree --NotebookApp.trust_xheaders=True --NotebookApp.disable_check_xsrf=True
120097 120117 120117 120097 ?            -1 Rl    1000 791:42  |       \_ /opt/conda/bin/python /opt/conda/bin/jupyter-notebook --ip=0.0.0.0 --port=8888 --NotebookApp.base_url=/user/1_myUser/ --NotebookApp.token=MIwv2i_LRcO-P1XZd_vTBA --NotebookApp.default_url=/tree --NotebookApp.trust_xheaders=True --NotebookApp.disable_check_xsrf=True
120117 120285 120285 120285 ?            -1 Ssl   1000   0:07  |           \_ /opt/conda/bin/python -m ipykernel_launcher -f /home/jovyan/.local/share/jupyter/runtime/kernel-05424536-930c-4ebe-924e-e5ed28e6c754.json
120117 120846 120846 120846 ?            -1 Rsl   1000 798:28  |           \_ /opt/conda/bin/python -m ipykernel_launcher -f /home/jovyan/.local/share/jupyter/runtime/kernel-2812103a-cf0d-409d-9dc0-172131c5167c.json

Lastly, and because this may be relevant - here's the POD information from the Rancher API for one of the images:

{
  "annotations": {
    "cni.projectcalico.org/podIP": "10.42.12.144/32",
    "container": "standard_notebook",
    "course_id": "BUST080392019-0SV1SEM1",
    "hub.jupyter.org/username": "1_myUser",
    "org": "the_university_of_edinburgh",
    "org_id": "1",
    "owner": "the_university_of_edinburgh",
    "owner_id": "2",
    "type": "naas",
    "user": "myUser",
    "username": "1_myUser"
  },
  "automountServiceAccountToken": false,
  "baseType": "pod",
  "containers": [
    {
      "command": [
        "jupyter-notebook",
        "--ip=0.0.0.0",
        "--port=8888",
        "--NotebookApp.base_url=/user/1_myUser/",
        "--NotebookApp.token=MIwv2i_LRcO-P1XZd_vTBA",
        "--NotebookApp.default_url=/tree",
        "--NotebookApp.trust_xheaders=True",
        "--NotebookApp.disable_check_xsrf=True"
      ],
      "environment": {
        "CPU_GUARANTEE": "0.05",
        "CPU_LIMIT": "1.0",
        "JPY_API_TOKEN": "14cf56c3e54544bc87b4ae1bc8b98bc2",
        "JUPYTERHUB_ACTIVITY_URL": "http://10.43.217.242:8081/hub/api/users/1_myUser/activity",
        "JUPYTERHUB_ADMIN_ACCESS": "1",
        "JUPYTERHUB_API_TOKEN": "14cf56c3e54544bc87b4ae1bc8b98bc2",
        "JUPYTERHUB_API_URL": "http://10.43.217.242:8081/hub/api",
        "JUPYTERHUB_BASE_URL": "/",
        "JUPYTERHUB_CLIENT_ID": "jupyterhub-user-1_myUser",
        "JUPYTERHUB_HOST": "",
        "JUPYTERHUB_OAUTH_CALLBACK_URL": "/user/1_myUser/oauth_callback",
        "JUPYTERHUB_SERVER_NAME": "",
        "JUPYTERHUB_SERVICE_PREFIX": "/user/1_myUser/",
        "JUPYTERHUB_USER": "1_myUser",
        "JUPYTER_IMAGE": "registry.gitlab.edina.ac.uk:1875/naas/docker_notebooks/standard-notebook:9428-master",
        "JUPYTER_IMAGE_SPEC": "registry.gitlab.edina.ac.uk:1875/naas/docker_notebooks/standard-notebook:9428-master",
        "MEM_GUARANTEE": "1073741824",
        "MEM_LIMIT": "4294967296",
        "NAAS_BASE_URL": "https://noteable.edina.ac.uk",
        "NAAS_COURSE_ID": "BUST080392019-0SV1SEM1",
        "NAAS_COURSE_TITLE": "Fundamentals of Programming for Business Applications (2019-2020)[SEM1]",
        "NAAS_JWT": "xxx",
        "NAAS_ROLE": "Learner"
      },
      "exitCode": null,
      "image": "registry.gitlab.edina.ac.uk:1875/naas/docker_notebooks/standard-notebook:9428-master",
      "imagePullPolicy": "IfNotPresent",
      "initContainer": false,
      "name": "notebook",
      "ports": [
        {
          "containerPort": 8888,
          "name": "notebook-port",
          "protocol": "TCP",
          "sourcePort": 0,
          "type": "/v3/project/schemas/containerPort"
        }
      ],
      "resources": {
        "limits": {
          "cpu": "1",
          "memory": "4294967296"
        },
        "requests": {
          "cpu": "50m",
          "memory": "1073741824"
        },
        "type": "/v3/project/schemas/resourceRequirements"
      },
      "restartCount": 0,
      "state": "running",
      "stdin": false,
      "stdinOnce": false,
      "terminationMessagePath": "/dev/termination-log",
      "terminationMessagePolicy": "File",
      "transitioning": "no",
      "transitioningMessage": "",
      "tty": false,
      "type": "/v3/project/schemas/container",
      "volumeMounts": [
        {
          "mountPath": "/home/jovyan",
          "name": "b95df8aa-c65a-4e7f-a241-5187976bc2da",
          "readOnly": false,
          "type": "/v3/project/schemas/volumeMount"
        }
      ]
    },
    {
      "capAdd": [
        "NET_ADMIN"
      ],
      "entrypoint": [
        "iptables",
        "-A",
        "OUTPUT",
        "-d",
        "169.254.169.254",
        "-j",
        "DROP"
      ],
      "exitCode": 0,
      "image": "jupyterhub/k8s-network-tools:0.8.2",
      "imagePullPolicy": "IfNotPresent",
      "initContainer": true,
      "name": "block-cloud-metadata",
      "privileged": true,
      "resources": {
        "type": "/v3/project/schemas/resourceRequirements"
      },
      "restartCount": 0,
      "state": "terminated",
      "stdin": false,
      "stdinOnce": false,
      "terminationMessagePath": "/dev/termination-log",
      "terminationMessagePolicy": "File",
      "transitioning": "no",
      "transitioningMessage": "Completed",
      "tty": false,
      "type": "/v3/project/schemas/container",
      "uid": 0
    }
  ],
  "created": "2019-10-26T12:47:47Z",
  "createdTS": 1572094067000,
  "creatorId": null,
  "dnsPolicy": "ClusterFirst",
  "fsgid": 100,
  "hostIPC": false,
  "hostNetwork": false,
  "hostPID": false,
  "id": "default:jupyter-1-5fmyUser",
  "imagePullSecrets": [
    {
      "name": "gitlab-edina",
      "type": "/v3/project/schemas/localObjectReference"
    }
  ],
  "labels": {
    "component": "singleuser-server",
    "org": "the_university_of_edinburgh",
    "org_id": "1",
    "owner": "the_university_of_edinburgh",
    "owner_id": "2",
    "type": "naas",
    "username": "1_myUser"
  },
  "links": {
    "remove": "…/v3/project/c-mf7x9:project-l8pzf/pods/default:jupyter-1-5fmyUser",
    "self": "…/v3/project/c-mf7x9:project-l8pzf/pods/default:jupyter-1-5fmyUser",
    "update": "…/v3/project/c-mf7x9:project-l8pzf/pods/default:jupyter-1-5fmyUser",
    "yaml": "…/v3/project/c-mf7x9:project-l8pzf/pods/default:jupyter-1-5fmyUser/yaml"
  },
  "name": "jupyter-1-5fmyUser",
  "namespaceId": "default",
  "nodeId": "c-mf7x9:m-2326d2f782c3",
  "priority": 0,
  "projectId": "c-mf7x9:project-l8pzf",
  "removed": "2019-10-26T16:41:16Z",
  "removedTS": 1572108076000,
  "restartPolicy": "OnFailure",
  "runAsGroup": 0,
  "schedulerName": "default-scheduler",
  "scheduling": {
    "node": {
      "nodeId": null,
      "preferred": [
        "hub.jupyter.org/node-purpose = user"
      ]
    },
    "tolerate": [
      {
        "effect": "NoSchedule",
        "key": "hub.jupyter.org/dedicated",
        "operator": "Equal",
        "type": "/v3/project/schemas/toleration",
        "value": "user"
      },
      {
        "effect": "NoSchedule",
        "key": "hub.jupyter.org_dedicated",
        "operator": "Equal",
        "type": "/v3/project/schemas/toleration",
        "value": "user"
      },
      {
        "effect": "NoExecute",
        "key": "node.kubernetes.io/not-ready",
        "operator": "Exists",
        "tolerationSeconds": 300,
        "type": "/v3/project/schemas/toleration"
      },
      {
        "effect": "NoExecute",
        "key": "node.kubernetes.io/unreachable",
        "operator": "Exists",
        "tolerationSeconds": 300,
        "type": "/v3/project/schemas/toleration"
      }
    ]
  },
  "serviceAccountName": "default",
  "state": "removing",
  "status": {
    "conditions": [
      {
        "lastProbeTime": null,
        "lastTransitionTime": "2019-10-26T12:47:49Z",
        "lastTransitionTimeTS": 1572094069000,
        "status": "True",
        "type": "Initialized"
      },
      {
        "lastProbeTime": null,
        "lastTransitionTime": "2019-10-26T12:47:50Z",
        "lastTransitionTimeTS": 1572094070000,
        "status": "True",
        "type": "Ready"
      },
      {
        "lastProbeTime": null,
        "lastTransitionTime": "2019-10-26T12:47:50Z",
        "lastTransitionTimeTS": 1572094070000,
        "status": "True",
        "type": "ContainersReady"
      },
      {
        "lastProbeTime": null,
        "lastTransitionTime": "2019-10-26T12:47:47Z",
        "lastTransitionTimeTS": 1572094067000,
        "status": "True",
        "type": "PodScheduled"
      }
    ],
    "containerStatuses": [
      {
        "containerID": "docker://ee1e99006421fc886da45e2ed015c30bb4e8e338715e3bc3f6a7d80875916b39",
        "image": "registry.gitlab.edina.ac.uk:1875/naas/docker_notebooks/standard-notebook:9428-master",
        "imageID": "docker-pullable://registry.gitlab.edina.ac.uk:1875/naas/docker_notebooks/standard-notebook@sha256:ce78c583e2424ec501ef2ea24f6f1449affef6857cd09e08967de4cd1dd63319",
        "lastState": {
          "type": "/v3/project/schemas/containerState"
        },
        "name": "notebook",
        "ready": true,
        "restartCount": 0,
        "state": {
          "running": {
            "startedAt": "2019-10-26T12:47:49Z",
            "startedAtTS": 1572094069000,
            "type": "/v3/project/schemas/containerStateRunning"
          },
          "type": "/v3/project/schemas/containerState"
        },
        "type": "/v3/project/schemas/containerStatus"
      }
    ],
    "initContainerStatuses": [
      {
        "containerID": "docker://8fa9679c1f1f57825a240cd1928e7e7c7955b5f84ac49b296477cd3295d4b3cb",
        "image": "jupyterhub/k8s-network-tools:0.8.2",
        "imageID": "docker-pullable://jupyterhub/k8s-network-tools@sha256:5d553df705ec62f6daa18e8baa8097c5b023922dbd22070b6e5b57c1a5e705ce",
        "lastState": {
          "type": "/v3/project/schemas/containerState"
        },
        "name": "block-cloud-metadata",
        "ready": true,
        "restartCount": 0,
        "state": {
          "terminated": {
            "containerID": "docker://8fa9679c1f1f57825a240cd1928e7e7c7955b5f84ac49b296477cd3295d4b3cb",
            "exitCode": 0,
            "finishedAt": "2019-10-26T12:47:48Z",
            "finishedAtTS": 1572094068000,
            "reason": "Completed",
            "signal": 0,
            "startedAt": "2019-10-26T12:47:48Z",
            "startedAtTS": 1572094068000,
            "type": "/v3/project/schemas/containerStateTerminated"
          },
          "type": "/v3/project/schemas/containerState"
        },
        "type": "/v3/project/schemas/containerStatus"
      }
    ],
    "nodeIp": "172.16.21.145",
    "phase": "Running",
    "podIp": "10.42.12.144",
    "qosClass": "Burstable",
    "startTime": "2019-10-26T12:47:47Z",
    "startTimeTS": 1572094067000,
    "type": "/v3/project/schemas/podStatus"
  },
  "terminationGracePeriodSeconds": 30,
  "transitioning": "yes",
  "transitioningMessage": "",
  "type": "pod",
  "uid": 1000,
  "uuid": "d01670d3-f7ee-11e9-b002-00505681593b",
  "volumes": [
    {
      "hostPath": {
        "kind": "",
        "path": "/path/to/home/dir"
      },
      "name": "b95df8aa-c65a-4e7f-a241-5187976bc2da",
      "type": "/v3/project/schemas/volume"
    }
  ],
  "workloadId": null
}

@perllaghu
Copy link

We updated out platform to more recent docker_stack base images (Ubuntu Bionic, Jupyterhub 1.0.9, and Jupyter notebook 6.0.0) in early November - and all has been fine since.

I can only assume there was an issue between those two, which was resolved when we upgraded those components.

From 3 or 4 a week to none in 3 weeks - this is a definite win!!

I'm happy to close this issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants