-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] - Mounting a NFS share as the users home locks ip ipykernel #1820
Comments
UPDATE:
Mount info...
So it is likely something about NFS. Perhaps some sort of lockfile failure. NFS is known for file lock problems. |
UPDATE 2 At moment of failure system reports... So is there anything that can be done? I have no control over the source of the storage. |
Have you looked at SO? |
Yes I did... It is irrelevant. I did track it down to being cause by NFS file locking. If I can determine what file is being locked, I could posibly move it out of the home directory (directory symlink?) I am also investigating getting the NFS file locking working, but I don't have control of the source systems. So really the issue boiled down to...
|
|
Thank you Bidek56, for the suggestion about asking on the Jupyter Forum. A user (big thanks to kevin-bates) who was able to point to some code information about NFS problems which resolved the problem! Notebook fails on NFS mount, lockd not available For reference... It is caused by a SQLite history in ipykernal issue |
Addendum... Perhaps this should be added to docker-stacks trouble shooting with regards to NFS mounted homes? |
Feel free to submit a PR, this stack has a documentation and recipe section. Perhaps add an entire recipe based on your findings. |
What docker image(s) are you using?
minimal-notebook
OS system and architecture running docker image
RHEL7 x86_64
What Docker command are you running?
It runs in a swarm environment...
and then with a local mount of a NFS mount
The contents of the direct has the correct uid:gid or 1000:100 and is a exact replica of the
/home/jovyan
within the imageHow to Reproduce the problem?
The first command above (no mount, just the image itself) works perfectly fine.
I get the token from the logs, and login, and can run ipykernel notebooks
The second command (with a local bind mount of an NFS mount) also logs in fine, Lab pages are visible, and terminals can be opened. But as soon as you try to run a ipykernel the web site locks up.
The server is running but no longer responds to any queries, from anywhere.
The log output ends with the line
[I 2022-10-31 16:47:10.952 ServerApp] Creating new notebook in
and no further.If the directory is mounted else where I can see and access it, and looks perfect fine, with expected UIDs and files.
Mounting as just the work sub-irectory also works fine. Even if you start jupyter notebook while labs is in the mounted work sub-directory. As such it likely involve home 'dot' files.
This system of mounting has been used for may other docker environments and has been in use by users for more than three years with code-server, NPM, apache, and larval software. Only Jupyter Notebook locks up, using many attempts to allow its use, on and off over many years. This is the first time I traced the problem so it only happens with the mounted home, while working fine without the mounted home.
Command output
No response
Expected behavior
ipykernel starts and notebook page appears.
Actual behavior
The notebook page does not appear, and a refresh results in a timeout to read data.
Anything else?
The NFS mount that docker does a bind mount to is...
It is used for the homes for may other docker containers for our users, (running code-server).
The only thing I can think of is perhaps there is a lockfile that does not work with the NFS mount docker is bind mounting against.
The text was updated successfully, but these errors were encountered: