Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JupyterLab 4.0.0a31 collaborative notebook sometimes not respect nbformat_minor in file #13606

Open
Wh1isper opened this issue Dec 16, 2022 · 11 comments

Comments

@Wh1isper
Copy link
Contributor

Description

I think this is due to a NotebookModel's constructor problem, initialized value overwrite the original value of the file.

Here we initialize the value:
https://github.com/jupyterlab/jupyterlab/blob/master/packages/notebook/src/model.ts#L115
and value is 4
https://github.com/jupyterlab/jupyterlab/blob/master/packages/nbformat/src/index.ts#L22

As far as I know, the crdt algorithm determines the order of operations based on the timestamp, and it is possible that the time of this initialization operation is after the server initializes yroom, so that the algorithm will consider it an active modification
https://github.com/jupyter-server/jupyter_server_ydoc/blob/v0.4.0/jupyter_server_ydoc/handlers.py#L211

So, when the nbformat_minor in the file is 5, there is a high probability that the nbformat_minor=4 in the constructor will override it, causing the file to be accidentally downgraded

I lack front-end knowledge, but I verified with breakpoints
jupyter_server_ydoc does not tamper with nbformat_minor, which is modified from self.room.document.source, which is the web client

Reproduce

Try collabroative open this file

{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4c8d2523-c9e8-4102-8949-8ec23e2be509",
   "metadata": {},
   "outputs": [],
   "source": [
    "print('hello world')"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.12"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}

Expected behavior

Do not downgrade nbformat_minor, cell_id is important and helpful for backend execution code in jupyter_kernel_executor

Context

  • Operating System and version:
  • Browser and version:
  • JupyterLab version:
Troubleshoot Output
Paste the output from running `jupyter troubleshoot` from the command line here.
You may want to sanitize the paths in the output.
Command Line Output
Paste the output from your command line running `jupyter lab` here, use `--debug` if possible.
Browser Output
Paste the output from your browser Javascript console here, if applicable.
@Wh1isper
Copy link
Contributor Author

Wh1isper commented Dec 16, 2022

Comment says

this will be overridden by the initialization coming from the document provider.

But that's not the same as what I've seen, the downgrad does exist

After trying export const MINOR_VERSION: number = 5; no such problem seems to occur, so I'm basically sure the race problem exists

@Wh1isper
Copy link
Contributor Author

It looks like it is caused by "clearing the cache and forcing a refresh", if the object is reinitialized, does that cause this situation?

@hbcarlos
Copy link
Member

Hi @Wh1isper,

As far as I know, the crdt algorithm determines the order of operations based on the timestamp

Indeed that's why we make sure not to override the content on disk from the constructor.

and it is possible that the time of this initialization operation is after the server initializes yroom, so that the algorithm will consider it an active modification

No, the Context's constructor synchronously instantiates the NotebookModel, and then the DocumentProvider (which opens the WebSocket and makes the back-end initialize the content). The timestamp of the initialization of nbformatand other attributes in Notebook's constructor will always be previous to the timestamp of the update loading the content from the back-end.

When a second client joins, the document is already in memory in the back-end (the timestamp in the back-end is newer), but before sending it to the new client, the back-end creates a snapshot (an update with the current state of the document) and sends that to the client. Again, since we make a snapshot, the timestamp of the initialization of nbformat in Notebook's constructor is previous to the update coming from the back-end.

It looks like it is caused by "clearing the cache and forcing a refresh" if the object is reinitialized, does that cause this situation?

Maybe. I'm trying to reproduce it to ensure we don't have the race condition, but I'm unable to reproduce it.

@hbcarlos
Copy link
Member

@Wh1isper What version of JupyterLab are you using? Are you using the collaborative mode?

@Wh1isper
Copy link
Contributor Author

Wh1isper commented Dec 19, 2022

@hbcarlos jupyterlab 4.0.0a31 with collaborative mode

@Wh1isper
Copy link
Contributor Author

Some additional information

pip list

Package Version Editable project location


aiofiles 0.8.0
aiohttp 3.8.3
aiosignal 1.3.1
aiosqlite 0.17.0
anyio 3.6.2
argon2-cffi 21.3.0
argon2-cffi-bindings 21.2.0
arrow 1.2.3
asttokens 2.1.0
async-lru 1.0.3
async-timeout 4.0.2
attrs 22.1.0
Babel 2.11.0
backcall 0.2.0
beautifulsoup4 4.11.1
bleach 5.0.1
certifi 2022.9.24
cffi 1.15.1
charset-normalizer 2.1.1
contourpy 1.0.6
cycler 0.11.0
debugpy 1.6.3
decorator 5.1.1
defusedxml 0.7.1
entrypoints 0.4
executing 1.2.0
fastjsonschema 2.16.2
fonttools 4.38.0
fqdn 1.5.1
frozenlist 1.3.3
idna 3.4
importlib-metadata 5.0.0
importlib-resources 5.10.0
ipykernel 6.17.1
ipython 8.6.0
ipython-genutils 0.2.0
isoduration 20.11.0
jedi 0.18.1
Jinja2 3.1.2
json5 0.9.10
jsonpointer 2.3
jsonschema 4.17.0
jupyter_client 7.4.7 /mnt/d/JupyterLegacy/jupyter_client
jupyter_core 5.1.0
jupyter-events 0.5.0
jupyter_kernel_client 0.1.2
jupyter_kernel_executor 0.2.3
jupyter-lsp 1.5.1
jupyter_server 2.1.0.dev0 /mnt/d/JupyterLegacy/jupyter_server
jupyter_server_fileid 0.6.0
jupyter_server_terminals 0.4.2
jupyter_server_ydoc 0.4.0
jupyter-ydoc 0.2.2
jupyterlab 4.0.0a31
jupyterlab-pygments 0.2.2
jupyterlab_server 2.16.3
kiwisolver 1.4.4
MarkupSafe 2.1.1
matplotlib 3.6.2
matplotlib-inline 0.1.6
mistune 2.0.4
multidict 6.0.3
nbclassic 0.4.8
nbclient 0.7.0
nbconvert 7.2.4
nbformat 5.7.0
nest-asyncio 1.5.6
notebook 6.5.2
notebook_shim 0.2.2
numpy 1.23.5
packaging 21.3
pandocfilters 1.5.0
parso 0.8.3
pexpect 4.8.0
pickleshare 0.7.5
Pillow 9.3.0
pip 22.3.1
pkgutil_resolve_name 1.3.10
platformdirs 2.5.3
prometheus-client 0.15.0
prompt-toolkit 3.0.32
psutil 5.9.4
ptyprocess 0.7.0
pure-eval 0.2.2
pycparser 2.21
Pygments 2.13.0
pyparsing 3.0.9
pyrsistent 0.19.2
python-dateutil 2.8.2
python-json-logger 2.0.4
pytz 2022.6
PyYAML 6.0
pyzmq 24.0.1
requests 2.28.1
rfc3339-validator 0.1.4
rfc3986-validator 0.1.1
Send2Trash 1.8.0
setuptools 65.5.1
six 1.16.0
sniffio 1.3.0
soupsieve 2.3.2.post1
stack-data 0.6.0
terminado 0.17.0
tinycss2 1.2.1
tomli 2.0.1
tornado 6.2
tqdm 4.64.1
traitlets 5.6.0
typing_extensions 4.4.0
uri-template 1.2.0
urllib3 1.26.12
watchfiles 0.18.1
wcwidth 0.2.5
webcolors 1.12
webencodings 0.5.1
websocket-client 1.4.2
wheel 0.38.4
y-py 0.5.4
yarl 1.8.2
ypy-websocket 0.5.0
zipp 3.10.0

I am using folloing scipt as entrypoint with python 3.8 in WSL ubuntu-2004

import re
import shutil
import sys
import os
from pkg_resources import load_entry_point

shutil.rmtree('./server_dir', ignore_errors=True)
shutil.rmtree('./id', ignore_errors=True)
if os.path.exists('.jupyter_ystore.db'):
    os.remove('.jupyter_ystore.db')
os.makedirs('./server_dir')
os.makedirs('./id')
shutil.copy('Untitled.ipynb', 'server_dir/Untitled.ipynb')


if __name__ == '__main__':
    sys.argv[0] = re.sub(r'(-script\.pyw?|\.exe)?$', '', sys.argv[0])
    sys.argv.append('--debug')
    sys.argv.append('--collaborative')
    sys.argv.append('--config=jupyter_server_config.py')
    path_fix = os.environ.get('PATH').split(':')
    sys.path.extend(path_fix)
    sys.exit(
        load_entry_point('jupyterlab', 'console_scripts', 'jupyter-lab')()
    )

@Wh1isper
Copy link
Contributor Author

Jupyter server config

import os

_here = os.path.dirname(__file__)

c.ServerApp.ip = '0.0.0.0'  # listen on all IPs
c.ServerApp.token = ''  # disable authentication
c.ServerApp.allow_origin = '*'  # allow access from anywhere
c.ServerApp.disable_check_xsrf = True  # allow cross-site requests

c.ServerApp.root_dir = os.path.abspath(os.path.join(_here, './server_dir'))
import jupyter_server_fileid

c.FileIdExtension.file_id_manager_class = jupyter_server_fileid.manager.LocalFileIdManager
c.LocalFileIdManager.db_path = os.path.join(_here, './id/db.sqlit')

@Wh1isper
Copy link
Contributor Author

@hbcarlos
I made two screen recordings here, the first one was successful and the page rendered in time once it was opened (I'm pointing to the code highlighting)

succeed.mp4

The second screen recording is a failure, you will find that the code highlighting is very slow to appear, I found through the breakpoint here the self.room.document.source changed: https://github.com/jupyter-server/jupyter_server_ydoc/blob/v0.4.0/jupyter_server_ydoc/handlers.py#L347

failed.mp4

@hbcarlos
Copy link
Member

Hi @Wh1isper, thanks for such detail. I can reproduce it now.

I reproduced this issue by shutting down the server and launching it without closing the client (the browser tab with the document open). Then when reloading the client, I can see that the ´nbformat_minor´ attribute changed.

Are you closing the client every time you shut down the server, or do you just reload the existing client?

@Wh1isper
Copy link
Contributor Author

@hbcarlos In my case this problem can occur both ways, closing, reopening the tab or just refreshing the page

@hbcarlos
Copy link
Member

hbcarlos commented Jan 3, 2023

@Wh1isper I believe this issue is related to #13550

@krassowski krassowski added this to the 4.0.x milestone May 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants