Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
D4.15: Exploratory support for live notebook collaboration #89
The Jupyter Notebook is a web application that enables the creation and sharing of executable documents containing live code, equations, visualizations and explanatory text. One of the main uses of Jupyter is as an interactive computing environment, building interactive notebook documents (notebooks for short). T4.2 (#70) is focused on improving the process of collaborating on Jupyter notebooks in the various ways researchers and educators use them. D4.6 (see #95) improved collaborating on notebooks via traditional version-control systems.
This deliverable aims to improve the process of "live collaboration" on notebooks, where two authors on different computers are editing the same notebook at the same time, and they are kept in sync over the Internet.
There have been prior efforts to enable live collaboration on notebooks, notably Google Colaboratory and CoCalc (formerly SageMathCloud). These efforts add live collaboration on notebooks to an existing, fully hosted cloud service (can also be downloaded and run as free, open source software, but not integrated into existing hosted notebook servers).
Our goal is to learn from these examples and build, in collaboration with the Jupyter community, an official live collaboration implementation into the JupyterLab notebook application, so that it can be available to all Jupyter users.
The target usage scenario for this effort is to enable live collaboration on notebooks whenever two or more users have access to the same hosted notebook server, e.g. hosted with JupyterHub. Example applications include researchers co-authoring an analysis, or an instructor helping a student through examples. This should require no additional infrastructure, and not rely on any commercial hosted service. In particular, it should be able to be used anywhere that Jupyter is already used, without opting into other systems. This last point will be the primary differentiator for the official Jupyter implementation, as opposed to those currently existing in hosted services. We also aim to keep the additional runtime dependencies for the server component minimal: a Python runtime, already needed to run the Jupyter notebook implementation, and a nodejs runtime, already needed to run certain aspects of the JupyterLab application. No databases or external services will be required.
Existing real-time collaboration implementations have had to heavily modify or entirely re-implement the Jupyter notebook client application in order to achieve real-time collaboration. This can lead to a maintenance burden, attempting to keep modifications working with updates to the Jupyter software as the two projects diverge. Delivering a working real-time collaboration implementation in JupyterLab itself should allow projects similar to (or possibly including) CoCalc to have real-time collaboration with less effort, even if they choose to keep their own implementation. The basis of both the server- and client-side should be reusable both inside and outside JupyteLab, enabling real-time collaboration on documents other than notebooks and outside the Jupyter ecosystem.
While the focus of this implementation is to sync two users with access to the same notebook server, decoupling the real-time server from the notebook server could allow collaboration on documents shared across implementations.
Review done! Being well written it made me curious :-) I inserted a few TODO's with questions that the reader may ask himself. I also made minor changes to the introduction for a fully consistent terminology across paragraphs. Given that the report explicitly mentions cocalc, you may want to ping William for feedback as well.
With this we can press the button, and be done. Thanks!
It would make a great blog post by the way, for all those excited to see live collaboration approaching.