Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable Real Time Collaboration in our Hubs #3027

Open
yuvipanda opened this issue Nov 18, 2021 · 20 comments
Open

Enable Real Time Collaboration in our Hubs #3027

yuvipanda opened this issue Nov 18, 2021 · 20 comments
Labels
enhancement Issues around improving existing functionality

Comments

@yuvipanda
Copy link
Contributor

JupyterLab and Retrolab have realtime collaboration features now. They are a bit experimental, but work reasonably well. We should enable them in our hubs.

@balajialg balajialg added this to To do in Datahub Project Board via automation Nov 18, 2021
@balajialg balajialg added the enhancement Issues around improving existing functionality label Nov 18, 2021
yuvipanda added a commit to yuvipanda/datahub that referenced this issue Nov 19, 2021
To use real time collaboration in JupyterLab, we
need jupyterlab-contrib/jupyterlab-link-share#10
to be fixed.
jupyterlab-contrib/jupyterlab-link-share#21 is
a proposed solution, and I'm testing it out here to see if it
works properly.

Ref berkeley-dsep-infra#3027
@yuvipanda
Copy link
Contributor Author

Coming soon to data100

image

@yuvipanda
Copy link
Contributor Author

Step 1: Find the 'Share' menu item in JupyterLab

image

Step 2: Read the warning, make sure you understand it, and copy the link to be shared with someone else

image

Step 3: Profit? Real Time Collaboration 'works' now!

yuvipanda added a commit to yuvipanda/datahub that referenced this issue Nov 19, 2021
yuvipanda added a commit to yuvipanda/datahub that referenced this issue Nov 19, 2021
yuvipanda added a commit to yuvipanda/datahub that referenced this issue Nov 19, 2021
Checked with @davidwagner who is super excited for this

Ref berkeley-dsep-infra#3027
yuvipanda added a commit to yuvipanda/datahub that referenced this issue Nov 19, 2021
Brings in jupyterlab-contrib/jupyterlab-link-share#27
as retrolab is going to be (hopefully) used by data8

Ref berkeley-dsep-infra#3027
yuvipanda added a commit to yuvipanda/datahub that referenced this issue Nov 22, 2021
Users are reporting some save issues on Piazza, and I
wasn't able to debug this right now. Turning this off to
see if it helps.

Ref berkeley-dsep-infra#3027
yuvipanda added a commit to yuvipanda/datahub that referenced this issue Nov 24, 2021
@balajialg balajialg added this to In progress in 2021-12 Sprint Board Dec 1, 2021
@balajialg balajialg moved this from In progress to To do in 2021-12 Sprint Board Dec 1, 2021
@balajialg
Copy link
Contributor

balajialg commented Dec 8, 2021

@yuvipanda @ericvd-ucb and I found an interesting behavior of the RTC feature when we were exploring this functionality a few days ago in the Data 8 hub. I had a notebook file in the root folder of the tree. So I created a shareable link and then shared it with @ericvd-ucb and Elias. Once they accessed the notebook, any folder created in each of our directories was accessible for all of us.

Am I making the right assumption here - Whenever a file within a specific directory gets shared with other users, it automatically updates the directory where it resides as a shared directory? Any additions/deletions to the shared directory gets updated for users with access to shareable links. If that be the case, I would like to add this behavior to our existing documentation

@balajialg
Copy link
Contributor

@yuvipanda Need to enable this functionality in Ischool hub too!

yuvipanda added a commit to yuvipanda/datahub that referenced this issue Dec 9, 2021
yuvipanda added a commit to yuvipanda/datahub that referenced this issue Dec 9, 2021
@yuvipanda
Copy link
Contributor Author

@balajialg so you're really giving full access to your entire server, rather than any particular file. This is fully equivalent to sharing an 'admin access' link with another user, although you can't shut down your own server with this link. So anything anyone with the link does is treated exactly like you are doing it yourself.

@balajialg balajialg moved this from To do to In progress in 2021-12 Sprint Board Dec 9, 2021
@balajialg
Copy link
Contributor

balajialg commented Dec 14, 2021

This documentation captures the essence of today's discussion and the potential next steps! Pasting it below for future reference!

Security Considerations for adopting RTC

  • there is a share menu that generates a link with a token that can be sent to other users
  • a user shares a link to the next user, that user can then see and modify all the files in the user's home directory for this particular hub - including other assignments and homework
  • this permission goes away when the user server stops - either explicitly from the control panel, or after non-activity for 60 minutes
  • there is no ability to revoke the link, but you can stop the server and the link becomes invalid
  • anyone can forward the link to the next person - there is no link-level user control. Users would have to be careful where they share links while that server is active since anyone on the internet could get control.
  • currently, the Data 8 hub has its own file directory, only that class is on it - the main Datahub has all of the user's classes ( eg connectors and modules on it)
  • a user needs to leave the session to go to work on their own files

Minimum requirements for use by students in a collaborative 'project'

Tier 1

  • Only allow users who already have an account on the hub
  • Add the ability to revoke access once the link is out
  • Be 'clearer' who is the host. A guest of the share might want to visit their own server and/or share it, so this could be confusing.
  • Display informed consent - the link share dialog's text should be customizable
  • a read-only collaboration that is time-bound (aspirational usecase)

Tier 2

  • Only allow users whose names are explicitly chosen by the user initiating the share
  • Have a concept of 'projects', only people who can be given access to particular projects

Maybe useful

  • time limits for each share link?

  • readonly is very helpful

  • Must be limited by

  • limit by username

  • projects or directories

@yuvipanda
Copy link
Contributor Author

Based on suggestions by @ericvd-ucb, I'm going to disable jupyterlab-link-share and RTC on all hubs except the dlab hub. Let's write out a doc that lays out the security posture, and let individual instructors make calls on when to enable this for now. Once we have better access control, we can enable it more broadly.

@yuvipanda
Copy link
Contributor Author

I'm also going to leave it enabled in the ischool hub, as @balajialg is talking to those folks about how they can trial it.

@balajialg
Copy link
Contributor

balajialg commented Feb 3, 2022

  • Plan an user research with instructor/GSI during March

@balajialg
Copy link
Contributor

RTC is currently disabled across all hubs and the reasons are outlined in this issue #3517!

@balajialg balajialg added this to To do in 2022-08 Sprint Board Aug 4, 2022
@balajialg balajialg removed this from To do in 2022-08 Sprint Board Aug 4, 2022
@fperez
Copy link
Collaborator

fperez commented Aug 13, 2022

Quick question @afshin - are any of the recent changes made on the YJS side in Lab had any impact on these data loss issues? I know the big refactoring is due for 4.0, but I was wondering if some of the recent work has made its way into 3.4.5 in a manner that would change our risk assessment?

I'd very much love to have RTC available, but we can't really take any chances (this semester I'm teaching a huge class of ~ 1,200+, not my smaller one from the spring where I could afford to be a bit more risk-tolerant and conduct experiments).

Thx for any input you might have.

@afshin
Copy link

afshin commented Aug 14, 2022

@fperez @yuvipanda @balajialg, That bug still exists in every current release and pre-release of JupyterLab. It will be resolved when we merge this PR which refactors RTC in JupyterLab.

This PR is backward-incompatible, so it'll ship in a 4.0 alpha but it will not land in a 3.x release. We're pushing hard on merging it so we can iterate and get back to truly testing RTC in the wild.

@fperez
Copy link
Collaborator

fperez commented Aug 15, 2022

Understood, thx for the clarification @afshin! It helps us plan on our end, this means we're probably going to hold off for this semester on RTC, at least in very large courses like D100 where flexibility/risk have to be tightly controlled (smaller courses have a different dynamic).

@balajialg balajialg added this to To do in 2022-09 Sprint Board Sep 1, 2022
@balajialg balajialg removed this from To do in 2022-09 Sprint Board Sep 1, 2022
@balajialg balajialg added this to To Do in 2022-10 Sprint Board via automation Sep 1, 2022
@balajialg balajialg removed this from To Do in 2022-10 Sprint Board Oct 31, 2022
@VictoriaHoll
Copy link

@aculich and I are interested in enabling this feature on the general DataHub as well as the D-Lab DataHub! Do you have an estimate of when this might be available? Thank you!

@balajialg
Copy link
Contributor

balajialg commented Nov 23, 2022

Thanks, @VictoriaHoll for raising the request! We are waiting for Jupyter Lab 4.0 release to actually test the RTC feature.

@afshin Do you have any inputs on when we can expect to test the RTC feature in our hubs? We have had multiple requests to enable RTC during the past 4-5 months (like the comment above).

@fperez
Copy link
Collaborator

fperez commented Nov 23, 2022

If we can get 3.6.0 out the door with the RTC machinery enabled soon, I'm inclined to test it again in Stat 159. 159 is a safer environment to test this in, as I have "only" ~70 students, a very knowledgeable GSI, and the focus is precisely collaborative research.

We did last time (sp22), and that's how we found the data loss issues! It was a bit scary, but ultimately it led to the right fixes going in.

@balajialg
Copy link
Contributor

balajialg commented Nov 23, 2022

@fperez Thanks for both your willingness to test RTC as part of Stat 159 and the update that RTC is part of the minor release 3.6 which I assumed was part of the 4.0 release earlier.

@fperez
Copy link
Collaborator

fperez commented Nov 23, 2022

Sure! All the big changes are meant for 4.0, but 3.6 has the key backports that should (that's why we need to test! :) fix the bugs we encountered in the spring.

@balajialg
Copy link
Contributor

@fperez Got it. That makes sense.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Issues around improving existing functionality
Projects
Development

No branches or pull requests

5 participants