Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix cancellation of file save operation #241

Merged
merged 2 commits into from
Mar 4, 2024

Conversation

davidbrochart
Copy link
Collaborator

When we schedule a file saving and one has already been scheduled but not yet completed (for instance, by typing a character and then another one less than one second later), we need to cancel the previously scheduled one. The proper way to do it is not by calling .cancel() on the task and awaiting it, but by handling the CancelledError exception inside the task. In our case, we want to do nothing if the task is cancelled, so we just return.

Copy link
Contributor

Binder 👈 Launch a Binder on branch davidbrochart/jupyter_collaboration/fix-save-cancel

@davidbrochart davidbrochart added the bug Something isn't working label Feb 28, 2024
@Zsailer
Copy link
Member

Zsailer commented Feb 28, 2024

@davidbrochart are the UI tests pretty flaky? At quick glance, these failures appear unrelated. The first few failures are due to the collaborations panel being slow to update.

@ellisonbg
Copy link

In the current implementation what triggers a save to disk? Wondering how these re-entrant saves happen and if there is a deeper issue for us to solve.

@davidbrochart
Copy link
Collaborator Author

Yes I think failures are unrelated, we will need to look into this.

@ellisonbg
Copy link

Some high level thinking on this:

  • I think we want to avoid triggering saves (even if there is a pause) on user actions (editing the document as this will increase load on the server as the number of users and edits increases.
  • A simpler approach would be to set a periodic timer (say every 60 seconds) and save to disk if there have been changes in that 60s.
  • This would guarantee that re-entrant saves are impossible and simplify the logic.

Thoughts?

@Zsailer
Copy link
Member

Zsailer commented Feb 28, 2024

I think we want to avoid triggering saves (even if there is a pause) on user actions (editing the document as this will increase load on the server as the number of users and edits increases.

I totally agree here.

A simpler approach would be to set a periodic timer (say every 60 seconds) and save to disk if there have been changes in that 60s.

I was going to suggest the same thing.

I think the this PR still fixes a bug and should be merged. If two saves are queued, we can cancel one. But I agree with @ellisonbg, a "periodic backup" instead of user-action-driven saving mechanism would avoid slamming the server, especially as these documents scale larger.

@ellisonbg
Copy link

Sounds good, let's review and merge this.

@Zsailer
Copy link
Member

Zsailer commented Feb 28, 2024

@davidbrochart, I think this codebase could really use a lot more commenting, because the async task flow is hard to hold in the brain 😅. I'm mentioning this, because I don't want to annoy you when asking for more comments as-we-go in these code reviews. 😄

@davidbrochart
Copy link
Collaborator Author

davidbrochart commented Feb 28, 2024

  • I think we want to avoid triggering saves (even if there is a pause) on user actions (editing the document as this will increase load on the server as the number of users and edits increases.

Actually no, it's the opposite: the more users and edits, the less saving happens, because every edit reschedules the save for later.

  • A simpler approach would be to set a periodic timer (say every 60 seconds) and save to disk if there have been changes in that 60s.

That's not how Google Docs works, and as a user it feels strange to me: sometimes I will see the document being saved right after a change, sometimes a long time after. I think Google Docs has set some kind of standard, as far as this should behave.

  • This would guarantee that re-entrant saves are impossible and simplify the logic.

They are impossible already.

@dlqqq
Copy link
Collaborator

dlqqq commented Feb 28, 2024

@ellisonbg I agree, and came to similar observations while reading through the source. Will post the issue back here in the thread once it's ready.

@Zsailer
Copy link
Member

Zsailer commented Feb 28, 2024

That's not how Google Docs works, and as a user it feels strange to me: sometimes I will see the document being saved right after a change, sometimes a long time after. I think Google Docs has set some kind of standard, as far as this should behave.

The server-side serialization of the document to its backend storage is opaque to the user in Google Docs, right? It's running as a cloud service, not a local application like most of our users. So I don't know if we can directly compare here.

In a "Jupyter cloud case", the UX for this doesn't necessarily need to show when the saving happens in the backend, i.e. it can be opaque too.

Copy link
Collaborator

@dlqqq dlqqq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is correct, but @Zsailer is right to call out the need for additional doc comments.

@davidbrochart There should be a doc comment saying something like, "Note to developers: make sure all await statements are wrapped in the try block below."

@Zsailer I can offer some insight into why this PR works. Basically, calling task.cancel() does not actually cancel the task. Instead, it raises a asyncio.CancelledError exception the next time that the task is re-entered by the event loop. This occurs immediately after every await statement. This PR moves all await statements into the try block and catches the asyncio.CancelledError exception silently. This allows the task self._maybe_save_document(...) to be cancelable as expected.

Lastly, as @ellisonbg called out, we will want to change this logic in the near future, so I think leaving this untested is OK for the time being.

@davidbrochart
Copy link
Collaborator Author

@Zsailer I think it's important to show that edits are saved or not, for instance JupyterLab will still warn you that changes are not saved if you try to exit when changes are not saved. If you didn't have that, you could really close everything and loose your changes.

Lastly, as @ellisonbg called out, we will want to change this logic in the near future, so I think leaving this untested is OK for the time being.

@dlqqq I think I made good arguments against Brian's comments.

@Zsailer
Copy link
Member

Zsailer commented Feb 28, 2024

I think it's important to show that edits are saved or not, for instance JupyterLab will still warn you that changes are not saved if you try to exit when changes are not saved. If you didn't have that, you could really close everything and loose your changes.

Fair enough. I just opened Google docs to see what they do here. I didn't realize they actually show a "Saving..." message at the top of the doc every time a new keystroke happens.

Screenshot 2024-02-28 at 2 50 51 PM

@davidbrochart davidbrochart merged commit 4538d02 into jupyterlab:main Mar 4, 2024
19 of 20 checks passed
@davidbrochart davidbrochart deleted the fix-save-cancel branch March 4, 2024 08:10
@davidbrochart
Copy link
Collaborator Author

Released in v2.0.3.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants