Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC 95: Concurrent editing notifications #95

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

gasman
Copy link
Contributor

@gasman gasman commented Mar 13, 2024

Rendered view

Currently, if two users edit the same page or snippet at the same time, the second user to save their changes will overwrite the first user's changes. This doesn't seem to be a common problem, and can often be avoided through communication - however, it will need to be addressed in advance of implementing auto-saving, as this increases the likelihood of users overwriting each other's changes unknowingly.

The "holy grail" of avoiding edit conflicts would be a real-time collaborative editing feature where changes are reflected immediately on the other user's screen, but this will not happen any time soon. A locking mechanism has been proposed, but this raises further implementation questions (for example, who has authority to override a lock in the case that an editor leaves a page open and goes on holiday?) and would be overly disruptive.

This RFC proposes an approach where users are notified of any other users editing the same item. Users will not be blocked from editing, but will be provided with information about the potential for conflicts, and how to resolve them when they do occur.

@gasman gasman added the 1:New label Mar 13, 2024
@laymonage laymonage self-requested a review March 15, 2024 09:36
@laymonage laymonage assigned laymonage and unassigned gasman Mar 22, 2024
Copy link
Member

@laymonage laymonage left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I think I'm quite happy with the general approach here (an EditingSession model, a client-side behaviour that pings the server, etc.). I have a few questions, but they're somewhat closer to implementation details than the general concept.

We add a 'ping' endpoint view that receives an EditingSession ID as a parameter, validates that the EditingSession belongs to the current user, and updates the 'last seen' timestamp to the current time. It returns a JSON response containing a list of all EditingSessions other than the current one that match this object and have a timestamp within the last minute.

> [!NOTE]
> It may also be necessary to perform a permission check to verify that the current user has edit permission on the object, so that users cannot use the 'ping' endpoint as a means to track user activity outside of their own area of the site. (This cannot happen if going through the 'edit' view is the only way to acquire an EditingSession ID, but see the "Handling unknown EditingSession IDs" section below for a setup where this could be bypassed.)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yet another reason why we need something like #92 so we can check the permissions for non-page models without relying on the viewset's permission_policy 😄


The list of active users will also be included in the initial rendered response of the edit view, so that we don't have to wait for the initial 'ping' response for this information to be displayed.

When a user leaves the edit view (as determined by a [`visibilitychange` event](https://developer.mozilla.org/en-US/docs/Web/API/Document/visibilitychange_event) changing visibility to 'hidden'), the 'release' endpoint is called (using [`sendBeacon`](https://developer.mozilla.org/en-US/docs/Web/API/Navigator/sendBeacon)) to delete the EditingSession.
Copy link
Member

@laymonage laymonage Mar 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also consider "idle" sessions? When a user switches to a different tab, it is not uncommon for them to return to the editing tab not long after. Do we want to immediately release the session, or mark it as inactive? Or, maybe just leave it as-is, and if the timestamp is e.g. 30 seconds ago < timestamp < 60 seconds ago), display it as idle.

idle indicator on google docs

If so, we might need to use pagehide event (or beforeunload, but that seems less ideal) to call the 'release' endpoint.


### Handling unknown EditingSession IDs

Given the above cleanup process, and the potential for a `visibilitychange` event to occur without the user actually leaving the page, it's possible that a browser could resume sending 'ping' requests for a given EditingSession ID some time after the record has been deleted. In this situation, we would want the user to reappear in the list of active sessions. To handle this, requests to the 'ping' endpoint should pass the object's ID and content type in addition to the EditingSession ID, and in the case that the EditingSession is not found, the view will create a new one. The ID of the EditingSession (whether retrieved or newly created) will also be included in the response, so that subsequent 'ping' requests can be made with the new ID.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we adding a UniqueConstraint on content_type, object_id, and user? If so, maybe we could use the content_type and object_id as the identifier when interacting with the API? I'm not sure if there would be a need to use the EditingSession's ID directly if this were the case.

I suppose it depends on how we want to handle "multiple sessions of the same user" scenario. Making the EditingSession unique on the generic FK and the user would prevent us from differentiating the sessions, but I don't know if we need that. It doesn't make much sense displaying the same user multiple times in the header (and if we make the query distinct – perhaps we should make it unique in the first place?).

We're using the revision IDs to handle the concurrent editing notifications, and it doesn't seem to require the EditingSession directly (other than for piggybacking the requests to update the client JS state). If a user saves a new revision in a different tab, the current spec doesn't seem to have any issues displaying it when the user clicks save. Unless I'm missing something...

Copy link
Member

@tm-kn tm-kn Mar 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels to me that the editing session is an instance of editing the object, not an instance of a user editing an object, at least on the backend side of things. I think that's what you want to be modelling. In the UI, you can still display it as a single user, even if one user has multiple editing sessions. It would make sense to generate a separate identifier (e.g. UUID) for each editing session and identify it that way (with filtering by the user ID from the session as well).

I don't see a reason for abstracting editing sessions in this way at the database level. I think the database should store each instance of an editing session (one user may multiple tabs where each tab is an editing session). If the UI or public API needs a simplified version, it can be done separately in the querying code.

I think the implementation should be as simple as possible. An example of unnecessary complexity you are introducing is, if there's one session only and you use the update_or_create() call, is there a possibility that one session calls the release endpoint when you close the tab and that deletes the session, and the user disappears for a short period and is no longer "currently editing", even if they are still editing.

I can't see what outcome enforcing one editing session per user in the database would bring. I can only see negatives such as: introducing unnecessary complexity in the code and making the database work harder for zero gain, plus introducing complexity usually means you multiply number of surprises you will come by when you implement this.

Also, an example reason why you might want to display multiple sessions on the UI is for users to realise that they have the same object opened in multiple tabs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I was also slightly concerned about the additional complexity that the unique constraint would bring. OK then, I think we can proceed without making it unique per user+object. I also thought about adding a UUID as well, although I'm not sure what it would be useful for (other than preventing enumeration).


### Base implementation - tracking active sessions

A minimal implementation would simply show a list of other users with an active editing session on the same object, without any notification of when those users actually make changes. This would be the only functionality available for non-versioned snippets, since we need a revision ID to identify whether the last saved version is the one that the current user is editing.
Copy link
Member

@laymonage laymonage Mar 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would the audit logs (perhaps only those with the wagtail.edit action) be a reasonable substitute instead of the revisions? It would allow us to have the feature for non-versioned snippets. Maybe use revisions if supported, and fall back to the audit logs, or maybe just use the latter to simplify the implementation.

We might also be able to use the BaseLogEntry.content_changed flag to only show the warning if the other user's "save" action actually changed the content. If we use revisions, we'll always have to display the warning, even if the other user might have only clicked "save" without changing anything. Not sure how reliable the flag is, though.

Also, we could detect if there's a wagtail.delete action and display some information that the object was deleted by which user. If we use revisions, we can still do this by making the ping API's response indicate that the object is no longer available (and optionally query the audit log to include the information of which user deleted it). Speaking of which, I don't think we've discussed the deletion case when we initially designed the feature...


### Clean-up of outdated sessions

Since we cannot guarantee that the 'release' endpoint will be called in all cases (for example, if the user's browser crashes), it is possible that stale EditingSession records will be left in the database. These will not affect functionality (since we disregard sessions older than a minute), but over time they may impair performance and bloat the database. These could be cleaned up with a scheduled task, but the extra work for site owners to set this up is probably not worth the benefit. I therefore propose that we perform this clean-up in the edit view, at the point where an EditingSession record is created. This will delete all EditingSession records that are more than one hour old.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be more of an implementation detail – but if we're using generic FK to the object, we probably will need to provide a model mixin with a default GenericRelation and tell the developers to add it to their model so that the EditingSession gets deleted when the model instance is deleted. Or perhaps, tell developers to add the GenericRelation themselves.

Or, if that sounds like a hassle when you have a lot of models, maybe we need to introduce a management command to be run periodically after all...

Maybe we can get away with doing the deletion in the generic DeleteView instead, but this won't cover cases where the deletion happens programmatically.

@abigail-hampson abigail-hampson assigned gasman and unassigned laymonage Apr 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: 🕵️ Reviewing
Development

Successfully merging this pull request may close these issues.

None yet

3 participants