You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
First reported in #16. The collaboration server is currently written with a scalable serverless architecture hosted on zeit now. We want to have a different codebase for the local one. Because the zeit now code was built for a commercial project, we can't open-source the code. But we can build a new version that implements the API.
Here is the full specification:
Universal Data Tool Collaborative Editing Server
Goals
Users should be able to collaborate with other users to complete the labeling of a dataset together
Users should receive notifications as work is completed or started by other users
Users should receive "updates" from other users in less than 500ms
The "Settings" should be able to be edited by any user
New data uploaded should be supported by any user
Collaborative links should be shareable
The first time someone enters collaboration mode a dialog should explain how to share the link etc.
Out of Scope
Should not require any login
Collaborative editing on a per-sample basis
Collisions should take "last person who submitted edit"
better-sqlite3 is an npm module that makes the connection to sqlite very fast and simple
Architecture
The following endpoints are used...
POST /udt/session: Creates a link to a UDT session. Whoever initiates collaboration mode calls this. It is called exactly once to start a session. A session lasts indefinitely. Returns the url to the session.
GET /udt/session/<session_id>: Gets the latest version of the UDT JSON file by getting the latest session_state (see DB Architecture)
GET /udt/session/<session_id>/diffs: Gets recent diffs for the JSON file
The requestor must provide the querystring parameter since=<ISODATE> indicating that they would like the diffs since the last time they polled.
The UDT will poll this every 250-500ms. Most of the time it'll return an empty array of patches.
Responds with { patches: Array<JSONDiffPatch>, hashOfLatestState, latestVersion }
PATCH /udt/session/<session_id>: Sends a JSONDiffPatch object with changes
Request contains { patch, mySessionStateId }
patch is applied against the latest session state to generate a new session state.
mySessionStateId isn't used (for now)
Should return { hashOfLatestState, latestVersion }
PATCH /udt/session/<session_id>/sample: Creates modifies or deletes a sample
This endpoint should be used instead of the /udt/session/<session_id> endpoint for updating, creating or deleting samples because it can handle certain edge cases better.
A request contains { operation, sampleIndex, [newInput], [newOutput], [previousInput] }
operation can be "DELETE", "CREATE", "UPDATE"
newInput is the taskData[sampleIndex] that the UDT observes when it sends the request
If "UPDATE" or "DELETE", use previousInput to find the true sample index. (i.e. do a deep comparison to find the sampleIndex using the latest version of the state).
newOutput is the new output for "UPDATE" operations. It is optional because the user may not want
sampleIndex provided by the requestor not be used.
Should return { hashOfLatestState, latestVersion }
Example
Let's look at a typical collaborative workflow to see how these endpoints work:
After User1 engages collaboration mode, an API request is sent to POST /udt/sessionUser1's editor parses the response and creates a link for them to share.
User1 shares the link with their team (only User2) and begins to edit
User2 uses the link to join the session. They get the latest version of the UDT JSON by calling GET /udt/session/<session_id>. They know the session_id because it's embedded in the link.
User2 edits something in the settings. The UDT makes a request to PATCH /udt/session/<session_id> with a JSONDiffPatch containing they're changes.
User1 polls GET /udt/session/<session_id>/diffs?since=<last_version> to get the latest patches. User1's editor sees that there's a patch to apply from User2. They apply the patch, and display a notification for the user.
User1 begins to edit a sample. This triggers a request to PATCH /udt/session/<session_id>/sample changing the taskData[sampleIndex].isBeingEdited to true.
User1 finishes editing a sample. This triggers a request to PATCH /udt/session/<session_id>/sample changing the taskData[sampleIndex].isBeingEdited to true and and taskOutput[sampleIndex] to their newOutput
Database Architecture
One table called session_state representing each state of the JSON file. It contains the following columns:
session_state_id uuid randomly generated
short_id text randomly generated: represents the session id
udt_json jsonb: The state of the UDT file
patch jsonb: The patch that created this version from the previous version
previous_session_state_id uuid: Identifier for previous state
version integer: Integer identifying the revision number
created_at timestamptz: Timestamp on creation
The database will have the following constraints applied
UNIQUE previous_session_state_id
Each session can only have one subsequent state. This prevents certain race conditions.
The database will have the following SQL triggers:
Delete session_states that are older than 1 hour AND not the latest state
Triggered when a session state is inserted.
The text was updated successfully, but these errors were encountered:
First reported in #16. The collaboration server is currently written with a scalable serverless architecture hosted on zeit now. We want to have a different codebase for the local one. Because the zeit now code was built for a commercial project, we can't open-source the code. But we can build a new version that implements the API.
Here is the full specification:
Universal Data Tool Collaborative Editing Server
Goals
Out of Scope
Key Technologies
hashOfLatestState
Architecture
The following endpoints are used...
POST /udt/session
: Creates a link to a UDT session. Whoever initiates collaboration mode calls this. It is called exactly once to start a session. A session lasts indefinitely. Returns the url to the session.GET /udt/session/<session_id>
: Gets the latest version of the UDT JSON file by getting the latest session_state (see DB Architecture)GET /udt/session/<session_id>/diffs
: Gets recent diffs for the JSON filesince=<ISODATE>
indicating that they would like the diffs since the last time they polled.{ patches: Array<JSONDiffPatch>, hashOfLatestState, latestVersion }
PATCH /udt/session/<session_id>
: Sends a JSONDiffPatch object with changes{ patch, mySessionStateId }
patch
is applied against the latest session state to generate a new session state.mySessionStateId
isn't used (for now){ hashOfLatestState, latestVersion }
PATCH /udt/session/<session_id>/sample
: Creates modifies or deletes a sample/udt/session/<session_id>
endpoint for updating, creating or deleting samples because it can handle certain edge cases better.{ operation, sampleIndex, [newInput], [newOutput], [previousInput] }
operation
can be "DELETE", "CREATE", "UPDATE"newInput
is thetaskData[sampleIndex]
that the UDT observes when it sends the requestpreviousInput
to find the true sample index. (i.e. do a deep comparison to find the sampleIndex using the latest version of the state).newOutput
is the new output for "UPDATE" operations. It is optional because the user may not wantsampleIndex
provided by the requestor not be used.{ hashOfLatestState, latestVersion }
Example
Let's look at a typical collaborative workflow to see how these endpoints work:
POST /udt/session
User1's editor parses the response and creates a link for them to share.GET /udt/session/<session_id>
. They know the session_id because it's embedded in the link.PATCH /udt/session/<session_id>
with a JSONDiffPatch containing they're changes.GET /udt/session/<session_id>/diffs?since=<last_version>
to get the latest patches. User1's editor sees that there's a patch to apply from User2. They apply the patch, and display a notification for the user.PATCH /udt/session/<session_id>/sample
changing thetaskData[sampleIndex].isBeingEdited
totrue
.PATCH /udt/session/<session_id>/sample
changing thetaskData[sampleIndex].isBeingEdited
totrue
and andtaskOutput[sampleIndex]
to theirnewOutput
Database Architecture
One table called
session_state
representing each state of the JSON file. It contains the following columns:The database will have the following constraints applied
The database will have the following SQL triggers:
The text was updated successfully, but these errors were encountered: