A source of lexicographically-ordered "position strings" for collaborative lists and text.
In a collaborative list (or text string), you need a way to refer to "positions" within that list that:
- Point to a specific list element (or text character).
- Are global (all users agree on them) and immutable (they do not change over time).
- Can be sorted.
- Are unique, even if different users concurrently create positions at the same place.
This package gives you such positions, in the form
of lexicographically-ordered strings. Specifically, PositionSource.createBetween
returns a new "position string" in between two existing position strings.
These strings have the bonus properties:
-
(Non-Interleaving) If two
PositionSource
s concurrently create a (forward or backward) sequence of positions at the same place, their sequences will not be interleaved.For example, if Alice types "Hello" while Bob types "World" at the same place, and they each use a
PositionSource
to create a position for each character, then the resulting order will be "HelloWorld" or "WorldHello", not "HWeolrllod". -
If a
PositionSource
creates positions in a forward (increasing) sequence, their lengths as strings will only grow logarithmically, not linearly.
Position strings are printable ASCII. Specifically, they
contain alphanumeric characters, ','
, and '.'
.
Also, the special string PositionSource.LAST
is '~'
.
- Fractional indexing, a related scheme that satisfies 1-3 but not 4-6.
- List CRDTs
and how they map to position strings.
PositionSource
uses an optimized variant of that link's string implementation, described in algorithm.md. - Paper about interleaving in collaborative text editors.
- list-positions, a similar but more comprehensive library. It provides helper data structures (e.g., a
List<T>
class) and more efficient usage options.
Install with npm:
npm i --save position-strings
Creating position strings:
import { PositionSource } from "position-strings";
// At the start of your app:
const source = new PositionSource();
// When the user types `char` at `index`:
const position = source.createBetween(
myListPositions[index - 1],
myListPositions[index]
// If index is 0 or myListPositions.length, the above behaves reasonably,
// since undefined defaults to PositionSource.FIRST or LAST.
);
myListPositions.splice(index, 0, position);
myList.splice(index, 0, char);
// Or insert { position, char } into a database table, ordered map, etc.
If your list is collaborative:
import { findPosition } from "position-strings";
// After creating { char, position }, also broadcast it to other users.
// When you receive `remote = { char, position }` from another user:
const index = findPosition(remote.position, myListPositions).index;
myListPositions.splice(index, 0, remote.position);
myList.splice(index, 0, remote.char);
// Or insert `remote` into a database table and query
// "SELECT char FROM table ORDER BY position".
// Or insert `remote` into an ordered map, etc.
To use cursors:
import { Cursors, PositionSource } from "position-strings";
let cursor: string = PositionSource.FIRST;
// When the user deliberately moves their cursor to `cursorIndex`:
cursor = Cursors.fromIndex(cursorIndex, myListPositions);
// Or run the algorithm in the `Cursors.fromIndex` docs.
// When the text changes, update the displayed cursor:
cursorIndex = Cursors.toIndex(cursor, myListPositions);
// Or run the query in the `Cursors.toIndex` docs.
constructor(options?: { ID?: string })
Constructs a new PositionSource
.
It is okay to share a single PositionSource
between
all documents (lists/text strings) in the same JavaScript runtime.
For efficiency (shorter position strings),
within each JavaScript runtime, you should not use
more than one PositionSource
for the same document.
An exception is if multiple logical users share the same runtime;
we then recommend one PositionSource
per user.
@param options.ID
A unique ID for this PositionSource
. Defaults to
IDs.random()
.
If provided, options.ID
must satisfy:
- It is unique across the entire collaborative application, i.e.,
all
PositionSource
s whose positions may be compared to ours. This includes pastPositionSource
s, even if they correspond to the same user/device. - It does not contain
','
or'.'
. - The first character is lexicographically less than
'~'
(code point 126).
If options.ID
contains non-alphanumeric characters, then created
positions will contain those characters in addition to
alphanumeric characters, ','
, and '.'
.
createBetween(
left: string = PositionSource.FIRST,
right: string = PositionSource.LAST
): string
Returns a new position between left
and right
(left < new < right
).
The new position is unique across the entire collaborative application,
even in the face of concurrent calls to this method on other
PositionSource
s.
@param left
Defaults to PositionSource.FIRST
(insert at the beginning).
@param right
Defaults to PositionSource.LAST
(insert at the end).
readonly ID: string
The unique ID for this PositionSource
.
static readonly FIRST: string = ""
A string that is less than all positions.
static readonly LAST: string = "~"
A string that is greater than all positions.
function findPosition(
position: string,
positions: ArrayLike<string>
): { index: number; isPresent: boolean };
Returns { index, isPresent }
, where:
index
is the current index ofposition
inpositions
, or where it would be if added.isPresent
is true ifposition
is present inpositions
.
If this method is inconvenient (e.g., the positions are in a database
instead of an array), you can instead compute
index
by finding the number of positions less than position
.
For example, in SQL, use:
SELECT COUNT(*) FROM table WHERE position < $position
See also: Cursors.toIndex
.
@param positions
The target list's positions, in lexicographic order.
There should be no duplicate positions.
Utilities for working with cursors in a collaborative list or text string.
A cursor points to a particular spot in a list, in between two list elements (or text characters). This class handles cursors for lists that use our position strings.
A cursor is represented as a string.
Specifically, it is the position of the element
to its left, or PositionSource.FIRST
if it is at the beginning
of the list. If that position is later deleted, the cursor stays the
same, but its index shifts to next element on its left.
You can use cursor strings as ordinary cursors, selection endpoints, range endpoints for a comment or formatting span, etc.
static fromIndex(index: number, positions: ArrayLike<string>): string
Returns the cursor at index
within the given list of positions. Invert with Cursors.toIndex
.
That is, the cursor is between the list elements at index - 1
and index
.
If this method is inconvenient (e.g., the positions are in a database instead of an array), you can instead run the following algorithm yourself:
- If
index
is 0, returnPositionSource.FIRST = ""
. - Else return
positions[index - 1]
.
@param positions
The target list's positions, in lexicographic order.
There should be no duplicate positions.
static toIndex(cursor: string, positions: ArrayLike<string>): number
Returns the current index of cursor
within the given list of
positions. Inverse of Cursors.fromIndex
.
That is, the cursor is between the list elements at index - 1
and index
.
If this method is inconvenient (e.g., the positions are in a database
instead of an array), you can instead compute
index
by finding the number of positions less than
or equal to position
.
For example, in SQL, use:
SELECT COUNT(*) FROM table WHERE position <= $position
See also: findPosition
.
@param positions
The target list's positions, in lexicographic order.
There should be no duplicate positions.
Utitilies for generating PositionSource
IDs (the options.ID
constructor argument).
static random(options?: { length?: number; chars?: string }): string
Returns a cryptographically random ID made of alphanumeric characters.
@param options.length
The length of the ID, in characters.
Default: IDs.DEFAULT_LENGTH
.
@param options.chars
The characters to draw from. Default: IDs.DEFAULT_CHARS
.
If specified, only the first 256 elements are used, and you achieve
about log_2(chars.length)
bits of entropy per length
.
static pseudoRandom(
rng: seedrandom.prng,
options?: { length?: number; chars?: string }
): string
Returns a psuedorandom ID made of alphanumeric characters,
generated using rng
from package seedrandom.
Note: If you install
@types/seedrandom
yourself instead of relying on our dependency, install version2.4.28
, even thoughseedrandom
itself has version3.0.5
.
Pseudorandom IDs with a fixed seed are recommended for tests and benchmarks, to make them deterministic.
@param options.length
The length of the ID, in characters.
Default: IDs.DEFAULT_LENGTH
.
@param options.chars
The characters to draw from. Default: IDs.DEFAULT_CHARS
.
If specified, only the first 256 elements are used, and you achieve
about log_2(chars.length)
bits of entropy per length
.
static validate(ID: string): void
Throws an error if ID
does not satisfy the
following requirements from PositionSource
's constructor:
- It does not contain
','
or'.'
. - The first character is lexicographically less than
'~'
(code point 126).
static readonly DEFAULT_LENGTH: number = 10
The default length of an ID, in characters.
static readonly DEFAULT_CHARS: string =
"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789"
Default characters used in IDs: alphanumeric chars.
Firebase text-editor uses position-strings to implement collaborative (plain) text editing on top of Firebase RTDB. Each character is stored together with its position, and a Firebase query is used to list the characters in order.
The app also demonstrates using Cursors
to track the local user's selection start and end.
Position string length is our main performance metric. This determines the memory, storage, and network overhead due to a collaborative list's positions.
Additionally, each
PositionSource
instance uses some memory, andPositionSource.createBetween
takes some time, but these are usually small enough to ignore.
To measure position string length in a realistic setting, we benchmark against Martin Kleppmann's text trace. That is, we pretend a user is typing into a collaborative text editor that attaches a position string to each character, then output statistics for those positions.
For the complete trace (182k positions, 260k total edits) typed by a single PositionSource
, the average position length is 33 characters, and the max length is 55.
For a more realistic scenario with 260 PositionSource
s (a new one every 1,000 edits), the average position length is 111 characters, and the max length is 237. "Rotating" PositionSource
s in this way simulates the effect of multiple users, or a single user who occasionally reloads the page. (The extra length comes from referencing multiple IDs per position: an average of 8 IDs/position x 8 chars/ID = 64 chars/position.)
If we only consider the first 10,000 edits, the averages decrease to 23 characters (single PositionSource
) and 50 characters (new PositionSource
every 1,000 edits).
More stats for these four scenarios are in stats.md. For full data, run npm run benchmarks
(after npm ci
) and look in benchmark_results/
.
- In realistic scenarios with multiple
PositionSource
s, most of the positions' length comes from referencing IDs. By default, IDs are 8 random alphanumeric characters to give a low probability of collisions, but you can pass your own shorter IDs toPositionSource
's constructor. For example, you could assign IDs sequentially from a server. - A set of positions from the same list compress reasonably well together, since they represent different paths in the same tree. In particular, a list's worth of positions should compress well under gzip or prefix compression. However, compressing individual positions is not recommended.
PositionSource.createBetween
is optimized for left-to-right insertions. If you primarily insert right-to-left or at random, you will see worse performance.