Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to identify changes / changed nodes? #376

Open
michaelrampl opened this issue Jan 14, 2024 · 4 comments
Open

How to identify changes / changed nodes? #376

michaelrampl opened this issue Jan 14, 2024 · 4 comments

Comments

@michaelrampl
Copy link

michaelrampl commented Jan 14, 2024

Assuming that I want to update another data structure with changes from yrs transactions, how do I approach that?

My idea was to use some identifiers that are the same in both data structures and update or insert after every y-crdt commit.

I thought about using the changed_parent_types for that, e.g.

for branch_ptr in remote_txn.changed_parent_types()
        println!("* {:?}", branch_ptr);
}

Output

* YText(start: (<4005326611#2>))
* YArray(start: (<4005326611#0>))

Even though this gives me some start position, which looks like a crdt id, I can't access these variables directly.

@michaelrampl michaelrampl changed the title How to update another datastructure with changes How to identify changes / changed nodes? Jan 14, 2024
@michaelrampl
Copy link
Author

The reason I'm having two data structures is that I need some extra functionality, like the ability to generate drawings. If there is a possibility to implement traits of y-crdt to have my own structs inside a document, I would be happy as well.

So perhaps you could either give an example on how to expand y-crdt structures with custom functionality or how to merge with a custom data structure after updates, since I guess those are the two most common scenarios in which one could use it.

@Horusiath
Copy link
Collaborator

Horusiath commented Jan 15, 2024

Probably the easiest way would be to subscribe a deep observable callback at a root-type shared collection that you want to track the changes on. Any changes caused on that collection and it's corresponding nested collections will be bubbling up as events that can be catched at root and reacted to this way.

@michaelrampl
Copy link
Author

michaelrampl commented Jan 17, 2024

Thats a good start but doesn't fully answer my question. Imagine I have one array in my UI representing all the paragraphs. Additionally I have a crdt doc with one array containing TextRefs representing the same paragraphs. How to best sync (create, update, delete) entries in the UI array based on the crdt array without recreating it entirely on every update? Do you have a best practice for that? My current approach would be to either store references / pointers (e.g. every UI array entry holds a reference to a TextRef) or to use ID's (the same way). For the first one, you have to guarantee that individual References (e.g. TextRef, ArrayRef, MapRef) won't change with merges. For the second one I'm not sure If I can use e.g. the id of an ArrayEvent insert/delete operation https://docs.rs/yrs/latest/yrs/types/array/struct.ArrayEvent.html#method.inserts or the type_id() https://doc.rust-lang.org/nightly/core/any/struct.TypeId.html or any other attribute to globally identify a node.

@dmonad
Copy link
Contributor

dmonad commented Jan 19, 2024

There is no definite answer to this. And it really depends on the application / UI framework you are using.

Bartosz can correct me, but I think that a Shared Type Ref is simply an object that holds a reference to a shared type (currently it's a pointer, but Bartosz said some time ago that he wants to make a change so that it is an ID). In any case, individual references won't change with merges.

For the second one I'm not sure If I can use e.g. the id of an ArrayEvent insert/delete operation https://docs.rs/yrs/latest/yrs/types/array/struct.ArrayEvent.html#method.inserts

The result of method.insert shouldn't be relevant to you. I would even argue that we shouldn't return a value. You should use the observe API instead to update your UI/model. The Y.Text event, for example, spits out a delta that describes the changes that you need to make to your existing text to sync up with remote users.

The simple observe method allows you to observe events on a single type. Deep Observe allows you to observe changes on that type and any of its children. You can use path to figure out where to find the changed type.

There shouldn't be any need to store IDs of the Ycrdt model on your data structure.

Generally, there are two approaches to make an application collaborative. Each has different tradeoffs.

1) Two way binding

Assume you want to make some existing application (e.g. Vim) collaborative. Vim already has its own data structure for working with text. In order to make Vim collaborative, you would create a two-way binding from the Vim text model to a Y.Text type. That means:

  • Every time Y.Text changes (observe/observeDeep), you reflect the changes to the vim text model.
  • Every time the Vim text model changes, you reflect the changes to Y.text.

Yjs was built to make existing UI frameworks collaborative. Hence, most applications using Yjs, are built using two-way bindings.

Preferably, you sync at the root level. I.e., the different models don't know anything about each other. In this case, the Vim model, doesn't need to hold references to the Y.Text type. All can be achieved using deep observables. It is helpful if the application model can also be observed for changes. Otherwise, you need to diff the models, to figure out what changed (this is what I do in y-prosemirror, for example - it's quite complex!).

This approach obviously has some overhead, because all data is stored twice. The advantage is that there is a clear separation between shared data and "local data". The local data can also contain additional information that is relevant to the local application (e.g. the currently highlighted text, or other kinds of configurations that shouldn't be shared between all users).

2) Use Ycrdt as the source of truth

You build your application around Ycrdt. Your application model has references to shared types refs. When a user performs a change, the changes will be reflected in the Ycrdt type. The Ycrdt type will emit a change event that you can use to update the UI.

This approach is less complex. However, there is no clear separation between "local state" and shared data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants