New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Persistent Undo #2021

Open
jupart opened this Issue Apr 30, 2018 · 12 comments

Comments

Projects
None yet
7 participants
@jupart
Copy link

jupart commented Apr 30, 2018

Vim has a feature called persistent_undo that is a nice quality-of-life feature I really like.

The implementation in nvim is like: It writes some kind of binary file to VIM_UNDO_DIR for each file. The undo files are written with their full paths separated by %, like %home%name%path%to%changed%file.c.

Does kak plan to support something like this one day? I think I'd like to implement a plugin for it either way, but I'm not sure where to start other than just to dig into other plugins.

@danr

This comment has been minimized.

Copy link
Contributor

danr commented May 1, 2018

Note: the undo history is not exposed to plugins right now so there is no way to support this as a plugin without changing the c++ source code as well.

@mawww

This comment has been minimized.

Copy link
Owner

mawww commented May 1, 2018

Exposing the necessary data so that a support script can take care of storing/restoring would be my preferred way. The problem is how to expose the undo data. This is related to the requests for a nice interface to the undo tree, we need to expose the undo tree data to the external/script world and let them do their thing (a graphical UI should be able to get the undo data and provide its own interface).

So, first steps would be to make it possible to get the undo tree from Kakoune. For that a text format must be defined. An undo tree is a tree of history nodes. history nodes can be reduced to a list of modifications and a timestamp. modifications are a tuple (insert/erase, buffer position, text content).

Second phase would be to make it possible to set the undo tree from a script. This is tricky because we need to validate that the undo tree is valid with the current state of the buffer.

@yshui

This comment has been minimized.

Copy link

yshui commented Jul 17, 2018

@mawww A history node should also contain its parent node. It is a undo tree after all.

And for validating, we can probably store and verify a checksum of the buffer.

@mawww

This comment has been minimized.

Copy link
Owner

mawww commented Jul 18, 2018

@yshui Right, the parent would be an index in the array. (so history node would be a tuple (parent index, timestamp, list of modifications).

Validating with a checksum is not safe enough, this is arbitrary data coming from an external source, Kakoune cannot trust it to be valid, it needs to either check it, or tolerate invalid undo data.

@yshui

This comment has been minimized.

Copy link

yshui commented Jul 18, 2018

@mawww I was thinking more about "how would the script figure out if the undo data is still valid" aspect.

I agree kakoune needs to check the undo data, but that's probably not difficult to do.

@mawww

This comment has been minimized.

Copy link
Owner

mawww commented Jul 26, 2018

Its pretty involved because each undo node in the undo tree applies to its parent buffer state, so in order to check that an undo node is valid, we need to compute its parent buffer state. Trying to do that without having to apply the modifications to the buffer each time might be a bit tricky to implement, I need to take a look.

For the script, it can do that however it wants, Kakoune will only provide an interface to set the undo tree, and complain if the undo tree is invalid based of the current buffer state. Its true that an eventual script doing undo tree restoration will likely use a checksum or similar to early validate the data its loading and notify the user if its out of date.

@dirkson

This comment has been minimized.

Copy link

dirkson commented Aug 4, 2018

+1 for this. After having persistent undo, I really can't imagine switching to an editor without it.

@eraserhd

This comment has been minimized.

Copy link
Contributor

eraserhd commented Dec 30, 2018

OK, below is my ... well, third or fourth ... draft of design for a format for history. Thoughts?

(I'm planning on doing the coding, whatever the final design, FYI.)

Goals

  1. Provide all information necessary for a "Persistent Undo" plugin.
  2. Provide all information necessary for a tool like parinfer-rust to inspect
    all the changes to the buffer since it was last seen.
  3. Enable for "undojoin"-like functionality: easily amend the current history
    node.
  4. Be easy to parse from POSIX shell.

Nice to have:

  1. Use one variable, allowing history to be replaced with set-option.
  2. Have the format be compact, to avoid environment variable limits.
  3. Able to support cursor positions or timestamps for each node in the future.

Data Items

.HistoryNode

  1. Parent
  2. Redo Child
  3. Undo group (a vector of modifications)

.Modification

  1. Insert/Erase
  2. Buffer coordinate
  3. Text

Format

The data is represented in ( entity, attribute, value ) triplets. "Entity"
values are the ids of history items being described. "Attributes" are
the single-character identifiers r for redo child, m for modification, and
p for parent. Additonal attributes such as selections or timestamps can be
added at a later time.

 0 r 1                   # node 1 is the redo child for node 0
 0 m '+|1.2|hello world' # node 0 has a modification adding "hello world" at 1.2
 0 m '-|2.7|foo'         # node 0 has a modification removing "foo" at 2.7
 1 p 0                   # node 1 has parent 0
 1 r 2                   # node 2 is the redo child for node 1
 1 m '+|2.2|bar'         # node 1 has a modification adding "bar" at 2.2
 1 a _                   # 1 is the active history_id
 2 p 1
 ...

While possible to further normalize the data by introducing modification ids
and describing modifications with ( entity, attribute, value ) triplets, I
figure this inflates the size of the data unnecesarily: the structure of
modifications seems more fixed than the structure of history nodes, and since
it is only three fields without special cases needing more or fewer attributes
(like our history root), let's not pay the penalty for extensibility.

Modifications are represented as op|coord|text strings. text is at the
end so it can contain arbitrary text, including pipe symbols, and still be
easily parsed by shell, like so:

 op="${change%%|*}"
 change="${change#*|}"
 coord="${change%%|*}"
 text="${change#*|}"
@Screwtapello

This comment has been minimized.

Copy link
Contributor

Screwtapello commented Dec 30, 2018

Kakoune has an undo tree, not just linear history. Therefore, it doesn't make much sense for a HistoryNode to have "a redo child" since it may have many redo children. How about:

  • a HistoryNode has a single parent
  • HistoryNodes do not mention anything about their children; whoever's reading the data can reconstruct that information from all the parent links
  • As well as HistoryNodes, the history data should include:
    • the ID of the "tip" HistoryNode, the one that "redo" steps toward
    • the distance from node 0 to the current state of the buffer

For example, if I start with an empty buffer and do:

ihello<esc>i world<esc>ui planet<esc>u

...I would expect the history tree to look like:

0 ""
|
1 "hello"
|\
| 2 "hello world"
|
3 "hello planet"

There are four nodes (0-3) and two tips (2, 3). The current tip is the second one (1, in zero-based numbering), so the linear undo chain would be 0 → 1 → 3. The current state of the buffer is 1 step along that chain, so the history would also need to include:

t 1
s 1

Instead of storing the index of the tip, we could store the actual tip node ID, but what if the given node ID wasn't a tip? Instead of storing the index of the current state, we could store the actual state node ID, but what if the given node ID wasn't on the path from the origin to the selected tip? Doing it this way makes it easier for tools to validate history, since there's fewer things that can go wrong.

There should also be some kind of signature for the current buffer state, to guarantee we aren't loading undo-history for the wrong file, or a file that's been externally modified.

Another thing to think about: do HistoryNodes represent the states of a buffer (like Git commit IDs) or the transitions between states?

Lastly, I know mawww said (insert/erase, position, content), but I wonder if it wouldn't be simpler to model things as: (position, old text, new text):

  • to apply a change, instead of having to check an "insert/erase" flag and do different things, we can just delete the length of the old text and insert the new text every time.
  • to undo a change, just call the "apply change" function with the old and new parameters swapped.
  • this also allows us to model a change or replace operation in a single node, instead of combining an insert and a delete.
@eraserhd

This comment has been minimized.

Copy link
Contributor

eraserhd commented Dec 30, 2018

On the point of redo child and modelling the “tip”:

one of my earlier drafts worked this way, but then I thought about this situation: you jump to a different, internal history node with some kind of undo-tree plugin. This node has multiple children and doesn’t appear on the path from the root to the redo tip. Then you type U to redo. What do we do? We can’t reach the previous tip from our new node, so we have no way of using that to figure out what redo must do.

In reality, there isn’t just one linear undo history, but as many as the tree has leaves, and Kakoune needs to decide what U means for every internal node.

Perhaps there is an algorithm? Maybe redo always travels towards the newest (most recently added) grandchild? Does this work if we jump then edit?

(I’m thinking about it more now, and I think it does work...)

Anyway, the “redo child” already exists in Kakoune’s data structures, which is why I modelled it here. If we can compute it, then we can remove it from the text format.

Anyway, way past my bedtime. I’ll think about the rest tomorrow.

@Screwtapello

This comment has been minimized.

Copy link
Contributor

Screwtapello commented Dec 30, 2018

... you jump to a different, internal history node with some kind of undo-tree plugin. This node has multiple children and doesn’t appear on the path from the root to the redo tip. Then you type U to redo. What do we do?

You got me to stop imagining how history might work, and actually look at the code. It turns out that Kakoune's internal state does indeed store a "redo child" for each node, and this value is essentially set arbitrarily as you wander the tree. Therefore, properly saving the undo history requires properly serializing the redo child data.

On the other hand, the "redo child" information will be mostly predictable (I bet most history nodes have exactly one child) and including a lot of redundant data violates "Have the format be compact". It also makes the format more difficult to parse, since the parser now has to validate that the redo_child is an actual child of the node in question.

How about this: we just record the history_id of the current buffer state, and don't record anything about "redo child" or "tip". At load time, we set all the "redo_child" fields to point at the highest-numbered child, except for nodes on the path from the root to the current state. That's simple, and only breaks if the user makes one change, undoes, makes a second change, uses <a-u> to go back to the first change, and undoes again.

Alternatively, do that, but record the "tip" (calculated by following "redo_child" fields as far as you can) as I described. Still pretty simple, and all the history nodes on the undo/redo chain will be correct at load time. Other history nodes will be "wrong", but the only way to get to them is via <a-u> and <a-U>. Those commands rewrite "redo_child" as they traverse the tree, and they can traverse the tree in an arbitrarily complex order (since they iterate by node ID rather than depth-first-search or whatever), so in general it's not a good idea to make assumptions about what "redo" will do after jumping to a history node, even today.

@eraserhd

This comment has been minimized.

Copy link
Contributor

eraserhd commented Dec 31, 2018

I've created this design document, and I'm keeping it updated based on chats.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment