byte positions, more information in listener_add #5086

andymass · 2019-10-19T01:02:57Z

Is your feature request related something that is currently hard to do? Please describe.

I'm working with a server utility that needs buffer updates to update its internal model of the text. listener_add callbacks seem designed for this purpose:

    func Listener(bufnr, start, end, added, changes)
        changes = {lnum, end, added, col}

However, I can't figure out how to make this work with my use-case, which requires both line/col and byte positions, e.g., an insertion of two chars looks like:

class Edit {
    start_byte = 5, old_end_byte = 5, new_end_byte = 5 + 2,
    start_point = (0, 5), old_end_point = (0, 5), new_end_point = (0, 5 + 2),
}

Column position doesn't much matter so much, I can just use the start/end of lines if necessary. But I can't figure out how to get byte positions at all. It won't work, for instance, to keep track of buffer text before and after because we only have the current buffer's state and there may be numerous disjoint changes in callback Listener. The callback doesn't get text before/after or bytes.

Describe the solution you'd like

It would be helpful to receive byte positions as well as line/column. If the edit is a single-line I would also expect to be able to determine the length byte-wise.

Neovim solved this issue by adding "old_bytecount" to their listener callback:
neovim/neovim@b0e2619

It's possible to determine the new end-byte since you can inspect the buffer. This doesn't work in vim (as far as I know) because the changes are batched up.

Describe alternatives you've considered

I considered taking the changes and forming the smallest enclosing contiguous region, calling this a single edit, and using line2byte. This seems less efficient than using the granular listener changes.

The text was updated successfully, but these errors were encountered:

brammool · 2019-10-19T12:11:03Z

**Is your feature request related something that is currently hard to do? Please describe.** I'm working with a server utility that needs buffer updates to update its internal model of the text. listener_add callbacks seem designed for this purpose: func Listener(bufnr, start, end, added, changes) changes = {lnum, end, added, col} However, I can't figure out how to make this work with my use-case, which requires both line/col and byte positions, e.g., an insertion of two chars looks like: ``` class Edit { start_byte = 5, old_end_byte = 5, new_end_byte = 5 + 2, start_point = (0, 5), old_end_point = (0, 5), new_end_point = (0, 5 + 2), } ``` Column position doesn't much matter so much, I can just use the start/end of lines if necessary. But I can't figure out how to get byte positions at all. It won't work, for instance, to keep track of buffer text before and after because we only have the current buffer's state and there may be numerous disjoint changes in callback Listener. The callback doesn't get text before/after or bytes. **Describe the solution you'd like** It would be helpful to receive byte positions as well as line/column. If the edit is a single-line I would also expect to be able to determine the length byte-wise. Neovim solved this issue by adding "old_bytecount" to their listener callback: neovim/neovim@b0e2619 It's possible to determine the new end-byte since you can inspect the buffer. This doesn't work in vim (as far as I know) because the changes are batched up. **Describe alternatives you've considered** I considered taking the changes and forming the smallest enclosing contiguous region, calling this a single edit, and using line2byte. This seems less efficient than using the granular listener changes.

Do I understand you only need the byte count of the affected text? That sounds doable. You don't need the byte offset from the start of the buffer? I hope not, because that would be expensive. The Neovim patch is confusing, the title mentions old_byte_size, the comment old_bytecount, and the implementation appears to use dirty_bytes. Anyway, it doesn't looks similar to what Vim does, thus doing the same is not helpful. We could add the byte count of the affected lines before, if that is sufficient. Does it help to add the byte count afterwards?

…

-- ARTHUR: It is I, Arthur, son of Uther Pendragon, from the castle of Camelot. King of all Britons, defeator of the Saxons, sovereign of all England! [Pause] SOLDIER: Get away! "Monty Python and the Holy Grail" PYTHON (MONTY) PICTURES LTD /// Bram Moolenaar -- Bram@Moolenaar.net -- http://www.Moolenaar.net \\\ /// sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\ \\\ an exciting new programming language -- http://www.Zimbu.org /// \\\ help me help AIDS victims -- http://ICCF-Holland.org ///

andymass · 2019-10-30T23:04:13Z

You don't need the byte offset from the start of the buffer?

Unfortunately, ultimately, I do need absolute byte offsets of start and end of the old and new text. If there were exactly one edit, it would suffice to use line2byte for the start and the end positions, if the callback would supply the old byte count and end position of the affected text.

But since there are multiple changes per callback I think this wouldn't work, thus we would need old and new byte counts of the affected text. Basically the complication is that if you look at partial changes the current buffer state is effective after all of these changes.

prabirshrestha · 2020-05-03T01:34:18Z

Curious is there was any solution for this. Was trying to start looking at change listener to add to vim-lsp and faced the same issue. This would impact folks using multi-byte characters and LSP by it self mainly uses utf-16.

andymass · 2020-05-06T21:12:02Z

@prabirshrestha my concern was not utf-16 but keeping track of absolute byte positions. The solution there was to keep an array of byte positions which you update whenever a listener is called. You could probably keep track of utf-16 start-of-line positions in a similar way, assuming you could compute the number of utf-16 encoded bytes in a particular line.

andymass added the enhancement label Oct 19, 2019

prabirshrestha mentioned this issue May 23, 2020

Typing delay in large files - performance natebosch/vim-lsc#273

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

byte positions, more information in listener_add #5086

byte positions, more information in listener_add #5086

andymass commented Oct 19, 2019

brammool commented Oct 19, 2019 via email

andymass commented Oct 30, 2019

prabirshrestha commented May 3, 2020

andymass commented May 6, 2020

byte positions, more information in listener_add #5086

byte positions, more information in listener_add #5086

Comments

andymass commented Oct 19, 2019

brammool commented Oct 19, 2019 via email

andymass commented Oct 30, 2019

prabirshrestha commented May 3, 2020

andymass commented May 6, 2020