-
Notifications
You must be signed in to change notification settings - Fork 715
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] experiment: stream buffer modifications to external process #3160
Conversation
There's a Murmur3 implementation already available in |
Thanks for the hint. The problem md5 solves is that PATH_MAX (4096 on my system) is longer than NAME_MAX (255), since I have to map an absolute path to a name for shm_open(). I need something collision free. I don't know how good MurmurHash3 is for that. Usually a cryptographic hash is used for that kind of thing. Switching to another hash for mapping or another method altogether (a counter?) later, won't be problem. I just mentioned it, because it's currently the biggest change, I guess there will be more difficult things to resolve, before the patch can be accepted. If it even works... 🤔 |
to extract information from the buffers interactively, allowing: * tree-sitter based highlighting * better kak-lsp/tree integration In my opinion the only obstacle for fluid integration is the need to send the full buffer to the external process on every update. Kakoune reacts very fast to external commands, I have scripts that issue quite a lot of commands in the background, Kakoune handles these smoothly. Disclaimer: Don't get your hopes up too high, I don't know if patch can reach acceptance.
@ganwell This is really interesting. I'm not sure what I think about it... It definitely is connected to a bunch of things I'm doing or pondering.
Also, I want to add buffer access to my 9p branch, #3116. It sure would be cool to come up with a mechanism that would allow us to pool efforts on these three or four points? (This branch also has the problem with unique names for buffers.) I have some skepticism about whether using a shared-memory database could improve performance, but I guess I'm also not sure exactly how it would work. Do client programs still stream the buffer contents? Is there generally one |
Shared-memory is the wrong term: shm_open creates a file on /dev/shm, other than that its just a normal file, the idea would work over the unix-socket, too. I see you extended the socket protocol, may be I use the unix-socket later. Currently shm_open is more about a simpler implementation for the prototype. With shm_open we have multiple readers for free, with a socket we have to send the modifications to each client. The journal-protocol will improve performance and memory-usage. A sync-client opens the original file using mmap, so pages are shared with other clients and the kakoune-server (copy on write). Kakoune streams serialized Buffer::Modification objects to the journal. The sync-client follows the journal (think tail -f) and applies all the modifications to its copy of the buffer (MAP_PRIVATE). The sync-client triggers a callback, which for example updates a tree-sitter AST. When the user saves the file, the clients update the buffer from the file-system (I don't know if mmap has functionality for that, or it means remapping the file) and the journal is truncated. New modifications are streamed in. If the idea works, I would create a rust library that implements the buffer-sync and tree-sitter-updates. Other projects can either use the library, running a binary per project (ie one for kak-lsp, one for kak-tree) or an umbrella-project could integrate these in one big binary. I'd rather have multiple binaries: unix-philosophy. A tree-sitter-based highlighter can read the journal and send range-commands into the unix-socket, no shell calls needed. Most other use-cases still need shell calls, but they have an up to date buffer ready (a persistent process is needed). I am not sure if the async nature of the journal, means that data-races make any plugin using it buggy, that's why I want to do a prototype with highlighting. I guess the timestamps should solve synchronization problems, but I don't understand the implications, yet. |
Streaming is implemented 🎉Please note none of this code is intended for merging, it is all just prototype quality. What I've learned
NextI will try if actual syntax-highlighting works. I guess I need to serialize the timestamp too. |
Why don't you write a client instead of having the server write everything to a file? You could create a new UI type that sends serialized information on the socket. |
I guess that is what I mean with "switch to the unix-socket": Extend the client protocol, with another type of client. In the end it is the best solution and has the best chance to be accepted. |
Hello, It is still unclear to me what you are trying to solve with this, do we have use cases where transmitting the whole buffer has unreasonable overhead ? Assuming it is the case, this seems to stream writes to the undo tree, I think a better approach would be to finalize the work on exposing the undo tree, then once you have a If this proves a bit too complex client side, we could expose say '$kak_modifications_from_ID_to_ID` that would expose the list of modifications to apply to get to some history id from another given one. @eraserhd @ul What are your requirements for transfering buffer data and buffer updates, is transferring full buffer too costly ? Is there an approach you expect to be easier to work with ? |
LSP requires sending entire buffer content to language server on any change. Therefore streaming changes might improve only Kakoune -> kak-lsp communication as less data needs to be transferred via stdio and then socket. It would come at additional cost of applying changes on the kak-lsp side before sending content to the language server. It wouldn't add cost of keeping the old buffer in kak-lsp memory as we already (unfortunately) have to do it to deal with position translation (https://github.com/ul/kak-lsp/#positioncharacter-interpretation). I guess for buffers less than hundreds of megabytes benefit would be marginal, if any. But hard to say without benchmarks. kak-tree might benefit from streaming changes as tree-sitter supports incremental parsing. But this comes at the cost of making kak-tree stateful. Right now any kak-tree command is just a one-shot executable invocation. In both cases, it's easier to work with the whole buffer transmission. I didn't notice much lag when using this strategy, but I must admit that I used Kakoune and kak-{lsp,tree} only on a high-end laptop. |
External realtime highlightingMy use-case is external, realtime highlighting using tree-sitter. Tree-sitter is an incremental parser. It has better highlighting quality than regexes, because it knows the context and what symbols refer to. It does incremental reparsing, which is much faster than parsing the whole file on every change. When I researched the possibilities, I read in several issues that external highlighting causes too much load. Also Tree-sitter wants edits (modifications). I didn't think about exposing the modifications to the environment, when I started this. [1] So I had the idea to stream the modifications: The modifications stream into the external highlighter and it sends highlighting commands back. ExperimentHowever I don't think this is a must-have feature. Kakounes current highlighting is good. It is interesting that LSP requires sending the whole buffer on every change. I assumed the way tree-sitter works would be common practice. This means there are a lot of implementations out there, that work fine by sending the whole buffer. Also TextDocumentSyncKind.Incremental suggests that LSP supports sending edits? Was it added later? @ul I am prepared to drop idea at any stage, if it is clear that the benefit is not worth the cost.. But I was even more skeptic before I run the simple buffer-client. It was surprisingly easy and even the current crude implementation did not cause any noticeable additional load. @mawww Assuming you wanted external, realtime highlighting and you wanted to send edits as tree-sitters interface suggests, how would you implement it? Do you think it would be worth the effort and cost? [1] I will test InsertChar/Delete plus exposing the modifications. If it really causes too much load, maybe a rate limited InsertChar/Delete would help? |
@ganwell What issues did you read that external highlighting caused too much load? I haven't seen any here, and Kakoune gets away with a lot of stuff that is conventionally "too slow". Sending the entire buffer twice on each keystroke (they are diffed) has had no performance issues for me. One of my machines is a 3Ghz AMD and the other is a modern Macbook. If there are machines or environments that don't perform well, we should look at whether we can improve the performance first. It's possible that parsing highlighter ranges pauses somehow. I've seen a pause that I can't explain in I'm personally excited about tree-sitter and external highlighters, and I'm pretty sure they could be run with the existing mechanisms. At least I haven't seen any counter-evidence yet. Also, I'll bet it would be best to run external highlighters on NormalIdle. I imagine in Kakoune it would use a |
I can't find them anymore, there was one where @mawww said that calling a shell on InsertChar would be a bad idea, but it was a long time ago (like around issue 500). The others were issues where people wanted to implement this or that in kakoune instead of using hooks,
I agree with you, if you use NormalIdle, it takes a very large file for any performance issues to appear. But I don't understand how this can be realtime. If you use NormalIdle, you type something, then you have to exit insert-mode, wait a tiny moment and then the text gets highlighted. On the other hand I just tried InsertIdle and you have to type very fast to block updates. I think InsertIdle plus NormalIdle is a good solution. But I still like to feed edits into tree-sitter and I want to only generate highlights for the text that has changed. #2673 solves that right? @mawww what do you think about InsertIdle plus NormalIdle plus exposing modifications? I am now completely convinced that streaming is not needed and I am glad, I wanted to focus on the tree-sitter integration and got distracted by this.
|
I was hoping to get back to #2673 before Clojure/conj this year, but that's only 1.5 weeks away, so I think it isn't going to happen :( #2673 isn't ready yet, and it was (for me) motivated by parinfer - but not for performance reasons. There are cases where parinfer wants to know exactly what's changed in order to know how to respond. For example, if you have |
Yes, you are right. I was about to answer that it's not widely supported (as it was at the time I checked it in the past), but I went and double-checked a few "popular" language servers, and most of them advertise incremental sync in server capabilities response. |
Correct. |
I started working on sync based on the history expansion. https://github.com/ganwell/hitch This is super WIP and the names are just working titles. |
to extract information from the buffers interactively, allowing:
tree-sitter based highlighting
better kak-lsp/tree integration
This issue communicates that I am working on this, currently everything I have to do is straight forward: Write a journal of modifications since last save, then create a test-program that uses the journal, and then we see....
In my opinion the only obstacle for fluid integration is the need to send the full buffer to the external process on every update. Kakoune reacts very fast to external commands, I have scripts that issue quite a lot of commands in the background, Kakoune handles these smoothly.
Disclaimer: Don't get your hopes up too high, it uses shm_open() (POSIX) and a public domain md5 implementation. My C++ isn't too good and I haven't mastered the kakoune code-style, yet. I will improve this, but first I want a running PoC. I don't know if patch can reach acceptance.