Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Notes on plugin architecture, for discussion #43

Merged
merged 1 commit into from
Apr 25, 2017
Merged

Conversation

raphlinus
Copy link
Member

Ideas on building an architecture for plugins. Comments and suggestions welcome!

@raphlinus
Copy link
Member Author

Thought this would be automatic, but I guess I have to add it by hand: Rendered

@leavengood
Copy link

Nice document, I didn't know the syntax highlighting line state technique, but it makes a lot of sense.

My gut feeling is that syntax highlighting might be worth putting into the core, but a reasonable step to experiment with plugin-based syntax highlighting would be to develop a Rust crate with most of the functionality and then expose that in a simple plugin binary. If later it was deemed that syntax highlighting is indeed better in the core, the crate could be used directly there (maybe with some small changes as needed.) One advantage of the plugin approach is that different methods of syntax highlighting could be tried (such as LPeg vs Sublime's YAML files with regexp) without having to change the core or add dependencies to Lua or whatever. This could also allow for different approaches to syntax themes if that was deemed interesting, with all syntax highlighting plugins just sending spans with color information that the front-end faithfully renders. The other option is to leave all theming in the front-end with some sort of established naming convention for syntax scopes (identifier, string, etc) which are mapped to colors.

As for the configuration language I would lean towards TOML but as you note the fact that YAML is also needed to parse Sublime's new syntax definitions does make things more complicated. Though I doubt supporting both would add too much bloat to the core.

Beyond this I need to do more research on RPC approaches and differential synchronization before commenting.

Thanks for writing this document!


## Security

Plugins can potentially
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fill this in before merge :)

@disordinary
Copy link

@leavengood you'd want to decouple colour from the backend and as you said just send through scopes with the content, similar to CSS - the benefit obviously being theming which can take advantage of whatever niceties the platform has.

@fjbfy51888-github
Copy link

Loading latest commit…

@fjbfy51888-github
Copy link

#44 Breaking changes coming up

@chriskrycho
Copy link

Major upside to TOML over YAML is its readability, of course. I'd also note that doing anything for the sake of compatibility with Sublime is… maybe not a great plan. ST is incredibly popular, of course, but I think you're probably better with doing whatever's actually best and providing a conversion mechanism. I like YAML in principle, but it is incredibly hard to write correctly in my experience.

@raphlinus
Copy link
Member Author

Since posting the above, I've done a lot more reading, and am now convinced that OT's are too complicated, differential synchronization relies on a poorly defined "patch" operation, and CRDT's are the One True Way. I'll update the concrete proposal after I've had a time to refine the details a bit, but in the meantime here are some readings I highly recommend for those who want to follow along:

I recommend starting with Marc Shapiro's 2011 talk at Microsoft Research explaining the CRDT concept and the underlying mathematical concept of monotonic semi-lattice.

The concept of CRDT actually started with the collaborative editing use case, before they generalized it to more general cloud-flavored problems (it now powers eventual consistency in Riak, among other things). The WOOT paper (2006) is very good at comparing the new approach to operational transforms, but I'm definitely not going to adapt the details of that implementation.

Lastly, I recommend reading A comprehensive study of Convergent and Commutative Replicated Data Types (2011), which surveys many of the use cases and goes into more detail about the duality of the state-based and operation-based approaches to implementing CRDTs. It also provides a very useful retrospective on the application to collaborative text editing (WOOT and TreeDoc), placing the somewhat complex implementation details into a clean mathematical framework.

Lastly, yes, I'll definitely want to flesh out the security section. A situation like neovim is completely unacceptable imho.

@disordinary
Copy link

As someone who has only had a limited need to develop for mobile platforms I'm wondering if the system of communicating between the view layer and the editor itself via stdin and stdout rather than ffi means that there might be issues with mobile platforms?

@chriskrycho
Copy link

chriskrycho commented May 6, 2016

That's been a point of some curiosity on my part as well.. I don't believe you can do XPC this way on iOS in general, because of the sandboxing rules and restrictions on running things like Python.

@leavengood
Copy link

Personally I think mobile platforms are the last target for something like a programmer's editor, which as far as I know is what Xi is meant to be.

Therefore I think for the moment mobile should not be a concern. But long term the core could probably be used as a library on mobile rather than through the stdin/stdout mechanism.

@chriskrycho
Copy link

chriskrycho commented May 6, 2016

Define 'mobile platform', though: I'm typing this from an iPad Pro, which has astounded me with how capable it is even for programming tasks. I think the question of whether that's in line with @raphlinus' goals is a totally fair one, but distinct from whether the platform is capable of supporting it (and that one has a clear answer: it is). There are of course plenty of reasons why mobile devices (whether Android, iOS, or other) are not a go-to yet, some of them technical and some of them cultural, but I wouldn't count on that remaining the case over even the medium term.

To be clear, I think opting not to support mobile devices is totally fair and reasonable! I note it not because it's a worry for me with regard to this project—about which, honestly, I'm still just kind of giddy—but more just as a point of curiosity and consideration.

I'll leave my thoughts there, as I don't want to derail this particular issue, and like I said I'm mostly just curious.

@disordinary
Copy link

@leavengood I don't know, like @chriskrycho I use a fully functioning bluetooth keyboard on an ipad pro and could easily see myself hacking on it while on the road. I've also used a keyboard with android in the past for writing.

The other thing is that while there is Sublime, Atom, Visual Studio Code, etc. on PC platforms there isn't a decent cross platform text editor that can work on mobile platforms which are increasingly becoming productivity tools. I think it's reasonable to think about it, and it could be that the core is able to communicate with a client both using stdin/out or embeddable as a library (although that would likely break all plugins and syntax highlighting as currently discussed).

So very likely the goal of this project is not to support mobile platforms at all, which as @chriskrycho mentioned is perfectly fine, it's just worth thinking about at this early stage.

@raphlinus
Copy link
Member Author

raphlinus commented May 6, 2016

Yes, I am thinking about mobile. If anything, I think asynchrony is even more important to get right because the input method is in a separate process and generally does more complex edits than a typical hardware keyboard. Figuring out the interaction between that and, say, undo, is not simple.

Subprocesses work great on Android, but (as I understand it) are not allowed on iOS App Store apps. Running the front-end and core in separate threads should be fine. The plugin story will be more restrictive, but I think at least some subset should still be doable (say, ones written in Rust and bundled with the app, probably also JavaScript). I believe that JSON for the IPC is still viable on mobile, but I can imagine replacing or augmenting it with a binary transport (maybe even using shared memory buffers to cut down on copies).

I definitely want to make xi good on desktops first before getting too distracted by mobile, but I totally agree it's worth thinking about and planning for.

To the RFC, I'm planning on updating it soon. I've explored CRDT-land and do believe I can come up with a design that's a lot simpler than the fully distributed peer-to-peer protocols, yet allows high degrees of asynchrony from the plugins and input method. More soon.

@leavengood
Copy link

You make fair points guys. I'm still using my MacBook Pro as my development machine and my iPad Air as a consumption machine, but I can see how we are moving toward mobile devices being productivity machines too.

In the time I've thought about this project I think the safest and most flexible approach is to build all the various pieces as standalone parts that have clear interfaces and that can be used as either a library or through the stdin/stdout mechanism (which really would just be a sort of "frontend" on the library part.) If most of the main components are in Rust it would always be an option to compile them into the core. For example I'm starting work on a Rust syntax highlighting crate which hopefully can prove useful and it could serve as a testbed for trying out SH in a plugin. If that proves too slow or otherwise is not working well, the same code should be usable directly by the core.

And to bring this back to the topic of mobile, if a lot of the pieces of Xi are like that then they could be compiled into the core for mobile. But overall I think we just need to see what happens.

@rkusa
Copy link
Contributor

rkusa commented May 6, 2016

@raphlinus What use cases do you have in mind regarding the OT/CRDT stuff? Multiple frontends connected to one backend? Or also multiple backends connected together? Or is it mostly about asynchronous updates from plugins?

@raphlinus
Copy link
Member Author

@rkusa What I plan on implementing in short order is a simple engine that evaluates the CRDT serially, for asynchronous updates from plugins. However, I'm imagining that the engine could be replaced with one that actually does distributed CRDT without affecting the rest of the editor.

@rkusa
Copy link
Contributor

rkusa commented May 6, 2016

@raphlinus Thanks for the info. Since in the plugin case, the backend is the single point all plugins report to, OT would work perfectly fine. I read the papers you linked some time ago, but if I remember correctly using CRDTs for collaborative/concurrent text updates introduces some overhead (e.g. in data structure) compared to OT. Anyway, I think most CRDTs are easier to understand/implement and since CRDTs are better with regards to additional future features: +1 for CRDTs

raphlinus added a commit that referenced this pull request May 7, 2016
My latest thinking on how to make async plugins and undo work well.
This is relevant to pull request #43 (plugin architecture RFC).
@trishume
Copy link
Collaborator

trishume commented Jul 5, 2016

@raphlinus I just read through this and thought I might weigh in on syntax highlighting specifically:

There's a possible middle ground between including syntax highlighting in the core, and having it be just another plugin that uses a JSON RPC API to place edits on the rich text annotations.

As I calculated earlier, syntax highlighting on large files can be quick if the serialization overhead is low. But I worry about using the normal JSON API, especially if the API is adding edits to the rich text sounds like it would have high overhead. With parse state caching, streaming and incremental rendering you could make it work for common cases but there would still be laggy cases like changing the syntax highlighting mode at the bottom of a long file. Or scrolling quickly soon after opening a file.

I'm thinking that perhaps there could be a special API for plugins that want to annotate buffers with Sublime-style scopes that uses a custom binary protocol (perhaps with Cap'nProto/Thrift). This could have low enough overhead that RPC is not an issue.

Alternatively, just a special JSON protocol optimized for low overhead might work. Perhaps something like sending large batches of updates to text styles as JSON arrays to avoid re-sending key names a billion times.

Also I bet you'll have to have a special case for the CRDTs for syntax highlighting so as not to overload them with millions of edits per second. An easy way is not to store past highlighting and just re-highlight on undo. Also if you plan on heavily relying on caching to make undo work fast that you may also have to cache parse states in the past (or worse, rely on plugins to cache past parse states).

Another thing to think about (can be combined with any of the above) is representation of semantic annotations. I really like Sublime/Textmate scopes and that is what syntect outputs. The thing is without a good representation they're super costly because if you use strings that can be like a 50 character string every 5 characters of text. I use a string interning method that also allows fast prefix checks for super fast colouring, you might want to borrow that from syntect (or just use it as a library, it is public). When giving data to plugins you could can just send them the numbers, which are all you need to check equality and prefixes, and then give them an API to access the interning table so they can stringify them and intern their own scopes/selectors.

My only non-highlighting related comment is that 20 undos is way too few for me. I use undo all the time. I undo over 50 things more often than I open large files even, so I would take an editor that offered long undo history over large file support (although I rarely use both at the same time, so that might be an out).

@raphlinus
Copy link
Member Author

@trishume Sorry for taking so long to respond. Life has been busy, it's been hard to find time for xi. I'm basically going to start coding on a prototype to get myself unstuck. I think I have a pretty good design. It's more line oriented than what I was thinking before (

To address some of the things you raised: I'll definitely send deltas, sending the whole file goes against the spirit of the project. I'll avoid difficulties with undo and the like by having syntax highlighting a transient layer, basically downstream of all edits to the buffer. There's a small amount of logic needed to handle deltas (you move the spans along with the underlying text they annotate, then the plugin may or may not replace the spans on any given line later). I'll have a "get" RPC from the plugin to the core which is about sending chunks of up to a megabyte or so (but aligned on line boundaries, unlike my previous ideas). This lets the plugin keep as much or as little state as it likes; in the latter extreme, it can just scan through the whole file, fetching chunks and replacing the spans line-by-line.

The prototype is an experiment. If performance is good, I'll stick with it, and this will make things very good for people wanting to experiment with alternative techniques for syntax highlighting (PEG parsers, etc). If performance is not awesome, we very much could move to a more optimized interprocess communication model, maybe with binary serialization, maybe with shared memory buffers. But at that point, I think it would be simpler (and more performant) to just bring it into the core and share ropes across thread boundaries. But we'll see.

I hear you regarding the undos. In fact, a large part of what's been distracting me has been researching Operational Transforms and CRDT's. The current design is very much n^2 in the number of undos, so increasing it too much is expensive. I have some ideas how to reduce that, but unfortunately it's pretty complicated, so I'd like to get back to implementing basic functionality (including highlighting!) before spending too much more time on it.

@raphlinus
Copy link
Member Author

A heads-up for people following this issue: I've made considerable progress on the plugin protocol recently, to the point where it wouldn't be crazy for someone to start prototyping a plugin for syntax highlighting. I've written a bit more at a thread on the xi_editor subreddit.

@raphlinus raphlinus mentioned this pull request Jul 22, 2016
2 tasks
@raphlinus raphlinus merged commit 7342e10 into master Apr 25, 2017
@raphlinus
Copy link
Member Author

I'm going to merge this stuff even though it leaves the docs in a far-from-perfect state. Having the content in the repo and iterating from there seems better than holding PR's open for a long time without activity.

@raphlinus raphlinus deleted the plugin-rfc branch May 6, 2017 03:39
lord pushed a commit to lord/xi-editor that referenced this pull request Oct 31, 2018
My latest thinking on how to make async plugins and undo work well.
This is relevant to pull request xi-editor#43 (plugin architecture RFC).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants