structured/"rich" text, text annotations/overlay #1767

phmarek · 2015-01-03T14:47:07Z

With the new plugin structure and things like the MessagePack API, it becomes much easier to have external plugins; one thing that is still awkward is the coloring of information.

I'm thinking about a plugin here (slimv, to be exact, although many others would have the same issue) that wants to have its own buffer to display arbitrary information. To get some highlightning (of "active" fields and other parts), there have to be syntax rules that operate via string matching - and that is

cumbersome (needs separators that don't exist in the text)
not nice re cursor movement (think concealcursor, conceallevel)
slow (for big buffers)
unnecessary.

How about being able to specify "classes" for (parts of) text, so that the syntax coloring rules can be applied directly without needing the RE engine inbetween? I imagine sending not a plain string for a line, but something like ["a string ", { class: "Error", text: "some error string"}, "more plain text"].

If that could be stored in Neovim directly it should make lots of things easier - especially for the plugins if they could store (some) arbitrary information in the dictionary as well! (Currently such things have to be put into the line, concealed in some way, and then matched and parsed out again, which is awful.)

To give a specific example - currently a line looks like this:

{[10] "types" []} = {[11] #<HASH-TABLE :TEST EQUALP :COUNT 3 {1009D95AD3}> []} {<3> [remove entry] <>}

while the visual display is

"types"  = #<HASH-TABLE :TEST EQUALP :COUNT 3 {1009D95AD3}>  [remove entry]

(and includes colors, of course).

The text was updated successfully, but these errors were encountered:

justinmk · 2015-01-03T16:54:45Z

How about being able to specify "classes" for (parts of) text, so that the syntax coloring rules can be applied directly without needing the RE engine

#719 (comment) mentions scintillua, which looks like a very good alternative to regex for describing syntax.

It doesn't solve your more general request, which I interpret as "text properties". Text properties are very important but will require careful design, and won't be possible until we achieve more fundamental changes such as abstracting the buffer structure. Currently in the n/vim core, a buffer is basically a character array (memline.c); trying to bolt-on text properties will result in terrible performance.

fmoralesc · 2015-01-03T19:09:59Z

vis's basic data structure (a piece chain) seems good for this kind of thing, but moving to that could require a major rewrite of the regexp engine.

aktau · 2015-01-03T20:28:39Z

vis's basic data structure (a piece chain) seems good for this kind of thing, but moving to that could require a major rewrite of the regexp engine.

I saw vis appear on HN too, and it intrigues me a lot. Though at some points in the description I thought: we can't omit that (feature) as he has done, (n)vim needs it.

Being able to mmap files into the memory pace is really really cool, but already breaks down when input conversion needs to be done, as said by @martanne (plus the fact that vim currently scans over the entire file to determine the encoding, which is cool since it has to read it into allocated memory anyway).

justinmk · 2015-01-03T20:35:28Z

@aktau If a buffer is opened with :e ++enc=utf8 we could avoid conversion and scanning. And we could provide a user option that says "always assume utf8".

aktau · 2015-01-03T20:38:30Z

@aktau If a buffer is opened with :e ++enc=utf8 we could avoid conversion and scanning. And we could provide a user option that says "always assume utf8".

Yes, if the fenc (forced or not) is the same as enc, then a read-only mmap could be done. Otherwise not so much.

Most likely this is the majority case. But for example this would break down when opening binary files which are likely to fail the utf-8 test, which is usually when people really need large file support.

I'm also not sure of the impact of such a split-up on the manageability of the code. As I understand it, the current mem{line,file} combo is actually quite clean.

phmarek · 2015-01-03T20:39:56Z

@aktau as soon as there's a NUL byte in the first few kBytes, the encoding should be seen as "binary", so the UTF8 test shouldn't matter here.

phmarek · 2015-01-03T20:41:38Z

Well, it's equally possible to keep that memory layout, and "just" have an additional map (array or hash or tree or whatever) that can deliver additional details for parts of a line... don't know whether that's the same kind of rewrite, though.

aktau · 2015-01-03T20:41:51Z

@aktau as soon as there's a NUL byte in the first few kBytes, the encoding should be seen as "binary", so the UTF8 test shouldn't matter here.

As far as I can remember (got an unfinished blog post about this), if enc is utf-8 (which it most likely is), and fenc is something else, conversion will take place. I'm not even sure if binary is a vim encoding, I don't think so.

aktau · 2015-01-03T20:43:05Z

Well, it's equally possible to keep that memory layout, and "just" have an additional map (array or hash or tree or whatever) that can deliver additional details for parts of a line... don't know whether that's the same kind of rewrite, though.

Like some sort of conversion overlay, you mean. This would work well for latin-X to utf-8 or the reverse, but more distincts encodings would probably suffer.

aktau · 2015-01-03T21:08:16Z

All that said, I think perhaps an on-the-fly conversion overlay could work. It only being generated when a certain piece of the buffer is actually requested. I shudder at the thought of implementing this without any crazy bugs though. The thought-stuff sure is enticing.

justinmk · 2015-01-03T21:34:56Z

On the other hand, large files support seems like yet another concept that could be added on as a plugin. In the case of binary files, syntax/highlighting is obviously not needed, so a "view" of a file could be fed to nvim and the usual motions and non-syntax plugins could work on the partial buffer.

In the case of a large log file, for which syntax/highlighting would be needed, nothing is lost because vim already has maxlines and synmaxcol values which limit the lines evaluated by the syntax engine.

Some problems I can think of with this "view" approach:

in-file search (/) and :vimgrep won't work. We would need to fall back to an external search tool (which likely wouldn't support vim-style regex).
we would need to modify nvim core to understand the concept of "deferred content". E.g., we only send the current view, but let nvim know the actual line/column count (and other parameters I haven't thought of)

Personally I really prefer trying to leverage robust external solutions and only enhancing the core by adding hooks.

justinmk · 2015-01-03T21:39:47Z

Another reason I like the plugin approach is that in the common case, loading the file in memory is not really a problem and avoids complication. When people load large files they are unhappy about one of two things:

too slow
not enough features

If you load a giant C# (10 MB) file in Visual Studio, it will buckle (I know this for a fact). Add ReSharper and you might as well get some coffee. So you must either choose fast or good, and that means it is reasonable to disable some features (vim regex, whole-file analysis) on very large files.

I am interested to hear other cases that I am missing which would break with the "partial view" approach.

aktau · 2015-01-03T21:43:58Z

f you load a giant C# (10 MB) file in Visual Studio, it will buckle (I know this for a fact). Add ReSharper and you might as well get some coffee. So you must either choose fast or good, and that means it is reasonable to disable some features (vim regex, whole-file analys) on very large files.

I wasn't actually thinking about 10MB files as large. It sounds like peanuts. I was more thinking of 1-50GB size files. Which would have trouble fitting in main memory. Off the top my head, I don't know how well (n)vim does with a 10MB source file (syntax highlighted and all), but I would consider it a failure if we don't solve that (in case it has issues).

justinmk · 2015-01-03T22:27:31Z

n/vim (with syntax highlighting and neocomplete) has no problem at all on the same 10 MB C# file (obviously VS/ReSharper are doing a lot more work on that file, so I don't mean to compare the two). I only raised that example to point out that one cannot expect all features in all scenarios.

Migrating the buffer data structure of n/vim is pretty close to a total rewrite. I find it much more interesting to see how far we can get with alternative solutions.

phmarek · 2015-01-04T14:02:27Z

Hmm, to get back to my original request ... how about providing some kind of rich text buffer with some restrictions?

readonly, ie. only modifyable via replacing whole lines
highlightning only valid within line, so needs to be repeated for each line in a paragraph
cannot be saved or loaded

justinmk · 2015-01-04T14:48:58Z

Is text properties not a correct interpretation of your original request? I don't understand what is new in the rich text buffer you describe.

phmarek · 2015-01-04T14:53:40Z

Yeah, text properties might be a good name for it, too.

I'd need not only the highlighning class name, though - storing arbitrary data as well would be nice.

justinmk · 2015-01-04T15:23:25Z

Storing arbitrary data in association with a piece of text, and that association follows the text as edits are made. I believe the existing marks logic could be extended to do this, though it may not be scalable.

phmarek · 2015-01-04T16:09:32Z

Storing arbitrary data in association with a piece of text, and that
association follows the text as edits are made.
Sounds right, although for my use case no (user-)edits are needed.
I'd just replace whole lines via RPC.

As for an easy example, think about netrw directory listings with
coloring, like ls does.
Perhaps with optional other highlightning, eg. files newer than an hour,
files bigger than X, or something like that, to get more colorized items
in a line.

tarruda · 2015-01-04T16:19:14Z

How about allowing arbitrary key/value pairs in the :highlight command(eg: :highlight SomeGroup rgba=#e1e1e1cc) and simplify association of highlight groups with arbitrary ranges? These arbitrary key/value pairs are consumed only by UIs that are interested

The advantage is that we reuse the existing mechanism for decorating text

fmoralesc · 2015-01-04T17:52:05Z

@phmarek If you can compute the position of the text to highlight since it is static, shouldn't it be possible to use matchaddpos()? That said, it seems to me that would be even more cumbersome than what we currently have; I use the concealed tags method in vim-pad and I know what you mean about it being not as clean as one would want.

Perhaps introducing a virtual key to tag separators (let's say <HSep>, like <SNR>) would help that, so instead of

 {[10] "types" []} = {[11] #<HASH-TABLE :TEST EQUALP :COUNT 3 {1009D95AD3}> []} {<3> [remove entry] <>}

you could have

<Hsep>10 "types" 10<Hsep> = <Hsep>11 #<HASH-TABLE :TEST EQUALP :COUNT 3 {1009D95AD3}> 11<Hsep> <Hsep>3 [remove entry] 3<Hsep>

@tarruda That would be quite helpful, and not only for UIs.

phmarek · 2015-01-04T18:04:03Z

@fmoralesc That might be an option, too, but not much cleaner IMO.

And, in the long run, I'd like to shoot for having a "text" property named img, to have inline images, and this here would be an "easy" first step ;P

bfredl · 2015-01-04T18:09:47Z

matchaddpos works good for e.g. a read-only output buffer, but it's a little bit inconvenient since it only modifies the current window. In a plugin I want to dynamically highlight an output buffer that need not be the current window. Switching current window back and forth kind-of works, but is not entirely reliable. Also it would be more convenient if the added highlighting were associated with a buffer and not a window (if the window is closed and the buffer then reopened, the matches need not be re-added). A bufferwise matchaddpos is perhaps something to consider?

fmoralesc · 2015-01-04T18:22:48Z

@phmarek Sure, I was only thinking of the issue of having to specify different separators depending on the contents of the region, which that would solve.

I think @tarruda's suggestion could help for implementing img. Actually, already nothing should stop a UI to interpret text like

  ![image](path)

to be displayed as a image, it's just that all UIs currently assume a grid of text. Expanding the :hi command would allow providing hints to UIs about this:

 syn match cmImage /![.\+](.\+)/ 
 hi cmImage type=img

I've thought a mechanism like this could allow plugins like NerdTree to be displayed as native lists (like this), and special buffers to be displayed using non-fixed with fonts.

@bfredl 👍

justinmk · 2015-01-04T18:44:21Z

Not sure highlight is the right mechanism, rather :syntax. But extending vimscript seems unnecessary to me in this case. We should only reuse the internal structures, but expose the functionality via the API only.

fmoralesc · 2015-01-04T18:54:11Z

@justinmk Probably. Extending both would be helpful.

(As to extending :syntax, I was thinking of adding a conceahhl attr to it, to allow highlighting different conceals differently, which has been a pain for me for a while at vim-pandoc-syntax).

Extending vimscript can also benefit vanilla vim, if the code could be proposed to vim_dev (I know...)

justinmk · 2015-01-04T23:23:17Z

Sure if vim_dev accepts, but otherwise extending vimscript really complicates the burden of compatibility (or managing, documenting, and providing solutions for incompatibility) and always requires difficult, time-hungry decisions. Incompatibility is always an option (and sometimes we will choose it), but we can also work on making our API really nice to work with (from vimscript, too, using rpcnotify() and friends).

tarruda · 2015-01-05T00:15:56Z

But extending vimscript seems unnecessary to me in this case. We should only reuse the internal structures, but expose the functionality via the API only.

The vimscript changes are minimal, all I'm proposing is to allow arbitrary key/value pairs after the :highlight GROUP command(as opposed to allowing only fixed keys such as guifg, ctermfg, etc). Highlight group data would be stored and passed as dictionaries, and UIs will only extract the information they support.

For example, if a TUI and a GUI are connected to the same instance and need to display a highlight group, the TUI would check for ctermfg/ctermbg while the GUI would check for guifg/guibg and even richer information such as alpha level or images as suggested by @fmoralesc suggested.

For associating the highlight information with arbitrary positions, I vote for @bfredl suggestion: a buffer-awarematchaddpos()

Besides being backwards compatible and allowing arbitrary formatting to be associated with text, this has the advantage of simplifying code (I estimate about 40% of syntax.c could be removed).

felipesere · 2019-09-05T07:06:48Z

@bfredl how is your tree-sitter work coming along? I had to open some multi-megabyte HTML files (terms and conditions) and had to switch highlighting off to move the cursor. That is how I stumbled here!

bfredl · 2019-09-05T10:00:14Z

@felipesere It will be one of my priorities after 0.4 is released (very soon, hopefully).

adaszko · 2019-10-24T19:34:04Z

I'm not sure if this is off-topic but #1767 (comment) got me thinking: Will it be possible, once the tree-sitter branch is merged, to use language-specific syntax objects like identifier, function_definition directly from VimScript? I'm thinking of cases where for instance iskeyword option isn't precise enough (think e.g. Rust with it's foo!() and foo! being a keyword if foo is a macro, but !foo not being a keyword in if !foo {}). I see a huge potential in this. It always struck me as odd that movements like ( don't make much sense in almost any programming language, only when writing prose. With tree-sitter merged in, there could be language-specific, precise text objects available under a single key in Vim! Another example is targets.vim implementing function argument text object. It easily gets confused when there are commas within a function argument text object. I believe tree-sitter has the potential to fix that shortcoming.

justinmk · 2019-10-30T19:58:05Z

Will it be possible, once the tree-sitter branch is merged, to use language-specific syntax objects like identifier, function_definition directly from VimScript

Yes. Directly from Lua, which is accessible from Vimscript. Initially however, we will provide only a query API. We will document patterns for using the API to query syntax, which can be used to create mappings. Later, we will think about adding first-class Normal-mode commands for common cases like "around Function" (af, if), etc.

bfredl · 2024-03-22T07:49:02Z

A lot of work has been done in this area. more focused issues / drafts are tracking pieces that are missing.

justinmk added the discuss label Jan 3, 2015

tarruda mentioned this issue Jan 5, 2015

[WIP] "abstract_ui" fixes and improvements #1657

Merged

justinmk mentioned this issue Jan 18, 2019

Dynamic Syntax Highlighting #9521

Closed

dlants mentioned this issue Aug 21, 2019

Linters get confused easymotion/vim-easymotion#402

Open

justinmk added extmarks extmarks, decorations, virtual text, namespaces treesitter virtualtext and removed gsoc community: Google Summer of Code project labels Jan 21, 2020

justinmk modified the milestones: todo, 0.5 Jan 21, 2020

alerque mentioned this issue May 12, 2020

Replace RegEx based syntax with external AST parser vim-pandoc/vim-pandoc-syntax#300

Open

3 tasks

janlazo modified the milestones: 0.5, 0.5.1 Dec 26, 2020

janlazo mentioned this issue Dec 26, 2020

Structural Editing : Tracking Issue #12842

Open

janlazo modified the milestones: 0.5.1, 0.6 Feb 15, 2021

clason removed the virtualtext label Jul 29, 2021

resolritter mentioned this issue Nov 30, 2021

inline virt_text which occupies virtual columns #16466

Closed

clason modified the milestones: 0.6, 0.7 Nov 30, 2021

ykuksenko mentioned this issue Feb 23, 2022

Use custom highlight attributes to support images, vector graphics and fonts neovide/neovide#1144

Open

dyamito mentioned this issue Dec 31, 2022

UI: exotic text decorations #21603

Open

bfredl modified the milestones: 0.9, 0.10 Feb 2, 2023

bfredl mentioned this issue Feb 4, 2023

non-monospace text rendering #22125

Closed

bfredl closed this as completed Mar 22, 2024

neovim locked as resolved and limited conversation to collaborators Mar 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

structured/"rich" text, text annotations/overlay #1767

structured/"rich" text, text annotations/overlay #1767

phmarek commented Jan 3, 2015

justinmk commented Jan 3, 2015

fmoralesc commented Jan 3, 2015

aktau commented Jan 3, 2015

justinmk commented Jan 3, 2015

aktau commented Jan 3, 2015

phmarek commented Jan 3, 2015

phmarek commented Jan 3, 2015

aktau commented Jan 3, 2015

aktau commented Jan 3, 2015

aktau commented Jan 3, 2015

justinmk commented Jan 3, 2015

justinmk commented Jan 3, 2015

aktau commented Jan 3, 2015

justinmk commented Jan 3, 2015

phmarek commented Jan 4, 2015

justinmk commented Jan 4, 2015

phmarek commented Jan 4, 2015

justinmk commented Jan 4, 2015

phmarek commented Jan 4, 2015

tarruda commented Jan 4, 2015

fmoralesc commented Jan 4, 2015

phmarek commented Jan 4, 2015

bfredl commented Jan 4, 2015

fmoralesc commented Jan 4, 2015

justinmk commented Jan 4, 2015

fmoralesc commented Jan 4, 2015

justinmk commented Jan 4, 2015

tarruda commented Jan 5, 2015

felipesere commented Sep 5, 2019

bfredl commented Sep 5, 2019

adaszko commented Oct 24, 2019

justinmk commented Oct 30, 2019

bfredl commented Mar 22, 2024

structured/"rich" text, text annotations/overlay #1767

structured/"rich" text, text annotations/overlay #1767

Comments

phmarek commented Jan 3, 2015

justinmk commented Jan 3, 2015

fmoralesc commented Jan 3, 2015

aktau commented Jan 3, 2015

justinmk commented Jan 3, 2015

aktau commented Jan 3, 2015

phmarek commented Jan 3, 2015

phmarek commented Jan 3, 2015

aktau commented Jan 3, 2015

aktau commented Jan 3, 2015

aktau commented Jan 3, 2015

justinmk commented Jan 3, 2015

justinmk commented Jan 3, 2015

aktau commented Jan 3, 2015

justinmk commented Jan 3, 2015

phmarek commented Jan 4, 2015

justinmk commented Jan 4, 2015

phmarek commented Jan 4, 2015

justinmk commented Jan 4, 2015

phmarek commented Jan 4, 2015

tarruda commented Jan 4, 2015

fmoralesc commented Jan 4, 2015

phmarek commented Jan 4, 2015

bfredl commented Jan 4, 2015

fmoralesc commented Jan 4, 2015

justinmk commented Jan 4, 2015

fmoralesc commented Jan 4, 2015

justinmk commented Jan 4, 2015

tarruda commented Jan 5, 2015

felipesere commented Sep 5, 2019

bfredl commented Sep 5, 2019

adaszko commented Oct 24, 2019

justinmk commented Oct 30, 2019

bfredl commented Mar 22, 2024