-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement colon separated CSI parameters #22
Comments
Because control sequences sometimes need to look at multiple parameters together, we can't simply split into multiple function calls. It looks like xterm implements subparams by basically sticking them into the params list but then also keeping a separate array which indicates which are subparams and which are top levels params. Since this naturally makes it possible to exceed the 16 param limit for normal use, xterm caps the params list at 30 instead. A similar approach here would have comparable performance to the existing implementation. It would allow the |
AFAIK the only sequences with subparameters in the standard are colors, which could be handled as a special case by the parser. From what I've seen, this is what other parsers do. What you suggest is something along the lines of adding a parameter to the csi callback of type The first approach might allow us to keep backward compatibility, but the latter is certainly more generic. |
The parameter format is specified generically such that any control character could be associated with 1 or more params. Here's the relevant section from ECMA-48 My suggestion is that we would allow the params list to be larger and also have something like a &[bool] of the same length which indicates which params are actually subparams. I wouldn't actually call it a suggestion though -- this is just what xterm does. We could probably have a nicer data structure in Rust. |
A simple example of what I was thinking: Playground link. |
Thanks for clarifying with an example. The slice ranges work for me as long as we can avoid dynamic allocations. |
Stumbled over your issue here through GH suggestions lol. Maybe I can share a few insights from our side (we implemented this a few months ago in xterm.js). Imho the crucial part here is not to limit this to certain final functions or certain param values, as the spec clearly states, that any param can have those substring param extensions (as cited by @jwilm above). I think some emulators get this wrong. Following the spec, this a perfect legal example:
Whether those extensive subparams make sense at all is up to the final sequence function to decide. Thus a GP sequence parser should pass subparams along as they occur without losing the associated param they are meant to extend. Took me a while to find a suitable structure for that, in the end we went with a single parser wide data structure with certain limits (per sequence: up to 32 params, up to 32 subparams in total, a single value clamped to -1 .. 2³¹ - 1), that gets borrowed by the sequence functions to avoid re-allocations. Since our parser is based on the VT500 parser described on VT100.net, this subparam handling also applies to DCS params (we did not introduce another action to distinguish between CSI_PARAM and DCS_PARAM). Well, the DCS parsing of the VT500 parser is not ECMA compatible in the first place, thus I think the subparam extension does not hurt anyone in the DCS field. Also all well-defined DCS sequences keep working. Maybe this helps somewhat to get things sorted out for your parser. |
Thanks for the input! I 100% agree that the parser should not in any way try to interpret the sequence, only provide the library user with information about how the parameters were read. I guess the main issue I have with the implementation for now is what should the interface look like (I'm also sure this will become less of a problem once I actually formulate the requirements I expect from the interface in this comment, stop sitting on the issue for eight months and actually put together a PR 😄) What I would like to achieve is to support these use cases:
This leads me to believe that we have to pass the slice as we do now, and then some additional metadata, which in the end might not be the slice Range, but a custom struct that is Copy, Index and maybe something else. I will experiment for a bit and open a PR once I feel I have a hint of a solution. |
Some more remarks from my side: Technically the params in the example above would translate into something like this (as "int arrays" in C): { { 1, 2, 2147483647 }, { 3, 4 }, { 5, -1 }, { -1, 6 } } (we use -1 as placeholder for omitted values). While the old impl was just a simple array with index access to the values, this one needs to store the slice offsets somehow. After puzzling around and doing some benchmarks I ended up with these containers instead: int params[32] = { 1, 3, 5, -1, ... };
int subparams[32] = { { 2, 2147483647 }, { 4 }, { -1 }, { 6 }, ... }; I basically pulled the first value of those subparam groups into the old params array we had before, and put excessive subparams in a new container (+ the slice offsets into a third one). I did not do this slightly more complicated model for backwards compatibility, still it would serve any sequence w'o subparams the old way. Main reason to do this was perf - reshaping everything to use slices was quite a dent in the perf, thus I optimized it for the usual case with no subparams. I have not looked at your current impl, maybe you can restore parts of that idea as well. Note that the type exhibits the same params interface to functions as before, subparams are only "on top" (if the function cares at all for those). |
That seems like a good idea, I will try to take inspiration from it. I'll see what I can come up with. |
Yepp, well C plays "nicely" here and would just store those values in linear memory as The slice indices are stored separately like this (yes in C everything has to be done by hand 😸):
which is a quite compact memory layout (with a hard coded limit of 256 subparams due to the Ofc this is made with C in mind, not sure if you can do it similar in rust, which comes with much nicer default types out of the box. In C at least always doing the slicing descent is much worse in runtime than treating the first value special by pulling it "in front". |
Please note that VTE should work without |
I know, that comment was a supposed to indicate that the implementation will probably go a different way, but apparently I wasn't clear enough 😄 |
This adds support for CSI subparameters like `\x1b[38:2:255:0:255m`, which allows the combination of truecolor SGR commands together with other SGR parameters like bold text, without any ambiguity. This implements subparameters by storing them in a list together with all other parameters and having a separate slice to indicate which parameter is a subparameter and how long the subparameter list is. This allows for static memory allocation and good performance while still having the option for dynamic sizing of the parameters. Since the subparameters are now also counted as parameters, the number of allowed parameters has been increased from `16` to `32`. Since the existing structures combine the handling of parameters for CSI and DCS escape sequences, it is now also possible for DCS parameters to have subparameters, even though that is currently never used. Considering that DCS is rarely supported by terminal emulators, handling these separately would likely just cause unnecessary issues. The performance should also be better by using this existing subparam structure rather than having two separate structures for DCS and CSI parameters. The only API provided for accessing the list of parameters is using an iterator, this is intentional to make the internal structure clear and allow for easy optimizations downstream. Since it makes little sense to access parameters out of order, this limitation should not have any negative effects on performance. The main drawback is that direct access to the first parameter while ignoring all other subparameters is less efficient, since it requires indexing a slice after iterating to the element. However while this is often useful, it's mostly done for the first few parameters which significantly reduces the overhead to a negligible amount. At the same time this forces people to support subparameters or at least consider their existence, which should make it more difficult to implement things improperly downstream. Fixes alacritty#22.
This adds support for CSI subparameters like `\x1b[38:2:255:0:255m`, which allows the combination of truecolor SGR commands together with other SGR parameters like bold text, without any ambiguity. This implements subparameters by storing them in a list together with all other parameters and having a separate slice to indicate which parameter is a subparameter and how long the subparameter list is. This allows for static memory allocation and good performance while still having the option for dynamic sizing of the parameters. Since the subparameters are now also counted as parameters, the number of allowed parameters has been increased from `16` to `32`. Since the existing structures combine the handling of parameters for CSI and DCS escape sequences, it is now also possible for DCS parameters to have subparameters, even though that is currently never used. Considering that DCS is rarely supported by terminal emulators, handling these separately would likely just cause unnecessary issues. The performance should also be better by using this existing subparam structure rather than having two separate structures for DCS and CSI parameters. The only API provided for accessing the list of parameters is using an iterator, this is intentional to make the internal structure clear and allow for easy optimizations downstream. Since it makes little sense to access parameters out of order, this limitation should not have any negative effects on performance. The main drawback is that direct access to the first parameter while ignoring all other subparameters is less efficient, since it requires indexing a slice after iterating to the element. However while this is often useful, it's mostly done for the first few parameters which significantly reduces the overhead to a negligible amount. At the same time this forces people to support subparameters or at least consider their existence, which should make it more difficult to implement things improperly downstream. Fixes #22.
I was looking into implementing styled underlines into alacritty, like what you might see in kitty or gnome vte based terminals - that is normal underline, double underline, curly underline (undercurl), and have it have a different color than the foreground.
Looking at the kitty implementation (spec), the terminal emulator has to handle not only semicolon separated parameters (
\e[0;4m
), but also colon separated parameters (\e[4:3m
).I'm not an expert on the standard defining how those sequences should be handled, so I was going by kitty's source code. What's being done in their codebase is not applicable to this repo, because they have the entire sequence buffered up until the command byte ('m' in this case) and then parse it, dispatching every semicolon separated parameter as it's own sequence, with special cases for color parameters. In that way, a command like
\e[0;4;58:2::186:93:0m
would be split into three function calls: handle0m
, handle4m
, handle58:2::186:93:0m
.This could be implemented in the vte crate (I think) by adding a dimension to the
params
array, and multiple calls to the Performer implementor. Not sure what the performance cost would be though.The text was updated successfully, but these errors were encountered: