Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revamp wcwidth #1470

Closed
wants to merge 4 commits into from
Closed

Revamp wcwidth #1470

wants to merge 4 commits into from

Conversation

jerch
Copy link
Member

@jerch jerch commented May 24, 2018

This PR addresses 1. and 2. of issue #1467.

No tests yet.

@Tyriar I made as little as possible changes to the existing wcwidth implementation. To still be able to use the closured lookup table I simply extended the old factory like design. Therefore it looks a bit messy now, I can refactor it to a more readable flat design if needed.

Edit:
I was not sure where to put the set options method and added it to InputHandler for now.

Edit2:
The options method understands 2 settings:

  • ambiguous: Can be used to override the width of all ambiguous characters at once (much like in iterm2). I inspected several other wcwidth implementations - since most of them alter the width for ambiguous chars all together I did not split them into separate segments either. If this is needed the custom setting can be used instead.
  • custom: Can be used to adjust the wcwidth for specific characters in the form {charCode: width}. This setting overrides any other width rule. Not applicable for ASCII chars due a performance optimization (should never be an issue).

The factory method understands 2 more settings nul and control. I did not expose them since it is not a good idea to change their behavior.

@jerch jerch mentioned this pull request May 25, 2018
@@ -110,6 +110,7 @@ export interface ICompositionHelper {
export interface IInputHandler {
parse(data: string): void;
print(data: string, start: number, end: number): void;
setWcwidthOptions(opts: {ambiguous?: 0 | 1 | 2, custom?: {[key: number]: 0 | 1 | 2}}): void;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the idea to expose this as a setting? What other terminals have you checked that also do this?

Copy link
Member Author

@jerch jerch May 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well the ambiguous setting is comparable to the setting in iterm2, although they only offer to set full width on or off (here 2 or 1, we still support the current "default" by omitting this setting). It might be useful to expose this so people can hot patch the width of ambiguous chars if they encounter issues. I think VTE also supports this by an env variable. There is even a test file, that will show the difference (just open the attachment 00example.txt here mintty/mintty#88 (comment) in an editor in xterm.js with ambiguous unset, set to 1 and set to 2). I also gonna add some tests to show/test the difference.
The custom setting might be helpful internally, if the systems wcwidth does not agree with our implementation - we could simply overload it with the system setting to avoid cursor and line ending problems. I would not expose this the user since it might break more than it will help. Still we could offer an addon or something similar to do the nasty low level stuff.

Local terminals are normally not affected by this since they can use the systems wcwidth (and will have automatically the same settings). We cant do that in a browser component since we have no access to the C land. In VSCode (and similar "local" terminals) this could be achieved by some additional node package with access the system wcwidth (some C++ binding), in the demo it gets more tricky to load those data, maybe by some addon with a server addition.

See also my comment here #1467 (comment) which kinda addresses this problem.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While it might be useful to allow someone to pass a custom dictionary, in the case where we want to match with the system's wcwidth, it might be better if we can instead pass a function, because that matches the API POSIX gives us.

So, I'd recommend that the option be supplied as custom: (key: number) => 0 | 1 | 2 | undefined, where undefined causes fall-through behavior. If someone wants to use an object, then they can implement it as a function stub (e.g. (key) => {123: 2}[key]).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bgw Yupp thats better, gonna change it.

@Tyriar Tyriar added this to the 3.6.0 milestone Jun 1, 2018
@Tyriar Tyriar added the work-in-progress Do not merge label Jul 4, 2018
@Tyriar Tyriar removed this from the 3.6.0 milestone Aug 8, 2018
@jerch
Copy link
Member Author

jerch commented Aug 31, 2018

Closing this for now since it needs some more thinking about the right way to deal with the unicode mess behind, see #1467 (comment) for further details.

@jerch jerch closed this Aug 31, 2018
@Tyriar Tyriar added the reference A closed issue/pr that is expected to be useful later as a reference label Aug 31, 2018
@jerch jerch mentioned this pull request Aug 31, 2018
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
reference A closed issue/pr that is expected to be useful later as a reference work-in-progress Do not merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants