This is based on the work of @MrRio in his PR #535, so thanks to him for the work, I couldn't have done this without him.
This PR makes hterm add a <span> around every unicode character, even when they have default style. The span has the width of the font, which basically forces it to be a monospace font.
See this screenshot with vtop :
And you can see that the cursor is still well positioned, even with my prompt abusing wide characters :
There are still some problems that can't be solved (in my opinion, if anyone has a brilliant idea I'll take it)
This is probably not a problem in prompts because there are spaces around, and there are not wide chars side-by-side all that often. We could theoretically measure each characters before inserting them, but then we'd need to fix hterm's cursor position.
What's left to do:
All feedback appreciated !
matchesContainer should only return false when the line contains unicode chars (or more narrowly, if the previous char contains unicode chars)
As promised, here is a screenshot of HyperTerm at the same revision but on my debian VM. The problem is that some characters are larger than others, even though the font is monospace
This is more obvious on the following 2 screenshots. All I did was change the font size from 14 to 16.
The only way I see to fix this would be to wrap every character in a , like I do currently with unicode chars, but I'm afraid this would hurt performance. Maybe we could make it optional, so that users that have problems can try to activate it ?
Can it be done for every char within a line if that line contains a unicode char?
@MrRio The debian problem is not related to unicode unfortunately, so no there's no way to detect when it's needed (other than maybe measuring each char independently and comparing that to the font-width). In any case I think it's a subject for another PR, I'll try to get to it after this one is closed.
Regarding the "matchesContainer" function, I added a finer control on that. I'd appreciate for someone else to test everything on other setups than mine (MacOS, other fonts, other prompts, other workflows...) because the way we monkeypatch hterm like this makes it very hard to ensure I covered every case. I think Hyperterm should use its own fork of hterm, it would be easier to maintain and scale (at the price of a higher entry cost). But again, that's another subject.
After reviews and tests, this PR should be almost ready to merge.
I wanna propose runes library for splitting unicode strings. Because unicode-string-utils don't support some emoji subset.(👩👩👧👦,❤️,👍🏽 etc)
@dotcypress Hey, thanks for the feedback. Can you provide me with steps that produce unexpected result because of unicodeStringUtils? I haven't been able to create a problematic test case on my computer.
Try curl https://gist.githubusercontent.com/dotcypress/35da7cec74643bc095bbe5c9a14f66f6/raw/51fa6f1e6080ce39a2d712175f077373d8b580cb/test.txt | cat
curl https://gist.githubusercontent.com/dotcypress/35da7cec74643bc095bbe5c9a14f66f6/raw/51fa6f1e6080ce39a2d712175f077373d8b580cb/test.txt | cat
Hyperterm with your changes:
What are the performance implications of creating so many <span> elements?
@rauchg I didn't measure it precisely, but I don't think having many spans is a big problem (atom has way more I think, for example). A potential problem would be in applications that write a lot of unicode characters, like in vtop for example, because there are new spans created constantly (as opposed to a fixed hig-number of spans). It could be noticeable on slower or average computers.
Friendly ping @BenoitAverty 🤗
Hello matheuss. I'm not sure how to advance this PR. I think it's ready to merge, but I was only able to test on linux. I don't think Emoji support is done in this PR, but the cursor is OK and the char-width issues are mostly fixed, if we ignore platform-specific problems like I mentionned on debian (these problems come from chromium so we can't do anything about them except add a flag that enables a span around each chars, probably should do that in another PR).
I'm more than happy to work on this again if someone finds a bug, a regression, or something that should be added before merging.
Don't hesitate to ask, or tell me when I should rebase.
Thanks for looking at this :)
Amazing @BenoitAverty 😄 I'll test it today and then merge/give feedback 🙌
Add container with a fixed width around unicode characters
Make container width dynamic and fix wide chars clipping issue
Finer control on the creation of containers for text inserted in the …
@matheuss Any update on the status of this?
@winneon I just discussed it with @rauchg recently, and: we're merging it very soon, before the next release 😄
@rauchg and I discussed this PR recently, and we agreed that we needed some benchmarks. The first thing that came to our minds was @indutny's latetyper. But it's not very useful in this context because it only types zs – I tried it and this PR produced almost the same results that Hyper 0.8.3 produced.
With that in mind, I moved to more visual (and less scientific) tests.
(Both GIFs were recorded @ 30fps and downscaled from 1416x1080 to 800x610)
I don't think that having lots of spans is a problem (looking into my Atom right now, I see 2550; with vtop open on this PR, i see 2879). Apparently, the problem begins when you have lots of moving spans. As per the GIF above, some frames are taking way too much time to render.
If you have something like 3 special chars per line, you should be fine – 3 static spans per line will not need much CPU to be rendered. But when you're running something like vtop, with hundreds of moving spans, it's likely that you'll experience some lags – at least with a setup like mine (details below).
I also noticed that if I resize the window while vtop is running, Hyper freezes for ~10 seconds – such behavior is not present on v0.8.3.
PS: my setup: MacBook 13" Retina, 2.6GHz i5.
PS2: both builds (v0.8.3 and this PR) were running in production mode (npm run dist). Also, I can guarantee that the scenario between each screenshot/GIF was the same.
npm run dist
With that on mind, @BenoitAverty @MrRio and @rauchg: how could we improve this approach? There's room for improvement? There's any other approach(s) that is worth a try?
Thanks for taking the time to do this 👍
I don't have any idea currently on a completely different approach. The problem is that there's no way to avoid certain characters being larger than others. Given that fact, the only way to do anything on those characters is to wrap them in a span, whatever we do with that span (use width currently).
One Idea that I had at the beginning was to wrap several characters in spans instead of only one, and set the width of that bigger span to the theoretical width of the entire string, but this didn't work for some reason (probably a margin issue or something). @MrRio tried it before me and had the same result.
A dirty workaround would be to put this PR behind a configuration flag. That way people experiencing cursor issues could fix it, and people experiencing performance issues could go back to current behaviour for the time being. I see several drabacks to this:
If i have some time, i'll try again with the larger spans, and if I have another idea I'll share it here, but currently I don't have any other idea.
After much investigations with @matheuss, here are some more information:
The problem with vtop happens because most fonts don't have the braille characters used to draw the graphs. As a result, chromium falls back on the first font it finds with braille, which is often not monospaced (or at least not the same size). Apple Braille on MacOS, DejaVu Sans on linux (DejaVu Sans Mono doesn't have braille unfortunately)
This particular problem can be solved by including another font, for example FreeMono which has braille and is monospaced. This can be done without creating any more spans, but:
Anyway, that explains the problem with vtop but doesn't solve the problem with chars that are larger even in a monospaced font.
Hi, I'm the original author of hterm. If you'd like to try to upstream this fix into hterm rather than monkeypatching or forking, I can help.
@rginda that'd be amazing! Do you have any ideas on how to approach this?
Thank you SO MUCH for your work on this, @BenoitAverty and @MrRio!
We (@rauchg) discussed extensively the performance issues that this method introduces, and we concluded that it's better to merge it as it is for now (so our next release can have it) and work on the performance improvements later.
Once again, thank you! 🙌
Since @rginda chimed in, this is a good case study to review together. @MrRio the creator of vtop can also comment accordingly.
We had to start wrapping special characters in <span> to introduce a "double width" through custom styling. But that's introducing a huge regression of performance in vtop (and perhaps other terminal apps)
I suspect the issue is that (without having read any code yet :P) MrRio re-paints aggressively by clearing the screen and re-appending lines a bunch of times per second. Which is what he should do, and I'm sure lots of terminal apps work this way.
I think that because we're seeing, with this PR in place, a huge number of <span> elements being created per second, and that seems to be the root cause of the slowdown.
If that theory is correct (and even if it's not), I think an optimization worth investigating is to recycle x-row by not acting on the ANSI clear instruction immediately. I suspect that this would involve hterm working with something like a v-dom, where it makes the changes that would be eventually rendered to the screen in-memory, and then dumps the resulting state of the terminal whenever there's time or at a certain interval.
So, my questions are:
Wanted to jump in as I tried to fix this a while ago for Blink too. I identified it on Spacemacs and just running the UTF-demo.
The font fallback is how I'm solving this at the moment (falling back to Menlo in my case, and maybe you guys can package it?). It really is good in 99.9% of the cases.
I tried spans and larger spans on Unicode chars. Larger spans will cause other glitches on the screen on certain sections (like "vertical lines"). The reason why I dumped it is because as @BenoitAverty wondered and I can confirm, with other fonts (if I recall I tried with Anonymous?) it is actually on almost all the characters. and I couldn't find a reliable way on when to apply it.
@rauchg suggestion is a great idea but if I understand it, it might require to change the rendering completely to always go through a renderbuffer, and seems like a big change on hterm for something that really is a problem with the font.
Hope we can all figure out something together! :)
@matheuss: re: Upstreaming patches to libapps/hterm.
Google can only accept patches from people who have signed their contributor license agreement. IANAL/TL;DR: You won't try to go after Google for patents or other IP rights based on your contributions, and you're under no obligation to support the work you contribute.
If you're ok with that, sign and continue:
This command clones the repo and fetches a commit-msg hook needed by the code review server:
git clone https://chromium.googlesource.com/apps/libapps && (cd libapps && curl -Lo `git rev-parse --git-dir`/hooks/commit-msg https://gerrit-review.googlesource.com/tools/hooks/commit-msg ; chmod +x `git rev-parse --git-dir`/hooks/commit-msg)
Then make your changes and post them to gerrit for review:
git push origin HEAD:refs/for/master
This will upload a new change to the libapps project on chromium-review.googlesource.com
Add me (firstname.lastname@example.org) as a reviewer, and I'll have a look or find someone else who can. Hyperterm folks should feel free to comment on the review too.
It probably makes sense to try some smaller patches first to get the hang of it.
I've always wondered if hterm.Screen would work better if it just had span for each character cell. I think there'd be a performance hit for append-only uses like cat /var/log/messages, but a win for tmux, vtop, less, and other full-screen apps. My initial assumption was that append-only was the primary use case.
We could have both screen types, and swap based on the whether the terminal was displaying the primary vs alternate screen. Then the issue would be that some things that work in the alternate screen wouldn't work as well in the primary screen.
I found couple of problems related to unicode issue.
@rginda @rauchg @BenoitAverty @matheuss
@rginda I think most terminal applications render each character separately to force them in a grid, instead of rendering the text itself. The way to do that for hterm would indeed be to make a grid using a span for each char.
The performance hit would be big though, but could be mitigated using virtual dom.
The main problem is that it's probably a complete rewrite of the hterm.Screen module 😅
@BenoitAverty Yes, it would be a complete rewrite of hterm.Screen. In fact, it would make sense to start with a new file so the two implementations could be compared side-by-side. The rewrite would probably end up much simpler than the current version, and much more predictable in the face of emoji and not-really-monospace fonts.
I don't know for sure that it would be a big performance hit. It may be, but the only way to know is to try it out and measure it under different use cases. hterm.ScrollPort already takes care of the Virtual DOM, and probably won't care if you swap out hterm.Screen for a different implementation.
Agreed that it wouldn't necessarily decrease performance in a noticeable way, if done right (disclaimer: based on my intuition)
Now it shifts the text one character to the left when I'm inside tmux (works perfectly otherwise).