Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invisible character from YouTube chat copying #3570

Open
alexherbo2 opened this issue Jun 29, 2020 · 6 comments
Open

Invisible character from YouTube chat copying #3570

alexherbo2 opened this issue Jun 29, 2020 · 6 comments
Labels

Comments

@alexherbo2
Copy link
Contributor

Copy this line.  There is a character between [​].
@alexherbo2 alexherbo2 added the bug label Jun 29, 2020
@Screwtapello
Copy link
Contributor

hexdump -C says:

00000000  43 6f 70 79 20 74 68 69  73 20 6c 69 6e 65 2e 20  |Copy this line. |
00000010  20 54 68 65 72 65 20 69  73 20 61 20 63 68 61 72  | There is a char|
00000020  61 63 74 65 72 20 62 65  74 77 65 65 6e 20 5b e2  |acter between [.|
00000030  80 8b 5d 2e 0a                                    |..]..|
00000035

The bytes between [] are E2 80 8B, which is the UTF-8 encoding of U+200B ZERO WIDTH SPACE. The h and l keys in Kakoune navigate by Unicode code-point, but not all code-points are visible on-screen, so sometimes the cursor disappears when a zero-width string is selected. Unicode combining characters (which are also zero-width) have the same effect: try pasting Z̭̯̹͈͢a̜̰͇̘̿̎͋ͥḽ̝̺̳̩̩̗g͎̺̩̞o̧ ̤̭̀ͬͧͅt͚̯͉̃̇̇ĕ̟͇͑xt into Kakoune and see what happens.

@lenormf
Copy link
Contributor

lenormf commented Jun 29, 2020

Related #1447

@worldtechpk
This comment was marked as spam.
@oc34n-1
Copy link

oc34n-1 commented Apr 12, 2023

@Icantjuddle
Copy link

OOC what is "correct" behavior here?

@Screwtapello
Copy link
Contributor

I think the expectation (at least in this comment field, and GUI text editors in general) is that nobody will ever edit text that contains a sequence of code-points they couldn't type from their regular keyboard. Once you start getting into the weirder parts of Unicode control sequences, like "right-to-left override" or emoji that are ligatures of other emoji, or (on some platforms) once you involve characters beyond the Basic Multilingual Plane, all bets are off and all kinds of weirdness can occur.

Kakoune's behaviour here isn't the weirdest thing I've seen, and it allows surgical editing of individual code-points which is probably what you want in a programming-oriented editor.

Possibly the nicest behaviour would be to temporarily expand zero-width characters to take an entire terminal cell (padding them with a space character) when selected so they'd be visible, but I have no idea how complex that would be to implement, and it would be a pretty niche feature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants