Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WeeChat renders soft-hyphens as spaces #1659

Closed
Mikaela opened this issue Jun 7, 2021 · 8 comments
Closed

WeeChat renders soft-hyphens as spaces #1659

Mikaela opened this issue Jun 7, 2021 · 8 comments
Assignees
Labels
bug Unexpected problem or unintended behavior
Milestone

Comments

@Mikaela
Copy link
Contributor

Mikaela commented Jun 7, 2021

Bug summary

Soft-hyphens appear as spaces instead of being invisible. This doesn't happen either in curl or Lynx, so I think terminal emulator may be innocent. https://en.wikipedia.org/wiki/Soft_hyphen

Steps to reproduce

1. Send content with soft-hyphned spaces to IRC such as the quote above
2. See spaces where there shouldn't be spaces

Current behavior

Screenshot of incorrect behaviour

The below is rendered correctly by web browsers, so it has no spaces that you see in the screenshot above.

2021-158 15:54:49 EEST <mikaela1> /title https://www.hs.fi/kaupunki/art-2000008034266.html
2021-158 15:54:53 EEST <@MI1> R​-66Y/IRC@Etro: Ulko­mainos­yhtiö ja Helsingin kaupungin liikenne­laitos poistavat perus­suomalaisten paheksuntaa herättäneet vaali­mainokset - Kaupunki | HS.fi

Expected behavior

Soft-hyphens aren't visible and the words appear written correctly.

Suggested solutions

I don't know how it should be fixed technically.

Additional information

I cannot reproduce the issue with either curl 7.76.1 or Lynx versio 2.8.9rel.1 (08 Jul 2018)


  • WeeChat version: WeeChat 3.2-rc1 (git: v3.2-rc1) [compiled on Jun 4 2021 23:07:23] & WeeChat 3.0.1 [compiled on Feb 12 2021 00:00:00]
  • OS, distribution and version: Debian 11 Testing & Fedora 34
  • Terminal: MATE Terminal 1.24.1
  • Terminal multiplexer (screen/tmux/…/none):  tmux & none
@Mikaela Mikaela added the bug Unexpected problem or unintended behavior label Jun 7, 2021
@trygveaa
Copy link
Member

trygveaa commented Jun 7, 2021

There was some talk around this on #weechat. Apparently people see different results. How it's shown varies by terminal emulator, but also between the same terminal emulator for different people.

In all the terminals I tested, it is displayed the same for me in WeeChat as when I print the character directly to the terminal emulator outside of WeeChat (I tested a bunch of terminal emulators, many VTE based (including MATE Terminal which the author uses), alacritty, kitty, konsole, qterminal, st, terminology, urxvt, xterm).

I asked @Mikaela to run python -c 'print("soft\u00adhyphen")' and they said it appeared as soft hyphen, so the same as in WeeChat. This indicates that the issue is in the terminal emulator, not in WeeChat, since this command just prints the characters directly to the terminal emulator.

I'm not sure why it wasn't displayed as a space with curl for you, I would need the exact method you checked it with curl to say more about that. It is displayed as a space for me with curl in MATE Terminal.

Lynx seems to strip this character, because it's not visible in lynx in any of my terminal emulators.

@Mikaela
Copy link
Contributor Author

Mikaela commented Jun 7, 2021

I think I did curl https://www.hs.fi/kaupunki/art-2000008034266.html | grep "<title>" and forgot to mention curl usage which thinking afterwards may have affected the test.

@trygveaa
Copy link
Member

trygveaa commented Jun 7, 2021

Hm, it still appears as a space in MATE Terminal with that command for me.

@flashcode
Copy link
Member

So, based on the discussion here, is it a WeeChat issue or not?

@flashcode flashcode added the waiting info Waiting for info from author of issue label Jun 27, 2021
@trygveaa
Copy link
Member

Hm, there was some more talk on #weechat. We concluded that terminal emulators can't really show the soft hypen correctly, because they don't know where a line wraps (e.g. where the chat area ends and a side bar begins). WeeChat (or ncurses) is the one wrapping a line into multiple lines by inserting newlines between words. Therefore, there is no way for a terminal emulator to know if the soft hyphen should be rendered or not.

Since WeeChat doesn't break words on soft hyphens (or regular hyphens), the soft hyphens should never be rendered. Since some terminals renders soft hyphens (some as a space, some as a hyphen), WeeChat should probably handle this, I suppose by just stripping out the soft hyphens.

@flashcode
Copy link
Member

@Mikaela, @trygveaa: I wrote a specification that solves this issue and some other with Unicode chars: https://specs.weechat.org/specs/2022-003-fix-unicode-display.html

Please tell me what you think about the proposed changes before I implement them.

I can make them available on a testing branch before merging into master.

@flashcode flashcode self-assigned this Dec 3, 2022
@flashcode flashcode added this to the 3.8 milestone Dec 3, 2022
@flashcode
Copy link
Member

I pushed the branch unicode-fixes for tests: https://github.com/weechat/weechat/tree/unicode-fixes

Please ping me if you find differences with the specification or display bugs (chat and bars).

@trygveaa
Copy link
Member

This looks good, as long as we don't want WeeChat to wrap the line on soft-hyphens. Technically, if the line can fit the word before the soft-hyphen, but not the whole word, it might be more correct to write the word before the soft-hyphen before wrapping, inserting a normal hyphen and then wrap the line and write the part of the word after the soft-hyphen. But maybe not worth doing, I don't think soft-hypens are that common, so probably a lot of work for little gain.

@flashcode flashcode removed the waiting info Waiting for info from author of issue label Dec 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Unexpected problem or unintended behavior
Projects
None yet
Development

No branches or pull requests

3 participants