Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

help syntax don't support help tags with unicode characters #2213

Closed
mildred opened this issue Oct 16, 2017 · 6 comments
Closed

help syntax don't support help tags with unicode characters #2213

mildred opened this issue Oct 16, 2017 · 6 comments
Labels

Comments

@mildred
Copy link

mildred commented Oct 16, 2017

Here is defined the syntax highlighting pattern for help tags:

syn match helpHyperTextJump "\\\@<!|[#-)!+-~]\+|" contains=helpBar
syn match helpHyperTextEntry "\*[#-)!+-~]\+\*\s"he=e-1 contains=helpStar
syn match helpHyperTextEntry "\*[#-)!+-~]\+\*$" contains=helpStar

This pattern doesn't match utf-8 unicode characters. More specifically, the [...+-~] matches the pure ASCII alphabet but not any alphabetic character beyond that. Specifying negative character class using [^ "*\t] instead of [#-)!+-~] would match all unicode charactes (and control characters, but those are not supposed to be present in text files except for tabs)

A ASCII table would show that out of the #-) range, and out of +-~ with the ! character excluded there are only control characters below 32, space , double quote", and the star *

Note that :helptags already supports unicode tags, it's just syntax hifhlighting that is lagging behind. I currently use this in my .vimrc:

augroup vimrc_filetype_help
  au FileType help syn match helpHyperTextJump	"\\\@<!|[^ \þ!*]\+|" contains=helpBar
  au FileType help syn match helpHyperTextEntry	"\*[^ \t!*]\+\*\s"he=e-1 contains=helpStar
  au FileType help syn match helpHyperTextEntry	"\*[^ \t!*]\+\*$" contains=helpStar
augroup END

Also reported with neovim in neovim/neovim#7402

@k-takata
Copy link
Member

Is in helpHyperTextJump a mistake of \t?

@mildred
Copy link
Author

mildred commented Oct 17, 2017

yes, it is, sorry

augroup vimrc_filetype_help
  au FileType help syn match helpHyperTextJump	"\\\@<!|[^ \t!*]\+|" contains=helpBar
  au FileType help syn match helpHyperTextEntry	"\*[^ \t!*]\+\*\s"he=e-1 contains=helpStar
  au FileType help syn match helpHyperTextEntry	"\*[^ \t!*]\+\*$" contains=helpStar
augroup END

@brammool
Copy link
Contributor

I prefer to keep the help tags ASCII, so that they still work when 'encoding' is some code page. Some people still use that. And there is not much point in non-ASCII characters in help tags.

@k-takata
Copy link
Member

But isn't it useful for translated help?
At near the end of :help help-translated:

Hints for translators:
- Do not translate the tags.  This makes it possible to use 'helplang' to
  specify the preferred language. You may add new tags in your language.

@vim-ml
Copy link

vim-ml commented Oct 19, 2017 via email

@chrisbra
Copy link
Member

closing then.

alerque pushed a commit to preservim/vim-textobj-quote that referenced this issue Sep 1, 2021
Vim does not support non-ASCII characters in help documents. They can
lead to an error when building help tags.  See the following link for
details:

vim/vim#2213 (comment)

However, if we remove the non-ASCII characters from just the first line
of the document, then help tags build correctly.

The alternative is to rename the document from *.txt to *.??x, which
seems more confusing than changing one line. (In the future, if we have
to choose between removing *all* the non-ASCII characters and changing
the filename, I would support changing the filename.)

This should fix #29. Thanks to @Freed-Wu for teaching us about this
corner case.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants