Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mkdx TOC does not support utf8 title #130

Closed
4 tasks
Cyperwu opened this issue Sep 25, 2020 · 11 comments
Closed
4 tasks

mkdx TOC does not support utf8 title #130

Cyperwu opened this issue Sep 25, 2020 · 11 comments
Assignees
Labels
bug:report Something does not work as intended / expected.

Comments

@Cyperwu
Copy link

Cyperwu commented Sep 25, 2020

OS type:

  • [x ] Unix
  • Windows
  • Other ([SPECIFY])

Vim:

  • vim
  • [x ] neovim
  • Other ([SPECIFY])

Vim version:

NVIM v0.5.0-nightly

Reproduce steps:

  1. Open a markdown file.
  2. Paste the following example into the file:
## 测试
  1. Then, attempt to generate TOC

Expected:

# Table of Contents

- [Table of Contents](#table-of-contents)
  - [测试](#测试)

## 测试

Actual:

outputs:

# Table of Contents

- [Table of Contents](#table-of-contents)
  - [测试](#)

## 测试
@SidOfc SidOfc self-assigned this Sep 25, 2020
@SidOfc SidOfc added the bug:report Something does not work as intended / expected. label Sep 25, 2020
@SidOfc
Copy link
Owner

SidOfc commented Sep 25, 2020

Yep that looks like a bug indeed, will check it out somewhere this week.

Cheers for the report 👍

@Cyperwu
Copy link
Author

Cyperwu commented Sep 25, 2020

Thanks! And there is another bug that

## 1. test

generates

[1. test](#1-test)

The period was missing. Should I open a new issue or leave it here?

@SidOfc
Copy link
Owner

SidOfc commented Sep 25, 2020

@Cyperwu so it actually generated [1 test](#1-test) instead of [1. test](#1-test)?

@SidOfc
Copy link
Owner

SidOfc commented Sep 25, 2020

Oke so the issue with chinese characters should now be fixed (https://stackoverflow.com/questions/41318003/how-to-match-chinese-characters-with-grep) by including the unicode range of chinese.

About this other bug, I checked to see what gets generated when using # 1. some heading and I can't find any errors in the link text or actual link, it ends up like this: [1. some heading](#1-some-heading) which is correct in both text and generated fragment link on GH.

@Cyperwu
Copy link
Author

Cyperwu commented Sep 27, 2020

Sorry, I checked my code and found that it was related to my markdown preview tool. So that the related bug was actually not exist.
And I tried out the latest commit, which raises E945: Range too large in character class when generating TOC.

SidOfc added a commit that referenced this issue Sep 27, 2020
…rontmatter last line showing up in generated TOC
@SidOfc
Copy link
Owner

SidOfc commented Sep 27, 2020

I can't reproduce this on either neovim (0.5.0) or regular vim (8.1), even after set re=1 it works, I did however find some useful information when going to :h E945 which shows me how to "fix" it. I've pushed the commit to master so you can try it out by updating.

I also patched another bug in there with YAML frontmatter showing up in generated table of contents because in markdown --- is used for titles as well as YAML front matter.

@Cyperwu
Copy link
Author

Cyperwu commented Sep 28, 2020

It still raises E945, but the TOC is generated. And updating TOC won't throw the error.

Here is a piece the error message:

Error detected while processing function mkdx#GenerateOrUpdateTOC[10]..mkdx#GenerateTOC[7]..324[23]..326:
line    6:
E945: Range too large in character class
E945: Range too large in character class
...
E945: Range too large in character class
Error detected while processing function mkdx#GenerateOrUpdateTOC[10]..mkdx#GenerateTOC[85]..<lambda>4[1]..327[1]..326:
line    6:
E945: Range too large in character class
E945: Range too large in character class
E945: Range too large in character class
Error detected while processing function mkdx#GenerateOrUpdateTOC[10]..mkdx#GenerateTOC[79]..326:
line    6:
E945: Range too large in character class
Error detected while processing function mkdx#GenerateOrUpdateTOC[10]..mkdx#GenerateTOC[82]..<lambda>4[1]..327[1]..326:
line    6:
E945: Range too large in character class
Error detected while processing function mkdx#GenerateOrUpdateTOC[10]..mkdx#GenerateTOC[85]..<lambda>4[1]..327[1]..326:
line    6:
E945: Range too large in character class
E945: Range too large in character class
E945: Range too large in character class

I found some code on vim-markdown-toc. They support Chinese, Korean specifically but I think that's not the common use case for markdown writing.

@SidOfc
Copy link
Owner

SidOfc commented Sep 28, 2020

Hmmn very strange, also not very easy to fix for me as I don't get this error at all :(

What I think I'll try to do is split this one big range up into multiple smaller ones that are no more than 256 chars apart since that is the limit of the old regex engine. Unfortunately though I'll also have to ask for your feedback on it once I've committed some change since I can't reproduce it myself 😅

I'll keep you posted when I've worked more on it, will try to apply another patch after work.

@SidOfc
Copy link
Owner

SidOfc commented Sep 28, 2020

alright @Cyperwu can you update again and give it another shot?

The ranges are now split and should all be < 256 characters apart as per Vim limit. The only thing I can imagine that would cause it to fail is because they are all still in the same character class, but not as one giant range but quite a few small ranges.

Cheers for your patience and feedback so far, looking forward to more info 👍

@Cyperwu
Copy link
Author

Cyperwu commented Sep 29, 2020

Cheers, your patch works!

@SidOfc
Copy link
Owner

SidOfc commented Sep 29, 2020

Good stuff, will close this, thanks for reporting!

@SidOfc SidOfc closed this as completed Sep 29, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug:report Something does not work as intended / expected.
Projects
None yet
Development

No branches or pull requests

2 participants