Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Whitespace handling for auto ID generation only considers space and tab #6710

Closed
jkboxomine opened this issue Jan 5, 2020 · 4 comments
Closed

Comments

@jkboxomine
Copy link

@jkboxomine jkboxomine commented Jan 5, 2020

What version of Hugo are you using (hugo version)?

$ hugo version
Hugo Static Site Generator v0.63.0-DEV darwin/amd64 BuildDate: unknown

Does this issue reproduce with the latest release?

Yes

Whitespace characters other than space or tab (for example, non-breaking space) are not taken into account for auto ID generation when autoHeadingIDType is set to "github" or "github-ascii". For instance, the heading below produces an auto ID as follows.

## Template(space)Name(nbsp)Convention => template-nameconvention

The code below seems to be related to this issue.

https://github.com/gohugoio/hugo/blob/master/markup/goldmark/autoid.go#L88-L90

Regarding the example of nbsp, it's common to see nbsp getting mixed in the Markdown content, especially when the content is migrated from other web-based tools such as Confluence. We cannot expect every project to perform clean-up of their content, so this issue needs to be fixed.

@matteocontrini

This comment has been minimized.

Copy link

@matteocontrini matteocontrini commented Jan 5, 2020

@bep

This comment has been minimized.

Copy link
Member

@bep bep commented Jan 5, 2020

@jkboxomine can yo provide a failing test case (that can be copy-pasted)?

@jkboxomine

This comment has been minimized.

Copy link
Author

@jkboxomine jkboxomine commented Jan 5, 2020

@bep , looks like that's the correct behavior for GitHub-compliant auto heading ID.

https://raw.githubusercontent.com/jkboxomine/goldmark-headingid/master/README.md
https://github.com/jkboxomine/goldmark-headingid#test-for-github-heading-id-generation-temporary

Just as a side note, it seems like copy-pasting the raw text mostly converts nbsp to space. You can force the input of nbsp in your editor by some key combination (e.g. option-space in VSCode).

Thanks.

@bep

This comment has been minimized.

Copy link
Member

@bep bep commented Jan 5, 2020

@matteocontrini IsSpace returns true for both vertical and horizontal space, which may or may not be the correct in this case. I don't find GitHub's implementation of this; they seem to do it in some common "HTML pipeline" ...

@bep bep closed this in 9b6e614 Jan 5, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.