New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Header anchor with diacritics #807
Comments
Note: In HTML5, there is no such restriction on the ID attribute: I don't know of any general way to translate accented characters to ascii equivalents. In your case, there is an obvious translation -- just drop the accents -- but this won't be true in general, e.g. for Chinese. So for full generality one would have to use something like percent-encoding -- but without percent signs, of course. It would be ugly and it would be hard for users to calculate the IDs on the fly. I just tried some documents with links to the unicode anchors, and they seem to work fine in modern browsers, even with an HTML 4 doctype. So I'm inclined not to worry about being "correct" in this respect. The alternative seems to me worse, and I don't see much advantage. |
Yes, for HTML5 is oputput realy valid. I use Nette framework and he "webalize" strings like this. Yes, this solution has problem with e.g. Chinese. There is not the problem that it does not work in browsers, but that is not compatible with compilers e.g. GitHub. Ideal is by my state, use ID generated by "Nette" algorithm, and if return empty string, use current algorithm - only replace spaces by dash. Problem is, that this solution is again not compatible with GitHub. :( My reason for this issue: E.g.: # Test
[Nejaký text](#nejaký-text)
## Nejaký text When I generate using Pandoc PDF, output is correct. I don't know solution for this problem without change id generator in pandoc. It would be nice create for this feature at least argument in command line. Now I must to choose the correct output on GitHub or via Pandoc. PS: Current I get from ### Údaje o zákazníkovi this <h3 id="údaje-ozákazníkovi"> Údaje o zákazníkovi</h3> why not <h3 id="údaje-o-zákazníkovi"> Údaje o zákazníkovi</h3> ? |
On the last point: I just tried it, and I got what you expected,
Unfortunately, I don't know of a Haskell library that provides a Note that you can now specify the header ID explicitly in pandoc (though this probably won't work in github):
You can also use explicit HTML anchor tags if you need something that works in both. |
OK, it wasn't actually too hard to create the needed function from the official unicode tables. I'll try to incorporate this. |
Probably the best approach is to add a new markdown extension for strict IDs, and use it with markdown_github. It appears that github just completely ignores characters that don't have ascii equivalents, in generating the ID. |
How? |
Try
if you want to ensure that the identifiers are ASCII. |
@jgm (concerning |
@Wolf-at-SO yes that could easily be done. The relevant source file is |
@jgm Thanks, |
For correct generating ID is necessary to contains only characters [a-z0-9-].
Now is from:
generated:
Correct output is:
This behavior will be equal such as the behavior of GitHub markdown (etc. README.md)
The text was updated successfully, but these errors were encountered: