Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non ascii headers do not slugify nicely. #212

Closed
tobegit3hub opened this issue Nov 11, 2014 · 32 comments
Closed

Non ascii headers do not slugify nicely. #212

tobegit3hub opened this issue Nov 11, 2014 · 32 comments

Comments

@tobegit3hub
Copy link

If we have a

web

in the markdown file, it will generate a hyperlink like "#web". This is what we expect but for other languages, it just returns "#_1" or "#_2".

@SvenDowideit
Copy link

mmm, wait a moment....

ah, I'm not actually aware of 'other language' support in mkdocs - /me throws it on the list of things to read.

@tobegit3hub
Copy link
Author

Now we may write documents in Chinese and the hyperlink looks a little wired. It's not a big deal but it would better to fix it 😃

@tomchristie
Copy link
Member

So what the reporter is describing here is that the anchor link is generated using any ASCII characters in the heading text.

@tobegit3hub
Copy link
Author

Thanks @tomchristie and sorry for my bad description. Can we use Unicode instead of ASCII for generate anchor links?

@tomchristie
Copy link
Member

Anchor links do need to be ascii, as URLs are restricted to that.
We might want to look at something like https://github.com/dimka665/awesome-slugify
Some notes from @pydanny on it, here: http://www.pydanny.com/awesome-slugify-human-readable-url-slugs-from-any-string.html

@tomchristie tomchristie changed the title The hyperlink is not well for other languages Non ascii headers do not slugify nicely. Dec 8, 2014
@d0ugal
Copy link
Member

d0ugal commented Dec 8, 2014

Huh, that project is neat. I have no idea if this is accurate, but it appears to work...

Python 3.4.2 (default, Oct  9 2014, 17:09:51) 
[GCC 4.8.3 20140911 (Red Hat 4.8.3-7)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import slugify
>>> from slugify import slugify
>>> slugify("您好")
'Nin-Hao'

@pydanny
Copy link

pydanny commented Dec 8, 2014

👍 for using awesome-slugify

@tomchristie
Copy link
Member

@d0ugal
Copy link
Member

d0ugal commented Dec 8, 2014

The python-markdown TOC extension (which generates the slugs for us) accepts a slugify function, so it should really easy to integrate awesome-slugify.

Would it be worth making this an optional feature? It could cause quite a few broken links for some people relying on the current behavior.

@tomchristie
Copy link
Member

Personally I'd say til we hit 1.0 ask em to suck it up.

@tomchristie
Copy link
Member

Tho we could tie it in with eg #279

@facelessuser
Copy link
Contributor

If/when you implement awesome-slugify, It would be cool if you consider exposing it to any/all extensions that may be included. That way they don't have to do any gymnastics to get at the TOC class instance's slugify.

Maybe just make it simple. If a extension has a slugify config option defined, you could replace its value with the internal slugify. That way a 3rd party extension could take advantage of the same slugify and be in sync.

Just for each plugin you would need to:

if 'slugify' in config:
    config['slugify'] = mkdoc_slugify_method

That way a user could just define the slugify key, and then get the method. The value wouldn't matter as mkdocs would just overwrite it.

markdown_extensions:
    myextension:
         some_config: true
         slugify:

@d0ugal
Copy link
Member

d0ugal commented May 1, 2015

@facelessuser that sounds a bit too magical or me. There is no guarantee that the different slugify functions would have the same method signature.

@facelessuser
Copy link
Contributor

I guess, I haven't looked at awesome-slugify close enough to understand the possible drawbacks of using it in this way. I guess the best bet is to shy away from other plugins that may need to augment headers or craft a special one to use the instance contained within toc; it's doable if needed, so no worries.

@robodude666
Copy link

I don't agree with having mkdocs dependent on there being a slugify package installed. What if I have my own slugify function I want to use; I'd have to wrap it into a Slugify class, and hope none of my other dependencies rely on awesome-slugify ?

I'd rather see mkdocs convert the yaml toc slugify string into a module import and pass the toc extension a proper value... though, that seems too specific to me. I wouldn't expect mkdocs to support similar requirements for all markdown extensions... though technically, the markdown toc extension should simply just accept an import string and do the import itself.

@d0ugal
Copy link
Member

d0ugal commented Jun 25, 2015

@robodude666 Sorry, I don't follow you...

I don't agree with having mkdocs dependent on there being a slugify package installed. What if I have my own slugify function I want to use; I'd have to wrap it into a Slugify class, and hope none of my other dependencies rely on awesome-slugify ?

Why would you need to wrap anything if we use autsome-slugify? and why would it matter if your dependencies use it?

I'd rather see mkdocs convert the yaml toc slugify string into a module import and pass the toc extension a proper value... though, that seems too specific to me.

The "yaml toc slugify string"?

Are you objecting to idea in #212 (comment)? or something else?

@robodude666
Copy link

I'm opposing the solution provided by #297. It limits users of mkdocs customizing the toc slugify parameter by forcing there to be a slugify.Slugify class with the user's slugify code.

If I configure the mkdocs.yml file as follows:

markdown_extensions:
    - toc:
        permalink: "P"
        slugify: "mycodebase.mkdocs_stuff.toc_slugify"
    - admonition:

I would expect the slugify parameter specified would be imported based on the string provided.

@waylan
Copy link
Member

waylan commented Jun 25, 2015

the markdown toc extension should simply just accept an import string and do the import itself

While I'm not completely opposed to that idea (as the lead dev of Python-Markdown), it most likley won't happen for some time. Python-Markdown it currently in "maintainance mode" as we work towards 3.0. In other words the 2.x line will only receive bug fixes going forward with no new features.

I'd rather see mkdocs convert the yaml toc slugify string into a module import and pass the toc extension a proper value...

I'm assuming you mean the the string assigned to the extension config in the YAML config file should be recognized by MkDocs as using Python dot notation and be imported. I agree that that is a little to magic for my tastes.

However, the PyYaml library does support special types which will result in the value being a Python object rather than a string. You just have to explicitly define the type in your YAML file and YAML will attempt to import the object and assign it as the value.

Therefore, the following would presumably work today with no changes to MkDocs or Python-Markdown (untested - I haven't checked the signature of awesome-slugify to verify if it will work):

markdown_extensions:
    - toc:
         slugify: !!python/name:slugify.slugify

A possible alternative would be to wait for the Plugin API (see #206) to land. Then you could use a post-config hook (which would be called after the config was set up but before anything else runs) to modify the config by replacing the string (which uses Python dot notation) with the Python object. Personally, outside if using the special YAML types, I think this is probably the best way for users to modify the behavior in this type of situation.

@waylan
Copy link
Member

waylan commented Jun 25, 2015

Giving this some additional thought, a Plugin wouldn't even need to replace a dot notation string with a Python object. It could just add the slugify key to the extension configs with the appropriate value. In fact, my opinion is that the appropriate fix for this issue would be to use a Plugin. If someone wants to use awesome-slugify, they can use the "Awesome Slugify Plugin" and if they want to use "Other Cool Slugify" then they can use the "Other Cool Slugify Plugin".

Of course, given PyYaml's ability to assign Python types, I think this whole thing is moot. The only reason to even consider a plugin is if it is decided that explicitly defining the type in YAML is ugly and not something you want to document for users. "Use this plugin" is certainly easier to explain to noobs.

Either way, I don't see any reason to leave either this issue or #297 open. But, of course, that is not my call.

@robodude666
Copy link

@waylan @d0ugal Can confirm that PyYaml's special types work with mkdocs:

markdown_extensions:
    - toc:
         slugify: !!python/name:mycodebase.myawesome.slugify

Allows me to define the dot notation path to my slugify function and it's used appropriately. With that said, I would agree with @waylan that this issue is a non-issue.

Might be nice to note in the markdown_extensions that this functionality from PyYaml is available; although it might only be an advanced user feature?

@facelessuser
Copy link
Contributor

I actually never made the connection this could be done, even though I knew about pyyaml's special types. This is a useful tip.

@tomchristie
Copy link
Member

explicitly defining the type in YAML is ugly and not something you want to document for users.

Bothers me that it's even something that's possible to do. :-/

@tomchristie
Copy link
Member

@waylan @d0ugal Can confirm that PyYaml's special types work with mkdocs

If they can, I'd suggest that we prevent them from being able to do so. Yergh.

@d0ugal
Copy link
Member

d0ugal commented Jun 26, 2015

Yeah, this is the default in PyYAML which I agree is kinda gross. We can stop it by using yaml.safe_load. I had considered changing it before, but I don't think it is a big issue for projects like MkDocs.

IMO yaml.load should be called dangerous_load and the default should be safe_load... but that is another matter.

@tomchristie
Copy link
Member

"I don't think it is a big issue for projects like MkDocs."

...

"is kinda gross"

Agreed. Not a big issue here, just icky.

@facelessuser
Copy link
Contributor

I would think it could be prevented once a plugin API allows this. This has actually been something I have wanted as well, and the yaml "danger" load allows me to accomplish it now. With that said, I totally understand why a maintainer would want to kill it, but it now saves me from doing coding gymnastics as a user to override and/or acquire the slugify method used in toc and and share it with other extensions that touch the headers.

@robodude666
Copy link

I agree with @facelessuser. It shouldn't be prevented until there's another approved method for the same result.

Until we have extensions, it's the only way right now to customize the slugification without having to author a subclass of the TOC extension.

@d0ugal
Copy link
Member

d0ugal commented Jun 26, 2015

Yeah, my thoughts are heading in this direction too. Let's leave this as it is for now, support this sort of customisation via extensions later and maybe switch to safe_load in the YAML. I am not sure the swtich is really needed, it grosses me out, but I'll never use it anyway.

So, now I am wondering if we I should just close this issue and direct people to track #206

@d0ugal d0ugal removed this from the 1.0.0 milestone Jun 26, 2015
@d0ugal
Copy link
Member

d0ugal commented Jun 26, 2015

It doesn't sound like anyone is advocating that we change the default. Users have a way of changing the default now and we want to make that easier with extensions later.

So, yeah, there is nothing that needs to happen here. Thanks all for the discussion, that was really useful.

I'll happily re-open if anyone can put forward a compelling case to do so.

@ngtrian
Copy link

ngtrian commented Mar 13, 2017

I have fix a problem with pymdownx

pip install pymdown-extensions

mkdocs.yml

- toc:
   slugify: !!python/name:pymdownx.slugs.uslugify

@rapon233
Copy link

After all, what should I modify in mkdocs.yml to make use of awesome-slugify?
Installed from pip, and then I'm confused with the code you mentioned above: @waylan

markdown_extensions:
    - toc:
         slugify: !!python/name:slugify.slugify

If my mkdocs dir is

newdoc
    -docs
        -git
            -git1.md
            -git2.md
        -test
            -test.md
    -index.md
    -mkdocs.yml

It would be great to your reply!

@waylan
Copy link
Member

waylan commented Dec 10, 2019

@rapon if you read my comments surrounding my example, you will note that it was unsure of the correct import path for the slugify lib. My example was simply meant to serve as an example of how it might be done. However, the other examples provide real working examples. The differences between them is that each is using a different library.

The key is that the part after !!python/name: is dependent on which lib you are using. You need to check the documentation for that lib and adjust accordingly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants