Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Taxonomies names does not match urlized #5687

Open
allanjude opened this issue Feb 10, 2019 · 12 comments
Open

Taxonomies names does not match urlized #5687

allanjude opened this issue Feb 10, 2019 · 12 comments

Comments

@allanjude
Copy link

@allanjude allanjude commented Feb 10, 2019

The documentation here says:
https://gohugo.io/content-management/taxonomies/#preserve-taxonomy-values

"Therefore, if you want to have a taxonomy term with special characters such as Gérard Depardieu instead of Gerard Depardieu, set the value for preserveTaxonomyNames to true in your site config."

"Note that if you use preserveTaxonomyNames and intend to manually construct URLs to the archive pages, you will need to pass the taxonomy values through the urlize template function."

However, in v0.54.0 this is not the behaviour I am seeing. preserveTaxonomyNames is false, but names are not normalized.

I have defined a taxonomy called "author" which has peoples names in it.

However, when I do:

    {{ with $.Site.GetPage (printf "/author/%s" ( $author | urlize )) }}
      <a href="{{ .RelPermalink }}">{{ $author }}</a>
    {{ end }}

It doesn't get the page for authors with accents in their names

{{ "Olivier Cochard-Labbé" | urlize }} = olivier-cochard-labb%C3%A9

but the taxonomy is: olivier-cochard-labbé

How do I encode the url the same way that the taxonomy is encoded, or get the taxonomy to be urlize'd?

@gcushen

This comment has been minimized.

Copy link

@gcushen gcushen commented May 2, 2019

@bep this is a very common problem being reported with Academic theme since Hugo 0.54/0.55.

Does it appear to be a Hugo bug or should there be extra logic applied to non-ASCII characters when using GetPage and urlize to get a taxonomy page as above?

Thanks :)

@gcushen

This comment has been minimized.

Copy link

@gcushen gcushen commented May 5, 2019

@a-fortunato

This comment has been minimized.

Copy link

@a-fortunato a-fortunato commented May 31, 2019

Hey,
I was struggling with this too, but I managed to make it work adding removePathAccents = true to the config file.

(I found the answer here #1180 and here https://discourse.gohugo.io/t/problem-with-taxonomies-in-foreign-languages/7955/10).

That should be enough for languages that mainly have to deal with accents, but it seems there's still some problems with other special characters, like explained here: #3476.

Good luck!

@gcushen

This comment has been minimized.

Copy link

@gcushen gcushen commented Jun 18, 2019

@a-fortunato thanks for the input with the possible workaround! The bug still remains unfixed though whereby GetPage URL will be invalid and not match the actual taxonomy page URL unless non-ASCII characters are removed (such as via the workaround).

gcushen added a commit to gcushen/hugo-academic that referenced this issue Jun 18, 2019
gcushen added a commit to sourcethemes/academic-kickstart that referenced this issue Jun 18, 2019
rhewett added a commit to rhewett/hugo-academic that referenced this issue Jun 19, 2019
@vintikzzz

This comment has been minimized.

Copy link

@vintikzzz vintikzzz commented Jul 25, 2019

I've got it working by fixing theme templates by changing from this

{{ $series := .Params.series | urlize}}

to this

{{ $series := .Params.series | anchorize}}

Anchorize function doesn't do urlencoding.

@HughP

This comment has been minimized.

Copy link

@HughP HughP commented Jul 25, 2019

It is not just accents, in my case I am using Saltillo U+A78B LATIN CAPITAL LETTER SALTILLO and it still breaks.

@vintikzzz

This comment has been minimized.

Copy link

@vintikzzz vintikzzz commented Jul 25, 2019

@HughP have you tried my solution?
I had Russian text that looks like this "Смотри фильмы онлайн с любого сайта" and anchorize just convert it to "смотри-фильмы-онлайн-с-любого-сайта" (that works) instead of urlize that makes it "%D1%81%D0%BC%D0%BE%D1%82%D1%80%D0%B8-%D1%84%D0%B8%D0%BB%D1%8C%D0%BC%D1%8B-%D0%BE%D0%BD%D0%BB%D0%B0%D0%B9%D0%BD-%D1%81-%D0%BB%D1%8E%D0%B1%D0%BE%D0%B3%D0%BE-%D1%81%D0%B0%D0%B9%D1%82%D0%B0" (doesn't work)

@HughP

This comment has been minimized.

Copy link

@HughP HughP commented Jul 26, 2019

@vintikzzz Yes that works for me. But I don't really understand the difference in what is happening.

@vintikzzz

This comment has been minimized.

Copy link

@vintikzzz vintikzzz commented Jul 26, 2019

@HughP in my case there was a code like this

{{ $series := .Params.series | urlize }}
{{ $posts := index .Site.Taxonomies.series $series }}

.Site.Taxonomies.series - is a map of weighted pages like so

map[
  смотри-фильмы-онлайн-с-любого-сайта:[
    WeightedPage(0,"Смотри фильмы онлайн с rutracker.org")
    WeightedPage(0,"Смотри фильмы онлайн с rutor.org")]
  что-нового:[
    WeightedPage(0,"Представляем Webtor 1.0")
  ]
]

and .Params.series is Смотри фильмы онлайн с любого сайта
As you can see here, the key there is an anchorized value not urlized.
urlize and anchorize works the same for ASCII charset, because urlencoding (that additionally used in urlize) doesn't do additional conversion fo ASCII character set (except reserved chars). That's why, I think, many theme developers doesn't meet this problem. Characters from Romance language group (French, Italian and etc) use a lot of characters from ASCII, the main difference is that there are so called "Accents", and that's why removePathAccents = true solves issue for them without changing theme templates.

@amoutiers

This comment has been minimized.

Copy link

@amoutiers amoutiers commented Aug 15, 2019

I've the same problem, I think hugo should have a function to calculate a key from a name (a term name here) like toKey, toIndex, keyize or indexize with the same logic it's internal do to calculate the index/key

anchorize, and urlize are not for this purpose and user use them because such a function do not exists

@stale

This comment has been minimized.

Copy link

@stale stale bot commented Dec 13, 2019

This issue has been automatically marked as stale because it has not had recent activity. The resources of the Hugo team are limited, and so we are asking for your help.
If this is a bug and you can still reproduce this error on the master branch, please reply with all of the information you have about it in order to keep the issue open.
If this is a feature request, and you feel that it is still relevant and valuable, please tell us why.
This issue will automatically be closed in the near future if no further activity occurs. Thank you for all your contributions.

@stale stale bot added the Stale label Dec 13, 2019
@gcushen

This comment has been minimized.

Copy link

@gcushen gcushen commented Dec 15, 2019

@bep using urlize with GetPage, as per the Hugo docs, can still fail as demonstrated above. If we don't fix this GetPage behaviour with urlize then perhaps we can implement a dedicated function to resolve the inconsistency in Hugo such as @amoutiers suggested above?

@stale stale bot removed the Stale label Dec 15, 2019
gcushen added a commit to gcushen/hugo-academic that referenced this issue Dec 15, 2019
Attempt to use anchorize rather than urlize for fetching user profiles for users with non-ASCII usernames. Apparently `anchorize` does not perform URL encoding unlike `urlize`, so may be better suited for use with `GetPage`.

See gohugoio/hugo#5687
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants
You can’t perform that action at this time.