New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stop removing stopwords from auto-generated page slugs #4899
Comments
Just want to throw my +1 here - I support this change. It should at least be an option, so it won't modify the default behavior others are used to. It's annoying having to manually modify every slug, but it's so much cleaner in my opinion if the URL matches the title as closely as possible. |
I've posted to django-developers to solicit feedback on making this change upstream in Django, which feels like the right place for this to happen. |
@Scotchester and I submitted django/django#12945 (tracked in Django ticket 11157), and it was merged this morning. In Django 3.2+, Django's version of urlify.js will no longer strip stopwords from slugs. This doesn't affect Wagtail (or this issue) because Wagtail has its own copy of urlify.js. We could update Wagtail's copy of urlify.js to similarly remove this stopword behavior, now that it's been adopted by Django. Perhaps an even better approach, though, would be to consider removing Wagtail's copy entirely, and instead just relying on Django's copy that comes with the admin app. The default Wagtail project template does include |
As far as I remember, the desire to avoid a dependency on django.contrib.admin did come out of a real-world requirement (although all I can immediately find is #1567, which is probably more user error than "it's essential that this works without Django admin"), and I think it's still a good thing to adhere to - I'd be in favour of updating Wagtail's copy of urlify.js (assuming the new version is compatible with pre-3.2 versions of Django, of course). |
This is awesome, thanks to all who've put effort in! |
Issue Summary
When a new Wagtail page is saved without the user entering a slug, its slug is auto-generated from the page's title. Wagtail's slug generation uses a copy of the logic used by Django's
SlugField
(the Wagtail codebase contains a copy of the "bit of JavaScript" mentioned there). This logic includes the removal of certain stopwords if the page title contains only ASCII characters. For example, a page with title "To be or not to be, that is the question" will be given a slug "be-or-not-be-question", not "to-be-or-not-to-be-that-is-the-question" as one might expect (this can be entered manually if desired).This is longstanding behavior (since #8) but I'm opening this issue to solicit feedback on whether Wagtail should stop removing these stopwords, motivated in particular by #4881 and related discussion in #4884.
Here is the logic that performs stopword stripping, which includes the list of words that are removed:
wagtail/wagtail/admin/static_src/wagtailadmin/js/vendor/urlify.js
Lines 158 to 170 in 882f8f3
This behavior comes from Django's urlify.js, which is used in the Django admin in conjunction with
ModelAdmin.prepopulated_fields
onSlugField
fields, of whichPage.slug
is one. There's no backend code related to removal of these words.As @jjanssen mentions on #4884, the behavior of having a title like "Before the Sunrise" converted to a slug of just plain "sunrise" feels quite unexpected. Another example would be three pages named "Before Wagtail Space", "At Wagtail Space", and "Since Wagtail Space". All three of these would be given the same auto-generated slug: "wagtail-space".
Another downside of the existing logic is the unequal treatment given to English over other languages. Only English stopwords are removed, and if a title contains even a single Unicode character, no stopwords are. Worse, stopwords can be removed even though they may be valid words in other languages (see django#12905).
Granted, the current behavior does given users the ability to override the removal of stopwords if desired -- by retyping the slug -- but this is an extra step they'd have to remember to do. Especially given the complexity introduced by something like #4881, might it be better to default to a slightly simpler behavior?
The most straightforward change would be to remove the lines excerpted above, but keep the rest of the slug generation logic, which would maintain similarity with Django.
One consideration is around deviating from Django's logic. Because
Page.slug
is aSlugField
, it might be best to stay in step with Django slug creation, perhaps proposing any change in functionality there. On the other hand, Wagtail should be first and foremost focused on the needs of users, not on maintaining compatibility with Django admin behavior. Additionally, Wagtail is already using a copy of the logic instead of directly pulling in the Django JS file (and has to work to keep it up-to-date with Django changes, see for example #4897). The fact that Django's stopword list hasn't changed since 2005 (!) might be a sign that the Django community is happy with the existing logic -- or the use of DjangoSlugField
s with auto-generation might be an edge case that doesn't get much attention.Wagtail could instead consider making this logic customizable, so that users can define their own slugging behavior if it differs from the Wagtail default. This is at least theoretically possible now by including an alternate
wagtailadmin/js/vendor/urlify.js
file in a project's templates, although maybe not as simple or well-contained as one might like; it'd be nicer if you could override just the stopwords piece while using the default unicode character handling, for example. Customization of urlify.js logic in Django seems to have been discussed a fair bit in Django tickets and on the django-developers mailing list, but the decision was made not to implement.)Other potential questions to consider:
Steps to Reproduce
wagtail start myproject
./manage.py migrate && ./manage.py createsuperuser
./manage.py runserver
, visit http://localhost:8000/admin, and login with the superuser you just created.Technical details
The text was updated successfully, but these errors were encountered: