Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slug: Use urlencoding to support non-ASCII characters #70691

Merged
merged 5 commits into from Jul 10, 2023

Conversation

sakjur
Copy link
Contributor

@sakjur sakjur commented Jun 26, 2023

What is this feature?

Support a combination of URL safe and non-URL safe characters in our slug generation.

Why do we need this feature?

The previous generation of the slugify function first tries to generate a slug without special characters based on a small character translation map, and if that's successful that is used as the slug. This omits a number of characters that aren't supported in URLs. If no characters can be used to generate a slug, a urlencoded version of the string is used. If that string is more than 50 characters, a UUID is used instead.

The problem here lies in the urlencoding, because that only happens when there are exclusively non-slug compatible characters, so a dashboard named using both latin characters and non-latin characters could conflict based on just the latin part.

This solution squashes together the two options and urlencodes all the characters that aren't explicitly supported, replaced, or omitted, and leaving the URL safe characters as-is. The 50 character size limit is also now applied to all slugs, not only urlencoded slugs.

flowchart TD

old[Old flow]-->oldGenASCII[Generate slug based on the latin and latin-like characters in the title]
oldGenASCII-->oldASCIIEmpty{Slug empty?}
oldASCIIEmpty-->|Yes| oldUrlencodeString[Generate a URL-encoded variant for the slug]
oldASCIIEmpty-->|No| oldReturn[Return the resulting slug]
oldUrlencodeString-->oldTooLong{Is the slug more than 50 characters?}
oldTooLong-->|Yes| oldUUIDv5[Generate a UUIDv5 based on the title]
oldTooLong-->|No| oldReturn
oldUUIDv5-->oldReturn

new[New flow]-->newGen[Generate slug, urlencode non-latin like characters]
newGen-->newEmpty{Slug empty or more than 50 characters?}
newEmpty-->|Yes| newUUIDv5[Generate a UUIDv5 based on the title]
newEmpty-->|No| newReturn[Return the resulting slug]
newUUIDv5-->newReturn

Who is this feature for?

Fixes a bug for editors managing dashboards.

Which issue(s) does this PR fix?:

Fixes #66373

Special notes for your reviewer:

While this is a bug fix, I've chosen not to put this up for backporting to allow for testing the fix for a bit of an extended time. This has a lower urgency than #65184 and #70723 .

Please check that:

  • It works as expected from a user's perspective.
  • If this is a pre-GA feature, it is behind a feature toggle.
  • The docs are updated, and if this is a notable improvement, it's added to our What's New doc.

@sakjur sakjur added this to the 10.1.x milestone Jul 4, 2023
@sakjur sakjur marked this pull request as ready for review July 4, 2023 09:42
@sakjur sakjur requested review from a team as code owners July 4, 2023 09:42
@sakjur sakjur requested review from vtorosyan, kalleep, zserge, mildwonkey and suntala and removed request for a team July 4, 2023 09:42
Copy link
Contributor

@mildwonkey mildwonkey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! I tested it out locally using the example in the linked issue and everything worked (and looks) great:

Screenshot 2023-07-10 at 2 14 56 PM

@sakjur sakjur merged commit 5c19272 into main Jul 10, 2023
19 checks passed
@sakjur sakjur deleted the emil/202306/slug-urlencode-non-latin-chars branch July 10, 2023 19:05
@alexandnpu
Copy link

I really appreciate your work to get this fixed!

Will this be backported to v9.4.x? @sakjur

@sakjur
Copy link
Contributor Author

sakjur commented Jul 11, 2023

@alexandnpu Thank you!

Unfortunately this’ll have to wait until 10.1, as for 9.4 specifically, this has a hard dependency on #70723 being fixed (…to avoid breaking even more slugs) and that’s done at the earliest in the incoming patch releases for 9.5 and 10.0. The severity of this is slightly lower than that of #70723, so I made the call that we don’t have to backport this. Timing-wise, this shouldn’t affect when the patch is released.

@alexandnpu
Copy link

hmmm, I can wait for v10.1. Is it safe to upgrade from v9.4.x to v10.1?
will there be something incompatible?

@sakjur
Copy link
Contributor Author

sakjur commented Jul 25, 2023

@alexandnpu Sorry for the slow response! For most people, upgrading from 9.4 to 10.1 should be safe, we always recommend backing up the database beforehand to be able to rollback if something unexpected happens. There is a documentation page for breaking changes in v10, which lists quite a few features, but it shouldn't be anything that affects most installations.

@alexandnpu
Copy link

alexandnpu commented Jul 26, 2023

@sakjur , thanks for your info, the changes listed in your link seem to be OK to me. I will give v10.1 a try when it is out.

@ricky-undeadcoders ricky-undeadcoders modified the milestones: 10.1.x, 10.1.0 Aug 1, 2023
@DanCech
Copy link
Collaborator

DanCech commented Aug 9, 2023

While this change works in theory, we recently ran into an issue where a literal percentage symbol in a dashboard title (and since this change url-encoded into the slug) resulted in a dashboard that would not load (possibly due to inconsistent url-encoding when using slugs in urls). In any case putting a pre-url-encoded character into a slug is not a good idea, since we should be url-encoding all components at the time we assemble urls, and if we pre-encode the slug then we should end up with a double-encoded url.

At the very least we need to add % to the list of omitted characters, but I would strongly recommend that we revisit this change and avoid producing slugs that rely on urlencoding to be url-safe.

@mildwonkey
Copy link
Contributor

Hey @DanCech - I am not entirely sure how this PR caused that issue; a literal % in a dashboard title is stripped for the slug (i added some test cases locally to check):

	results["hello-playground"] = "Hello, playground"
	results["hello-playground"] = "Hello%, %playground"
	results["hello-playground"] = "Hello, playground"

Do you have any more info about the issue you saw? I wonder if there's another issue; maybe we're using the slug in place of title for a search somewhere still

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Dashboard Title Slug does not support the mix of English and Chinese in 9.4.1
6 participants