Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

replace redirect with case-insensitive routing #142

Merged
merged 4 commits into from Jun 21, 2022

Conversation

vito
Copy link
Contributor

@vito vito commented Jun 19, 2022

馃憢 Loving this project so far!

I'm switching an app over to it to see how it goes and I noticed that Bud normalizes paths to lowercase. This is a problem for my app because it uses base64-encoded digests to route to content-addressed objects. I could switch to a case-insensitive database lookup, but that introduces the admittedly low possibility of collisions.

I've seen other websites in the wild that use case sensitive routes, for example url shorteners, so I don't think this is too unusual.

This PR removes the redirect and instead changes the router to use a case-insensitive match, since I figure the goal is for URLs to be case-insensitive. I'm not too opinionated here, just wanted to get a PR out to spark discussion. :)

Another option could be to convert casing for only the static portions of the path based on the route, but that's a bigger lift.

Thanks for making Bud!

@matthewmueller
Copy link
Contributor

matthewmueller commented Jun 19, 2022

Thrilled to hear Alex! It's so cool seeing Bass Loop using Bud. Bass also looks awesome. Looks like you're bringing the simplicity of functional programming to deploying infrastructure!

This is a problem for my app because it uses base64-encoded digests to route to content-addressed objects. I could switch to a case-insensitive database lookup, but that introduces the admittedly low possibility of collisions.

Ah, I can totally see the problem. Something like: TUlUIExpY2Vuc2UKCkNvcHl gets redirected to tuluiexpy2vuc2ukcknvchl.

This PR removes the redirect and instead changes the router to use a case-insensitive match, since I figure the goal is for URLs to be case-insensitive. I'm not too opinionated here, just wanted to get a PR out to spark discussion. :)

Thanks for opening this PR! I originally did this for SEO purposes, so https://github.com/livebud/bud and https://github.com/livebud/BUD wouldn't be treated as different routes in your analytics dashboards.

I had a look at what Github does and I think I like their approach:

  1. They are case-insensitive, so curl https://github.com/livebud/bud and curl https://github.com/livebud/BUD render the same HTML.
  2. Both HTML responses include <link rel="canonical" href="https://github.com/livebud/bud" /> in their <head> to say that the lowercase version is the canonical route.

So I think your PR makes sense once the tests pass!

If you feel like adding 2., you can adjust the default layout in this file. I'm not sure we're passing enough context to the views yet to provide a full URL like that, so this could also be a follow-up task once I revisit views for custom layouts, error pages, etc.

@vito vito force-pushed the sensitive-paths branch 2 times, most recently from 5966737 to 3b7f622 Compare June 20, 2022 04:15
@vito
Copy link
Contributor Author

vito commented Jun 20, 2022

Thanks. :)

2. Both HTML responses include <link rel="canonical" href="https://github.com/livebud/bud" /> in their <head> to say that the lowercase version is the canonical route.

Cool, that makes sense.

So I think your PR makes sense once the tests pass!

Great! I got the tests passing, but I'm not sure of my approach. There were a few places that iterated over bytes, not runes. The "伪" ("\xce\xb1") route would be inserted fine, but then inserting "尾" ("\xce\xb2") would see"\xce" as a common prefix and set "伪" and "尾" as children of path:"\xce" in the routing tree. This somehow led to /尾 being returned for requests to /伪.

So I changed Tokens to operate on runes rather than bytes, which affects Size, At, and Split. Took a while to figure out!

I'm interested in doing 2., but pushing just these changes for now; I'll try to find time to do it soon, so feel free to wait or merge as-is (if they actually pass in CI).

Copy link
Contributor

@matthewmueller matthewmueller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few minor tweaks then I think we're good to go!

@@ -41,9 +42,10 @@ func (tokens Tokens) At(i int) string {
for _, token := range tokens {
switch token.Type {
case PathToken, SlashToken:
for j := 0; j < len(token.Value); j++ {
chars := []rune(token.Value)
for j := 0; j < len(chars); j++ {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you try ranging over token.Value? IIRC, you can do:

for char := range token.Value {
  // do something
}

And it'll range over the runes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, I'll switch to that - I thought about it but didn't recognize that it "did the right thing" with the index key, i.e. skipping numbers for multi-byte runes.

package/router/radix/tree.go Show resolved Hide resolved
package/router/radix/tree_test.go Show resolved Hide resolved
vito added 2 commits June 20, 2022 21:10
otherwise all utf8 characters have a "common prefix" which allowed one
route to return a different one somehow.

adds an explicit test for matching unicode, too. having this mixed in
with the other test made things pretty difficult to debug
Copy link
Contributor

@matthewmueller matthewmueller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work, thanks Alex!

@matthewmueller matthewmueller merged commit f26db87 into livebud:main Jun 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants