Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Path matching semantics update #2057

Closed
blakeembrey opened this issue Apr 17, 2014 · 15 comments
Closed

Path matching semantics update #2057

blakeembrey opened this issue Apr 17, 2014 · 15 comments
Labels
Milestone

Comments

@blakeembrey
Copy link
Member

I would like to make a breaking change update to the way paths match, but I'm interested in everyones opinions before I implement any changes. My biggest gripe right now is that the special characters outside of a custom matching group restricts the characters that can be used inside of a matching group. I want to change this behaviour so only regex characters can be used inside a matching group and characters outside will be escaped.

Examples:

  • /test? - currently valid, makes the t character optional which really looks like a bug
  • /:test(.*) - currently invalid, periods are being escaped and asterisks are being turned into matching any character groups
  • /(\\d+) - currently won't be indexed properly since we don't have any regex group logic
  • /\\* - currently invalid, can't escape the meaning of special characters and will still become a match all route
@jonathanong
Copy link
Member

  1. /test? should match both / and /test. non-strict mode, /asdf/test? should match /asdf as well.
  2. /:test should essentially be /:test(.*) anyways, so (.*) would be unnecessary.
  3. is that a regex expression without a name? without a name they're useless anyways, so i don't mind

@blakeembrey
Copy link
Member Author

@jonathanong Sorry, these were incredibly trivial examples - let me think of something more complex and realistic.

  1. I don't think the question mark should actually have any matching ability outside of a regex group, but that behaviour would be tricky to get right.
  2. Any regex using * and . is currently impossible to create because of the replacement rules. E.g. /:test(.+), /:test(\\d*), etc.
  3. Is there a value in keeping non-named matching groups around? I can imagine it being useful in a number of cases, but if not we could transform them into non-matching.

@jonathanong
Copy link
Member

Is there a value in keeping non-named matching groups around? I can imagine it being useful in a number of cases, but if not we could transform them into non-matching.

i don't see any value and wouldn't mine deprecating them or something

i'm okay with the other two gotcha's as long as they're documented.

@chirag04
Copy link
Contributor

app.get('/about', info.about);

app.get('/:country', home.city_selection);

//generic renderer 
app.use(function(req, res){
        var somedata = {};
    res.render('website.ejs', somedata);
});

citiy_selection and about will just set the site_state and call next().

Control should then goto the generic renderer to display the site.

However when i visit about page citiy_selection route is also executed. What is it that i am doing wrong here?

@blakeembrey
Copy link
Member Author

@chirag04 Is info.about calling next()? Since both /about and /:country would match /about, they run sequentially.

@chirag04
Copy link
Contributor

@blakeembrey Yes, both info.about and home.city_selection call next so that control moves to the generic renderer.

Is there a way to not execute multiple matching routes?

How should i go about solving this apart from not calling next in info.about?

@blakeembrey
Copy link
Member Author

@chirag04 There isn't really a way to do this without something hacky, but in reality you don't want to write routes that do fall through like that. Can you pull the function out use it in both places?

@chirag04
Copy link
Contributor

I want to make it generic. I dont want to pull that function out and duplicate in both places.

However i am using this approach for now:

Instead of calling next, each about and city_selection will call render_site function.

Eg:

var render_site = require('../helpers/render_site');

about: function(req, res, next) {
        req.data.site_state = 'about full-page';
        req.data.page_title = 'About Us';
        render_site(req, res);
    }

Render site moved out of app.use wrapper into an independent module

module.exports = function(req, res) {
    res.render('website.ejs', req.data);
}

This seems to work for now.

@dougwilson
Copy link
Contributor

Hey @blakeembrey, would you be willing to put together a list of the incompatibilities between 0.x and 1.x of path-to-regexp and how people an migrate their routes? I've been thinking a lot about it, and this is probably why I have been hesitant to bump the dep in the 5.x branch yet, because it seems like it's not a simple migration path (unless you think otherwise? let me know!). Ideally we would be able to describe the path differences and how to migrate your routes. It would be even cooler if we could come up with a way to detect if a path string may need changing in Express 3 -> 4 to give a way for people to do something to determine what they need to change in their app (this idea is not flushed out yet).

@dougwilson
Copy link
Contributor

Oh, and as an additional though, @blakeembrey, I'm not really looking for a full list of new features in 1.x (at least for this), more of what kind of paths that would be broken/behave differently between the two versions and how to fix them to work in 1.x. I hope that helps!

@blakeembrey
Copy link
Member Author

Sure, I can look at putting something together. I think 99% of uses are the same, it's just the tricky edge ones (and hacks that may not have even been intended with 0.1.x but worked).

@blakeembrey
Copy link
Member Author

There's also https://github.com/pillarjs/path-to-regexp#compatibility-with-express--4x, though it's less exhaustive. Should give you an idea on the core changes that aren't supported now though. Most changes were around consistency and proper usage of path-to-regexp.

@dougwilson
Copy link
Contributor

Yea, the tricky ones are what I'm looking for. I think the goal is just to get the best we can :) I looked at that section of the README, but it didn't seem quite what I was looking for, mainly just enough words to help people understand how to identify a route that would break.

@blakeembrey
Copy link
Member Author

Things that will be broken (may be incomplete):

  • RegExp groups
    • No nesting groups
      • /(\d([a-z]+)), the inner paren will be literal
  • RegExp outside of groups
    • Anything outside of the () is literal
      • /\d+ will break, use /(\d+)
  • No expansion of * inside groups
    • /(*) will break, use proper RegExp or asterisk outside of a group
      • /* or /(.*)
  • Proper "segment handling"
    • E.g. /:userId is treated as a path segment and matches accordingly when using repeated params

Advantages:

  • Consistent path matching behaviour
  • Well-documented (also, consistent and understandable algorithm)
  • Path reversal support
  • Understanding of "segments" - /foo/:part?/bar does do weird stuff like only match when /foo//bar
    • This is more important for matching end of paths - found lots of places in old code where it'd match paths with double trailing slashes like /foo//
  • Actually really awesome, if we use https://github.com/pillarjs/path-match with the engine PR I have outstanding on the router, we can get automatic splat support (E.g. the raw segments when doing /:path*)
  • Can actually write regexps without them breaking, no idea how many times people discover they need to do a hack like \\d{0,} instead of using * in the group

@dougwilson
Copy link
Contributor

These changes have now landed on the 5.0 branch 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants