Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Routing #54

Merged
merged 21 commits into from
Jan 24, 2019
Merged

Routing #54

merged 21 commits into from
Jan 24, 2019

Conversation

johanjanssens
Copy link
Member

@johanjanssens johanjanssens commented Jan 20, 2019

Route resolving

This PR implements a complete new routing system using a route resolvers approach, implementing a prioritised queue.

A route resolver follow a similar mechanism to a response transports. A resolver receives the router object which gives it access to the response and request and allow it to resolve the request and modify the request and response accordingly.

A resolver has two key methods:

  1. resolve(DispatcherRouterInterface $router): to resolve the request
  2. generate($path, array $query): to generate a new route based on a path and optional query info.

The router implements a prioritised queue, resolvers are called in FIFO order, or based on their priority. If the resolver returns not FALSE the resolver queue will be halted and the result returned by the router.

The router will always return a fully qualified route as a HttpUrl object. It's up the implementing code to render the route or just cast it to string.

1. Defining routes

Routes are defined in the page frontmatter using the 'route' property. Example: route: the-page-route

Note: If the route property is not defined the default route will be set to the absolute path of the page.

2. Route Aliases

Multiple routes can be defined for the same page, the first route is considered the canonical route and used to create the canonical url for the page.

route:
- page/route
- my-page-alias

Note: If multiple routes are defined and you wish to include the absolute page route you need to manually add it again, if you define it first it will be the canonical route.

3. Dynamic routes

Wildcards let you capture multiple routes at once. Wildcards are available anywhere in your route pattern and there aren’t any limits to how many you can use.

Wildcards need to be placed between [...]and can contain a named parameter prefixed by a match typer. Example: /users/[digit:id]

Wildcards can be made optional when they are appended with a ? Example Example: /users/[digit:id]?

Match Types

You can use the following limits on your named parameters.

- *                    // Match all request URIs
- [digit]              // Match numeric character
- [digit:id]           // Match an numeric characters as 'id'
- [alpha:action]       // Match alphanumeric characters as 'action'
- [:action]            // Match anything up to the next / or end of the URI as 'action'
- [create|edit:action] // Match either 'create' or 'edit' as 'action'
- [*]                  // Catch all (lazy, stops at the next trailing slash)
- [*:trailing]         // Catch all as 'trailing' (lazy)
- [**:trailing]        // Catch all (possessive - will match the rest of the URI)
- .[:format]?          // Match an optional parameter 'format' - a / or . before the block is also optional

The character before the colon (the 'match type') is a shortcut for one of the following regular expressions

'digit' => '[0-9]++'
'alnum' => '[0-9A-Za-z]++'
'alpha' => '[A-Za-z]++',
'*' => '.+?'
'**' => '.++'
'' => '[^/\.]++'

Note: You can define your own expressions for each resolver using the types resolver config option.

4. External Redirects

The router supplies a redirect resolver to handle permanent redirects. Redirects can be configured using the new 'redirects' config option and use the format: array('/path' => '/target')

Path and/or target can be dynamic in which case named path arguments will be replaced in the target (if they exist). Example: array('/oldblog/[digit:year]' => '/newblog/[:year]')

The target can be a local route, or an external url. Example: array('/blog/[digit:year]' => 'http://blog.site.com/[:year]')

5. Canonical Url

A canonical url is always set in the response headers (even if the canonical route is used). This ensure clean self-referential url's. See also: http://www.thesempost.com/using-rel-canonical-on-all-pages-for-duplicate-content-protection/

Note: Pages doesn't use the Joomla routing system but we are using a little trick to set the canonical url in Joomla if it needs it. See: 6bd7a93

Note: The canonical url is not also being used to calculate the cache key for the http cache. This avoid cache duplication.

Notes

  • The view classes no longer return a route (HttpUrl) object instead they return the route as a string. For the html view the route is returned un-classified, all other views return a fully classified route. See: 3e33c43

  • If the limit is hardcoded in the selector the page router will transform offset=x into page=x where x is offset/limit - 1. The paginator template helper will also not show a limit selector in this case.

Additional improvements

  • This PR also adds a `collection' property to the page entity to allow to return collection as config object. See: da20b40

  • This PR adds a new dateable model behavior to allow to easily filter a collection by year, month and day. See: 711a7b5

  • This PR adds news basepath:// and baseurl:// shortcuts to the template to make it easier to qualify url's in a template. See: a1fc515

  • This PR add a new ComPagesPageRegistry::getRoutes($page) method allows to retrieve the routes for a specific page. See: 56ea039

The dateable behavior allows to filter a list of pages by date and exposes
three new states:

- year
- month
- day

If a state is set it will return a list of pages that match.
- basepath:// will return the base path
- baseurl:// will return the base url including url authority
- Rename match to resolve
- Do not extend from KDispatcherRouter
- Add getPage() method to retrieve the active page
- Add getPath() method to retrieve active path
- Always return a fully qualify url object
- Set canonical self-referencing url in response as header
- Refactor router to support multiple routes per page, the first route
is considered the canoncial
- Add getCanonicalUrl() method to router
- Set canonical url in the response headers
- Ensure Joomla received the canonical url
- Use canonical url to calculate the cache key
- a route resolver follow a similar mechanism to response transports. A resolver
receives the router object which gives it access to the response and can resolve
the request, if the resolver returns not FALSE the resolver queue will be halted
and the resolved route returned by the router.

- The resolver also implements reverse routing through the generate method. Here
the resolver will receive the path and the query to generate a route for. if
the resolver returns not FALSE the resolver queue will be halted and the generated
route returned by the router.
Redirects can be configured using the new 'redirects' config option and
use the format: array('/path' => '/target')

Both path and target can be dynamic in which case named path arguments
will be replaced in the target.

Example: array('/oldblog/[digit:year]' => '/newblog/[:year]')
@johanjanssens johanjanssens added this to the 0.6.0 milestone Jan 20, 2019
@johanjanssens johanjanssens self-assigned this Jan 20, 2019
@johanjanssens johanjanssens merged commit 1f0e27b into master Jan 24, 2019
@johanjanssens johanjanssens deleted the feature/50-routing branch January 24, 2019 00:02
@amazeika amazeika self-requested a review February 7, 2019 15:42
@@ -183,47 +186,38 @@ public function getMetadata()
return $metadata;
}

public function getRoute($route = '', $fqr = true, $escape = true)
public function getRoute($page = '', $query = array(), $escape = false)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@johanjanssens I feel that something is missing here. All is well as long as a $page is provided. However I see things not working as expected if route() is called without args. On this last case I would expect to get a route for the current page (states included). In this case $page should default to $this->getPage(). Thoughts?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed the code to allow for an empty route and use the active page to generate it. This scenario will not happen much, but is now also supported.

Note: changes still need to be committed.

* @throws UnexpectedValueException
* @return ComPagesDispatcherRouterInterface
*/
public function getResolver($resolver, $config = array());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@johanjanssens I probably would get rid of this here. The get resolver logic could perfectly fit inside attachResolver. I don't see getResolver being called from anywhere, not really an interface method.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic is the same as the one in the transport which also has a getTransport() and an attachTransport(). Trying to keep the API's constitent and provide flexibility. If we refactor this we should review other API's too that use a similar approach.

* @param array $config An optional associative array of configuration settings
* @return ComPagesDispatcherRouterInterface
*/
public function attachResolver($resolver, $config = array());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@johanjanssens I think addResolver would be cleaner?, on top of that I would add an additional one to add multiple resolvers at once => addResolvers. It's cosmetic, but you make use this convention on other parts of the code.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same reply as above. API is kept consistent with the transport API.

There is also a subtle difference between add and attach, when you add something to a collection you can no longer retrieve what you added, when you attach something you can always get access to what you attached, hence why we have an attachX and a getX.

* @param KControllerResponseInterface $response A response object
* @return ComPagesDispatcherRouterInterface
*/
public function setResponse(KControllerResponseInterface $response);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@johanjanssens It doesn't feel quite right that the router package requires knowledge of the response object. This is very AD HOC as this gets used exclusively for the cannonical URL setting. Instead I would add a request only dependency here (setRequest/getRequest), which is what I would really expect and leave the cannonical URL setting outside the router package. I'd say that should happen here => https://github.com/joomlatools/joomlatools-pages/blob/master/code/site/components/com_pages/dispatcher/http.php#L59 at the time you first resolve the route.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is by design and intended. The role of a router and route resolvers is to parse the request and transform it in such a way that is understandable for the underlying application, by giving the router

The router receives the response object and through that also has access to the request, one of the router resolvers resolve the request.

If you follow your logic then:

  1. All the code that now sits in the response object should be moved to the dispatcher.
  2. The dispatcher would always need to set a canonical, or if wouldn't need to make a decision to set it or not, which breaks encapsulation of the resolver and negates the resolver approach.
  3. If you only pass in the request a redirect resolver cannot be implemented easily either. Then the dispatcher would again be made more thick, need to make decisions if the returned route is a redirect or not, which it cannot easily do. Again you break encapsulation that way and you negate the resolver queue approach.

$response = $router->getResponse();
$path = $route->getPath();

if($page = $this->getObject('page.registry')->getPage($path))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@johanjanssens Same feeling here as with the setCannonicalUrl call made below. When the router resolves a request it should only return a route. I believe the resolver is trying to do too much here. This whole block should perhaps be moved outside the router package and put in the pages dispatcher.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment above. A router is not just url matcher or url generator, it has a wider responsibility.

@johanjanssens johanjanssens mentioned this pull request Mar 1, 2019
@johanjanssens johanjanssens mentioned this pull request Jun 2, 2019
@johanjanssens johanjanssens mentioned this pull request Apr 28, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants