New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make unknown routes do server-side refresh #4645

Closed
wants to merge 2 commits into
base: master
from

Conversation

4 participants
@ryantm
Copy link
Contributor

ryantm commented Jan 10, 2017

There are URLs that are known to the Rails backend or HTTP server but
not known to the Ember.js router. Since the Rails backend can return a
404 page just as well as Ember.js, lets default to always trying to
query the backend if the Ember.js router would have shown a 404 page
with the "unknown" route.

This should address some of the concerns in posts like:

https://meta.discourse.org/t/left-clicking-raw-links-return-404-errors/31314/7
https://meta.discourse.org/t/left-clicking-same-domain-links-to-urls-not-on-the-whitelist-is-broken/54722

It is still necessary to leave in the SERVER_SIDE_ONLY routes, because
there are some routes that are prefixed by valid Ember.js routes, and
we want to short-circuit the Ember.js router in those cases, because
it will think that the path is to a route when it should be going to a
backend route. For example, a .rss route is simply a topic URL with
.rss added to the end. The Ember.js router recognizes this as a topic
URL even though it should be an RSS URL.

make unknown routes do server-side refresh
There are URLs that are known to the Rails backend or HTTP server but
not known to the Ember.js router. Since the Rails backend can return a
404 page just as well as Ember.js, lets default to always trying to
query the backend if the Ember.js router would have shown a 404 page
with the "unknown" route.

This should address some of the concerns in posts like:

  https://meta.discourse.org/t/left-clicking-raw-links-return-404-errors/31314/7
  https://meta.discourse.org/t/left-clicking-same-domain-links-to-urls-not-on-the-whitelist-is-broken/54722

It is still necessary to leave in the SERVER_SIDE_ONLY routes, because
there are some routes that are prefixed by valid Ember.js routes, and
we want to short-circuit the Ember.js router in those cases, because
it will think that the path is to a route when it should be going to a
backend route. For example, a .rss route is simply a topic URL with
.rss added to the end. The Ember.js router recognizes this as a topic
URL even though it should be an RSS URL.
@discoursebot

This comment has been minimized.

Copy link

discoursebot commented Jan 10, 2017

You've signed the CLA, ryantm. Thank you! This pull request is ready for review.

@eviltrout

This comment has been minimized.

Copy link
Member

eviltrout commented Jan 11, 2017

I'm a little uneasy about this. We can't assume that every 404 on the client side has an equivalent server side route. What worries me, is if the page reloads like this, and the new page that loads somehow bootstraps ember and causes a 404, that would be a redirect loop.

@ryantm

This comment has been minimized.

Copy link
Contributor

ryantm commented Jan 11, 2017

@eviltrout, Thanks for reviewing my pull request!

We are not assuming every 404 on the client side has an equivalent server side route. In the case that it doesn't, it will be caught by the catchall 404 route on the server side, which returns the 404 page in a way that doesn't cause a redirect loop. I tested that before submitting the PR.

You seem to be hypothesizing that there are Discourse URLs that respond with a 200 response and load ember.js so they can display a client-side 404 page. These URLs are already broken and this PR would make them a little more broken. This problem seems less important than making known good URLs usable. It also seems like this redirect loop would be something easy to fix by fixing the Rails routes.

@eviltrout

This comment has been minimized.

Copy link
Member

eviltrout commented Jan 16, 2017

We are not assuming every 404 on the client side has an equivalent server side route. In the case that it doesn't, it will be caught by the catchall 404 route on the server side

Right, but in that case it means the browser does a lot more work for a 404 than it needs to previously.

You seem to be hypothesizing that there are Discourse URLs that respond with a 200 response and load ember.js

Yes, it's quite possible no such routes exist and I understand they can be fixed. Redirect loops can be super terrible user experiences, that in my mind this is a trade off between potentially having them and just whitelisting the odd route we missed.

I am leaning towards just whitelisting routes as we are made aware of them rather than opening this hole.

@DavidEGrayson

This comment has been minimized.

Copy link

DavidEGrayson commented Jan 16, 2017

As a web administrator, it is surprising that when I add a resource on the server side, I am unable to post working links to that resource in a forum post. It is also hard for me to discover this issue: after adding the resource, I would test its URL by putting it in my browser, and that test would succeed. It would not (and did not) occur to me to test whether I can link to it from the body of a forum post. I would rather close that hole, and fix any redirect loops for broken URLs if they actually come up.

Right, but in that case it means the browser does a lot more work for a 404 than it needs to previously.

I don't think that any users will be annoyed that some rare broken links on the forum will load at the same speed as a typical link on the web instead of being accelerated with a javascript. I say those links are rare because the only way to avoid a server request when processing a broken link is if the link goes to the same domain as the Discourse app and the structure of the link itself is so wrong that it could not possibly point to anything valid. In that sentence, it sounded like you were not talking about a redirect loop, just the cost of doing one normal request to the server.

All of the arguments you have made so far against this have been about making the already broken URLs not get worse. I'm not sure how to convince you that being able to link to working and tested URLs is more important.

If you are still leaning towards adding more entries to the SERVER_SIDE_ONLY whitelist instead, there are a bunch of patterns I would want to add. Off the top of my head, I'd want to whitelist /assets/*, /uploads/*, /stylesheets/*, /raw/*. For people like us who want our old phpBB links to keep working, I would want to whitelist /viewtopic.php*, /viewforum.php*, and /atom.php* (the last one is pretty specific to us). Would you allow those to be whitelisted?

@eviltrout

This comment has been minimized.

Copy link
Member

eviltrout commented Jan 17, 2017

I had a couple of other core members review this thread and all of us shared similar concerns about this approach.

I'm going to close this for now. https://meta.discourse.org would be a better place to debate this technique. In the meantime, we'll happily accept any PRs that fix routes that are only available on the server side.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment