Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ASCII comes in two flavors that need to be UTF-8ified. #30119

Closed
wants to merge 2 commits into from

Conversation

begrif
Copy link

@begrif begrif commented Aug 7, 2017

Summary

Discourse found an issue where URL parameters were entering the code as ASCII-8BIT instead of UTF-8. Examining the source tree for URL unescape code, I found that almost everywhere does not specify an encoding, relying on the UTF-8 default in Ruby. Almost everywhere.

In ActionDispatch there is a unescape_uri that attempts to preserve the encoding if it is not US-ASCII. Changing that to UTF-8-ify ASCII-8BIT fixes the issue seen in Discourse.

Other Information

It is noteworthy that the ASCII-8BIT encoding is used throughout as the encoding of choice before %-escaping a URL string. This encoding mirrors the widespread usage of UTF-8 in URLs getting encoded as a stream of octets instead of codepoints. Clearly ASCII-8BIT is intended to be convertible back to UTF-8.

@rails-bot
Copy link

Thanks for the pull request, and welcome! The Rails team is excited to review your changes, and you should hear from @schneems (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

This repository is being automatically checked for code quality issues using Code Climate. You can see results for this analysis in the PR status below. Newly introduced issues should be fixed before a Pull Request is considered ready to review.

Please see the contribution instructions for more information.

@begrif
Copy link
Author

begrif commented Aug 7, 2017

Forgot to include a link to the page at Discourse describing this:

https://meta.discourse.org/t/automatic-encoding-of-parsed-url-params/67232

@robin850
Copy link
Member

robin850 commented Aug 7, 2017

Hi @begrif,

Thank you for the patch and congratulations on sending your first pull request to Rails ! 🎉

Looking at the Discourse tickets, it looks like 9220935 could fix this issue. Did you try to replicate this issue from a fresh check-out of the Rails repository ? The aforementioned commit is pretty recent and not part of any release.

@begrif
Copy link
Author

begrif commented Aug 7, 2017

@robin850 No, I have not tried a non-released Rails. The description of the problem and the fix look like that patch has a similar goal.

@robin850
Copy link
Member

robin850 commented Aug 7, 2017

Could you give the steps to reproduce the error with a Discourse application please ? I installed it and tried to create a tag with the земля name but I can't reproduce the issue. This patch has also been properly reverted on my local copy.

@begrif
Copy link
Author

begrif commented Aug 7, 2017

Sure. There are a few steps.

  1. Install Discourse.
  2. Go to settings page, and change the slug generation method to "encoded"
    http://your-discourse-site/admin/site_settings/category/all_results?filter=slug
  3. Create a topic with non-English words. I used Russian. People have also reported it with Persian.
    Эта тема не английская
  4. Copy the URL for the topic created, open a new tab, and paste it in to go to that topic.
    Clicking on links within the app does not work, because Ember handles those routes. It needs to be a fresh tab to make the link go through the Rails routing.

In the failing case, the page never loads due to excessive redirects. In the working case the page loads.

@robin850
Copy link
Member

robin850 commented Aug 8, 2017

@begrif : I managed to reproduce the issue as you described and managed to run Discourse on current master (just had to remove spork-rails and discourse-qunit-rails from the Gemfile and fix *_filter calls to *_action). There is an error on the page because Discourse isn't ready to run on current master but at least the page is properly redirecting.

Also I was wrong in my first comment ; the aforementioned commit has been back-ported on the 5-1-stable branch so you should be able to remove the force_encoding thing from Discourse once it runs on 5.1.

We can certainly close this issue but if I'm wrong, feel free to comment and we can reopen. Thank you very much !

@robin850 robin850 closed this Aug 8, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants