Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Escape anchor once in #url_for with Hash syntax #43293

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

intrip
Copy link
Contributor

@intrip intrip commented Sep 23, 2021

Summary

Fixes #43289

@rails-bot rails-bot bot added the actionpack label Sep 23, 2021
@intrip intrip force-pushed the 43289-escape-hash-anchor-once branch from 04d5d9d to c8d6ea4 Compare September 23, 2021 10:37
@intrip intrip changed the title Escape anchor once in #url_for Escape anchor once in #url_for with Hash syntax Sep 23, 2021
@intrip intrip force-pushed the 43289-escape-hash-anchor-once branch 2 times, most recently from 0839ea6 to 111b49b Compare September 23, 2021 10:39
@p8
Copy link
Member

p8 commented Sep 24, 2021

@intrip I'm not quite sure what is the correct/expected behaviour here.
Only Hash escapes the values when calling to_param.
Space is escaped to +

"% ".to_param
=> "% "
["% "].to_param
=> "% "
{foo: "% "}.to_param
=> "foo=%25+"

But the escape_fragment method works a bit different as it escapes a space to %20 instead of +

ActionDispatch::Journey::Router::Utils.escape_fragment("% ".to_param)
=> "%25%20"
ActionDispatch::Journey::Router::Utils.escape_fragment(["% "].to_param)
=> "%25%20"
ActionDispatch::Journey::Router::Utils.escape_fragment({foo: "% "}.to_param)
=> "foo=%2525+"

I found the following Stackoverflow post that describes what should be used:

When data that has been entered into HTML forms is submitted, the form field names and values are encoded and sent to the server in an HTTP request message using method GET or POST, or, historically, via email. The encoding used by default is based on a very early version of the general URI percent-encoding rules, with a number of modifications such as newline normalization and replacing spaces with "+" instead of "%20". The MIME type of data encoded this way is application/x-www-form-urlencoded, and it is currently defined (still in a very outdated manner) in the HTML and XForms specifications.

So, the real percent encoding uses %20 while form data in URLs is in a modified form that uses +. So you're most likely to only see + in URLs in the query string after an ?.

https://stackoverflow.com/a/1634293

@intrip
Copy link
Contributor Author

intrip commented Sep 24, 2021

@p8 good point! I'll check and come back with more info.

@intrip
Copy link
Contributor Author

intrip commented Sep 24, 2021

@p8 The stackoverflow post you linked makes sense. I've also checked URI RFC and Form spec which confirms that.

I think the correct behaviour is to always use only escape_fragment to encode URL fragments and do not encode them as application/x-www-form-urlencoded even when a Hash is given in input.

I'll change the codebase accordingly and ping you again when done 🙂. Thanks!

@intrip intrip changed the title Escape anchor once in #url_for with Hash syntax WIP: Escape anchor once in #url_for with Hash syntax Sep 24, 2021
@intrip intrip force-pushed the 43289-escape-hash-anchor-once branch from 111b49b to 82df220 Compare September 27, 2021 12:15
@intrip intrip force-pushed the 43289-escape-hash-anchor-once branch from 82df220 to 057c063 Compare September 27, 2021 12:18
#
# This method is also aliased as +to_param+.
def to_query(namespace = nil)
def to_query(namespace = nil, escape: true)
Copy link
Contributor Author

@intrip intrip Sep 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding the keyword argument breaks clients which relies on *args delegation. I'm not sure how big of an impact this will be but the fix on the clients is quite simple.

# The string pairs "key=value" that conform the query string
# are sorted lexicographically in ascending order.
#
# This method is also aliased as +to_param+.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is redundant because RDoc already shows such info: "Also aliased as: to_param"

@intrip intrip changed the title WIP: Escape anchor once in #url_for with Hash syntax Escape anchor once in #url_for with Hash syntax Sep 27, 2021
@intrip intrip force-pushed the 43289-escape-hash-anchor-once branch 3 times, most recently from 4c19b9d to 62d85d3 Compare September 27, 2021 12:41
@intrip
Copy link
Contributor Author

intrip commented Sep 27, 2021

@p8 I've changed the implementation as outlined in #43293 (comment).
Can you please check again? 🙏

@intrip intrip force-pushed the 43289-escape-hash-anchor-once branch from 62d85d3 to 45c2cc5 Compare September 27, 2021 15:01
@p8
Copy link
Member

p8 commented Sep 27, 2021

Thanks for working on this @intrip!
I'm still unsure what the correct solution is here.
Hopefully someone else will have a look.

@intrip intrip force-pushed the 43289-escape-hash-anchor-once branch from 45c2cc5 to c1d3a65 Compare September 28, 2021 07:37
@intrip intrip force-pushed the 43289-escape-hash-anchor-once branch from c1d3a65 to 0059867 Compare October 6, 2021 10:47
@rails-bot
Copy link

rails-bot bot commented Jan 4, 2022

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.
Thank you for your contributions.

@rails-bot rails-bot bot added the stale label Jan 4, 2022
@intrip
Copy link
Contributor Author

intrip commented Jan 4, 2022

ping

@rails-bot rails-bot bot removed the stale label Jan 4, 2022
@rails-bot
Copy link

rails-bot bot commented Apr 4, 2022

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.
Thank you for your contributions.

@rails-bot rails-bot bot added the stale label Apr 4, 2022
@intrip
Copy link
Contributor Author

intrip commented Apr 4, 2022

pong

@rails-bot rails-bot bot removed the stale label Apr 4, 2022
jonathanhefner added a commit to jonathanhefner/rails that referenced this pull request May 23, 2022
Prior to this commit, `Hash#to_param` was an alias for `Hash#to_query`,
which meant `Hash#to_param` CGI-escaped query string keys and values,
like other `to_query` methods, but *unlike* other `to_param` methods.
This caused double escaping in some cases, such as when passing a `Hash`
as an `anchor` argument to `url_for`.

This commit adds an internal-use-only optional `escape` block to
`to_query` methods, and `Hash#to_param` now uses this block to bypass
escaping, while still being implemented in terms of `Hash#to_query`.

Fixes rails#43289.
Closes rails#43293.

Co-authored-by: Jacopo <beschi.jacopo@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

path_for escapes anchor/fragment twice
2 participants