-
Notifications
You must be signed in to change notification settings - Fork 363
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How can I reference an ESI URL that includes an encoded HTML entity with Varnish? #2034
Comments
I'm not sure we should expand html entities. Why not urlencode the ' ? |
fgsch, +1. HTML entities do no form legal URLs, and wouldn't work any better in A HREF or IMG links. URL encoding as with the %20 in the example is the right solution. More to the point, Varnish shouldn't slow down everything else to run a decoder over what is best understood as an error in the src link. pkaeding, the best solution would be for your app to decode HTML entities and re-encode them with URL encoding before passing along the string. |
Because https://www.w3.org/TR/esi-lang is not particularly clear about this, I'll seek clarification and we'll follow up then |
That's unfortunate, but I understand the pragmatism of the performance argument. (I would argue that HTML entities do work in A HREF or IMG links when the user agent is set up to properly interpret HTML. I gather that Varnish is not really interpreting the HTML, rather just looking for the specific ESI tokens that it cares about. This seems like a reasonable trade-off, all things considered.) Thanks for the clarification! I will modify my app to work around this. |
ESI is an XML-based language, so &-encoded character escapes are allowed there syntactically. However, semantically that attribute is a URI, not an IRI. Putting non-ascii content in that field isn't specified. Really, it should create some sort of error condition, but the outcome would be the same -- "don't do that." Understand that the ESI spec is getting close to two decades old, was created before internationalisation was even partly understood in Internet protocols, and hasn't been standardised or publicly maintained. So it's not surprising that these sorts of ambiguities pop up. |
Backport review: Not a bug. The ESI parsing will stay as it is, so there is nothing to backport. |
I am using Varnish 4.1.2, and trying to include ESI content. Sometimes, the URL for the included content might include encoded HTML entities, such as
'
('
).For example, I have the following in the HTML returned by my app for the outer shell:
Expected Behavior
This should result in an ESI request for a path (
/esi/map/alice's%20house
) with 3 segments and no query string, which when fully decoded, result in:esi
map
alice's house
The apostrophe in the third segment is HTML encoded because it is being transmitted in an HTML document. The space is URL encoded because it is part of a URL.
Current Behavior
However, Varnish seems to not decode the HTML encoded entity before making the ESI request. It sends a request for
/esi/map/alice's%20house
, which is a bad request, because&
is not legal in the path or a URL.Steps to Reproduce
Context
This is preventing me from using ESI to serve content that uses user-submitted data for one path segment of the URL.
Your Environment
The text was updated successfully, but these errors were encountered: