Add html_attr_relaxed escaping strategy#4743
Conversation
97bbe49 to
e641a34
Compare
|
@fabpot WDYT, would you endorse such an additional strategy? |
tests/Runtime/EscaperRuntimeTest.php
Outdated
| if (\in_array($literal, $immune)) { | ||
| $this->assertEquals($literal, (new EscaperRuntime())->escape($literal, 'html_attr_relaxed')); | ||
| } else { | ||
| $this->assertNotEquals( |
|
|
||
| public function testHtmlAttributeRelaxedEscapingConvertsSpecialChars() | ||
| { | ||
| foreach ($this->htmlAttrSpecialChars as $key => $value) { |
There was a problem hiding this comment.
Shouldn't we test :, @, [, or ] as well here (they are not part of $this->htmlAttrSpecialChars)?
There was a problem hiding this comment.
Short: I don't think we need to.
I don't fully understand why the tests are written and organized the way they are. It seems that has been inherited from old Zend Framework Escaper tests.
testHtmlAttributeRelaxedEscapingConvertsSpecialChars and testHtmlAttributeEscapingConvertsSpecialChars use $htmlAttrSpecialChars to test for a few samples that
- alnums are not escaped
- a few "immune" chars (common to both
html_attrandhtml_attr_relaxed) are not escaped - two examples beyond ASCII 0xFF are escaped
- other examples like
<>&"are being escaped.
In addition to that, we have testHtmlAttributeEscapingEscapesOwaspRecommendedRanges and testHtmlAttributeRelaxedEscapingEscapesOwaspRecommendedRanges. Those tests will fully iterate the ASCII 0x01 to 0xFF range and test every single character from it. The test expects escaping to happen for all cases except the 0-9, a-z, A-Z ranges and explicitly given whitelists of characters.
So, I think we're safe as-is.
675cd42 to
d06e902
Compare
|
Feedback addressed. Thank you! When this gets merged, I'd like to use it as the escaping strategy for attribute names in #3930. |
d06e902 to
04aa3df
Compare
|
Thank you @mpdude. |
This adds
html_attr_relaxed, a relaxed variant of thehtml_attrescaping strategy. The difference is thathtml_attr_relaxeddoes not escape the:,@,[and]characters. These are used by some front-end frameworks in attribute names to wire special handling/value binding. See https://v2.vuejs.org/v2/guide/syntax.html#v-bind-Shorthand for an example.The HTML 5 spec does not exclude all those characters from attribute names (html.spec.whatwg.org/multipage/syntax.html#attributes-2).
However, at least XML processors will treat the colon as the XML namespace separator.
HTML 5 allows XML only on SVG and MathML elements, and only for pre-defined namespace-prefixes (developer.mozilla.org/en-US/docs/Web/API/Attr/localName#:~:text=That means that the local,different from the qualified name). For other something: prefixes, these will simply be passed on as part of the local attribute name.
According to engine.sygnal.com/research/html5-attribute-names, all current browser implementations handle at least the colon fine, and the aforementioned Vue.js documentation suggests that this is also the case for @.
Note also that Symfony UX only conditionally escapes attribute names, and it has
:and@in its safe list:https://github.com/symfony/ux/blob/c9a3e66b8ac53e870097e8a828913e57204398e7/src/TwigComponent/src/ComponentAttributes.php#L82
Closes #3614.