Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for mandatory_oneof and mandatory_anyof attribute constraints #938

Closed
westonruter opened this issue Feb 7, 2018 · 19 comments · Fixed by #4285
Closed

Add support for mandatory_oneof and mandatory_anyof attribute constraints #938

westonruter opened this issue Feb 7, 2018 · 19 comments · Fixed by #4285
Assignees
Labels
QA passed Has passed QA and is done Sanitizers Validation
Projects
Milestone

Comments

@westonruter
Copy link
Member

westonruter commented Feb 7, 2018

Feature description

Someone had a <iframe src="about:blank"> in a post on https://wordpress.org/support/topic/google-console-error/#post-9946202

This is resulting in an <amp-iframe> being rendered but without a src attribute. The result is in valid AMP because src is mandatory.

The whitelist sanitizer should be removing the <amp-iframe> in this case I should think. But otherwise, the iframe sanitizer itself can preemptively remove the iframe to ensure validity. Perhaps a placeholder could be put there in its place.


Do not alter or remove anything below. The following sections will be managed by moderators only.

Acceptance criteria

  1. The iframe sanitizer should preemptively remove the iframe to ensure validity
  2. When <iframe src="about:blank">is in a post the whitelist sanitizer should be removing the <amp-iframe>

Implementation brief

QA testing instructions

Demo

Changelog entry

@westonruter westonruter added this to the v0.7 milestone Feb 8, 2018
@westonruter westonruter removed this from the v0.7 milestone Mar 7, 2018
@westonruter
Copy link
Member Author

@westonruter westonruter changed the title Fully remove iframes that have illegal src Fully remove iframes that have illegal src (add support for mandatory_oneof attribute constraint) Jul 26, 2019
@westonruter westonruter added this to the v1.3 milestone Jul 26, 2019
@westonruter
Copy link
Member Author

This came up yet again, but for a different reason: https://wordpress.org/support/topic/the-tag-iframe-is-missing-a-mandatory-attribute/

@westonruter
Copy link
Member Author

westonruter commented Aug 9, 2019

amp-script has mandatory_anyof (as of #3003):

  attrs: {
    name: "script"
    mandatory_anyof: "['script', 'src']"
  }
  attrs: {
    name: "src"
    mandatory_anyof: "['script', 'src']"
    value_url: {
      protocol: "https"
      allow_relative: false
    }
    blacklisted_value_regex: "__amp_source_origin"
  }

Compare with amp-iframe which has mandatory_oneof:

  attrs: {
    name: "src"
    mandatory_oneof: "['src', 'srcdoc']"
    value_url: {
      protocol: "data"
      protocol: "https"
      allow_relative: true  # Will be set to false at a future date.
    }
    blacklisted_value_regex: "__amp_source_origin"
  }

Neither of these constraints are being applied currently.

Per @choumx, “mandatory_oneof requires that exactly one attribute is present”. This makes sense as amp-iframe allows for both src and srcdoc since not all browsers support the latter.

@westonruter westonruter changed the title Fully remove iframes that have illegal src (add support for mandatory_oneof attribute constraint) Add support for mandatory_oneof and mandatory_anyof attribute constraints Aug 9, 2019
@dreamofabear
Copy link

Good catch that amp-script should use mandatory_oneof.

@westonruter
Copy link
Member Author

@choumx Should it? I thought this made sense because amp-script should be configured to use a script either in an external location (via src attribute) or locally (via script attribute pointing to a script[type=text/plain]). No?

@dreamofabear
Copy link

Correct, so an amp-script that has both src and script should be invalid. Which is what mandatory_oneof enforces.

@westonruter
Copy link
Member Author

So then should amp-iframe actually use mandatory_anyof because both src and srcdoc should be allowed? Per MDN:

image

So this should be valid AMP:

<amp-iframe 
    width="640" 
    height="480" 
    layout="responsive" 
    src="https://example.com"
    srcdoc="&lt;!doctype html&gt; &lt;html&gt; &lt;head&gt; &lt;title&gt;Example Domain&lt;/title&gt; &lt;meta charset=&quot;utf-8&quot; /&gt; &lt;meta http-equiv=&quot;Content-type&quot; content=&quot;text/html; charset=utf-8&quot; /&gt; &lt;meta name=&quot;viewport&quot; content=&quot;width=device-width, initial-scale=1&quot; /&gt; &lt;style type=&quot;text/css&quot;&gt; body { background-color: #f0f0f2; margin: 0; padding: 0; font-family: &quot;Open Sans&quot;, &quot;Helvetica Neue&quot;, Helvetica, Arial, sans-serif; } div { width: 600px; margin: 5em auto; padding: 50px; background-color: #fff; border-radius: 1em; } a:link, a:visited { color: #38488f; text-decoration: none; } @media (max-width: 700px) { body { background-color: #fff; } div { width: auto; margin: 0 auto; border-radius: 0; padding: 1em; } } &lt;/style&gt; &lt;/head&gt; &lt;body&gt; &lt;div&gt; &lt;h1&gt;Example Domain&lt;/h1&gt; &lt;p&gt;This domain is established to be used for illustrative examples in documents. You may use this domain in examples without prior coordination or asking for permission.&lt;/p&gt; &lt;p&gt;&lt;a href=&quot;http://www.iana.org/domains/example&quot;&gt;More information...&lt;/a&gt;&lt;/p&gt; &lt;/div&gt; &lt;/body&gt; &lt;/html&gt;"
    >
  <span placeholder>...</span>
</amp-iframe>

But it is not.

@dreamofabear
Copy link

Do you know if there's a related bug on such browsers? Asking because it looks like amp-iframe converts srcdoc to a data URI under the hood: https://github.com/ampproject/amphtml/blob/a033bc12b94cde56887bdde7ea543c089d3502de/extensions/amp-iframe/0.1/amp-iframe.js#L255-L262

@westonruter
Copy link
Member Author

Interesting. I wasn't aware of that. If that's the case, where srcdoc is pushed into src with a data: URL, then indeed mandatory_oneof seems fine.

@MackenzieHartung MackenzieHartung added this to Backlog in Ongoing Sep 17, 2019
@MackenzieHartung MackenzieHartung modified the milestones: v1.3, v1.3.1 Sep 17, 2019
@swissspidy swissspidy modified the milestones: v1.3.1, v1.4 Oct 17, 2019
@westonruter westonruter removed this from the v1.4 milestone Oct 17, 2019
@westonruter westonruter moved this from Backlog to To Do in Ongoing Jan 16, 2020
@kienstra kienstra self-assigned this Feb 6, 2020
@kienstra kienstra moved this from To Do to In Progress in Ongoing Feb 12, 2020
@kienstra
Copy link
Contributor

kienstra commented Feb 12, 2020

Implementation: mandatory_oneof

Hi @westonruter,
Hope your week's off to a great start!

For the mandatory_oneof implementation, what do you think about this? I'll also add mandatory_anyof support.

diff --git a/bin/amphtml-update.py b/bin/amphtml-update.py
index 469d74469..7d9bf7d42 100644
--- a/bin/amphtml-update.py
+++ b/bin/amphtml-update.py
@@ -708,6 +708,10 @@ def GetValues(attr_spec):
        if attr_spec.HasField('mandatory'):
                value_dict['mandatory'] = attr_spec.mandatory
 
+       if attr_spec.HasField('mandatory_oneof'):
+               mandatory_oneof = attr_spec.mandatory_oneof.lstrip('[').rstrip(']').split(', ')
+               value_dict['mandatory_oneof'] = [oneof.strip("'") for oneof in mandatory_oneof]
+
        # Add allowed value
        if attr_spec.value:

diff --git a/includes/sanitizers/class-amp-tag-and-attribute-sanitizer.php b/includes/sanitizers/class-amp-tag-and-attribute-sanitizer.php
index 6786a213d..b3c6a0540 100644
--- a/includes/sanitizers/class-amp-tag-and-attribute-sanitizer.php
+++ b/includes/sanitizers/class-amp-tag-and-attribute-sanitizer.php
@@ -57,6 +57,7 @@ class AMP_Tag_And_Attribute_Sanitizer extends AMP_Base_Sanitizer {
        const INVALID_BLACKLISTED_VALUE_REGEX      = 'INVALID_BLACKLISTED_VALUE_REGEX';
        const DISALLOWED_PROPERTY_IN_ATTR_VALUE    = 'DISALLOWED_PROPERTY_IN_ATTR_VALUE';
        const ATTR_REQUIRED_BUT_MISSING            = 'ATTR_REQUIRED_BUT_MISSING';
+       const MANDATORY_ONEOF_ATTR_MISSING         = 'MANDATORY_ONEOF_ATTR_MISSING';
        const INVALID_LAYOUT_WIDTH                 = 'INVALID_LAYOUT_WIDTH';
        const INVALID_LAYOUT_HEIGHT                = 'INVALID_LAYOUT_HEIGHT';
        const INVALID_LAYOUT_AUTO_HEIGHT           = 'INVALID_LAYOUT_AUTO_HEIGHT';
@@ -729,6 +730,19 @@ class AMP_Tag_And_Attribute_Sanitizer extends AMP_Base_Sanitizer {
                        return null;
                }
 
+               // If there is a 'mandatory_oneof' value and exactly one of the required attributes isn't present, remove the element.
+               if ( $this->is_missing_mandatory_oneof_attribute( $merged_attr_spec_list, $node ) ) {
+                       $this->remove_invalid_child(
+                               $node,
+                               [
+                                       'code'       => self::MANDATORY_ONEOF_ATTR_MISSING,
+                                       // Maybe also include an 'attributes' value.
+                                       'spec_name'  => $this->get_spec_name( $node, $tag_spec ),
+                               ]
+                       );
+                       return null;
+               }
+
                // Add required AMP component scripts.
                $script_components = [];
                if ( ! empty( $tag_spec['requires_extension'] ) ) {
@@ -774,6 +788,8 @@ class AMP_Tag_And_Attribute_Sanitizer extends AMP_Base_Sanitizer {
                return 0 !== count( $this->get_missing_mandatory_attributes( $attr_spec, $node ) );
        }
 
+       private function is_missing_mandatory_oneof_attribute( $attr_spec, DOMElement $node ) {}
+
        /**
         * Get list of mandatory missing mandatory attributes.
         *

Also, It'll be good to avoid checking the same mandatory_oneof values multiple times.

For example, the same mandatory_oneof value "['end-date', 'timeleft-ms', 'timestamp-ms', 'timestamp-seconds']" appears multiple times:

attrs: {
    name: "timeleft-ms"
    mandatory_oneof: "['end-date', 'timeleft-ms', 'timestamp-ms', 'timestamp-seconds']"
    value_regex: "\\d+"
  }
  attrs: {
    name: "timestamp-ms"
    mandatory_oneof: "['end-date', 'timeleft-ms', 'timestamp-ms', 'timestamp-seconds']"
    value_regex: "\\d{13}"
  }
  attrs: {
    name: "timestamp-seconds"
    mandatory_oneof: "['end-date', 'timeleft-ms', 'timestamp-ms', 'timestamp-seconds']"
    value_regex: "\\d{10}"
  }

@westonruter
Copy link
Member Author

Seems like a good approach at first glance.

@kienstra
Copy link
Contributor

Thanks!

@kienstra kienstra moved this from In Progress to Ready for Review in Ongoing Feb 13, 2020
@kienstra
Copy link
Contributor

Whenever this is ready for QA, these testing instructions should be fine: #4285 (comment)

@westonruter westonruter added this to the v1.5 milestone Feb 15, 2020
@kienstra
Copy link
Contributor

Updated Testing Instruction

Please use these instead.

  1. As the description of Add support for mandatory_oneof and mandatory_anyof attribute constraints #938 mentions, adding this to a Custom HTML block:
<iframe src="about:blank">

...should result in the <amp-iframe> being stripped, and the validation errors now include:

mandatory-oneof

  1. And adding this to a Custom HTML block:
<amp-list width="400" height="400"></amp-list>

...should result in this validation error:
amp-list-here

@westonruter westonruter moved this from Ready for Review to Ready for QA in Ongoing Feb 16, 2020
@csossi
Copy link

csossi commented Feb 25, 2020

Hi @kienstra - seeing the validation error as described above for the 2nd case, but looks different for the first. What I'm seeing is:
image

@kienstra
Copy link
Contributor

kienstra commented Mar 3, 2020

Hi @csossi,
Sorry for the delay.

Maybe there was a recent deployment that fixed this.

But it looks like adding the markup from case 2 above to a Custom HTML block:

<amp-list width="400" height="400"></amp-list>

...results in the expected message:

expected-message

Do you see that also?

Thanks, Claudio!

@csossi
Copy link

csossi commented Mar 10, 2020

Verified in QA (able to replicate message above)

@csossi csossi added the QA passed Has passed QA and is done label Mar 10, 2020
@csossi csossi moved this from Ready for QA to Done in Ongoing Mar 10, 2020
@kienstra
Copy link
Contributor

Nice, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
QA passed Has passed QA and is done Sanitizers Validation
Projects
Ongoing
  
Done
Development

Successfully merging a pull request may close this issue.

7 participants