Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add sanitization reporting #912

Merged
merged 43 commits into from Feb 9, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
43 commits
Select commit Hold shift + click to select a range
0a44284
Add initial callback for listening for element/attribute removal in w…
westonruter Jan 27, 2018
e9f394a
Issue #843: Tracking for removed nodes and attributes.
Jan 28, 2018
d7104b8
Issue #843: Correct a failed Travis build by excluding a PHPCS rule.
Jan 28, 2018
616262f
Issue #843: Add a method to process markup for AMP validtity.
Jan 28, 2018
4714062
Issue #843: Track removed iframes in a helper method.
Jan 28, 2018
e209005
Issue #843: Initial registration of the REST endpoint for validation.
Jan 29, 2018
aed76c6
Issue #864: Support <amp-carousel> in 'Gallery' widget.
Jan 30, 2018
30c666f
Issue #843: Report removed attributes and nodes in a histogram.
Jan 30, 2018
76b0f17
Revert "Issue #864: Support <amp-carousel> in 'Gallery' widget."
Jan 30, 2018
0b7c3fc
Issue #843: Align equals signs vertically.
Jan 30, 2018
71744e5
Issue #843: Prepare to add headers to frontend GET requests.
Jan 30, 2018
da408bf
Issue #843: Align an equals sign to correct the failed Travis build.
Jan 30, 2018
ab3909a
Issue #864: Validation data in the response header.
Feb 1, 2018
90090b0
Issue #864: Remove an extra conditional, nest the 'mutation_callback.'
Feb 1, 2018
83dac87
Issue #864: Rename function to finish_output_buffering().
Feb 1, 2018
70521ef
Issue #843: Remove the extra variabl in the @return tag.
Feb 1, 2018
cc4fe85
Issue #843: Add processed markup to REST API response.
Feb 1, 2018
0b6a4ee
Issue #843: Output the 'processed_markup' at the bottom of the response.
Feb 1, 2018
db49a33
Issue #843: Merge in 'develop' branch and resolve conflicts.
Feb 1, 2018
c92bee4
Issue #843: Apply validation to post update on wp-admin/post.php.
Feb 1, 2018
6b5593b
Issue #843: Verify the nonce before validating on 'save_post'.
Feb 1, 2018
a168a5c
Issue #843: Sanitize $_GET value in addition to the 'ignore' comment.
Feb 1, 2018
a52ae05
Issue #843: Correct Travis error by changing error text.
Feb 1, 2018
0472a33
Issue #843: Report node removal in the rest of the sanitizers.
Feb 4, 2018
6d2350f
Merge branch 'develop' of https://github.com/Automattic/amp-wp into a…
westonruter Feb 5, 2018
3de38b8
Issue #843: Call the 'mutation_callback' for more attribute removals.
Feb 5, 2018
46e7084
Issue #843: Merge in develop, resolve conflicts.
Feb 5, 2018
972ac2c
Issue #843: Merge in develop again, resolve conflicts.
Feb 7, 2018
d2ab9ee
Issue #843: Rename test to 'get_buffer' instead of 'prepare_response'.
Feb 7, 2018
cb0cfc9
Issue #843: Fix an issue in the error message.
Feb 7, 2018
084acaa
Issue #843: Revert renaming of methods, adjust unit tests.
Feb 8, 2018
13ab280
Issue #843: Remove special characters, update documentation.
Feb 8, 2018
f996673
Issue #843: Change @const to @var for constants.
Feb 8, 2018
f8aeca8
Issue #843: Rename class to 'AMP_Validation_Utils'
Feb 8, 2018
69317b8
Issue #843; Use constants instead of string literals.
Feb 8, 2018
583a6bc
Issue #843: Add nonce verification for the editor message.
Feb 8, 2018
cd910ff
Issue #843: Align comments in addition to variable names.
Feb 8, 2018
a6d071a
List the AMP-invalid elements and attributes that have been saved
westonruter Feb 8, 2018
b40729b
Only report mutations when node/attribute is removed due to invalidity
westonruter Feb 9, 2018
9954c11
Improve naming of callback and removal methods
westonruter Feb 9, 2018
15d9186
Remove add_header until use case is confirmed
westonruter Feb 9, 2018
bb7a175
Skip reporting removal of iframe's former parent node
westonruter Feb 9, 2018
29caa8a
Skip sanitization reporting when saving post that doesn't support AMP
westonruter Feb 9, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
1 change: 1 addition & 0 deletions amp.php
Expand Up @@ -95,6 +95,7 @@ function amp_after_setup_theme() {
add_action( 'wp_loaded', 'amp_add_options_menu' );
add_action( 'parse_query', 'amp_correct_query_when_is_front_page' );
AMP_Post_Type_Support::add_post_type_support();
AMP_Validation_Utils::init();
}
add_action( 'after_setup_theme', 'amp_after_setup_theme', 5 );

Expand Down
1 change: 1 addition & 0 deletions includes/class-amp-autoloader.php
Expand Up @@ -83,6 +83,7 @@ class AMP_Autoloader {
'AMP_DOM_Utils' => 'includes/utils/class-amp-dom-utils',
'AMP_HTML_Utils' => 'includes/utils/class-amp-html-utils',
'AMP_Image_Dimension_Extractor' => 'includes/utils/class-amp-image-dimension-extractor',
'AMP_Validation_Utils' => 'includes/utils/class-amp-validation-utils',
'AMP_String_Utils' => 'includes/utils/class-amp-string-utils',
'AMP_WP_Utils' => 'includes/utils/class-amp-wp-utils',
'AMP_Widget_Archives' => 'includes/widgets/class-amp-widget-archives',
Expand Down
26 changes: 18 additions & 8 deletions includes/class-amp-theme-support.php
Expand Up @@ -760,13 +760,27 @@ public static function finish_output_buffering() {
*
* @since 0.7
*
* @param string $response HTML document response.
* @param string $response HTML document response. By default it expects a complete document.
* @param array $args {
* Args to send to the preprocessor/sanitizer.
*
* @type callable $remove_invalid_callback Function to call whenever a node is removed due to being invalid.
* }
* @return string AMP document response.
* @global int $content_width
*/
public static function prepare_response( $response ) {
public static function prepare_response( $response, $args = array() ) {
global $content_width;

$args = array_merge(
array(
'content_max_width' => ! empty( $content_width ) ? $content_width : AMP_Post_Template::CONTENT_MAX_WIDTH, // Back-compat.
'use_document_element' => true,
'remove_invalid_callback' => null,
),
$args
);

/*
* Make sure that <meta charset> is present in output prior to parsing.
* Note that the meta charset is supposed to appear within the first 1024 bytes.
Expand All @@ -780,11 +794,7 @@ public static function prepare_response( $response ) {
1
);
}
$dom = AMP_DOM_Utils::get_dom( $response );
$args = array(
'content_max_width' => ! empty( $content_width ) ? $content_width : AMP_Post_Template::CONTENT_MAX_WIDTH, // Back-compat.
'use_document_element' => true,
);
$dom = AMP_DOM_Utils::get_dom( $response );

// First ensure the mandatory amp attribute is present on the html element, as otherwise it will be stripped entirely.
if ( ! $dom->documentElement->hasAttribute( 'amp' ) && ! $dom->documentElement->hasAttribute( '⚡️' ) ) {
Expand All @@ -798,7 +808,7 @@ public static function prepare_response( $response ) {
// @todo If 'utf-8' is not the blog charset, then we'll need to do some character encoding conversation or "entityification".
if ( 'utf-8' !== strtolower( get_bloginfo( 'charset' ) ) ) {
/* translators: %s is the charset of the current site */
trigger_error( esc_html( sprintf( __( 'The database has the %s encoding when it needs to be utf-8 to work with AMP.', 'amp' ), get_bloginfo( 'charset' ) ), E_USER_WARNING ) ); // phpcs:ignore WordPress.PHP.DevelopmentFunctions.error_log_trigger_error
trigger_error( esc_html( sprintf( __( 'The database has the %s encoding when it needs to be utf-8 to work with AMP.', 'amp' ), get_bloginfo( 'charset' ) ) ), E_USER_WARNING ); // phpcs:ignore WordPress.PHP.DevelopmentFunctions.error_log_trigger_error
}

$response = "<!DOCTYPE html>\n";
Expand Down
2 changes: 1 addition & 1 deletion includes/sanitizers/class-amp-audio-sanitizer.php
Expand Up @@ -81,7 +81,7 @@ public function sanitize() {
* @see: https://github.com/ampproject/amphtml/issues/2261
*/
if ( 0 === $new_node->childNodes->length && empty( $new_attributes['src'] ) ) {
$node->parentNode->removeChild( $node );
$this->remove_invalid_child( $node );
} else {
$node->parentNode->replaceChild( $new_node, $node );
}
Expand Down
38 changes: 38 additions & 0 deletions includes/sanitizers/class-amp-base-sanitizer.php
Expand Up @@ -304,4 +304,42 @@ public function maybe_enforce_https_src( $src, $force_https = false ) {

return $src;
}

/**
* Removes an invalid child of a node.
*
* Also, calls the mutation callback for it.
* This tracks all the nodes that were removed.
*
* @since 0.7
*
* @param DOMElement $child The node to remove.
* @return void.
*/
public function remove_invalid_child( $child ) {
$child->parentNode->removeChild( $child );
if ( isset( $this->args['remove_invalid_callback'] ) ) {
call_user_func( $this->args['remove_invalid_callback'], $child, AMP_Validation_Utils::NODE_REMOVED );
}
}

/**
* Removes an invalid attribute of a node.
*
* Also, calls the mutation callback for it.
* This tracks all the attributes that were removed.
*
* @since 0.7
*
* @param DOMElement $element The node for which to remove the attribute.
* @param string $attribute The attribute to remove from the node.
* @return void.
*/
public function remove_invalid_attribute( $element, $attribute ) {
$element->removeAttribute( $attribute );
if ( isset( $this->args['remove_invalid_callback'] ) ) {
call_user_func( $this->args['remove_invalid_callback'], $element, AMP_Validation_Utils::ATTRIBUTE_REMOVED, $attribute );
}
}

}
16 changes: 8 additions & 8 deletions includes/sanitizers/class-amp-blacklist-sanitizer.php
Expand Up @@ -73,13 +73,13 @@ private function strip_attributes_recursive( $node, $bad_attributes, $bad_protoc
$attribute = $node->attributes->item( $i );
$attribute_name = strtolower( $attribute->name );
if ( in_array( $attribute_name, $bad_attributes, true ) ) {
$node->removeAttribute( $attribute_name );
$this->remove_invalid_attribute( $node, $attribute_name );
continue;
}

// The on* attributes (like onclick) are a special case.
if ( 0 === stripos( $attribute_name, 'on' ) && 'on' !== $attribute_name ) {
$node->removeAttribute( $attribute_name );
$this->remove_invalid_attribute( $node, $attribute_name );
continue;
} elseif ( 'a' === $node_name ) {
$this->sanitize_a_attribute( $node, $attribute );
Expand Down Expand Up @@ -112,10 +112,10 @@ private function strip_tags( $node, $tag_names ) {
for ( $i = $length - 1; $i >= 0; $i-- ) {
$element = $elements->item( $i );
$parent_node = $element->parentNode;
$parent_node->removeChild( $element );
$this->remove_invalid_child( $element );

if ( 'body' !== $parent_node->nodeName && AMP_DOM_Utils::is_node_empty( $parent_node ) ) {
$parent_node->parentNode->removeChild( $parent_node );
$this->remove_invalid_child( $parent_node );
}
}
}
Expand All @@ -134,13 +134,13 @@ private function sanitize_a_attribute( $node, $attribute ) {
$old_value = $attribute->value;
$new_value = trim( preg_replace( self::PATTERN_REL_WP_ATTACHMENT, '', $old_value ) );
if ( empty( $new_value ) ) {
$node->removeAttribute( $attribute_name );
$this->remove_invalid_attribute( $node, $attribute_name );
} elseif ( $old_value !== $new_value ) {
$node->setAttribute( $attribute_name, $new_value );
}
} elseif ( 'rev' === $attribute_name ) {
// rev removed from HTML5 spec, which was used by Jetpack Markdown.
$node->removeAttribute( $attribute_name );
$this->remove_invalid_attribute( $node, $attribute_name );
} elseif ( 'target' === $attribute_name ) {
// _blank is the only allowed value and it must be lowercase.
// replace _new with _blank and others should simply be removed.
Expand All @@ -150,7 +150,7 @@ private function sanitize_a_attribute( $node, $attribute ) {
$node->setAttribute( $attribute_name, '_blank' );
} else {
// Only _blank is allowed.
$node->removeAttribute( $attribute_name );
$this->remove_invalid_attribute( $node, $attribute_name );
}
}
}
Expand Down Expand Up @@ -219,7 +219,7 @@ private function replace_node_with_children( $node, $bad_attributes, $bad_protoc

// Remove the node from the parent, if defined.
if ( $node->parentNode ) {
$node->parentNode->removeChild( $node );
$this->remove_invalid_child( $node );
}
}

Expand Down
3 changes: 2 additions & 1 deletion includes/sanitizers/class-amp-iframe-sanitizer.php
Expand Up @@ -74,7 +74,7 @@ public function sanitize() {
* @see: https://github.com/ampproject/amphtml/issues/2261
*/
if ( empty( $new_attributes['src'] ) ) {
$node->parentNode->removeChild( $node );
$this->remove_invalid_child( $node );
continue;
}

Expand Down Expand Up @@ -193,4 +193,5 @@ private function build_placeholder( $parent_attributes ) {

return $placeholder_node;
}

}
2 changes: 1 addition & 1 deletion includes/sanitizers/class-amp-img-sanitizer.php
Expand Up @@ -74,7 +74,7 @@ public function sanitize() {
}

if ( ! $node->hasAttribute( 'src' ) || '' === $node->getAttribute( 'src' ) ) {
$node->parentNode->removeChild( $node );
$this->remove_invalid_child( $node );
continue;
}

Expand Down
9 changes: 6 additions & 3 deletions includes/sanitizers/class-amp-tag-and-attribute-sanitizer.php
Expand Up @@ -704,6 +704,9 @@ private function sanitize_disallowed_attributes_in_node( $node, $attr_spec_list

foreach ( $attrs_to_remove as $attr ) {
$node->removeAttributeNode( $attr );
if ( isset( $this->args['remove_invalid_callback'], $attr->name ) ) {
call_user_func( $this->args['remove_invalid_callback'], $node, AMP_Validation_Utils::ATTRIBUTE_REMOVED, $attr->name );
}
}
}

Expand Down Expand Up @@ -809,7 +812,7 @@ private function delegated_sanitize_disallowed_attribute_values_in_node( $node,
( true === $attr_spec_list[ $attr_name ][ AMP_Rule_Spec::VALUE_URL ][ AMP_Rule_Spec::ALLOW_EMPTY ] ) ) {
$node->setAttribute( $attr_name, '' );
} else {
$node->removeAttribute( $attr_name );
$this->remove_invalid_attribute( $node, $attr_name );
}
}
}
Expand Down Expand Up @@ -1448,13 +1451,13 @@ private function remove_node( $node ) {
*/
$parent = $node->parentNode;
if ( $node && $parent ) {
$parent->removeChild( $node );
$this->remove_invalid_child( $node );
}
while ( $parent && ! $parent->hasChildNodes() && $this->root_element !== $parent ) {
$node = $parent;
$parent = $parent->parentNode;
if ( $parent ) {
$parent->removeChild( $node );
$this->remove_invalid_child( $node );
}
}
}
Expand Down
2 changes: 1 addition & 1 deletion includes/sanitizers/class-amp-video-sanitizer.php
Expand Up @@ -93,7 +93,7 @@ public function sanitize() {
* See: https://github.com/ampproject/amphtml/issues/2261
*/
if ( 0 === $new_node->childNodes->length && empty( $new_attributes['src'] ) ) {
$node->parentNode->removeChild( $node );
$this->remove_invalid_child( $node );
} else {
$node->parentNode->replaceChild( $new_node, $node );
}
Expand Down