Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for comments #871

Closed
wants to merge 39 commits into from

Conversation

Projects
None yet
3 participants
@DavidCramer
Copy link
Contributor

commented Jan 17, 2018

See #797, #862.

@DavidCramer DavidCramer requested review from westonruter and ThierryA Jan 17, 2018

// Set a target if needed.
if ( ! $node->hasAttribute( 'target' ) ) {
$node->setAttribute( 'target', '_blank' );

This comment has been minimized.

Copy link
@westonruter

westonruter Jan 17, 2018

Member

Per the spec: https://www.ampproject.org/docs/reference/components/amp-form#target

Indicates where to display the form response after submitting the form. The value must be _blank or _top.

The default I think should be _top as that is closer to the default behavior for a form. The actual default is _self but that's not valid AMP. So I think this should be doing:

if ( ! in_array( $node->hasAttribute( 'target' ), array( '_top', '_blank' ) ) ) {
    $node->setAttribute( 'target', '_top' );
// Correct action.
if ( $node->hasAttribute( 'action' ) && ! $node->hasAttribute( 'action-xhr' ) ) {
$action_url = $node->getAttribute( 'action' );
$action_url = str_replace( 'http:', '', $action_url );

This comment has been minimized.

Copy link
@westonruter

westonruter Jan 17, 2018

Member

It's possible that http: appears somewhere else in the URL. So then this should instead use a regex like: preg_replace( '#^http:(?=//)#', '', $action_url ) to ensure it only replaces at the beginning of the string.

$action_url = str_replace( 'http:', '', $action_url );
$node->setAttribute( 'action', $action_url );
if ( 'post' === $node->getAttribute( 'method' ) ) {

This comment has been minimized.

Copy link
@westonruter

westonruter Jan 17, 2018

Member

Per above, this could compare with $method.

continue;
}
if ( ! $node->hasAttribute( 'action' ) || '' === $node->getAttribute( 'action' ) ) {

This comment has been minimized.

Copy link
@westonruter

westonruter Jan 17, 2018

Member

Per the spec: https://www.ampproject.org/docs/reference/components/amp-form#target

This attribute is required for method=GET. For method=POST, the action attribute is invalid, use action-xhr instead.

Is the second part of this conditional intended to check if 'get' === $node->getAttribute( 'method' )? Since the default method is get then I think there should be something above that does:

$method = 'get';
if ( $node->hasAttribute( 'method' ) ) {
    $method = strtolower( $node->getAttribute( 'method' ) );
}

And then use the $method variable going forward.

}
if ( ! $node->hasAttribute( 'action' ) || '' === $node->getAttribute( 'action' ) ) {
$node->parentNode->removeChild( $node );

This comment has been minimized.

Copy link
@westonruter

westonruter Jan 17, 2018

Member

I don't think this is right. If method is get and there is no action, then what this really should do I think is just set the action to be esc_url_raw( 'https://' . $_SERVER['HTTP_HOST'] . $_SERVER['REQUEST_URI'] ), because the default action for a form is the URL that the form appears on.

@@ -35,7 +33,7 @@ public static function get_dom_from_content( $content ) {
* Add utf-8 charset so loadHTML does not have problems parsing it.
* See: http://php.net/manual/en/domdocument.loadhtml.php#78243
*/
$result = $dom->loadHTML( '<html><head><meta http-equiv="content-type" content="text/html; charset=utf-8"></head><body>' . $content . '</body></html>' );
$result = $dom->loadHTML( $document );

This comment has been minimized.

Copy link
@westonruter

westonruter Jan 17, 2018

Member

Changes here are conflicting in develop.

@westonruter

This comment has been minimized.

Copy link
Member

commented Jan 17, 2018

For handling form method=POST submissions I think we can pretty cleanly handle them via https://www.ampproject.org/docs/reference/components/amp-form#redirecting-after-a-submission

Usually a POST request should be accompanied by a redirect. So we should be able to handle that via:

add_filter( 'wp_redirect', function( $location ) {
    $location = wp_sanitize_redirect( $location );
    header( "AMP-Redirect-To: $location" );
    header( "Access-Control-Expose-Headers: AMP-Redirect-To", false );
    return false; // Prevent header( 'Location:' ). I'm not sure this is necessary, or if we can just return $location.
}, 1000 );

Some more examples in general I've found at https://ampbyexample.com/components/amp-form/

// @todo Add more validation checking and potentially the whitelist sanitizer.
$output = $dom->saveHTML();

This comment has been minimized.

Copy link
@westonruter

westonruter Jan 17, 2018

Member

@DavidCramer I chatted about this with @ThierryA, but I think we should avoid having to load the entire response into the DOM for sanitization. I think that we should limit sanitization to just the post content, 3rd-party widgets, and other “leaf nodes” which we know are liable to be invalid AMP.

Is this right? Or am I off track? My worry is that loading the HTML into a DOMDocument for every request will be too heavy. But at the same time, if we're already spinning up a DOMDocument for every instance of the_content() maybe it is actually better to just do one for the entire HTML. This would certainly simplify somethings for sanitizing widgets (cc @kienstra). It could also mean that we wouldn't have to have this ad hoc logic to search for elements requiring components: https://github.com/Automattic/amp-wp/blob/e60cb152a50e2ffbf5504d41aee859ad1d0e0baa/includes/class-amp-theme-support.php#L448-L455

Maybe we should decide to not prematurely optimize and instead keep things simple at first and just go ahead and sanitize the entire response. Thoughts?

This comment has been minimized.

Copy link
@westonruter

westonruter Jan 18, 2018

Member

See findings at #875 (comment)

It seems we should definitely not sanitize the entire output buffer.

This comment has been minimized.

Copy link
@DavidCramer

DavidCramer Jan 18, 2018

Author Contributor

@westonruter whats the consensus now on this? It looks like your findings indicated it may be a good idea to sanitize the whole buffer output. I personally think this is the most efficient. I spoke with @ThierryA about this yesterday and he was going to do some tests, like you did.

This comment has been minimized.

Copy link
@westonruter

westonruter Jan 18, 2018

Member

I'm leaning toward sanitizing the entire output buffer now.

This comment has been minimized.

Copy link
@westonruter

westonruter Jan 18, 2018

Member

@DavidCramer do you want to build off of #875 (comment) in a new PR to serve as a basis for what you're introducing here with form sanitization?

@westonruter

This comment has been minimized.

Copy link
Member

commented Jan 19, 2018

@DavidCramer FYI: There won't need to be a get_scripts() method defined in the form sanitizer once #882 is merged.

@DavidCramer DavidCramer changed the title [WIP] - Add form sanitization for comment submissions [WIP] - Add support for comments Jan 19, 2018

@ThierryA

This comment has been minimized.

Copy link
Collaborator

commented Jan 19, 2018

@DavidCramer #882 was just merged to develop so you can apply @westonruter suggestion here.

$url = substr( $url, 5 );
}
// @todo Identify arguments and make filterable/settable.
$template .= '<amp-list src="' . esc_attr( $url ) . '" height="400" single-item="true" layout="fixed-height">';

This comment has been minimized.

Copy link
@westonruter

westonruter Jan 19, 2018

Member

What about calling the actual comments walker here too and provide the content as placeholder/fallback content? Maybe that is overkill.

This comment has been minimized.

Copy link
@DavidCramer

DavidCramer Jan 19, 2018

Author Contributor

That's not a bad idea. However, it would increase page weight. But setting some kind of fallback is great idea.

DavidCramer added some commits Jan 22, 2018

// Get the action URL.
if ( ! $node->hasAttribute( 'action' ) ) {
$action_url = esc_url_raw( '//' . $_SERVER['HTTP_HOST'] . $_SERVER['REQUEST_URI'] ); // WPCS: ignore. input var okay, sanitization ok.

This comment has been minimized.

Copy link
@DavidCramer

DavidCramer Jan 22, 2018

Author Contributor

@westonruter I don't normally like using $_SERVER if I can avoid it, but this is from your suggestion here: #871 (comment)
I feel this should be a little cleaner, but can't think how. any suggestions?

This comment has been minimized.

Copy link
@westonruter

westonruter Jan 23, 2018

Member

I think this is right what you have here, except that there should a beautiful wp_unslash() call added to it, since $_SERVER vars are auto-slashed for back-compat reasons:

esc_url_raw( '//' . wp_unslash( $_SERVER['HTTP_HOST'] ) . wp_unslash( $_SERVER['REQUEST_URI'] ) );

The one for HTTP_HOST is somewhat overkill.

This comment has been minimized.

Copy link
@DavidCramer

DavidCramer Jan 23, 2018

Author Contributor

agreed on the overkill. but will unslash the request uri.

}
return 'element';
}

This comment has been minimized.

Copy link
@DavidCramer

DavidCramer Jan 22, 2018

Author Contributor

@westonruter component scripts are have custom-element in the script tags, however amp-mustache used for amp-list does not, instead it has custom-template. see https://www.ampproject.org/docs/reference/components/amp-mustache
It looks like this is the only component that's different, however I separated it to its own method in case the spec changes, or additionals are added in future.

This comment has been minimized.

Copy link
@westonruter

westonruter Jan 23, 2018

Member

I don't think get_component_type would be needed if we sanitize the entire body, right?

This comment has been minimized.

Copy link
@DavidCramer

DavidCramer Jan 23, 2018

Author Contributor

@westonruter I think we still do, since this is done after the sanitisation and the scripts are placed in. All the scripts are custom-element but the template is custom-template.

@DavidCramer

This comment has been minimized.

Copy link
Contributor Author

commented Jan 22, 2018

@westonruter Regarding moving the sanitisation from the_content to the output buffer: I needed this in order to have the comments to work, which I have done in my dev, but have not added to this PR. Is that still under discussion or could I do a PR on it based on what I've already got?

@westonruter

This comment has been minimized.

Copy link
Member

commented Jan 23, 2018

@DavidCramer we'll go ahead and sanitize the entire body, or actually, the entire html document. I was also going to work on a PR, so if you can get one up first that is good as well.

$GLOBALS['comment'] = $comment; // WPCS: override ok.
$comment->comment_date = get_comment_date();
$comment->comment_time = get_comment_time();
$comment->comment = amp_get_comments_recursive( $id, $comment->comment_ID );

This comment has been minimized.

Copy link
@westonruter

westonruter Jan 23, 2018

Member

I believe this will have performance problems, as it could result in a lot of DB queries. If you look at wp_list_comments() I believe you can see it gets a flat list of all comments for the post, and then it passes it into Walker::paged_walk() which then goes about identifying which comments are children of other comments.

This comment has been minimized.

Copy link
@westonruter

westonruter Jan 23, 2018

Member

That is, AMP_Comment_Walker:: paged_walk() here.

This comment has been minimized.

Copy link
@DavidCramer

DavidCramer Jan 23, 2018

Author Contributor

@westonruter Agreed. Can revise it to be more efficient.

This comment has been minimized.

Copy link
@DavidCramer

DavidCramer Jan 23, 2018

Author Contributor

@westonruter how about this? b7c5963

@DavidCramer DavidCramer changed the title [WIP] - Add support for comments Add support for comments Jan 24, 2018

@DavidCramer

This comment has been minimized.

Copy link
Contributor Author

commented Jan 25, 2018

@westonruter You can start reviewing this, but there is an issue with the url template strings I need to fix. I'm needing to head off now, but will finish when I'm back. If not by the time you get on, I'll sort it in the morning.

@westonruter

This comment has been minimized.

Copy link
Member

commented Jan 25, 2018

there is an issue with the url template strings I need to fix

Shoot. This is a side effect of switching from saveXML to saveHTML in #891.

Similar to what was done in #895 with AMP_DOM_Utils::convert_amp_bind_attributes(), we need to do a temporary replacement of amp-mustache tokens during parsing and serialization, then swap them back after. I'm working on the change.

@westonruter

This comment has been minimized.

Copy link
Member

commented Jan 26, 2018

We talked it over and we decided to try the route of amp-live-list instead and see how it goes. 🎉

@westonruter westonruter deleted the add/862-support-comment-submissions branch Jul 5, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.