Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Carousel: issues modifying markup in some environments #13531

Open
jeherve opened this issue Sep 24, 2019 · 3 comments
Open

Carousel: issues modifying markup in some environments #13531

jeherve opened this issue Sep 24, 2019 · 3 comments
Assignees
Labels
[Feature] Carousel A fullscreen modal appearing when clicking on an image in a gallery or tiled gallery. [Pri] High [Status] In Progress [Type] Bug When a feature is broken and / or not performing as intended

Comments

@jeherve
Copy link
Member

jeherve commented Sep 24, 2019

Following #13446 (reverted in #13564), I'm seeing some issues in some environments, where:

I'll need to revise my implementation to take this into account. Here is a summary of the current problems:
https://stackoverflow.com/q/58096669/1045686

I may have to give up on using DOMDocument here after all.

Noting that this is happening on WordPress.com for example, so this is blocking D32610-code.


Original issue: #13428
Related: #13604

@jeherve jeherve added [Type] Bug When a feature is broken and / or not performing as intended [Feature] Carousel A fullscreen modal appearing when clicking on an image in a gallery or tiled gallery. [Pri] BLOCKER labels Sep 24, 2019
@jeherve jeherve added this to the 7.8 milestone Sep 24, 2019
@jeherve jeherve self-assigned this Sep 24, 2019
@mdawaffe
Copy link
Member

When are we looking for <ul>? Content from the block editor?

I ask because the gallery_style filter should always have a <div> opening tag.


A doctype and HTML wrappper is added around the HTML we are modifying.

I'm not sure that LIBXML_HTML_NOIMPLIED is sufficient here. DOMDocument always needs a root node, and the output from the gallery_style filter never has one. (The output from the the_content filter may not have one either.) For example:

$dom->loadHTML( '<style>foo</style><div class="hello">', LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD );
var_dump( $dom->saveHTML() ); // <style>foo<div class="hello"></div></style>

So we should provide a root element, and we should probably make up a fake type of element to protect against losing content if the input is broken in peculiar ways. Perhaps something like:

$tag = "jetpack-" . mt_rand( 10000, 99999 );
$dom->loadHTML( "<$tag>$html</$tag>", LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD );
…
return substr( trim( $dom->saveHTML() ), strlen( $tag ) + 2, -1 * ( strlen( $tag ) + 3 ) );

The div tag we're editing (we are adding a new attribute) also gets autoclosed.

Part of the problem is that we're using this one function (Jetpack_Carousel->add_data_to_container()) in two different contexts:

  1. In the gallery_style filter, which contains only the opening part of the <div>.
  2. In the the_content filter, which contains the full HTML of the post.

For 1, we could run the function, then remove the closing tag.

For 2, we can just run the function as is.


DOMDocument can also do some weird things with character encodings if we're not careful. Let's be sure to test everything with fancy UTF-8 characters and probably some data in other charsets.

@stale
Copy link

stale bot commented Apr 23, 2020

This issue has been marked as stale. This happened because:

  • It has been inactive in the past 6 months.
  • It hasn’t been labeled `[Pri] Blocker`, `[Pri] High`.

No further action is needed. But it's worth checking if this ticket has clear reproduction steps and it is still reproducible. Feel free to close this issue if you think it's not valid anymore — if you do, please add a brief explanation.

@lsl
Copy link
Contributor

lsl commented Aug 14, 2020

Edit, this might not be super useful after all: #13547 (comment) seems like the encoding needs to be known up front?

I recently had to figure something similar out for the block pattern translation stuff we've been doing, this snippet is possibly useful.

libxml_use_internal_errors( true );
$dom = new \DomDocument();
$dom->loadHTML( "<html><body>" . mb_convert_encoding( $post_content, 'HTML-ENTITIES', 'UTF-8' ) . "</body></html>" ), LIBXML_HTML_NODEFDTD );
// do some dom stuff
$new_content = preg_replace( [ '/^<html><body>/', '/<\/body><\/html>$/' ], '', $dom->saveHTML() );
// if no changes made this should pass
assert($new_content === $post_content);

mb_convert_encoding was to prevent the mojibake we were seeing in some attributes. Manually adding and removing the html/body wrap works around how LIBXML_HTML_NOIMPLIED causes a whitespace rewrite which breaks block validation in some cases.

I think you'd still need to close off and then unclose that div in one of the cases but that should be able to work the same way as the html/body wrap.

There may be more to this, I'm not completely sure about the contexts of use here but it's certainly a similar case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[Feature] Carousel A fullscreen modal appearing when clicking on an image in a gallery or tiled gallery. [Pri] High [Status] In Progress [Type] Bug When a feature is broken and / or not performing as intended
Projects
None yet
3 participants