Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ContentExtractor: remove logic for wp img lazy loading #251

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

Kdecherf
Copy link
Collaborator

I triggered a bug on a page with a wordpress image lazy loading system
based on noscript tags. In the case of this page, two noscript tags were
nested for the same image. The page fetch returned two different results
depending on the libxml version installed on the Debian and Alpine
systems I used for the test.

As far as we already have a logic to move data-lazy-* values in src and
srcset, I consider that the whole logic around the noscript tag is not
needed anymore.

The page used to trigger the bug: https://petapixel.com/2021/01/20/a-face-mask-can-double-as-a-flash-diffuser-for-better-portraits/

A HTML snippet of this page:

<p>
   <img loading="lazy" src="https://petapixel.com/assets/uploads/2021/01/maskback-800x701.jpg" alt width="800" height="701" class="aligncenter size-large wp-image-505005 jetpack-lazy-image" data-lazy-srcset="https://petapixel.com/assets/uploads/2021/01/maskback-800x701.jpg 800w, https://petapixel.com/assets/uploads/2021/01/maskback-320x281.jpg 320w, https://petapixel.com/assets/uploads/2021/01/maskback.jpg 1080w" data-lazy-sizes="(max-width: 800px) 100vw, 800px" data-lazy-src="http://petapixel.com/assets/uploads/2021/01/maskback-800x701.jpg?is-pending-load=1" srcset="data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7">
   <noscript>
      <img loading="lazy"  alt="" width="800" height="701"  data-srcset="https://petapixel.com/assets/uploads/2021/01/maskback-800x701.jpg 800w, https://petapixel.com/assets/uploads/2021/01/maskback-320x281.jpg 320w, https://petapixel.com/assets/uploads/2021/01/maskback.jpg 1080w"  data-src="http://petapixel.com/assets/uploads/2021/01/maskback-800x701.jpg" data-sizes="(max-width: 800px) 100vw, 800px" class="aligncenter size-large wp-image-505005 lazyload" src="data:image/gif;base64,R0lGODlhAQABAAAAACH5BAEKAAEALAAAAAABAAEAAAICTAEAOw==" />
      <noscript>
         <img loading="lazy" src="http://petapixel.com/assets/uploads/2021/01/maskback-800x701.jpg" alt="" width="800" height="701" class="aligncenter size-large wp-image-505005" srcset="https://petapixel.com/assets/uploads/2021/01/maskback-800x701.jpg 800w, https://petapixel.com/assets/uploads/2021/01/maskback-320x281.jpg 320w, https://petapixel.com/assets/uploads/2021/01/maskback.jpg 1080w" sizes="(max-width: 800px) 100vw, 800px" />
      </noscript>
   </noscript>
</p>

This change was enough to fix the issue for this page.
However I'm not fully aware of all use cases for this lazy loading/noscript mechanism so maybe more discussion is needed to ensure that we don't break anything else.

I triggered a bug on a page with a wordpress image lazy loading system
based on noscript tags. In the case of this page, two noscript tags were
nested for the same image. The page fetch returned two different results
depending on the libxml version installed on the Debian and Alpine
systems I used for the test.

As far as we already have a logic to move data-lazy-* values in src and
srcset, I consider that the whole logic around the noscript tag is not
needed anymore.

Signed-off-by: Kevin Decherf <kevin@kdecherf.com>
@coveralls
Copy link

coveralls commented Jan 22, 2021

Coverage Status

Coverage decreased (-0.02%) to 96.479% when pulling a0c464d on Kdecherf:fix/lazy-wp-noscript into e115409 on j0k3r:master.

Signed-off-by: Kevin Decherf <kevin@kdecherf.com>
@jtojnar
Copy link
Collaborator

jtojnar commented Jan 22, 2021

See also #240

@jtojnar
Copy link
Collaborator

jtojnar commented Jan 22, 2021

One issue is that this preserves the noscript tag. Ideally we would remove it:

<p><a href="https://uxmovement.com/category/buttons/" title="Buttons" class="st-term-108 st-accent-bg">Buttons</a></p><div class="st-post-content st-entry st-clr" itemprop="text"><p>A user made the mistake of posting their private information publicly. They had accidentally set their account to public when they wanted it set to private. What inherently caused this serious error wasn’t the user. It was a bad toggle button design.</p><p>The user misperceived the toggle button that controls the public and private modes. The cue for the active state looked inactive to them. Their misperception was caused by the inverted color cues.</p><p>Inverted color cues give the active state a dark or light color, while the inactive one gets the inverse. The problem is that users can interpret either state as active due to their equal color contrast.</p><h2>Equal Color Contrast Ratios</h2><p>Color contrast indicates visual prominence, and prominence indicates an option selection. The active state should have a much higher color contrast than the inactive one. But when you use inverted color cues, it isn’t clear which button is more prominent.</p><p><img data-lazyloaded="1" src="https://uxmovement.com/wp-content/uploads/2021/01/icc-equalcontrastratio.png" class="aligncenter size-full wp-image-32206" alt="" width="510" height="222" /></p><noscript><img class="aligncenter size-full wp-image-32206" src="https://uxmovement.com/wp-content/uploads/2021/01/icc-equalcontrastratio.png" alt="" width="510" height="222" /></noscript><p>When you calculate the color contrast ratio of the toggle buttons, you get the same value. The same value means both buttons are competing for attention, resulting in a tie.</p><p>Some think the solution is to apply a color hue to the active selection. However, this would only cause users to confuse <a href="https://uxmovement.com/buttons/why-toggle-buttons-should-never-look-like-action-buttons/">toggle buttons with action buttons</a>. This confusion occurs because they’re sharing the same cues.<img data-lazyloaded="1" src="https://uxmovement.com/wp-content/uploads/2021/01/icc-cueconfusion.png" class="aligncenter size-full wp-image-32197" alt="" width="510" height="222" /></p><noscript><img class="aligncenter size-full wp-image-32197" src="https://uxmovement.com/wp-content/uploads/2021/01/icc-cueconfusion.png" alt="" width="510" height="222" /></noscript><p>They’re both buttons, but they have different affordances. A toggle button represents selected options and states. An action button performs a process or command on the system. Different affordances need different cues.</p><p>Instead of using inverted colors or color hues, the solution is to use visual depth. Depth cues use spatial contrast to help users distinguish the active state.</p><p><img data-lazyloaded="1" src="https://uxmovement.com/wp-content/uploads/2021/01/icc-cuetouse.png" class="aligncenter size-full wp-image-32198" alt="" width="600" height="150" /></p><noscript><img class="aligncenter size-full wp-image-32198" src="https://uxmovement.com/wp-content/uploads/2021/01/icc-cuetouse.png" alt="" width="600" height="150" /></noscript><p><a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5526975/" target="_blank" rel="noopener">Research shows</a> that elements closer in space have a significant impact on visual working memory. Users place more attention on closer elements than ones further away. Therefore, active toggles should appear closer than inactive ones. To achieve this visual effect, you have to apply surface shading and occlusion to your buttons.</p><h2>Access Full Article</h2><p>The full article goes into detail on how to apply surface shading and occlusion on different background colors. You’ll see more examples and learn design techniques to enhance your depth cues. Subscribe to access the full article.</p><p class="c1"><a class="st-theme-button" href="https://uxmovement.substack.com/subscribe?coupon=8f63b414">Access Full Article</a></p></div><section class="st-related-posts-wrap st-clr"></section>

@Kdecherf
Copy link
Collaborator Author

@jtojnar I don't see it as an issue as in this way it preserves the actual page content.

I'm not sure we should strip this kind of content, any though @j0k3r?

@j0k3r
Copy link
Owner

j0k3r commented Jan 25, 2021

If we didn't remove that noscript tag, we might have duplicated image in the content. Am I right?

@Kdecherf
Copy link
Collaborator Author

If we didn't remove that noscript tag, we might have duplicated image in the content. Am I right?

It depends. Browsers should respect it, thus not showing duplicates.
However we do have duplicates inside the DOM (even if they are not actively shown).

@j0k3r
Copy link
Owner

j0k3r commented Jan 28, 2021

What do you think @jtojnar ? I'm ok to keep the noscript

@jtojnar
Copy link
Collaborator

jtojnar commented Jan 28, 2021

I think it should be removed since otherwise the image might be duplicated (think webview in an app with javascript disabled for security). Could we just add a htmlawed rule removing all noscript elements at the end of processing?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants