-
Notifications
You must be signed in to change notification settings - Fork 218
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Guard beacon script against saving not expected values into the database #6599
Comments
AC 1 & 3 would need to be refined I believe, that's quite free to interpretation here. We would really need examples to reproduce, otherwise we won't be able to validate the development, nor assess exactly what we should guard against. Typically, it might not be straightforward to distinguish relative URLs from the rest of possible values in a |
@DahmaniAdame Would you be able to provide steps to reproduce the problem with Chrome extension? |
Use an extension that adds an overlay with an image on it, like HarpaAI for example, and run the script. There should be entries related to the icon used on the overlay. There should be an entry similar to this:
|
Examples of what can be excluded:
About data URI they are accepted as LCP elements. |
I guess there are 2 ways of handling those exceptions: preventing them to be identified in the first place as candidates, or filter out before storing in DB some given patterns. |
@DahmaniAdame @benorfaz what do you think? I'd agree with Mathieu. If so, @DahmaniAdame could you add exact steps for this one? We'd be adding each case separately then. |
Option 1 sounds right. First thing, let's drop the ones with I couldn't find the plugin that served that, but you can use the below for testing: More for external services using node to served adaptive images. I don't have anything on hand for now. We an dismiss it until we have a case.
Install the HarpaAI extension on Chrome and test - https://chromewebstore.google.com/detail/harpa-ai-automation-agent/eanggfilgoajaocelnaflolkadkeghjp?authuser=2
I don't have any example on this one. @piotrbak where did you experience that? |
@DahmaniAdame I experienced it once on the testing website, it wasn't there after cleaning the data anymore. I suspect it'll be reproducible with big div element that's set with background like this. |
@piotrbak can you provide a sample the code for engineering/QA to take it into consideration? |
Scope a solution ✅src/wp-rocket/assets/js/lcp-beacon.jsIn
Estimate the effort ✅[S] |
We can also do the same thing on the backend should in case anything slip through the frontend |
as a final decision, we need to exclude the following urls:
Correct? |
I was thinking of doing that from the php side, exactly here:
not from the JS side @Khadreal |
Okay. I'm thinking we could that on both side. I'll add grooming for the backend |
@DahmaniAdame FYA |
@wp-media/engineering-plugin-team I created this PR #6605 to validate the idea and seems like it's working and covering many cases, can u plz take a look and once the idea is validated I'll fix the tests and may add some more test cases. |
Thanks @wordpressfan ! The safeguard/bail-out with elementInfo and empty( $image->type ) look OK to me.
@DahmaniAdame & @piotrbak, can you confirm that all elements that we should capture for LCP/ATF should always be URLs to a file of image type? Meaning, the URL we store should always be directly and explicitely pointing to a .jpg, .png or something like this? @wordpressfan how does it behave for this image for instance? http://1.gravatar.com/avatar/d7a973c7dab26985da5f961be7b74480?s=120&d=mm&r=g |
Yes, but that only applies to the collected |
@DahmaniAdame so this URL should not make it to the DB? |
@MathieuLamiot this one should be accepted. It's a valid image. The reason why I added .php and .js specifically as part of the exclusions is because they will process images on the fly without caching them. If we preload them, the time needed to process will have negative effects on INP as well. Anything else is fair game. |
@MathieuLamiot checking the mime is still applicable if we have a filter to allow URLs that are still valid but don't have an image mime type, like I would consider them more of an edge case and they will usually come from a specific provider. We can maintain a list of auto exclusions if needed. That way, we are sure that it's a valid image. |
Done, now we have a filter to validate the image src |
Possibly still occurring in some cases: #6814 (comment) |
Before submitting an issue please check that you’ve completed the following steps:
Describe the bug
In specific conditions it's possible that our beacon will add unexpected data to the db:
Expected behavior
We should guard the src against incorrect values. Accepted values are:
http
,https
, possibly with//
Acceptance Criteria (for WP Media team use only)
http
,https
,//
can be added to the database correctlyThe text was updated successfully, but these errors were encountered: