-
-
Notifications
You must be signed in to change notification settings - Fork 157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Delete search index entries under more specific conditions #7266
Conversation
Shouldn’t we also check for |
Hmm, do you mean in case a controller only sends this header during a |
Co-authored-by: Martin Auswöger <martin@auswoeger.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Co-authored-by: Leo Feyer <1192057+leofeyer@users.noreply.github.com>
Thank you @fritzmg. |
In a Contao instance of a customer where indexing of protected pages is enabled I noticed the following issue: if the URL of protected and already indexed page is requested without a valid login for that page, the URL gets deleted from the search index.
To fix this I first thought of ignoring the
401
and403
status code in ourSearchIndexListener
for the delete operation. However, then I realized that there are other status codes where the same holds true: a503
status code is only temporary and any URL responding momentarily with that status code should not be removed from the index (neither would Google, they would only remove the URL from the index if that status code persists over a longer period of time).Then I realized the same holds true for the
500
status code. Any error happening under a specific URL might only be temporary and thus the URL should not be removed from the index (neither would Google, they would only remove the URL from the index if that status code persists over a longer period of time).Thus I then decided to completely revamp the conditions under which a URL should be deleted from the index:
404
or410
, always delete from the index.X-Robots-Tag
containsnoindex
, always delete from the index.noindex
, always delete from the index.