Add "nofollow" rel attribute to links in facets/filters in PUI #1797
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This change asks search engine crawlers not to follow the links in facets. Such filtered search pages are not information sources themselves, and do not have distinct titles, so I don't think it is desirable for them to appear in search results.
Description
Soon after we launched, we noticed Amazonbot was the first crawler to hit our public user interface. It honours robots.txt instructions but does not support sitemaps, so it just spiders through all the links in the HTML. Unfortunately, that means it spent most of its time trying every combination of facets in the collections list, until it reached some kind of preset limit and shut down. Instead of throwing wave after wave of search results pages at it, this change allows it to crawl to every collection (and subject, agent, etc) by following the pagination links, but not waste time following the links in the filters.
Googlebot does not support the rel attribute for internal links, only links out to other web sites, but seems to be smart enough not to follow facet links anyway (or maybe it is because we submit sitemaps via the Google Search Console.) Bingbot does support it. So this is not a definitive solution. But stopping just a few crawlers will reduce some load.
Note that if pull requests #1778 and #1792 are approved, they will increase the number of potential permutations of facets that can be applied.
Related JIRA Ticket or GitHub Issue
N/A
How Has This Been Tested?
This change has been on our production system as part of a local plug-in for six months, with no issues.
Screenshots (if appropriate):
N/A
Types of changes
Checklist: