Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

next.angular.io is being indexed by google in spite of robots.txt disallowing it #21313

Closed
IgorMinar opened this issue Jan 4, 2018 · 7 comments

Comments

@IgorMinar
Copy link
Contributor

IgorMinar commented Jan 4, 2018

example query: https://www.google.com/search?q=language+service+site%3Aangular.io

current result:

screen shot 2018-01-04 at 10 21 30 am

note: this issue was discovered as part of the investigation into #21272

@IgorMinar
Copy link
Contributor Author

it seems that only the root url next.angular.io is being indexed. nothing else is: https://www.google.com/search?q=angular+site%3Anext.angular.io

Maybe this is the expected behavior?

@IgorMinar
Copy link
Contributor Author

I think I know what's going on. This note found in the noindex meta tag docs explains it:

Important! For the noindex meta tag to be effective, the page must not be blocked by a robots.txt file. If the page is blocked by a robots.txt file, the crawler will never see the noindex tag, and the page can still appear in search results, for example if other pages link to it.

We do block next.angular.io via robots.txt, so the robot does know that the url exists, but it's not allowed to index it - that's why there is no search snippet under the result ("No information is available for this page").

A better solution would be to remove the robots.txt restriction and instead use noindex metatag on all pages. This solution is very much compatible with the 2. solution proposed to resolve the 404 pages being indexed issue.

@gkalpak
Copy link
Member

gkalpak commented Jan 12, 2018

You mean remove the noindex metatag on angular.io, but leave it on next.angular.io?

@IgorMinar
Copy link
Contributor Author

@gkalpak yes

@ngbot ngbot bot added this to the Backlog milestone Jan 23, 2018
@ngbot ngbot bot modified the milestones: Backlog, needsTriage Feb 26, 2018
@petebacondarwin
Copy link
Member

This does not yet appear resolved, although the search results for next.angular.io now has the Angular Elements page instead of the Language Service page.

@IgorMinar
Copy link
Contributor Author

resolved - no longer a problem! closing!

@angular-automatic-lock-bot
Copy link

This issue has been automatically locked due to inactivity.
Please file a new issue if you are encountering a similar or related problem.

Read more about our automatic conversation locking policy.

This action has been performed automatically by a bot.

@angular-automatic-lock-bot angular-automatic-lock-bot bot locked and limited conversation to collaborators Jun 1, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
docs-infra
BACKLOG
Development

No branches or pull requests

3 participants