New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

It looks like angular.io may no longer be indexing #21272

Closed
StephenFluin opened this Issue Jan 3, 2018 · 14 comments

Comments

Projects
None yet
4 participants
@StephenFluin
Member

StephenFluin commented Jan 3, 2018

I'm submitting a...


[X] Regression (a behavior that used to work and stopped working in a new release)
[ ] Bug report  
[ ] Feature request
[ ] Documentation issue or request
[ ] Support request => Please do not submit support request here, instead see https://github.com/angular/angular/blob/master/CONTRIBUTING.md#question

Current behavior

Angular.io pages are not included or previewed in the google search index.

https://www.google.com/search?q=Angular+Language+Service+site%3Aangular.io&oq=Angular+Language+Service+site%3Aangular.io

Expected behavior

Angular.io pages are included and previewed in the google search index.

Minimal reproduction of the problem with instructions

Visit https://www.google.com/search?q=Angular+Language+Service+site%3Aangular.io

I've also noted this on several other sites, so it may be something on the Angular side of things.

https://www.google.com/search?q=site%3Afluin.io

Environment


Angular version: 5.x.x
@IgorMinar

This comment has been minimized.

Member

IgorMinar commented Jan 3, 2018

I checked webmaster tools and here is what I see:

On the surface there doesn't seem to be any problem - the number of documents hosted on angular.io that are in the index is about the same as before.

When I however try to fetch the language service doc as the google bot, the rendering fails with error:
[DocViewer]: Error preparing document 'guide/language-service'. and empty document viewer:

screen shot 2018-01-03 at 12 27 26 am

There is no more info about what went wrong.

It's as if the request to fetch the json doc failed, but in that case, we usually render a "network error" page, but that is not happening here. I tried several other urls and most of them suffer from this issue (which would explain the low quality search results).

Example urls that don't render in Google bot:

Example urls that do render in Google bot:

The only thing that pages that do vs do not work have in common is the number of path segments in the url. Single path segment seems to work, multiple don't work. I don't know if this is just a coincidence or not.

Just to be clear: all of these urls were indexable in the past, so something in aio or in google bot must have changed to cause this regression.

@StephenFluin's repro also surfaced two related issues that we should also address:

  • next.angular.io is being indexed, when it shouldn't be (similarly v2.angular.io and v4.angular.io should not be indexed)
  • the 404 pages for some old urls for the previous angular.io site are being indexed as regular documents (we should setup redirects for these + ensure that bogus urls that render as 404 do not end up in the index)
@IgorMinar

This comment has been minimized.

Member

IgorMinar commented Jan 3, 2018

I was thinking that maybe the change that introduced lazy-loaded components could be responsible for this, but /api page indexes just fine and it does use a lazy-loaded component as far as I know.

@arjunyel

This comment has been minimized.

Contributor

arjunyel commented Jan 3, 2018

Would potentially using Rendertron be a good fix for the Googlebot? https://angularfirebase.com/lessons/seo-angular-part-1-rendertron-meta-tags/

@IgorMinar

This comment has been minimized.

Member

IgorMinar commented Jan 3, 2018

@arjunyel angular.io is our proving ground for ensuring that Angular apps work well out of the box, so we want to avoid setting up custom infrastructure to make SEO work. But thanks for the suggestion.

@IgorMinar

This comment has been minimized.

Member

IgorMinar commented Jan 3, 2018

Once we fix this we should change our error logging setup to send errors to some external storage when google bot is crawling the site. General instructions can be found here: https://developers.google.com/search/docs/guides/debug-rendering

gkalpak added a commit to gkalpak/angular that referenced this issue Jan 3, 2018

refactor(aio): use one argument for `DocViewer` error reporting
Pass one argument to `logger.error()` to improve error reporting in
environments that do not handle more than one arguments well (e.g.
Googlebot's web rendering service).

Related to angular#21272.

kara added a commit that referenced this issue Jan 4, 2018

refactor(aio): use one argument for `DocViewer` error reporting (#21293)
Pass one argument to `logger.error()` to improve error reporting in
environments that do not handle more than one arguments well (e.g.
Googlebot's web rendering service).

Related to #21272.

PR Close #21293
@IgorMinar

This comment has been minimized.

Member

IgorMinar commented Jan 4, 2018

ok, @gkalpak's PR is now in and we have more info about the problem. Here is what I got:

[DocViewer] Error preparing document 'guide/quickstart': 
TypeError: Cannot read property 'toLowerCase' of undefined at 
https://angular.io/0.f2591f718907b6ae5a8f.chunk.js:1:196 at Array.forEach (native) at l 
(https://angular.io/0.f2591f718907b6ae5a8f.chunk.js:1:158) at r 
(https://angular.io/0.f2591f718907b6ae5a8f.chunk.js:1:484) at new t
 (https://angular.io/0.f2591f718907b6ae5a8f.chunk.js:1:31662) at Ke 
(https://angular.io/main.bec31ad514fe5bf71717.bundle.js:1:83560) at qe
 (https://angular.io/main.bec31ad514fe5bf71717.bundle.js:1:82013) at In 
(https://angular.io/main.bec31ad514fe5bf71717.bundle.js:1:96674) at xn 
(https://angular.io/main.bec31ad514fe5bf71717.bundle.js:1:95462) at Object.Gn [as createRootView] 
(https://angular.io/main.bec31ad514fe5bf71717.bundle.js:1:102128)
@IgorMinar

This comment has been minimized.

Member

IgorMinar commented Jan 4, 2018

This stacktrace maps back to:

/**
* Get attribute map from element or ElementRef `attributes`.
* Attribute map keys are forced lowercase for case-insensitive lookup.
* @param el The source of the attributes
*/
export function getAttrs(el: HTMLElement | ElementRef): StringMap {
const attrs: NamedNodeMap = el instanceof ElementRef ? el.nativeElement.attributes : el.attributes;
const attrMap = {};
Object.keys(attrs).forEach(key =>
attrMap[(attrs[key] as Attr).name.toLowerCase()] = (attrs[key] as Attr).value);
return attrMap;
}

Specifically line:

attrMap[(attrs[key] as Attr).name.toLowerCase()] = (attrs[key] as Attr).value);

@IgorMinar

This comment has been minimized.

Member

IgorMinar commented Jan 4, 2018

And this path gets invoked when we try to instantiate an embedded component with attributes like:

<code-example path="security/src/app/inner-html-binding.component.html" title="src/app/inner-html-binding.component.html">
</code-example>

Which explains why certain pages work and others don't. Only those that don't have embedded components with attributes work and this are typically the top level pages like most marketing pages.

@IgorMinar

This comment has been minimized.

Member

IgorMinar commented Jan 4, 2018

I debugged this more and found the root cause of the problem. When google bot evaluates:

 Object.keys(attrs)

for

<code-example path="security/src/app/inner-html-binding.component.html"></code-example>

We get an array of ['0', 'length'] while in evergreen browsers we get just ['0'].

When we then try to find the attribute with name length we get undefined and the rest of the code then fails.

IgorMinar added a commit to IgorMinar/angular that referenced this issue Jan 4, 2018

fix(aio): don't use Object.keys on NamedNodeMap to prevent SEO errors
Apparently Object.keys on NamedNodeMap work differently with googlebot :-(

Fixes angular#21272

IgorMinar added a commit to IgorMinar/angular that referenced this issue Jan 4, 2018

fix(aio): don't use Object.keys on NamedNodeMap to prevent SEO errors
Apparently Object.keys on NamedNodeMap work differently with googlebot :-(

There are not tests since we don't have a way to write tests for googlebot.

Fixes angular#21272

IgorMinar added a commit to IgorMinar/angular that referenced this issue Jan 4, 2018

fix(aio): don't use Object.keys on NamedNodeMap to prevent SEO errors
Apparently Object.keys on NamedNodeMap work differently with googlebot :-(

There are not tests since we don't have a way to write tests for googlebot,
but I did manually verify that after this fix googlebot correctly renders
several of the previously broken pages.

Fixes angular#21272
@IgorMinar

This comment has been minimized.

Member

IgorMinar commented Jan 4, 2018

I have a fix #21305

kara added a commit to kara/angular that referenced this issue Jan 4, 2018

refactor(aio): use one argument for `DocViewer` error reporting (angu…
…lar#21293)

Pass one argument to `logger.error()` to improve error reporting in
environments that do not handle more than one arguments well (e.g.
Googlebot's web rendering service).

Related to angular#21272.

PR Close angular#21293

@petebacondarwin petebacondarwin added this to MERGE in aio Jan 4, 2018

kara added a commit that referenced this issue Jan 4, 2018

fix(aio): don't use Object.keys on NamedNodeMap to prevent SEO errors (
…#21305)

Apparently Object.keys on NamedNodeMap work differently with googlebot :-(

There are not tests since we don't have a way to write tests for googlebot,
but I did manually verify that after this fix googlebot correctly renders
several of the previously broken pages.

Fixes #21272

PR Close #21305

@kara kara closed this in ef78538 Jan 4, 2018

@IgorMinar

This comment has been minimized.

Member

IgorMinar commented Jan 4, 2018

The fix is now in production:

screen shot 2018-01-04 at 1 18 43 pm

I requested the reindexing of the site via webmaster tools

@petebacondarwin petebacondarwin removed this from MERGE in aio Jan 5, 2018

@IgorMinar

This comment has been minimized.

Member

IgorMinar commented Jan 5, 2018

@IgorMinar

This comment has been minimized.

Member

IgorMinar commented Jan 6, 2018

I'm doing a bit more of post-mortem on this and what's odd is that the problematic code has been around since forever, 8 months to be precise. And I'm quite certain that it's been in the execution path since the same time:

https://github.com/angular/angular/blame/cdc66f616467e8623e04b49c2a029ba207a073bd/aio/src/app/shared/attribute-utils.ts#L14-L15

screen shot 2018-01-06 at 9 47 24 am

But the search traffic was affected only in mid December:

screen shot 2018-01-06 at 9 49 41 am

Now one could argue that the traffic hit was due to holiday season, but in that case the slump should start later and be less abrupt. So I analyzed the google analytics data and noticed that we saw similar looking curve for over number of sessions at angular.io:

screen shot 2018-01-06 at 9 52 47 am

If we look at the breakdown of traffic sources we'll see that the top 3 sources are organic search, referrals (links from other sites), and direct traffic (people clicking on bookmarks or typing angular.io into the browsers):

screen shot 2018-01-06 at 9 59 45 am

As you can see the search traffic is the most important source of traffic contributing over 70% of all traffic. With such big contribution the search traffic alone can easily skew the overall session count curve.

So let's see what the curve looks like for for sessions originating from "organic search" alone:

screen shot 2018-01-06 at 9 56 57 am

That's a similar looking curve!!

Now what about the curve for sessions acquired via referrals (links):

screen shot 2018-01-06 at 9 56 22 am

That looks very different! With just a little bit of seasonality around holidays.

Now what about the directly acquired sessions:

screen shot 2018-01-06 at 9 56 39 am

That also looks quite normal, with just a bit of seasonality around holidays.

Conclusion

Given that we haven't changed the problematic code in 8 months and that we experienced significant loss of traffic originating from search engines starting around December 11, 2017, I believe that something has changed in crawlers during this period of time which cased most of the site to be de-indexed, which then resulted in the traffic loss.

@MrCroft

This comment has been minimized.

MrCroft commented Mar 3, 2018

I just ran into the same problem.
While I didn't have any Object.keys(obj).forEach() in my test app, seeing your solution made me think about the *ngFor in my templates and went into polyfills.ts and uncomment the following line under /** IE9, IE10 and IE11 requires all of the following polyfills. **/:

import 'core-js/es6/array';

Then, everything showed up fine in the Search Console (under both "how Google sees it" and "how a visitor sees it"), except one listing.
The only difference in that listing was that I was using *ngIf and the index:

<div *ngIf="(favoriteVideos | async)?.items; let videos; else loading">
    <h1>Videos</h1>
    <div class="videos-listing">
        <mat-card *ngFor="let video of videos; index as i">
          <mat-card-content>
            <a [routerLink]="['/video', video.id, video.slug]">
              <img *ngIf="i < 6" [src]="video.image" [alt]="video.name" width="247" height="409">
              <img *ngIf="i >= 6" #lazyImage [attr.data-src]="video.image" [alt]="video.name" width="247" height="409">
            </a>
          </mat-card-content>

          <mat-card-footer>
            ...
          </mat-card-footer>
        </mat-card>
    </div>
</div>

<ng-template #loading>Loading Videos...</ng-template>

The content was showing up under "how Google bot sees it", but wasn't showing up under "how a visitor sees it"

Feeling crazy, I just tried replacing:

<img *ngIf="i < 6" [src]="video.image" [alt]="video.name" width="247" height="409">
<img *ngIf="i >= 6" #lazyImage [attr.data-src]="video.image" [alt]="video.name" width="247" height="409">

with:

<img [src]="video.image" [alt]="video.name" width="247" height="409">

And that did the trick. But I have no clue why? And also, now I have no lazy loading. I mean, even before, I should at least see the first 6 videos under both "How Google bot sees it" and under "How a visitor sees it".

I have to say, in my opinion, it's a bummer for SEO that Google cares so much about how your site looks with JavaScript enabled, even when it gets all the html from the server. Or is this not about SEO?... Maybe this part of the Search Console is only about the user experience.

EDIT: Putting back the lazy-load on the images, doing another build... now I get the expected result. No clue why it didn't work on the first few fetches, but now it does 😮

Angular: 5.2.7

leo6104 added a commit to leo6104/angular that referenced this issue Mar 25, 2018

refactor(aio): use one argument for `DocViewer` error reporting (angu…
…lar#21293)

Pass one argument to `logger.error()` to improve error reporting in
environments that do not handle more than one arguments well (e.g.
Googlebot's web rendering service).

Related to angular#21272.

PR Close angular#21293

leo6104 added a commit to leo6104/angular that referenced this issue Mar 25, 2018

fix(aio): don't use Object.keys on NamedNodeMap to prevent SEO errors (
…angular#21305)

Apparently Object.keys on NamedNodeMap work differently with googlebot :-(

There are not tests since we don't have a way to write tests for googlebot,
but I did manually verify that after this fix googlebot correctly renders
several of the previously broken pages.

Fixes angular#21272

PR Close angular#21305
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment