Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Google search results for mithril.js.org show html vomit (see picture) #2114

Closed
kylebakerio opened this issue Apr 3, 2018 · 15 comments
Closed
Labels
Area: Documentation For anything dealing mainly with the documentation itself Type: Bug For bugs and any other unexpected breakage
Projects

Comments

@kylebakerio
Copy link

google-result-mithril

(Image should be entirely self-explanatory, title then some.)

@pygy
Copy link
Member

pygy commented Apr 3, 2018

Thanks, it looks like we dont serve valid HTML.

https://validator.w3.org/check?uri=https%3A%2F%2Fmithril.js.org&charset=%28detect+automatically%29&doctype=Inline&group=0

Adding a doctype would be a good start...

Indeed, there are far fewer errors when parsing in HTML5 mode: https://validator.w3.org/nu/?doc=https%3A%2F%2Fmithril.js.org%2F

@pygy pygy added Type: Bug For bugs and any other unexpected breakage Area: Documentation For anything dealing mainly with the documentation itself labels Apr 3, 2018
@codeclown
Copy link
Contributor

That's just the code example from the page, Google happened to pick it up as the page description. The way to affect that would be to add a <meta name="description" content="..."> tag on the page.

@pygy
Copy link
Member

pygy commented Apr 10, 2018

Good catch @codeclown!

@kylebakerio
Copy link
Author

So, where is the repo for that file so we can add it? This bug is still present, just checked.

@pygy
Copy link
Member

pygy commented Apr 26, 2018

https://github.com/MithrilJS/mithril.js/blob/next/docs/layout.html

The trouble is that may then get the same description for all pages (Google doesn't always honour meta tags). I'd need to test this more, but IIRC the home page is the only one that shows up in search results so it doesn't matter much, but we'd need some systematic testing to be sure.

@orbitbot
Copy link
Member

Currently when searching f.e. site:mithril.js.org keys you'll get the start of actual page body content, don't know if the meta description would affect this. If not, then just making it reasonably generic would probably work.

@kylebakerio
Copy link
Author

kylebakerio commented Apr 26, 2018

https://support.google.com/webmasters/answer/35624?hl=en

Make sure that every page on your site has a meta description.

Differentiate the descriptions for different pages.

Programmatically generate descriptions.

You can, alternatively, prevent snippets from being created and shown for your site in Search results. Use the tag to prevent Google from displaying a snippet for your page in Search results.

^ several good tips from google here. We could programatically generate the meta tags to use the opening content on documentation pages. We could also turn off snippets otherwise as an easier solution for a specific page.

I'm open to doing this if it's welcome.

@tivac
Copy link
Contributor

tivac commented Apr 27, 2018

Go for it, @finetype!

@kylebakerio
Copy link
Author

kylebakerio commented May 6, 2018

Finally sat down to work on this today.

It would be "easy" to add snippets programatically, except for one problem: how to grab the right piece of text? The doc files aren't sufficiently standardized, unfortunately. Some ideal descriptions are under the first ---, some are under the first ##, most are under ###, and some are directly underneath the navigation table, above the ---. Any attempt to programatically extract a snippet from these docs will result in something pretty brittle and unwieldy.

I propose one of three solutions:

  1. Add a "meta description" section to the bottom of all the files. (Proposal: I can just copy some meaningful chunk of text by hand into such a section and put it under a #### Meta Description header on each doc).
  2. We could only add a custom snippet (or remove snippets) for that main index.md file that is causing the problem specifically mentioned in this bug report--that'd probably be the 80/20 solution.
  3. Remove snippets from the docs altogether with <meta name="nosnippets">. (This would fix the problem, would be nearly effortless, and would prevent this problem from happening on other docs as well, but then we don't get snippets at all, which are nice when they work.)

I'd prefer to go forward with (1), but would like some feedback before going that route.

@leothorp
Copy link
Contributor

leothorp commented May 6, 2018 via email

@tivac
Copy link
Contributor

tivac commented May 6, 2018

I'm for 3, being that it's the most obvious win.

@kylebakerio
Copy link
Author

Heh. I'll just wait for more responses, I guess? lol. That's for that tiebreaker, @pygy. ;P

@kylebakerio
Copy link
Author

kylebakerio commented May 6, 2018

I re-ran the html validator--a lot of its error messages were because it was evaluating it as if it were HTML 4.1, but if you switch it to the HTML5 mode, we get a much nicer bit of output: https://validator.w3.org/nu/?showsource=yes&doc=https%3A%2F%2Fmithril.js.org%2F

I have added a doctype and a lang="en" to the html tag, as well as alt text to the logo, but the extra "p" closing tags are interesting... My best guess is that those are being inserted by the marked library erroneously, but there is some off custom handling around code blocks that may be causing it. If you look at the source we generate, the incorrect closing </p> tags are all after code blocks on that page.

@kylebakerio
Copy link
Author

kylebakerio commented May 6, 2018

Huh... maybe that's a bug in the validator (which is acknowledged as experimental). I see two

tags, one inside another. While that's odd, it seems correct to have two closing tags...

Looking at it now, I think the problem is in marked, though. Those inner p tags are written directly as p tags within the markdown, e.g. index.md, and the outer layer of p tags are probably added by marked around that.

Doesn't seem to cause any real issues, though, it seems, fwiw.

@dead-claudia dead-claudia added this to Needs triage in Triage/bugs via automation Oct 28, 2018
@dead-claudia dead-claudia moved this from Needs triage to High priority in Triage/bugs Oct 28, 2018
@StephanHoyer
Copy link
Member

seems no issue anymore

Triage/bugs automation moved this from High priority to Closed Feb 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: Documentation For anything dealing mainly with the documentation itself Type: Bug For bugs and any other unexpected breakage
Projects
Development

No branches or pull requests

7 participants