Skip to content
This repository has been archived by the owner on Feb 24, 2022. It is now read-only.

Add keywords #22 #43

Merged
merged 1 commit into from Aug 19, 2016
Merged

Add keywords #22 #43

merged 1 commit into from Aug 19, 2016

Conversation

jaredlockhart
Copy link
Collaborator

No description provided.

@coveralls
Copy link

Coverage Status

Coverage remained the same at 100.0% when pulling 72a794a on JaredKerim-Mozilla:22 into 3a6a13b on mozilla:master.

@jaredlockhart jaredlockhart merged commit 813565e into mozilla:master Aug 19, 2016
@@ -37,6 +37,9 @@ const canonicalUrlRules = buildRuleset('url', [
['link[rel="canonical"]', node => node.element.href],
]);

const keywordsRules = buildRuleset('keywords', [
['meta[name="keywords"]', node => node.element.content],
]);
Copy link
Contributor

@pdehaan pdehaan Aug 19, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UPDATE: Filed as #47


Not sure if we want to add support for article:tag, which I've seen a few times in "the wild":

Example: https://www.engadget.com/2016/08/19/the-best-headlamps/
Source:

<meta property="og:url" content="https://www.engadget.com/2016/08/19/the-best-headlamps/">
<meta property="og:title" content="The best headlamps">
<meta property="og:description" content="Go for the Black Diamond Spot.">

<meta property="og:image" content="https://s.aolcdn.com/dims5/amp:7a9ea64b5117cd0b2d3e3df595c52b67aa6a6709/t:1200,630/q:80/?url=https%3A%2F%2Fs.aolcdn.com%2Fhss%2Fstorage%2Fmidas%2F7697ed6dc5ea00ddff3537c34c17dde3%2F204221484%2F01-headlamps-2000.jpg">
<meta property="og:image:width" content="1200">
<meta property="og:image:height" content="630">

<meta property="og:type" content="article">
<meta property="article:tag" content="BlackDiamond">
<meta property="article:tag" content="BlackDiamondRevolt">
<meta property="article:tag" content="BlackDiamondSpot">
<meta property="article:tag" content="CoastFL75">
<meta property="article:tag" content="gadgetry">
<meta property="article:tag" content="gadgets">
<meta property="article:tag" content="gear">
<meta property="article:tag" content="headlamp">
<meta property="article:tag" content="headlamps">
<meta property="article:tag" content="LED Lights">
<meta property="article:tag" content="ONeill">
<meta property="article:tag" content="partner">
<meta property="article:tag" content="Shining Buddy">
<meta property="article:tag" content="syndicated">
<meta property="article:tag" content="The Revolt">
<meta property="article:tag" content="thewirecutter">
<meta property="article:tag" content="Vitchelo">
<meta property="article:tag" content="VitcheloV800">
<meta property="article:tag" content="wirecutter">

Interestingly, I can't even see a <meta name="keywords" /> on that page...

Also, it looks like they do have the same values repeated for swiftype tags:

<meta class="swiftype" name="tags" data-type="string" content="BlackDiamond">
<meta class="swiftype" name="tags" data-type="string" content="BlackDiamondRevolt">
<meta class="swiftype" name="tags" data-type="string" content="BlackDiamondSpot">
<meta class="swiftype" name="tags" data-type="string" content="CoastFL75">
<meta class="swiftype" name="tags" data-type="string" content="gadgetry">
<meta class="swiftype" name="tags" data-type="string" content="gadgets">
<meta class="swiftype" name="tags" data-type="string" content="gear">
<meta class="swiftype" name="tags" data-type="string" content="headlamp">
<meta class="swiftype" name="tags" data-type="string" content="headlamps">
<meta class="swiftype" name="tags" data-type="string" content="LED Lights">
<meta class="swiftype" name="tags" data-type="string" content="ONeill">
<meta class="swiftype" name="tags" data-type="string" content="partner">
<meta class="swiftype" name="tags" data-type="string" content="Shining Buddy">
<meta class="swiftype" name="tags" data-type="string" content="syndicated">
<meta class="swiftype" name="tags" data-type="string" content="The Revolt">
<meta class="swiftype" name="tags" data-type="string" content="thewirecutter">
<meta class="swiftype" name="tags" data-type="string" content="Vitchelo">
<meta class="swiftype" name="tags" data-type="string" content="VitcheloV800">
<meta class="swiftype" name="tags" data-type="string" content="wirecutter">

And again for AMP, using ld+json:

<script type="application/ld+json">
  {
    "@context": "https://schema.org",
    "@type": "Article",
    "url": "https://www.engadget.com/2016/08/19/the-best-headlamps/",
    "author": "The Wirecutter",
    "headline": "The best headlamps",
    "datePublished": "2016-08-19 12:23:00.000000",
    ...
    "articleBody": "...",
    "articleSection": "Gear",
    "keywords": ["BlackDiamond","BlackDiamondRevolt","BlackDiamondSpot","CoastFL75","gadgetry","gadgets","gear","headlamp","headlamps","LED Lights","ONeill","partner","Shining Buddy","syndicated","The Revolt","thewirecutter","Vitchelo","VitcheloV800","wirecutter"],
    ...
    "dateModified": "2016-08-19 12:39:44.000000"
  }
</script>

Not sure if we want to add the latter two right now, or leave those until the amp and swiftype implementation bugs.

But it also brings up a semi-related issue I keep forgetting to ask. Given that OpenGraph and swiftype and others can sometimes have multiple tags that match a ruleset, does Fathom or our parser somehow convert those to an array, or will it just pluck the first value that matches (giving us one keyword, instead of an array of keywords)?

For example, will it work for tags like this:

<meta class="swiftype" name="tags" data-type="string" content="BlackDiamond">
<meta class="swiftype" name="tags" data-type="string" content="BlackDiamondRevolt">
<meta class="swiftype" name="tags" data-type="string" content="BlackDiamondSpot">
<meta class="swiftype" name="tags" data-type="string" content="CoastFL75">
...

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants