Skip to content
This repository has been archived by the owner on Aug 3, 2024. It is now read-only.

Add canonical tag to homepage for SEO #1367

Closed
AriFordsham opened this issue Mar 25, 2021 · 3 comments
Closed

Add canonical tag to homepage for SEO #1367

AriFordsham opened this issue Mar 25, 2021 · 3 comments

Comments

@AriFordsham
Copy link

AriFordsham commented Mar 25, 2021

Proposal

Add the cabal homepage attribute as rel=canonical to the contents page of Haddocks.

Discussion

Where should a Google search for a package take you?

  1. To the package Hackage?
  2. To the project homepage (most often the Github repo)?

If the answer is (2), (which is my opinion, since homepges tend to contain fuller descriptions and point to further resources), then adding a canonical tag should help search engines pick this up.

@gbaz
Copy link
Contributor

gbaz commented Mar 25, 2021

This is a misunderstanding of the canonical tag. When there are multiple "copies" of the exact same page (e.g. if a page appears under two subdomains) then the canonical tag is supposed to disambiguate which of those pages is the "canonical" one, and which the copy.

The canonical tag is not supposed to point to a page with different contents, and using it like this will only confuse search engines, not help them.

@AriFordsham
Copy link
Author

AriFordsham commented Mar 25, 2021

@gbaz Yes and no.
Canonicals are generally used when the content is largely the same, even on nonidentical pages e.g. cross posts of a blog post on different sites (with different styling, headers etc.)

Google Search docs considers " content you provide on a blog for syndication to other sites is replicated in part or in full on those domains" as a legitimate use-case for rel=canonical.

Many Haddock contents will contain a README which is the same as the Github or homepage, altough others may differ.

But I agree to substantially to your point, it may not be wise since it won't be uniformly correct.

Are there any other SEO techniques that will suport the correct resolution of this question, whatever it is?

@AriFordsham
Copy link
Author

Sorry, canonical is described in RFC 6596 where it is described as "to designate an Internationalized Resource Identifier (IRI) as preferred over resources with duplicative content."

The following excerpts are relevant:

  1. The Canonical Link Relation

The target (canonical) IRI MUST identify content that is either
duplicative or a superset of the content at the context (referring)
IRI.

  • As an example, each component page (e.g., page-1.html, page-
    2.html) of a multi-page article MAY specify the "view-all"
    version (e.g., page-all.html), the superset of their content,
    as the target IRI. This is because the content from each
    component page is contained within the view-all version. Given
    this implementation, applications can mark page-1.html and
    page-2.html as duplicates of page-all.html, process content
    only from page-all.html, and disregard the component pages.
    All references can then be made to the view-all version (page-
    all.html, the target IRI), and no content will have been lost
    in this process.

  • Using the same example above, page-2.html SHOULD NOT designate page-1.html as the target (canonical) IRI because this may
    cause a loss of data. When page-2.html designates page-1.html
    as the canonical, only content from the target IRI, page-
    1.html, will be processed. page-2.html may be marked as a
    duplicate of page-1.html and its content disregarded.

  1. Recommendations

Before adding the canonical link relation, verification of the
following is RECOMMENDED:

  1. The content of the context IRI is duplicated within the content
    of the target (canonical) IRI.

So while Haddock contents might comply, they also might not, so including this in Haddock is probably a bad idea.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants