Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specify how AMP pages are supposed to be referenced/linked from the their canonical pages #498

Closed
jamesreggio opened this issue Oct 7, 2015 · 32 comments

Comments

@jamesreggio
Copy link

It's possible that I've just missed this detail in the sea of documentation, but it seems to me that there's no clear way to signal the existence of an AMP page from a canonical/ordinary webpage.

The <link rel="canonical"> tag exists for an AMP page to reference its canonical page. How should the canonical page reference its AMP page?

This seems like an important feature for search-engine discovery and optimistic redirection to AMP pages on mobile browsers. I would expect a tag like <link rel="alternate" type="application/amp+html"> or similar to do the trick (though I'm not sure what the AMP MIME type is, if one has been established).

@jmadler jmadler self-assigned this Oct 7, 2015
@jmadler
Copy link
Contributor

jmadler commented Oct 7, 2015

Good point!

FWIW, the MIME type for AMP HTML documents is the same as all other HTML documents: text/html

@Gregable
Copy link
Member

Gregable commented Oct 7, 2015

There is a mechanism:
<link rel="amphtml" href="{amp version}">

I don't see it documented though.

@dvoytenko
Copy link
Contributor

Here's this in many examples, but I don't see a MD for it. We should add.

        The canonical document for this article should be linked, as above.

        The canonical document should also have a corresponding <link> tag
        within pointing at this AMP HTML file: 

          <link rel="amphtml" href="http://example.ampproject.org/article-metadata.amp.html" /> 

        It is possible that this AMP HTML document is the canonical document
        for this article, in which case, the canonical URL should point to this
        document, and no "amphtml" link is required.

@jamesreggio
Copy link
Author

Ah, excellent. That answers my question.

I'll leave this issue open as a reminder that the documentation can/should be improved.

@dvoytenko
Copy link
Contributor

Yes, please keep open. We will fix.

@Meggin
Copy link
Contributor

Meggin commented Oct 7, 2015

Jordan, do you want to re-assign this one to me? I can make the docs better.

@cramforce
Copy link
Member

Lets fix the spec first. Then the docs.

@kevinmarks
Copy link

rel="alternate" is more accurate; rel="alternate" type="text/html" media="handheld" is a noted example

http://microformats.org/wiki/rel-alternate#With_media

if you do want to create rel="amphtml" please use the registry here:

http://microformats.org/wiki/existing-rel-values#HTML5_link_type_extensions

@cramforce
Copy link
Member

I'm kind of on the fence here. We'd been using amphtml but it is not impossible to change.

How would the rel=alternate look examle

type="text/html+amp" ?

Don't really want to introduce a new mime type :)

@adactio
Copy link

adactio commented Oct 8, 2015

Rel values can be combined (space separated) so how about allowing both:

link rel="amphtml" href="/path/to/amp.html"

and

link rel="alternate amphtml" href="/path/to/amp.html"

@veganstraightedge
Copy link

👍 @adactio's suggestion.

@jamesreggio
Copy link
Author

I'm actually somewhat opposed to the multiple rels approach recommended by @adactio, if only because I'd imagine a lot of parsers are not written to handle that case well. (I've never seen multiple rels used before in the wild.)

@cramforce
Copy link
Member

I wonder if there is any concrete benefit of rel=alternate? It seems any application that would prefer AMP HTML would need to have a clear notion of AMP anyway.

@adactio
Copy link

adactio commented Oct 12, 2015

@cramforce It’s more for the other way around: aggregators gathering multiple alternates e.g. a JSON version, an RDF version, an AMP version.

@jamesreggio You have led a very sheltered existence. :-) Space separated rel values are very much the norm, and anyone writing a parser is aware of that.

@cramforce
Copy link
Member

Filed internal bug at Google to check on whether we support the space separated version for discovering AMP documents already.

Tagged this bug "discovery" for everyone else following along on spec changes related to finding AMP documents.

@julien51
Copy link

@jamesreggio Agreed with @adactio this is atcually pretty common in the RSS world...

@julien51
Copy link

And talking about RSS, it would be nice if one of the recommendations was to also include this <atom:link rel="amphtml" ... /> into the feed's entries (RSS or Atom) themselves to avoid fetching both the HTML and the AMP when trying to poll resources from a feed.

Something like this:

<item>
    <title>AMPed up</title>
    <link>https://adactio.com/journal/9646</link>
    <atom:link href="https://adactio.com/journal/9646/amp" rel="amphtml" />
    <description>
        <![CDATA[
        ...
        ]]>
    </description>
    <pubDate>Sat, 10 Oct 2015 15:02:39 GMT</pubDate>
    <guid>https://adactio.com/journal/9646</guid>
</item>

@Gregable
Copy link
Member

I don't know if we will accept the space separated version on a non-amp page referring to an amp page, but we don't validate space separated rel values in an AMP document currently, and we should.

@joshcp
Copy link

joshcp commented Nov 12, 2015

From the perspective of parsing the canonical page to find an AMP version, would a <link> inserted via JavaScript be acceptable?

@cramforce
Copy link
Member

@joshcp Very good question!
Pinging @amplesample

@julien51
Copy link

Hum. My hunch would be that it would be quite ambitious to expect everyone consuming HTML pages to be able to execute Javascript to identify an AMP version of the document. After all, even today, Google isn't 100% able to execute javascript from all pages it crawls.

@joshcp
Copy link

joshcp commented Nov 12, 2015

@julien51 my expectation wouldn't be that everyone consuming HTML pages would execute JavaScript. The HTML page is canonical, so if a visitor doesn't execute JS they still get served the content. I see AMP as an enhancement of user experience, not core to the user experience.

The question is more with regard to Google's parser in particular, and generally speaking how future implementations of AMP consuming apps are expected to work.

It seems like there's an expectation that apps displaying AMP pages will be able to execute JavaScript. Also, Google is able to detect JavaScript-adaptive mobile website configurations, so I would hope a JS-inserted AMP <link> would be detected by Google and other future AMP parsers.

@julien51
Copy link

Well, for what it's worth I run an API which aims at being able to consume AMP pages but is completely unable to execute JavaScript. Putting JS as a way to detect/identify AMP pages would completely block us from doing so.

I'm not 100% sure what the benefit is to allow JS to specify AMP resources, compared to other commonly used methods such as <link> elements and Link headers for example.

@cramforce
Copy link
Member

Pretty much agree with everyone here. Lets just decide that JS injection is not OK. We want AMP to be usable by lots of platforms and while Google might be able to handle JS not everyone can.

@joshcp
Copy link

joshcp commented Nov 12, 2015

@julien51 according to the link you shared, your app would still be able to consume the content via HTML. If a <link> is inserted into the HTML via JavaScript, that is an enhancement to the experience made available to parsers that execute JavaScript.

@cramforce For what it's worth, I run a Javascript-adaptive mobile platform that would like to be able to inject a <link> to an external AMP page on the canonical page. Why exclude us from being able to generate AMP versions for those parsers that can handle it? And for parsers that can't, there's still the HTML fallback.

@cramforce
Copy link
Member

@joshcp I would not say that crawlers should ignore JS based injection of the meta tag, but I would not mandate support for it. Do you use fragment URLs or actual URLs (based on pushState) for your permalinks?

@joshcp
Copy link

joshcp commented Nov 12, 2015

@cramforce our platform generates mobile views of existing websites, using a small JS snippet to reconfigure the page for mobile (for example, load only mobile-specific images and CSS). We're not generating URLs with JS, just re-configuring the content at any given URL. The plaform works with both actual and fragment URLs.

@joshcp
Copy link

joshcp commented Nov 12, 2015

@cramforce to clarify, are you saying that JS injection of the meta tag is OK, and crawlers that support JS should recognize that there's a <link>? Acknowledging that not all crawlers/parsers support JS?

@cramforce
Copy link
Member

@joshcp We're happy AMP pages to be discovered any way they can. But I would not expect the JS solution for work with most AMP platforms.

@Gregable
Copy link
Member

Gregable commented Dec 9, 2015

The validator now supports whitelist separated values in the link tag's rel attribute.

@adactio
Copy link

adactio commented Dec 9, 2015

👍

@niutech
Copy link
Contributor

niutech commented Apr 28, 2017

What I have recently suggested on the AMP HTML Discuss forum is to add the Link: <....>; rel="amphtml" response header alongside the existing <link href="..." rel="amphtml"> tag, so that a user agent does not have to download the whole response body in order to redirect to the AMP version of a web page. Please consider adding it to the spec.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests