Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Signed Exchanges: understand AMP runtime's needs for serving origin identity #20359

Closed
Gregable opened this issue Jan 16, 2019 · 10 comments
Closed

Comments

@Gregable
Copy link
Member

Gregable commented Jan 16, 2019

@ampproject/wg-ui-and-a11y

With Signed Exchanges, AMP documents will be operating on the publisher origin, even if the document is served from the AMP Cache. By default, the AMP runtime will not be able to determine if the document was served by the AMP cache or the Publisher.

The open question is if the runtime makes any decisions based on the origin being *.cdn.ampproject.org, etc. A hypothetical example could be rewriting images in amp-bind with a cdn.ampproject.org cache URL.

If the AMP runtime does not make any such decisions, this issue can simply be closed. If it does, signed exchanges can provide metadata in the document which the runtime can use to make that decision. We should determine what the syntax of this metadata should be.

@honeybadgerdontcare for the syntax of the <html amp transformed=> attribute which may be sufficient.

If not, we can also do something like:

<meta name="amp-cache-origin" content="cdn.ampproject.org">

@honeybadgerdontcare
Copy link
Contributor

The AMP Packager transformer transformerIdentifier.go will add the attribute transformed with value google;v=1 to the <html> tag. The value is validated by the AMP Validator.

This follows the form of the AMP-Cache-Transform header. It explicitly states the CDN performing the transform and the transformer version used.

E.g. <html amp transformed="google;v=1">

@twifkak
Copy link
Member

twifkak commented Jan 16, 2019

(Nit: It differs from the AMP-Cache-Transform header in that the version number isn't quoted.)

@honeybadgerdontcare
Copy link
Contributor

It is worth noting that the transformers used in version one include server side rendering transformer which also has hints to the Runtime that server side rendering was performed (i-amphtml-layout on <html> tag) and for inline AMP CSS transformer (<style amp-runtime i-amphtml-version=###>).

@ampprojectbot ampprojectbot added this to the Pending Triage milestone Jan 24, 2019
@dreamofabear dreamofabear assigned jridgewell and unassigned aghassemi Jan 28, 2019
@dreamofabear
Copy link

/to @jridgewell

@alanorozco
Copy link
Member

My 2c: There is a manageable number of hard origin checks throughout the runtime. In many cases this is just a proxy signal for whether the document is being loaded in a Google viewer context, so the proposed API should be sufficient.

@Gregable
Copy link
Member Author

Do you have thoughts on which API would be preferred? Is transformed=google sufficient, or should the cache provide the actual origin?

@dreamofabear
Copy link

Currently there are several configurable data (including origin) that other caches can theoretically override:

amphtml/src/config.js

Lines 24 to 47 in 6f63baf

const env = self.AMP_CONFIG || {};
const thirdPartyFrameRegex = typeof env['thirdPartyFrameRegex'] == 'string' ?
new RegExp(env['thirdPartyFrameRegex']) : env['thirdPartyFrameRegex'];
const cdnProxyRegex = typeof env['cdnProxyRegex'] == 'string' ?
new RegExp(env['cdnProxyRegex']) : env['cdnProxyRegex'];
/** @type {!Object<string, string|boolean|RegExp>} */
export const urls = {
thirdParty: env['thirdPartyUrl'] || 'https://3p.ampproject.net',
thirdPartyFrameHost: env['thirdPartyFrameHost'] || 'ampproject.net',
thirdPartyFrameRegex: thirdPartyFrameRegex || /^d-\d+\.ampproject\.net$/,
cdn: env['cdnUrl'] || 'https://cdn.ampproject.org',
/* Note that cdnProxyRegex is only ever checked against origins
* (proto://host[:port]) so does not need to consider path
*/
cdnProxyRegex: cdnProxyRegex ||
/^https:\/\/([a-zA-Z0-9_-]+\.)?cdn\.ampproject\.org$/,
localhostRegex: /^https?:\/\/localhost(:\d+)?$/,
errorReporting: env['errorReportingUrl'] ||
'https://amp-error-reporting.appspot.com/r',
localDev: env['localDev'] || false,
};

There are also a few places where cdn.ampproject.org logic is hardcoded e.g. parsing and generating cache URLs.

Providing the cache origin would be nice (obviates cdnUrl above) but insufficient to decouple runtime from cache. If that's not necessary for SxG then transformed=google should suffice.

@Gregable
Copy link
Member Author

I see. Well, presumably other caches could also provide their own javascript URL on the documents as well.

I then propose that we stick with transformed=google... for now until we have a real use case to consider. Some of these decisions depend on the number of caches we'd need to support. For example, if there are only 2 or 3, then it's reasonable to encode these parameters into every page. If there are hundreds, then we would not want to bloat the javascript file size with the configs.

Wherever we have the hard origin checks, we should replace that with a call that checks against transformed=google.... Note that the AMP cache sets this already, so it should be sufficient to only consider this html attribute.

@honeybadgerdontcare
Copy link
Contributor

It may be worth noting that eventually the concept of transformed AMP will come to pages served at origin. The actual attribute value hasn't been determined yet (e.g. transformed=self or transformed=amp-toolkit). When these are seen by an AMP cache they can choose to ingest them and apply their own transforms on top of them causing the final attribute value to be that of the cache. In Google's case transformed=self would become transformed=google;v=#.

Also, an AMP page served from origin could declare transformed=google;v=#. So whatever the AMP runtime does in the cases where it thinks it's from an AMP cache, it'll need to be something that doesn't fully break the page.

@Gregable
Copy link
Member Author

We discussed this for ~5 minutes offline and concluded that for now transformd=google... is a sufficient marker for AMP runtime.

The other suggestion that was raised is that other caches could implement these modifications in the future by mandating an alternative source for v0.js which makes the necessary changes to the runtime config.

Closing this issue. @jridgewell for filing any issues that for runtime changes that may be needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants