Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Design Review 2020-07-15 15:30 UTC (cross origin iframe, wg-caching) #28471

Closed
mrjoro opened this issue May 18, 2020 · 16 comments
Closed

Design Review 2020-07-15 15:30 UTC (cross origin iframe, wg-caching) #28471

mrjoro opened this issue May 18, 2020 · 16 comments
Assignees
Labels
Type: Design Review WG: Caching: Triaged Initial triage from wg-caching complete. Remove label if new input required. WG: caching
Milestone

Comments

@mrjoro
Copy link
Member

mrjoro commented May 18, 2020

Time: 2020-07-15 15:30 UTC (add to Google Calendar)
Location: Video conference via Google Meet

The AMP community holds weekly engineering design reviews. We encourage everyone in the community to participate in these design reviews.

If you are interested in bringing your design to design review, read the design review documentation and add a link to your design doc or issue by the Monday before your design review.

When attending a design review please read through the designs before the design review starts. This allows us to spend more time on discussion of the design.

We rotate our design review between times that work better for different parts of the world as described in our design review documentation, but you are welcome to attend any design review. If you cannot make any of the design reviews but have a design to discuss please let mrjoro@ know on Slack and we will find a time that works for you.

@mrjoro mrjoro added this to the Docs Updates milestone May 18, 2020
@mrjoro mrjoro self-assigned this May 18, 2020
@mrjoro
Copy link
Member Author

mrjoro commented May 18, 2020

The Validation & Caching Working Group will be presenting their periodic update at this Design Review (10-15 minutes).

/cc @ampproject/wg-caching @Gregable

@Gregable Gregable added WG: caching WG: Caching: Triaged Initial triage from wg-caching complete. Remove label if new input required. labels May 18, 2020
@zhouyx
Copy link
Contributor

zhouyx commented Jul 14, 2020

I'd like to discuss cross origin iframe usage in AMP.
This doc describes the current CORS iframe usage in AMP, and the proposal.

@lannka
Copy link
Contributor

lannka commented Jul 15, 2020

Notes:

Yuxuan: propose a guideline to require special approval for all new 3rd party background iframe usages in AMP.

Yuxuan open question: is WebWorker a good alternative? any security consideration?

Dima: is CORS worker allowed? we might need to share workers. Can be amp-script an alternative ?
Yuxuan: is it designed for background script? (no user interaction)
Dima: need a protocol for an invisible amp-script to interact with the page.
Devin (was that Devin?): amp-script already has all the security is already (meta tag SHA check)

Will: how to measure the impact of cross-origin ? Yuxuan: hard to measure
Will: do cross-origin iframe run in different process. Dima: not guaranteed most browsers do. browsers has web worker limit

Dima: alternatively, we can queue all 3rd party js in one single iframe / worker, and somehow make sure it run in a separate process.
Dima: we could use friendly iframe w/o origin to emulate foreign worker
Hongfei: that solves security concern, but compromises performance?

Will: step back, the questions are: 1) where 3rd party js should run. 2) are the run all useful. like amp-analytics, we could have different vendors to share js.

Next steps:

  • Recommend Permutive to not use iframe.
  • More research on the Dima's iframe/worker queue idea, and finalize the guideline.

@twifkak
Copy link
Member

twifkak commented Jul 15, 2020

Devin (was that Devin?): amp-script already has all the security is already (meta tag SHA check)

Yes, it was me. I was noting the required script hash for cross-origin amp-script (added in #23691). If talking about making components use amp-script as a backend, then it seems like the spiritual analogue here would be hard-coding the script hash into the amphtml repo. This means any updates to the script would have to be at a new URL, followed by a commit of the new URL and hash to the amphtml repo. This eliminates the need to unobfuscate the code, but doesn't allow for faster release cycles than AMP. But maybe my spiritual analogue is incorrect.

@zhouyx
Copy link
Contributor

zhouyx commented Jul 20, 2020

Thanks all.

I looked at the script hash requirement for cross-origin amp-script, based on #23691 (comment) that's added to mitigate use of amp-script as a script gadget.
Personally I think the fact that the script source is fixed based on the extension or the publisher's source origin already mitigate such concern a bit. Because one cannot use the component to load their own script.
@joshfg what do you think about the script hash requirement and the release cycle restriction?

@dvoytenko I did some research regarding running iframe in separate thread, the closet thing I can find is Chrome's site isolation project, which push iframe from different domains to new processor. From what I learnt, the feature was introduced for security reason, and is only partially enabled to Android Chrome due to memory usage concerns.

@joshfg
Copy link
Contributor

joshfg commented Jul 21, 2020

@joshfg what do you think about the script hash requirement and the release cycle restriction?

Unfortunately I don't think either of these would work for our use case.

Just to give some background for anyone else reading: we have an open design proposal for an AMP component, which we've been discussing with Yuxuan. We think that it would be an option for us to open source most of our JS SDK in an AMP component, however there is a dynamic part of our script which we build for each publisher. This means that when a publisher deploys our component, they'd need to pass their Permutive publisher ID as configuration, and our component would load the dynamic part from our CDN (e.g. from https://cdn.permutive.com/<publisher_id>.js). This dynamic part is an important aspect of our product: it enables us to decision using data on-device which is better for speed, privacy and reducing the amount of data that needs to be sent into the cloud.

For our use case, I see two problems with the script hash requirement & release cycle restriction:

  1. The dynamic part of our script is per publisher, it would be infeasible for us to maintain a complete list of script hashes (for every publisher who is a customer) in our component in the AMP repo.
  2. This publisher specific script is highly dynamic. We rebuild it for the publisher, whenever the publisher changes their configuration in our dashboard. This can happen very frequently (e.g. daily, or more). We need these changes to go live quickly for our product to be effective, so waiting 1-2 weeks for the next AMP release wouldn't work here.

What are these security measures intending to achieve? If we were to have our own AMP component loading this script, the origin would be fixed in the component itself (to cdn.permutive.com) so it should not be possible to load a script from another origin.

@twifkak
Copy link
Member

twifkak commented Jul 21, 2020

Thanks for clarifying the use-case. It's possible that, as @zhouyx says, the script gadget concerns are mitigated elsewise; I haven't read up on the details yet (nor should my saying "yes this seems secure" be considered authoritative as I'm not an expert in this area -- cue the old saying about proof of absence vs absence of proof). I just wanted to point out something worth considering.

@dvoytenko
Copy link
Contributor

@zhouyx

I did some research regarding running iframe in separate thread, the closet thing I can find is Chrome's site isolation project, which push iframe from different domains to new processor. From what I learnt, the feature was introduced for security reason, and is only partially enabled to Android Chrome due to memory usage concerns.

Yes. That's the one I meant. Do you see any reference on how one can achieve a separate thread for site isolation for a about:blank iframe when the sandbox="allow-same-origin" is NOT configured?

And, what about a worker with a importScripts('https://3p.net/some-script.js')? That's a guaranteed thread with some isolation that we should have already some experience with, I believe.

@zhouyx
Copy link
Contributor

zhouyx commented Jul 22, 2020

Do you see any reference on how one can achieve a separate thread for site isolation for a about:blank iframe when the sandbox="allow-same-origin" is NOT configured?

I couldn't find anything about about:blank iframe. And site isolation seems to only care about pushing site to different process not separate thread. Here's what I read

"To support a site-per-process policy in a multi-process web browser, we need to identify the smallest unit that cannot be split into multiple processes. This is not actually a single page, but rather a group of documents from the same web site that have references to each other. Such documents have full script access to each other's content, and they must run on a single thread, not concurrently. This group may span multiple frames or tabs, and they may come from multiple sub-domains of the same site."
"In addition, top-level documents may contain iframes from different web sites. These iframes have their own security context and must be rendered in a process based on their own site, not the site of their parent frame."

And, what about a worker with a importScripts('https://3p.net/some-script.js')? That's a guaranteed thread with some isolation that we should have already some experience with, I believe.

I believe the worker when created runs in a separate thread. But scripts imported by importScripts runs in the same thread as the worker, Loading 3p scripts leads to security concerns, and importScripts is blocked in worker dom.

@dvoytenko
Copy link
Contributor

But worker dom still uses a cross-origin script, no? The method of fetching it might be different, but isn't the end result still similar?

@zhouyx
Copy link
Contributor

zhouyx commented Jul 24, 2020

True.
Here's my understanding of how worker runs cross-origin scripts in AMP today. <amp-script> fetch the cross-origin script, verify the script hash is the same as the one provided in meta. (This is the meta tag SHA check mentioned above) and then run cross-origin script in a sandboxed web worker.
And yes that's a guaranteed thread with some isolation, mostly due to the sandboxing strategy. We can definitely leverage that to run 3P vendor scripts if we could get rid of the script hash safely.

@dvoytenko
Copy link
Contributor

I see. It looks like this hash is used more for integrity than security? I.e. no one on the serving stack validates the cache, do they? Assuming the integrity is less critical with a 3p service, what other security nuances do we need to validate for a worker?

I think to sum it up, we could explore two options:

  1. A worker with all 3p scripts. A guaranteed single separate thread for all related vendors. But security parameters might be unsatisfactory.
  2. A x-origin thread. It could be an iframe running on getDefaultBootstrapBaseUrl. Or maybe even it could be a srcdoc iframe without sandbox=allow-same-origin. It should check off security boxes. But it might be hard to guarantee thread isolation.

@dvoytenko
Copy link
Contributor

And the nice part here, potentially, is that the API to both a worker and thread would probably be exactly the same.

@zhouyx
Copy link
Contributor

zhouyx commented Jul 24, 2020

Based on #23691 (comment), the hash is added to mitigate use of amp-script as a script gadget.

In the case of multiple 3P scripts, are you suggesting running them in one worker/iframe? Or the AMP runtime will queue them and run each one in a new worker/iframe.

And the nice part here, potentially, is that the API to both a worker and thread would probably be exactly the same.

That'd be nice. The API should remained the same regardless of the underlying design.

@dvoytenko
Copy link
Contributor

That'd be nice. The API should remained the same regardless of the underlying design.

I meant mostly that it's just postMessage/onMessage in both cases.

In the case of multiple 3P scripts, are you suggesting running them in one worker/iframe? Or the AMP runtime will queue them and run each one in a new worker/iframe.

Yes to former. One worker/iframe - all 3p scripts for the analytics use cases. It's basically a way to give the 3p scripts a restricted singular access to the doc but in a separate thread.

Based on #23691 (comment), the hash is added to mitigate use of amp-script as a script gadget.

Can't say that I understand it all. But in our case the scripts will not be allowed to contribute any DOM and the cross-origin API is strictly postMessage.

@mrjoro mrjoro changed the title Design Review 2020-07-15 15:30 UTC (Africa/Europe/western Asia) Design Review 2020-07-15 15:30 UTC (cross origin iframe, wg-caching) Jul 28, 2020
@mrjoro mrjoro closed this as completed Aug 25, 2020
@zhouyx
Copy link
Contributor

zhouyx commented Sep 11, 2020

Re: The running 3p scripts in workers discussion. We proposed an I2I #30193. Feedback is welcome!

Let us know if the proposal will help with your use case in AMP given that iframe in the background is not allowed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Design Review WG: Caching: Triaged Initial triage from wg-caching complete. Remove label if new input required. WG: caching
Projects
None yet
Development

No branches or pull requests

7 participants