New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Receivers of ranged responses must ensure all ranges come from the same underlying resource #703

Closed
jakearchibald opened this Issue May 27, 2015 · 35 comments

Comments

Projects
None yet
@jakearchibald
Copy link
Collaborator

jakearchibald commented May 27, 2015

Say an element makes two ranged requests and receives two responses, and the serviceworker handles both. If the element accepts opaque responses (video), the serviceworker could mix opaque data from different urls, or opaque data with non-opaque data.

@sirdarckcat - could you quickly sum up an attack based on this? My brain's kinda reset and I can't think of anything bad you can do here other than lie to yourself.

+@slightlyoff

@sirdarckcat

This comment has been minimized.

Copy link

sirdarckcat commented May 28, 2015

@annevk

This comment has been minimized.

Copy link
Member

annevk commented Jun 9, 2015

Hmm, synthetic 206/304 should probably be a network error for APIs other than fetch() and XMLHttpRequest.

@annevk

This comment has been minimized.

Copy link
Member

annevk commented Jun 9, 2015

Also, that doesn't work as the Range header is not revealed to the service worker since it's set in the network layer, after the service worker.

@sirdarckcat

This comment has been minimized.

Copy link

sirdarckcat commented Jun 10, 2015

It is leaked, at least in Chrome.

@sirdarckcat

This comment has been minimized.

Copy link

sirdarckcat commented Jun 10, 2015

Test this:
http://sirdarckcat.github.io/range/
vs.
https://sirdarckcat.github.io/range/ (you might need to refresh once or twice to make sure the SW is registered).

Here it's just demonstrating truncation as the 3 bytes it replaces are the same (Ogg), but you can construct a more complicated service worker that mixes and matches.

@kinu

This comment has been minimized.

Copy link
Contributor

kinu commented Jun 11, 2015

/cc @horo-t @mattto interesting, is it implementation issue then? It's possibly related to the Chrome implementation where we actually intercept requests in the network layer.

@sirdarckcat

This comment has been minimized.

Copy link

sirdarckcat commented Jun 13, 2015

@annevk

This comment has been minimized.

Copy link
Member

annevk commented Jun 13, 2015

That is only true if there are no implementation bugs at play here, right? But either way I guess range requests deserve more inspection.

@horo-t

This comment has been minimized.

Copy link

horo-t commented Jun 15, 2015

When the file size is big, audio/video element will send multiple requests.

  • First, audio/video element sends a GET request(1).
  • The user clicks the seek bar of audio/video element, the browser will send a Range request(2).

I don't understand what "network layer" means in this situation.
Are you saying that the second Range request(2) should not be handled by the ServiceWorker?

@annevk

This comment has been minimized.

Copy link
Member

annevk commented Jun 15, 2015

@horo-t no, it should be handled by the SW. What is unclear to me is what layer should be responsible for setting the range headers?

  • API
  • Fetch
    • Before service worker
    • Service worker
    • After service worker
    • Cache/Network

That determines how observable they are (most headers such as User-Agent and If-Modified-Since are set in the Cache/Network part and therefore not observable to service workers).

Separately we need to decide what happens with synthetic 206 responses.

@mayhemer you probably want to review this thread.

@slightlyoff

This comment has been minimized.

Copy link
Contributor

slightlyoff commented Oct 28, 2015

/cc @mnot

@jakearchibald jakearchibald added this to the Version 1 milestone Oct 28, 2015

@mnot

This comment has been minimized.

Copy link
Member

mnot commented Nov 12, 2015

@annevk I think the audio/video element (or other consumer) will be generating the Range headers in this situation.

There's another use case where they're generated in cache/network, but that's when the HTTP cache has a partial response already there and it wants to "fill it in." When that happens, it really is opaque to SW and the API, because the response is a 200 (unless the API asked for a partial response to begin with, see above).

I think this needs to be updated with the outcome of discussion at TPAC...

@mattto

This comment has been minimized.

Copy link
Member

mattto commented Jan 22, 2016

Chrome has changed this since the bug was filed. @horo-t can correct me if I'm wrong, but my understanding is now we disallow mixing ranged responses from different origins. Furthermore, this change didn't have much to do with service worker, except that we are sure to use the response URL, not the original request URL in case the SW did respondWith(fetch(some-other-origin)). Also, we don't allow mixing of responses between network and SW-generated via new Response().

See https://code.google.com/p/chromium/issues/detail?id=505829

@jakearchibald

This comment has been minimized.

Copy link
Collaborator

jakearchibald commented Jan 26, 2016

F2F:

  • cache.put should prevent adding 206 responses.
  • HTTP spec says a server can respond 200 in responses to a ranged request
  • Do browsers allow this? If not, file bugs
  • If browsers unprepared to fix, change cache.match to help
@jakearchibald

This comment has been minimized.

Copy link
Collaborator

jakearchibald commented Apr 11, 2016

F2F: Need to do research around this

  • What do browsers do today without SW when it comes to 200 responses to a ranged request
  • Is returning a 200 response for a 4gb video a reasonable thing to do performance
  • Whoa, you can provide multiple ranges - do browsers support this?

@jakearchibald jakearchibald self-assigned this Apr 11, 2016

@NekR

This comment has been minimized.

Copy link

NekR commented Apr 12, 2016

Do browsers allow this? If not, file bugs
What do browsers do today without SW when it comes to 200 responses to a ranged request

They seems to allow. Saw this many times our CDN, it returns 200 for ranged requests and browsers work fine.
Of course, still makes sense to do a research about it.

@slightlyoff slightlyoff added the decided label Apr 12, 2016

@wanderview

This comment has been minimized.

Copy link
Member

wanderview commented Apr 13, 2016

I wrote a gecko bug to block cache.put() of 206 responses:

https://bugzilla.mozilla.org/show_bug.cgi?id=1264181

It seems we still need research to determine other aspects of this?

@mnot

This comment has been minimized.

Copy link
Member

mnot commented Apr 13, 2016

200 is absolutely allowed as a response to a ranged request; servers aren't required to support them.

http://httpwg.org/specs/rfc7233.html#introduction

Range requests are an OPTIONAL feature of HTTP, designed so that recipients not implementing this feature (or not supporting it for the target resource) can respond as if it is a normal GET request without impacting interoperability. Partial responses are indicated by a distinct status code to not be mistaken for full responses by caches that might not implement the feature.

@code-tree

This comment has been minimized.

Copy link

code-tree commented Jun 3, 2016

200 is absolutely allowed as a response to a ranged request; servers aren't required to support them.

Glad to know, but...

They seems to allow. Saw this many times our CDN, it returns 200 for ranged requests and browsers work fine.

I wish this were true, but it doesn't seem like Chrome is currently doing this with Audio element. I've documented this, and would appreciate confirmation as to whether: 1. my bad, 2. Chrome's bad, 3. SW's bad

@jakearchibald

This comment has been minimized.

Copy link
Collaborator

jakearchibald commented Jun 3, 2016

I need to do more research on this. The spec says it's ok, so it seems like Chrome is bad. But if all browsers behave like Chrome, we should probably update the fetch spec to recognise this, then make the cache API generate ranges responses where it's safe to.

@wanderview

This comment has been minimized.

Copy link
Member

wanderview commented Jun 3, 2016

@code-tree Do you have a sample site demonstrating the problem? I'd like to test in FF. Thanks.

@code-tree

This comment has been minimized.

Copy link

code-tree commented Jun 4, 2016

Yes, I created one today. I've also included a workaround based on a link @jakearchibald sent, and made it a bit more robust.

I was hoping if I removed the 'Accept-Ranges' header from responses, then Chrome would stop requesting for ranges, but it didn't work. Only workaround seems to be to satisfy Chrome, and give it its ranged content.

@wanderview

This comment has been minimized.

Copy link
Member

wanderview commented Jun 4, 2016

Thanks! Firefox seems to work with both cases on that site even with seeking the audio. On my mobile, though, so not sure if it's trying to use range requests or not.

@code-tree

This comment has been minimized.

Copy link

code-tree commented Jun 5, 2016

I also just realised it is working in Chrome mobile as well for me. The issue is quite hard to reproduce actually. I'm definitely reproducing it on desktop Chrome, and also know that it occurs on mobile too, in an app I'm not able to publish yet.

Chrome dev tools (connected to mobile Chrome) actually shows a request for range 0-1 for both audio players, then when I click play it doesn't show any more requests (but does get the rest of the audio), making me wonder if there is also a bug with dev tools here (update: see below).

@code-tree

This comment has been minimized.

Copy link

code-tree commented Jun 5, 2016

Update: Just realised I was using Chrome 50 for mobile testing, and it does not work (as evident when offline). Chrome pre-52 apparently uses Android's media stack (rather than own) and somehow was bypassing the SW. or something?

In any case, example with no range support does not work on any mobile Chrome version. Example with range support does not work for Chrome mobile < 52, but does work 52+.

See also chrome bug 575357 and an upcoming Google post.

And this quote from 7 Aug 2014 explains it... :/

Because android MediaPlayer lives in a separate process, chrome has no control over how network requests are made inside it. This may cause some issues.

@jakearchibald

This comment has been minimized.

Copy link
Collaborator

jakearchibald commented Jul 27, 2016

Pre F2F notes: I don't think there's anything to discuss here

@slightlyoff

This comment has been minimized.

Copy link
Contributor

slightlyoff commented Jul 29, 2016

Nothing to discuss because Chrome on Android now uses the Chrome media stack? Or for some other reason?

@jakearchibald

This comment has been minimized.

Copy link
Collaborator

jakearchibald commented Jul 29, 2016

This ticket was already decided, and there's no objection to the decision. Just need to submit PRs to the HTML spec defining this behaviour (I can't see how it can be done in the fetch or SW spec)

@jakearchibald

This comment has been minimized.

Copy link
Collaborator

jakearchibald commented Jul 29, 2016

Thought: look at adaptive formats built from many resources (dash) and come up with a safe strategy there

@jakearchibald

This comment has been minimized.

Copy link
Collaborator

jakearchibald commented Jul 31, 2016

@horo-t

We disallow mixing ranged responses from different origins.

This may not be enough. Imagine a request for <script src="//other-origin/private-data"> which (somehow) was requested in three ranges. I could satisfy them with:

  • //other-origin/data.js which I knew started `window.whatever = ``
  • //other-origin/private-data
  • //other-origin/data.js which I knew ended ``;`

…which would give me access to the private data. I realise script isn't requested with ranges, but there may be similar attacks with media?

If so, we can ensure that for media using opaque responses, each response must have a first item in the url list that matches the request url. I'm hoping this caters for single-resource media, but also things like DASH that are made up of multiple parts.

I use the URL list in attempt to ensure the response has come from the requested location, but allow for redirects.

@horo-t

This comment has been minimized.

Copy link

horo-t commented Aug 1, 2016

Yes. It looks possible to know the private data using media element and the known data in the same origin.

Checking the url list looks good to me.

@jakearchibald

This comment has been minimized.

Copy link
Collaborator

jakearchibald commented Sep 20, 2016

Action for @jakearchibald: file issues on html spec for handling range requests

@jakearchibald

This comment has been minimized.

Copy link
Collaborator

jakearchibald commented Aug 3, 2017

I've begun work on this.

HTML PR: whatwg/html#2814
Fetch PR: whatwg/fetch#560

I still need to spec how download-to-disk works, but I'll take that on as part of background fetch.

@jakearchibald

This comment has been minimized.

Copy link
Collaborator

jakearchibald commented Aug 3, 2017

Closing this, because the work needs to happen in other specs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment