Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect handling of multiple headers #19

Open
phiresky opened this issue Mar 6, 2021 · 2 comments
Open

Incorrect handling of multiple headers #19

phiresky opened this issue Mar 6, 2021 · 2 comments

Comments

@phiresky
Copy link

phiresky commented Mar 6, 2021

While implementing firefox support, I noticed the following:

The CDP Network.Headers interface returns headers as a JSON object of keys and values, while the webRequest API returns a list of keys and values.

HTTP Headers can appear multiple times though, so how does CDP handle multiple of the same header?

I tested this with a tiny test server, and when the response contains two values for the same header:

HTTP/1.1 200 OK
X-Powered-By: Express
Set-Cookie: foo=bar; Path=/; Expires=Sat, 06 Mar 2021 14:28:57 GMT
Set-Cookie: baz=hio; Path=/; Expires=Sat, 06 Mar 2021 14:28:57 GMT
Content-Type: text/html; charset=utf-8
Content-Length: 12
ETag: W/"c-Lve95gjOVATpfV8EL5X4nxwjKHE"
Date: Sat, 06 Mar 2021 14:28:57 GMT
Connection: keep-alive
Keep-Alive: timeout=5


The following is stored in the database:

respHeaders:
Connection: "keep-alive"
Content-Length: "12"
Content-Type: "text/html; charset=utf-8"
Date: "Sat, 06 Mar 2021 14:28:57 GMT"
ETag: "W/"c-Lve95gjOVATpfV8EL5X4nxwjKHE""
Keep-Alive: "timeout=5"
Set-Cookie: "foo=bar; Path=/; Expires=Sat, 06 Mar 2021 14:28:57 GMT, baz=hio; Path=/; Expires=Sat, 06 Mar 2021 14:28:57 GMT"
X-Powered-By: "Express"

The headers are concatted simply with comma. This seems to be "correct" for most headers, but not all of them. For set-cookie for example, the cookies are mangled and not parseable. This means that the resulting web archive will set mangled cookies.

I found some more info here:
https://stackoverflow.com/a/38406581/2639190

@ikreymer
Copy link
Member

Thanks for pointing this out! Probably should add a way to split these up when storing the headers.
Cookies are a bit tricky, and actually are not supported being set from a service worker, so the Set-Cookie headers are not replayed.
Instead, what happens is that the actual Cookie from the request is saved in another header, and then wabac.js injects that cookie into the page via document.cookie.

This seems sufficient for replay, and actually to be more accurate it should probably save the cookies from document.cookie itself, as only those cookies that are accessible to the document are needed for replay...

@phiresky
Copy link
Author

actual Cookie from the request is saved in another header

Huh, did not see that. That makes sense. So for replay it doesn't really matter (assuming the Cookie header doesn't have the same problem with concatenation), though it does decrease the fidelity of the resulting web archive somewhat.

ikreymer added a commit to webrecorder/wabac.js that referenced this issue Mar 11, 2021
…ected presetCookie after existing injected cookies from x-wabac-preset-cookie header.

- parse multiple comma separated cookies, as mentioned in webrecorder/archiveweb.page#19
- exclude HttpOnly cookies, as they won't be available in document.cookie
- exclude Secure cookies that are passed to http domain
likely fix for webrecorder/replayweb.page#30
ikreymer added a commit to webrecorder/wabac.js that referenced this issue Mar 11, 2021
…ected presetCookie after existing injected cookies from x-wabac-preset-cookie header.

- parse multiple comma separated cookies, as mentioned in webrecorder/archiveweb.page#19
- exclude HttpOnly cookies, as they won't be available in document.cookie
- exclude Secure cookies that are passed to http domain
likely fix for webrecorder/replayweb.page#30
bump to 2.6.4
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants