-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add custom metric listing the cookie store #116
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, Rick! This looks nice. However, we seem to be missing httpOnly
cookies. Given that in my past measurements, approximately 12% of the cookies were httpOnly
. Although httpOnly
cookies are inherently inaccessible via JavaScript, could WPT offer a workaround to include these?
@nrllh I've parsed out the Here's a more complex example using the NY Times site: https://www.webpagetest.org/result/240409_BiDcJS_79A/1/details/ Results{
"allCookies": [
{
"domain": "nytimes.com",
"expires": 1744235576387.214,
"name": "nyt-a",
"partitioned": false,
"path": "/",
"sameSite": "none",
"secure": true,
"value": "kvLPILWWcZksweQIzek_84"
},
{
"domain": "nytimes.com",
"expires": 1712721175988.096,
"name": "nyt-gdpr",
"partitioned": false,
"path": "/",
"sameSite": "lax",
"secure": false,
"value": "0"
},
{
"domain": "nytimes.com",
"expires": 1744235560009.721,
"name": "nyt-purr",
"partitioned": false,
"path": "/",
"sameSite": "lax",
"secure": true,
"value": "cfshcfhshckfhdfshg"
},
{
"domain": "nytimes.com",
"expires": 1712721160009.771,
"name": "nyt-geo",
"partitioned": false,
"path": "/",
"sameSite": "lax",
"secure": false,
"value": "US"
},
{
"domain": "nytimes.com",
"expires": null,
"name": "nyt-b3-traceid",
"partitioned": false,
"path": "/",
"sameSite": "none",
"secure": true,
"value": "b80ea2fae43e46d7bf6a5e6173f731f5"
},
{
"domain": "nytimes.com",
"expires": 1744235563215.201,
"name": "nyt-jkidd",
"partitioned": false,
"path": "/",
"sameSite": "lax",
"secure": false,
"value": "uid=0&lastRequest=1712699563194&activeDays=%5B0%2C0%2C0%2C0%2C0%2C0%2C0%2C0%2C0%2C0%2C0%2C0%2C0%2C0%2C0%2C0%2C0%2C0%2C0%2C0%2C0%2C0%2C0%2C0%2C0%2C0%2C0%2C0%2C0%2C1%5D&adv=1&a7dv=1&a14dv=1&a21dv=1&lastKnownType=anon&newsStartDate=&entitlements="
},
{
"domain": "nytimes.com",
"expires": 1746395564000,
"name": "__gads",
"partitioned": false,
"path": "/",
"sameSite": "lax",
"secure": false,
"value": "ID=33af9a77d42a2448:T=1712699564:RT=1712699564:S=ALNI_MY9bOtga7JgSrtENBENRDaCC0Ofmg"
},
{
"domain": "nytimes.com",
"expires": 1746395564000,
"name": "__gpi",
"partitioned": false,
"path": "/",
"sameSite": "lax",
"secure": false,
"value": "UID=00000ddc29532380:T=1712699564:RT=1712699564:S=ALNI_MaMDmuaf76zbnfw8SaKWQfJOUcDfg"
},
{
"domain": "nytimes.com",
"expires": 1728251564000,
"name": "__eoi",
"partitioned": false,
"path": "/",
"sameSite": "lax",
"secure": false,
"value": "ID=2c19d8ca5b1def9b:T=1712699564:RT=1712699564:S=AA-AfjbRnKD9hK_1SIu0f71CtK_z"
},
{
"domain": "www.nytimes.com",
"expires": 1744235565280.481,
"name": "datadome",
"partitioned": false,
"path": "/",
"sameSite": "lax",
"secure": true,
"value": "PNzEPCDoc4b4D8LTHJ6zY7_iMlVP0gdQSv1W0EzrOPjnyGzVzWt1V2ddSDVIaJVGdFOuStnuFn6T_3w~26ZXOyn2cKdgolQMrE23wFzFcQfvxdrS_ZzKDgorih1ghW88"
},
{
"domain": "nytimes.com",
"expires": 1746827567000,
"name": "_cb",
"partitioned": false,
"path": "/",
"sameSite": "lax",
"secure": true,
"value": "DcFyVCDKsokJChqbCb"
},
{
"domain": "nytimes.com",
"expires": 1746827567000,
"name": "_chartbeat2",
"partitioned": false,
"path": "/",
"sameSite": "lax",
"secure": true,
"value": ".1712699567412.1712699567412.1.BcyRhCDzBEoEB9oAZHCkOBnXDCekYY.1"
},
{
"domain": "nytimes.com",
"expires": 1712701367000,
"name": "_cb_svref",
"partitioned": false,
"path": "/",
"sameSite": "lax",
"secure": true,
"value": "external"
},
{
"domain": "nytimes.com",
"expires": 1746827567000,
"name": "_v__chartbeat3",
"partitioned": false,
"path": "/",
"sameSite": "lax",
"secure": true,
"value": "fr4anC70D3oCCZb-a"
},
{
"domain": null,
"expires": 1712785968521.009,
"name": "_lr_geo_location_state",
"partitioned": false,
"path": "/",
"sameSite": "lax",
"secure": false,
"value": ""
},
{
"domain": null,
"expires": 1712785968522.1099,
"name": "_lr_geo_location",
"partitioned": false,
"path": "/",
"sameSite": "lax",
"secure": false,
"value": "US"
},
{
"domain": "nytimes.com",
"expires": 1720475568000,
"name": "_gcl_au",
"partitioned": false,
"path": "/",
"sameSite": "lax",
"secure": false,
"value": "1.1.887181119.1712699569"
},
{
"domain": "nytimes.com",
"expires": 1747259570204.0898,
"name": "iter_id",
"partitioned": false,
"path": "/",
"sameSite": "lax",
"secure": false,
"value": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJhaWQiOiI2NjE1YjhiMmQzYjI0ZDAwMDEwMTk5MzQiLCJjb21wYW55X2lkIjoiNWMwOThiM2QxNjU0YzEwMDAxMmM2OGY5IiwiaWF0IjoxNzEyNjk5NTcwfQ.Xb-H1ZDdnwQ7LWj5UaMMlaXNrt5PPtpvKc-fadgoHto"
},
{
"domain": null,
"expires": 1712700479000,
"name": "_dd_s",
"partitioned": false,
"path": "/",
"sameSite": "none",
"secure": true,
"value": "rum=0&expire=1712700460858"
},
{
"name": "receive-cookie-deprecation",
"value": "1",
"expires": 1744235580443,
"domain": ".openx.net",
"path": "/",
"sameSite": "none",
"httpOnly": true,
"secure": true,
"partitioned": true
},
{
"name": "receive-cookie-deprecation",
"value": "1",
"expires": 1720475580443,
"domain": ".3lift.com",
"path": "/",
"sameSite": "none",
"httpOnly": true,
"secure": true,
"partitioned": true
},
{
"name": "receive-cookie-deprecation",
"value": "1",
"expires": 2027195580443,
"domain": ".adnxs.com",
"path": "/",
"sameSite": "none",
"httpOnly": true,
"secure": true,
"partitioned": true
},
{
"name": "receive-cookie-deprecation",
"value": "1",
"expires": 1744235580443,
"domain": "casalemedia.com",
"path": "/",
"sameSite": "none",
"httpOnly": true,
"secure": true,
"partitioned": true
},
{
"name": "purr-cache",
"value": "<K0<r<C_<G_<S0<a0<ua<T0",
"expires": null,
"domain": "nytimes.com",
"path": "/",
"sameSite": "lax",
"httpOnly": true,
"secure": true,
"partitioned": false
},
{
"name": "jkidd-s",
"value": "referrer",
"expires": null,
"path": "/",
"httpOnly": true,
"secure": false,
"partitioned": false
},
{
"name": "jkidd-p",
"value": "prevPage",
"expires": 1744235580443,
"path": "/",
"httpOnly": true,
"secure": false,
"partitioned": false
},
{
"name": "receive-cookie-deprecation",
"value": "1",
"expires": 2027195580443,
"domain": ".adnxs.com",
"path": "/",
"sameSite": "none",
"httpOnly": true,
"secure": true,
"partitioned": true
},
{
"name": "A3",
"value": "d",
"expires": 1744257180443,
"domain": ".yahoo.com",
"path": "/",
"sameSite": "none",
"httpOnly": true,
"secure": true,
"partitioned": false
},
{
"name": "test_cookie",
"value": "CheckForPermission",
"expires": 1712700464000,
"sameSite": "none",
"httpOnly": true,
"secure": true,
"partitioned": false
},
{
"name": "A3",
"value": "d",
"expires": 1744257180444,
"domain": ".yahoo.com",
"path": "/",
"sameSite": "none",
"httpOnly": true,
"secure": true,
"partitioned": false
},
{
"name": "receive-cookie-deprecation",
"value": "1",
"expires": 2027195580444,
"domain": ".adnxs.com",
"path": "/",
"sameSite": "none",
"httpOnly": true,
"secure": true,
"partitioned": true
},
{
"name": "receive-cookie-deprecation",
"value": "1",
"expires": 1728251580444,
"domain": "doubleclick.net",
"path": "/",
"sameSite": "none",
"httpOnly": true,
"secure": true,
"partitioned": true
},
{
"name": "A3",
"value": "d",
"expires": 1744257180444,
"domain": ".yahoo.com",
"path": "/",
"sameSite": "none",
"httpOnly": true,
"secure": true,
"partitioned": false
},
{
"name": "_sv3_7",
"value": "1",
"expires": 1712785966000,
"sameSite": "none",
"httpOnly": true,
"secure": false,
"partitioned": false
},
{
"name": "amuid2",
"value": "762b0357-e2e7-44ef-b7d1-d7bc3315212a",
"expires": 1744235566000,
"sameSite": "none",
"httpOnly": true,
"secure": false,
"partitioned": false
},
{
"name": "sd_amuid2",
"value": "762b0357-e2e7-44ef-b7d1-d7bc3315212a",
"expires": 1744235566000,
"sameSite": "none",
"httpOnly": true,
"secure": false,
"partitioned": false
},
{
"name": "receive-cookie-deprecation",
"value": "1",
"expires": 1728251580445,
"domain": "doubleclick.net",
"path": "/",
"sameSite": "none",
"httpOnly": true,
"secure": true,
"partitioned": true
},
{
"name": "test_cookie",
"value": "CheckForPermission",
"expires": 1712700468000,
"sameSite": "none",
"httpOnly": true,
"secure": true,
"partitioned": false
},
{
"name": "receive-cookie-deprecation",
"value": "1",
"expires": 2027195580445,
"domain": ".adnxs.com",
"path": "/",
"sameSite": "none",
"httpOnly": true,
"secure": true,
"partitioned": true
},
{
"name": "ktcid",
"value": "fa33aab7-bfd2-0bf3-5c52-0e0c55dc2053",
"expires": 1744235580445,
"domain": "kargo.com",
"path": "/",
"sameSite": "none",
"httpOnly": true,
"secure": true,
"partitioned": false
},
{
"name": "A3",
"value": "d",
"expires": 1744257180445,
"domain": ".yahoo.com",
"path": "/",
"sameSite": "none",
"httpOnly": true,
"secure": true,
"partitioned": false
},
{
"name": "IDSYNC",
"value": "18y3~2hrx",
"expires": 1744257180445,
"domain": ".yahoo.com",
"path": "/",
"sameSite": "none",
"httpOnly": true,
"secure": true,
"partitioned": false
},
{
"name": "test_cookie",
"value": "CheckForPermission",
"expires": 1712700464000,
"sameSite": "none",
"httpOnly": true,
"secure": true,
"partitioned": false
},
{
"name": "IDE",
"value": "AHWqTUnZe7rTPzhVXrsS2p_XV72YErxlsz-d4FEYkp7vGccHuTHiHRN0ecRfYLfBrsE",
"expires": 1217630755000,
"sameSite": "none",
"httpOnly": true,
"secure": true,
"partitioned": false
},
{
"name": "viewer_token",
"value": "7be52cd8-e7e9-4b31-816f-7d6b12897698",
"expires": null,
"sameSite": "none",
"httpOnly": true,
"secure": false,
"partitioned": false
},
{
"name": "IDSYNC",
"value": "\"18y3~2hrx:18z8~2hrx\"",
"expires": 1744257180445,
"domain": ".yahoo.com",
"path": "/",
"sameSite": "none",
"httpOnly": true,
"secure": true,
"partitioned": false
},
{
"name": "A3",
"value": "d",
"expires": 1744257180445,
"domain": ".yahoo.com",
"path": "/",
"sameSite": "none",
"httpOnly": true,
"secure": true,
"partitioned": false
},
{
"name": "V",
"value": "nC5po6UCCvki",
"expires": 1744235580446,
"domain": ".contextweb.com",
"path": "/",
"sameSite": "none",
"httpOnly": true,
"secure": true,
"partitioned": false
}
]
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you so much, Rick! Can we also have a flag for non httpOnly cookies like "httpOnly": false,
?
Done |
Custom metrics for https://almanac.httparchive.org/en/2022/WPT test run results: http://webpagetest.httparchive.org/results.php?test=240409_9Z_S Custom metrics for https://example.com/WPT test run results: http://webpagetest.httparchive.org/results.php?test=240409_SN_T {
"_cms": {
"wordpress": {
"block_theme": false,
"has_embed_block": false,
"embed_block_count": {
"total": 0,
"total_by_type": []
},
"scripts": [],
"content_type": {
"template": "unknown",
"post_type": "",
"taxonomy": ""
}
}
},
"_cookies": {
"allCookies": []
}
} Custom metrics for https://web.dev/WPT test run results: http://webpagetest.httparchive.org/results.php?test=240409_3D_V {
"_cms": {
"wordpress": {
"block_theme": false,
"has_embed_block": false,
"embed_block_count": {
"total": 0,
"total_by_type": []
},
"scripts": [],
"content_type": {
"template": "unknown",
"post_type": "",
"taxonomy": ""
}
}
},
"_cookies": {
"allCookies": [
{
"domain": null,
"expires": 1747266788519.529,
"name": "_ga_devsite",
"partitioned": false,
"path": "/",
"sameSite": "lax",
"secure": false,
"value": "GA1.2.3297890470.1712706776",
"httpOnly": false
},
{
"domain": null,
"expires": 1746402781000,
"name": "cookies_accepted",
"partitioned": false,
"path": "/",
"sameSite": "lax",
"secure": false,
"value": "true",
"httpOnly": false
},
{
"domain": null,
"expires": 1728258781000,
"name": "django_language",
"partitioned": false,
"path": "/",
"sameSite": "lax",
"secure": false,
"value": "en",
"httpOnly": false
},
{
"domain": "web.dev",
"expires": 1747266782465.985,
"name": "_ga",
"partitioned": false,
"path": "/",
"sameSite": "lax",
"secure": false,
"value": "GA1.1.524183990.1712706782",
"httpOnly": false
},
{
"domain": "web.dev",
"expires": 1747266782530.719,
"name": "_ga_18JR3Q8PJ8",
"partitioned": false,
"path": "/",
"sameSite": "lax",
"secure": false,
"value": "GS1.1.1712706782.1.1.1712706782.0.0.0",
"httpOnly": false
}
]
}
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
Just seeing it; @rviscomi do you know why some domain fields are null? |
|
We're interested in using this metric as part of the privacy 2024 almanac chapter (https://docs.google.com/document/d/1WJT9kfKHxwNl5HNAhIddefWC0vCvH3DUIxajT1gQWao/edit). However, we've noticed that for cookies set through JS, this metric only captures first-party cookies - third-party JS cookies set in an iframe aren't included, because @nrllh We're thinking that to capture all cookies, we would need to inject part of the custom metric code into third-party iframes to read the cookie store there. Do you know if that's possible, or whether we could extend the crawler to do that? |
@pmeenan That would be helpful! Do you have an example of how that could be implemented? |
It would be a raw dump of the dev tools output into the har. That might blow up the size of the page data though (raising the cost of all queries) so I'd want @rviscomi to weigh-in. I could also do some level of processing to make it look more like the cookies custom metric output if size is a concern. Currently, something like this: "_storage": {
"cookies": [
{
"name": "thirdparty",
"value": "yes",
"domain": "widgets.outbrain.com",
"path": "/nanoWidget/externals/cookie",
"expires": 1716142633.634125,
"size": 13,
"httpOnly": false,
"secure": true,
"session": false,
"sameSite": "None",
"priority": "Medium",
"sameParty": false,
"sourceScheme": "Secure",
"sourcePort": 443
},
{
"name": "sync",
"value": "CgoIoQEQtvztjvkxCgoI5gEQtvztjvkxCgoIhwIQtvztjvkxCgoItwIQtvztjvkxCgkIOhC2_O2O-TEKCQgbELb87Y75MQoKCIwCELb87Y75MQoKCKwCELb87Y75MQoKCK0CELb87Y75MQoJCF8Qtvztjvkx",
"domain": ".3lift.com",
"path": "/sync",
"expires": 1723915032.393132,
"size": 160,
"httpOnly": false,
"secure": true,
"session": false,
"sameSite": "None",
"priority": "Medium",
"sameParty": false,
"sourceScheme": "Secure",
"sourcePort": 443
},
{
"name": "receive-cookie-deprecation",
"value": "1",
"domain": ".adnxs.com",
"path": "/",
"expires": 1750699073.955507,
"size": 27,
"httpOnly": true,
"secure": true,
"session": false,
"sameSite": "None",
"priority": "Medium",
"sameParty": false,
"sourceScheme": "Secure",
"sourcePort": 443,
"partitionKey": "https://cnn.com"
},
]
}, |
The other option would be to make the cookie data available to custom metrics in their raw form (say, as |
The current custom metric's output is pretty long already - will that turn into an issue? I think the only data we actually need is a list of top cookie names and domains setting cookies (perhaps split into tracking/non-tracking) - if we had |
Done. It's worth noting that it is only available on the httparchive instance (webpagetest.httparchive.org) and as part of the HTTP Archive crawl and won't work on the public WebPageTest. |
@pmeenan Awesome, thanks! Is there any other way to test the metric implementation, or should I just assume that the output is identical to the |
You can test it with arbitrary custom metrics on https://webpagetest.httparchive.org/ or the PR activity for custom metrics will do it automatically. I tested it manually with a metric that simply was:
Just to make sure it was working. |
If I were building a new custom metric, I'd use something like the above to extract the value for a site (like cnn.com) and then write the custom metric in JS, testing locally with the value that came back from the one test and then when it is working the way I want, swap the test data back to $WPT_COOKIES and try it out as a custom metric. |
@pmeenan @rviscomi Is anything relying on the cookies custom metric currently? Given Patrick's earlier comment:
It seems like that would be an issue with the current metric as well, but my understand is that there hasn't been a large-scale crawl since this was merged to know whether it's a problem. I'm proposing to replace the cookies metric with this:
Which would capture more cookies but have a smaller total size. It would be sufficient for the privacy almanac chapter work, but I don't want to remove anything that other analyses depend on. |
What if we omitted |
@rviscomi That's fine with me. So you are OK with doing the above, but with just |
I would like to keep the structure shared in this comment , if the output is not a significant issue. For many further potential studies, other fields are essential. |
There's not really a hard limit, just side-effects so my ask would be to make sure you REALLY need everything you are collecting and if it makes sense to have protections in place to make sure it doesn't explode. We already have a couple of metrics that have that problem (rendered html and some of the CSS metrics - combine those with inline base64-encoded fonts and...). Each page record has a 10MB limit which includes the summary stats, lighthouse data and custom metrics. It used to be the case that records would be dropped if they exceeded the limit but we recently put in protection to drop parts of the data to get the record under the needed size (starting by dropping individual custom metrics that are over 100k). Probably more of a concern is that the size of the data directly impacts everyone's query costs for querying any custom metrics since they are all in the same column. Neither is a hard limit but just things we should take into account when adding metrics that might explode in edge cases. |
Hello, Thanks for exposing // [cookies]
return $WPT_COOKIES?.map(cookie => {
const {name, domain, path, expires, size, httpOnly, secure, session, sameSite, sameParty, partitionKey, partitionKeyOpaque} = cookie
return {name, domain, path, expires, size, httpOnly, secure, session, sameSite, sameParty, partitionKey, partitionKeyOpaque}
}) i.e., without these properties: Thanks! |
Progress on #112
Adds a custom metric to dump the contents of the cookie jar
Test websites: