Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Finalize assignments: Chapter 8. Security #10

Closed
3 tasks done
rviscomi opened this issue May 21, 2019 · 36 comments
Closed
3 tasks done

Finalize assignments: Chapter 8. Security #10

rviscomi opened this issue May 21, 2019 · 36 comments
Assignees

Comments

@rviscomi
Copy link
Member

rviscomi commented May 21, 2019

Section Chapter Authors Reviewers
II. User Experience 8. Security @arturjanc @ScottHelme @paulcalvano @bazzadp @ghedo @ndrnmnn

Due date: To help us stay on schedule, please complete the action items in this issue by June 3.

To do:

  • Assign subject matter experts (coauthors)
  • Assign peer reviewers
  • Finalize metrics

Current list of metrics:

TLS 🔒

  • Protocol Usage
    • SSLv2 / SSLv3 / TLSv1.0 / TLSv1.1 / TLSv1.2 / TLSv1.3
  • Unique CA issuers
  • RSA certificates
  • ECDSA certificates
  • Certificate validation level (DV / OV / EV)
  • Cipher suite usage
    • Suites supporting Forward Secrecy (ECDHE / DHE)
    • Authenticated suites (GCM / CCM)
    • Modern suites (AES GCM, ChaCha20-Polyc1305)
    • Legacy suites (AES CBC, 3DES, RC4)
  • OCSP Stapling
  • Session ID/Ticket assignment
  • Sites redirecting to HTTPS
  • Sites with degraded HTTPS UI (mixed-content)

Security Headers 📋

  • Content Security Policy
    • Policies with frame-ancestors
    • Policies with 'nonce-*'
    • Policies with 'hash-*'
    • Policies with 'unsafe-inline'
    • Policies with 'unsafe-eval'
    • Policies with 'strict-dynamic'
    • Policies with 'trusted-types'
    • Policies with 'upgrade-insecure-requests'
  • HTTP Strict Transport Security
    • Variance in max-age
    • Use of includeSubDomains
    • Use of preload token
  • Network Error Logging
  • Report To
  • Referrer Policy
  • Feature Policy
  • X-Content-Type-Options
  • X-Xss-Protection
  • X-Frame-Options
  • Cross-Origin-Resource-Policy
  • Cross-Origin-Opener-Policy
  • Vary (Sec-Fetch-* values)

Cookies 🍪

  • Use of HttpOnly
  • Use of Secure
  • Use of SameSite
  • Use of prefixes

Other ❓

  • Use of SRI on subresources
  • Vulnerable JS libraries (lighthouse?)

👉AI (coauthors): Assign peer reviewers. These are trusted experts who can support you when brainstorming metrics, interpreting results, and writing the report. Ideally this chapter will have 2 or more reviewers who can promote a diversity of perspectives.

👉 AI (coauthors): Finalize which metrics you might like to include in an annual "state of web security" report powered by HTTP Archive. Community contributors have initially sketched out a few ideas to get the ball rolling, but it's up to you, the subject matter experts, to know exactly which metrics we should be looking at. You can use the brainstorming doc to explore ideas.

The metrics should paint a holistic, data-driven picture of the web security landscape. The HTTP Archive does have its limitations and blind spots, so if there are metrics out of scope it's still good to identify them now during the brainstorming phase. We can make a note of them in the final report so readers understand why they're not discussed and the HTTP Archive team can make an effort to improve our telemetry for next year's Almanac.

Next steps: Over the next couple of months analysts will write the queries and generate the results, then hand everything off to you to write up your interpretation of the data.

Additional resources:

@rviscomi rviscomi transferred this issue from HTTPArchive/httparchive.org May 21, 2019
@rviscomi rviscomi added this to the Chapter planning complete milestone May 21, 2019
@rviscomi rviscomi changed the title [Web Almanac] Finalize assignments: Chapter 8. Security Finalize assignments: Chapter 8. Security May 21, 2019
@tunetheweb
Copy link
Member

Happy to help review this section if you want, as looks like you are light on reviewers?

@rviscomi
Copy link
Member Author

That'd be great, thanks @bazzadp!

@ghedo
Copy link
Member

ghedo commented May 31, 2019

Other metrics that could be interesting (though I don't know how feasible it would be to collect them):

  • Count of sites that support TLS 1.3 0-RTT
  • Distribution of certificate types (ECDSA vs. RSA)
  • Count of sites that only support legacy TLS versions (1.0, 1.1). This could be useful as input to https://datatracker.ietf.org/doc/draft-ietf-tls-oldversions-deprecate/
  • Count of sites that only support legacy signature_algorithms when TLS 1.2 is negotiated (i.e. the algorithms using MD5 and SHA-1). This could be useful as input to https://datatracker.ietf.org/doc/draft-lvelvindron-tls-md5-sha1-deprecate/
  • Count of sites that support modern ciphers (e.g. AES GCM, ChaCha20-Polyc1305)
  • Count of sites that only support legacy ciphers (e.g. AES CBC, 3DES, RC4, ...)
  • Count of sites that support OCSP stapling
  • Count of sites that support forward secrecy (i.e. they support (EC)DHE)

I'm also available if more reviewers are needed.

@rviscomi
Copy link
Member Author

rviscomi commented Jun 1, 2019

Thanks @ghedo! I've added you as a reviewer and sent you a team invite.

@tunetheweb
Copy link
Member

tunetheweb commented Jun 1, 2019

Those are all great stats to know @ghedo but I would caution the authors to be careful to not make it just about SSL/TLS. That’s very important obviously but we all know there is more to security than just that. Other resources (e.g. https://www.ssllabs.com/ssl-pulse/) measure nitty gritty details of SSL/TLS usage well for those of us really interested in the deep down detail of this.

In another post (#1) @rviscomi suggested about 10 metrics per chapter and I think if more than 3-4 of those for this chapter were about SSL/TLS in then we could be at risk of concentrating on that too much and missing out on other interesting analysis.

IMHO we need to think about finding the stats that will be the most useful to the wider community and not just the security community for this report. And that might mean picking stats that represent the rough state of security (e.g. TLS version) rather than something more specific (specific cipher suites). Finding this balance of representative detail versus too much detail is something I’m also struggling with for my chapter on HTTP/2 (#22) btw.

Anyway just my two cents, and don’t want to put people off suggesting metrics (it’s easier to whittle down a big list than to stretch up a little list!) but something I’m giving a bit of thought to for my chapter so thought I’d mention here too for consideration.

@tunetheweb
Copy link
Member

tunetheweb commented Jun 1, 2019

Oh and other point (and counter argument to my points above!) is some stats will probably throw out a few surprises, so we should also be careful not to limit ourselves too much either on stats because of preconceived ideas of what they will show. Can always exclude stats in final report if doesn’t show anything interesting.

@ghedo
Copy link
Member

ghedo commented Jun 1, 2019

@bazzadp I agree with you, and indeed I tried to list more general metrics (e.g. "sites that support modern/legacy ciphers"), rather than more specific ones (e.g. "distribution of specific ciphers"), though there's probably margin for improvement. Happy to discuss this more to try and get the list more focused (it's also likely that some of the metrics I proposed can't be easily collected anyway).

But it's worth noting that things like TLS versions, ciphers, certificate types, forward secrecy, and specific TLS features (e.g. OCSP staping and 0-RTT), are things that people who maintain websites generally might have to deal with directly (because they maintain their own webserver) or indirectly (because some CDNs offer some additional configuration options), so it seems like it would be useful to have at least an overview on the status of these things.

@tunetheweb
Copy link
Member

Agreed. Gonna be a tough call to cull down to a list of just 10 or so metrics!

@rviscomi
Copy link
Member Author

rviscomi commented Jun 1, 2019

Don't feel limited by the 10 metrics suggestion. If you think 25 is manageable, go for it! That said, I do think you want to dedupe similar metrics so your report is holistic and easy to read.

@rviscomi
Copy link
Member Author

rviscomi commented Jun 4, 2019

@arturjanc @ScottHelme we're hoping to finalize the metrics for each chapter today. Can you look through #10 (comment) and modify it to include anything we're missing? @ghedo made a bunch of suggestions in #10 (comment) that should be merged if they LGTY.

@paulcalvano @bazzadp @ghedo as reviewers, please also give the list of metrics one last look and shout if you think anything should be changed.

Once the metrics are in a good place, please tick the last TODO checkbox and close this issue.

@ScottHelme
Copy link
Contributor

Other metrics that could be interesting (though I don't know how feasible it would be to collect them):

  • Count of sites that support TLS 1.3 0-RTT

Will the data be able to provide this given the requirement for a second, resumed connection?

  • Distribution of certificate types (ECDSA vs. RSA)

Similar to above, it'd have to connect twice with a preferred suite at the top for each key type to know if a host was exclusively using ECDSA/RSA or just one key type for auth. Also need to know the client advertised suites on the connection.

Agred, given the pending deprecation these would be worrying.

Agreed.

  • Count of sites that support modern ciphers (e.g. AES GCM, ChaCha20-Polyc1305)
  • Count of sites that only support legacy ciphers (e.g. AES CBC, 3DES, RC4, ...)
  • Count of sites that support forward secrecy (i.e. they support (EC)DHE)

Linked to above again, we'd need multiple connections to determine overall support or we could just use the suite connected with.

  • Count of sites that support OCSP stapling

Yep.

@ScottHelme
Copy link
Contributor

I'd like to suggest the inclusion of more headers than CSP, HSTS and FP. As a minimum suggestion:

  • Referrer Policy

In addition to that some of the older 'x-based' headers:

  • X-Content-Type-Options
  • X-Xss-Protection
  • X-Frame-Options

Given how new they are I think it'd be intersting to see features around the new Reporting API and other security related monitoring mechanisms:

  • Report-To
  • NEL (Network Error Logging)
  • Expect-CT

@rviscomi
Copy link
Member Author

rviscomi commented Jun 4, 2019

@pmeenan could you do a quick sanity check on the metrics suggested here that the Chrome profile is able to capture them? For example, I think I recall OCSP stapling detection only being available in Firefox agents. If there are any flags/configs that need to be turned on to get any of this info, we should identify those before the July crawl.

Count of sites that support TLS 1.3 0-RTT
Will the data be able to provide this given the requirement for a second, resumed connection?

Is having two requests on the page over the same TLS 1.3 connection sufficient? If so yes it's something we can measure. If it requires a second page view then no.

@tunetheweb
Copy link
Member

tunetheweb commented Jun 4, 2019

What about pages being marked insecure?

  • HTTP (not HTTPS) pages with Credit Card or Password fields.
  • HTTP (not HTTPS) pages with any input fields.
  • HTTPS pages with mixed content.

As mentioned above, I’d also like more non-HTTPS related stats. Here’s ones I can think of:

  • Amount of 3rd party content
  • Ads or Trackers per page (more privacy than security but can cause security issues and since we don’t have a privacy chapter...)
  • A measure of Cookies and what security options they use (HttpOnly, Secure, SameSite... etc.).
  • CSP is now mainstream so think we need more than just a measure of whether header exists. Some analysis to try to see if it’s a useful policy (e.g. no unsafe-inline for script-src at least?). Maybe measure upgrade-insecure-request type policies separately (a common one I suspect, that’s still useful but not really a CSP as such if still allows everything and only being used to migrate to HTTPS).
  • Average (or Total?) number of CSP alerts per page?
  • I like the suggest stat about vulnerable libraries. It sure how to measure but something like this would be good. Limit to jQuery as an example? Or jQuery, Bootstrap, Angular, AngularJS, React and Vue? Or all libraries somehow?
  • SRI usage? Though personally I think it’s a bit pointless and better to self host (https://mobile.twitter.com/tunetheweb/status/1134559858353745923).

Oh and one more HTTPS stat:

  • Base domain’s (e.g. example.com) where certificate doesn’t cover www variant (e.g. www.example.com) or vice versa.

@rviscomi
Copy link
Member Author

rviscomi commented Jun 4, 2019

Vulnerable JS libraries is thankfully available as a Lighthouse audit.

As for CSP, +1 to everything. I'm also curious about policy length which might be an indicator of indiscriminate generation by some plugin/tool versus smaller more hand-crafted policies. It could also be a symptom of having too many third parties, so splitting by that dimension could be interesting.

@rviscomi
Copy link
Member Author

rviscomi commented Jun 4, 2019

Can someone take a stab at coalescing all of the suggested metrics into the top comment?

@tunetheweb
Copy link
Member

I'm also curious about policy length which might be an indicator of indiscriminate generation by some plugin/tool versus smaller more hand-crafted policies.

I’d say the opposite - the longer the policy, the more likely it’s been custom generated. Case in point: https://securityheaders.com/?q=twitter.com&followRedirects=on. But yeah think it would be good to see some
metrics on length.

Can someone take a stab at coalescing all of the suggested metrics into the top comment?

Will leave @arturjanc and @ScottHelme to do that. Think I saw on Twitter that Scott is travelling at the mo.

@ghedo
Copy link
Member

ghedo commented Jun 4, 2019

@ScottHelme it was probably worded wrong, but what I meant with the "Count of sites that support modern/legacy ciphers" as well as certificate types and forward secrecy, was to check what the site negotiates by default, so we would just need a single connection, so no need to scan for the whole configuration (like SSL Labs does).

That is, given, say, a browser with a modern TLS configuration, but that nevertheless supports legacy algorithms, if the browser connection ends up using a legacy cipher suite then it means the site either prefers that legacy configuration or simply doesn't support a modern one. So we can use that as indication of what normal web users would end up seeing with that particular site.

Also to be clear, the legacy/modern metrics would aggregate multiple negotiated ciphers into those two categories, so we wouldn't have separate metrics for each cipher, just "modern" vs. "legacy" depending on what is negotiated for the connection.

@arturjanc
Copy link
Contributor

I'm a little late to the party but I wanted to add some more metrics and a few comments. Hopefully we can afterwards integrate all the ideas into a more coherent list as @bazzadp suggested above.

Some more security features, primarily focused on isolation (their use will be low this year):

Trusted Types:

  • The Content-Security-Policy header with a trusted-types directive.

Flavors of Content Security Policy:

  • CSPs which prevent framing, i.e. include frame-ancestors. It may make sense to combine this with the reporting for X-Frame-Options which serves the same purpose; rather than reporting two different values for complementary mechanisms we could have a "this site protects itself from framing" metric that looks at both headers.
  • Policies which use CSP2 nonces/hashes and CSP3 'strict-dynamic'.
  • Policies which try to protect against XSS and don't have 'unsafe-inline'.
  • Policies with 'upgrade-insecure-requests'

I would also propose to remove X-XSS-Protection -- for better or worse it's now a Chrome-specific not-fully-maintained feature and the value is being explicitly set to 0 by many major webapps due to cross-origin information leaks. It might be best to not promote its further adoption.

One caveat about this is that some of these protections only make sense for origins with sensitive authenticated content, so the coverage may be quite low. I think it's perfectly fine (e.g. domains without login don't really need XSS protections), so it would be nice to convey this somehow.

@paulcalvano
Copy link
Contributor

  • Certificate Transparency compliance might be misleading to report on. At it's current state, the % of complaint certificates is largely based on how new certificates are. This might be a more meaningful metric to track for next years almanac.
  • I don't think we can identify 0-RTT support in this data.
  • cipher strength would be interesting to track, especially by 3rd parties.
  • SubResourceIntegrity usage would be interesting

@rviscomi rviscomi added the ASAP This issue is blocking progress label Jun 6, 2019
@rviscomi
Copy link
Member Author

rviscomi commented Jun 6, 2019

We need to finalize the metrics for this chapter ASAP. Could someone update #10 (comment) with the agreed list of metrics? If there are any iffy metrics, let's include them anyway with a note for the Data Analyst team.

@rviscomi
Copy link
Member Author

rviscomi commented Jun 7, 2019

Hate to nag, but we need to get this resolved today to stay on schedule. There's a lot of good discussion in the comments but the final metrics list needs to reflect what the consensus is.

@arturjanc @ScottHelme could you make the call?

@rviscomi
Copy link
Member Author

🛎 ping to get this closed out as soon as possible, it's one week overdue

@rviscomi
Copy link
Member Author

@arturjanc @ScottHelme do either of you have time today to update the list of metrics in the top comment with the consensus from the thread? Would love to close this issue today and unblock the analysis.

@ScottHelme
Copy link
Contributor

ScottHelme commented Jun 13, 2019

Here's my attempt to cover all of the metrics discussed so far that seem to be viable.


  • TLS 🔒
    • Protocol Usage
      • SSLv2 / SSLv3 / TLSv1.0 / TLSv1.1 / TLSv1.2 / TLSv1.3
    • Unique CA issuers
    • RSA certificates
    • ECDSA certificates
    • Certificate validation level (DV / OV / EV)
    • Cipher suite usage
      • Suites supporting Forward Secrecy (ECDHE / DHE)
      • Authenticated suites (GCM / CCM)
      • Modern suites (AES GCM, ChaCha20-Polyc1305)
      • Legacy suites (AES CBC, 3DES, RC4)
    • OCSP Stapling
    • Session ID/Ticket assignment
    • Sites redirecting to HTTPS
    • Sites with degraded HTTPS UI (mixed-content)

  • Security Headers 📋
    • Content Security Policy
      • Policies with frame-ancestors
      • Policies with 'nonce-*'
      • Policies with 'hash-*'
      • Policies with 'unsafe-inline'
      • Policies with 'unsafe-eval'
      • Policies with 'strict-dynamic'
      • Policies with 'trusted-types'
      • Policies with 'upgrade-insecure-requests'
    • HTTP Strict Transport Security
      • Variance in max-age
      • Use of includeSubDomains
      • Use of preload token
    • Network Error Logging
    • Report To
    • Referrer Policy
    • Feature Policy
    • X-Content-Type-Options
    • X-Xss-Protection
    • X-Frame-Options
    • Cross-Origin-Resource-Policy
    • Cross-Origin-Opener-Policy
    • Vary (Sec-Fetch-* values)

  • Cookies 🍪
    • Use of HttpOnly
    • Use of Secure
    • Use of SameSite
    • Use of prefixes

  • Other ❓
    • Use of SRI on subresources
    • Vulnerable JS libraries (lighthouse?)

@ScottHelme ScottHelme reopened this Jun 13, 2019
@tunetheweb
Copy link
Member

tunetheweb commented Jun 13, 2019

Looks pretty good to me! I say update first comment with those and then close out this issue at some point tomorrow if there's no more comments.

Few other thoughts from me:

  • TLS: What about TLS warnings and errors (e.g. HTTPS sites with mixed content, HTTP sites with Credit Card or password fields or any forms, HTTPS sites that are blocked due to inadequate TLS or being on Safe Browser list)?
  • TLS: Sites with EV? Know it's not liked by many in security industry but may be interesting to see. Especially if it declines over the years.
  • CSP - should we differentiate between unsafe-inline in script-src and style-src? Imagine the latter might be common for some CMS's to allow styled content and IMHO not quite as dangerous as allowing in script-src.
  • CSP - measure use of upgrade-insecure-requests? Imagine this is a common one to help sites migrate to HTTPS even if they don't use any other, more difficult to implement, CSP features.
  • Other - Don't forget "Vulnerable JavaScript libraries". Think this is an important one and apparently easily enough measured from above discussion.

@arturjanc
Copy link
Contributor

Thanks a lot for synthesizing this, @ScottHelme, and apologies for not having time earlier this week! A couple of answers / notes:

CSP - should we differentiate between unsafe-inline in script-src and style-src?

tl;dr Yes. More broadly, determining the security of a policy is quite difficult because of CSP's inheritance logic and ignoring some keywords (e.g. 'unsafe-inline' is ignored when the policy has a nonce/hash, and a policy can be safe even without script-src, e.g. default-src 'none'). To get this right it might be helpful to use a library like the CSP Evaluator possibly with some tweaks to align it with what we need here.

CSP - measure use of upgrade-insecure-requests?

+1. This would also tie in nicely with your mixed content idea above.

Vary

I would look specifically for Vary: Sec-Fetch-Site or one of the other Sec-Fetch-* headers.

LGTM otherwise.

@ScottHelme
Copy link
Contributor

Updated with additional comments.

Agree with @arturjanc on the finer details around CSP and looking at it more closely, I'm not sure on how we'd integrate the evaluator. I will leave that up to greater minds to decide ;-)

@tunetheweb
Copy link
Member

LGTM. @ScottHelme can you edit @rviscomi's first comment in this issue (you should have edit permissions even though it's his comment) to replace the current metrics with this, and also tick the "Finalise metrics" checkbox in that comment and then Close the issue? Would do it myself but don't want to overstep my role as "reviewer" here ;-)

@rviscomi
Copy link
Member Author

Updated the metrics in the top comment. Closing now. Thanks everyone!

@ScottHelme
Copy link
Contributor

Can the data tell us about the presence of a file? I'm thinking the security.txt file like this:

https://scotthelme.co.uk/.well-known/security.txt

@rviscomi
Copy link
Member Author

Unfortunately we're only aware of what is transferred over the network in the normal course of the page load. The exception is for Lighthouse audits that specifically check for these files, like the SEO audit for robots.txt. I don't think we will get it in time for this year's Almanac, but it might be a good idea to add a new Security audit to Lighthouse that checks this file.

@rviscomi rviscomi removed the ASAP This issue is blocking progress label Jun 20, 2019
@ndrnmnn
Copy link

ndrnmnn commented Jun 24, 2019

@rviscomi As discussed offline, it might be interesting to track the usage of the WebAuthn API. A good read on general 2FA adoption across sites is Elie's blog post The bleak picture of two-factor authentication adoption in the wild.

@rviscomi
Copy link
Member Author

rviscomi commented Jun 24, 2019

Thanks @ndrnmnn! Do you know of the corresponding feature counter we could look at? For example, perhaps CredentialManagerGetPublicKeyCredential?

I've also sent you an invitation to be a reviewer of this chapter. Thanks again!

@ndrnmnn
Copy link

ndrnmnn commented Jun 25, 2019

@rviscomi I checked some of the available metrics in chromestatus.com which seem to represent a funnel from credential creation to storage. We could therefore also include the CredentialManagerStore metric which in my understanding would represent the total number of credentials stored. WDYT?

@rviscomi
Copy link
Member Author

The chromestatus feature counters are simple indicators that the API was used, not an aggregation of how they're used. And for that feature in particular, we don't have any pages in HTTP Archive that use it (the bottom chart is empty on that page)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants