Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add partitioned HSTS storage support. #13062

Merged
merged 5 commits into from
May 26, 2022
Merged

Add partitioned HSTS storage support. #13062

merged 5 commits into from
May 26, 2022

Conversation

goodov
Copy link
Member

@goodov goodov commented Apr 19, 2022

Added partitioned HSTS storage support.

Original Chromium implementation uses SHA256(domain) as a key to store/lookup HSTS data for domain.
Our implementation uses half of the original key and half of SHA256(top_frame_etld1) to partition HSTS data. This approach allows us to have partitioned storage and have the ability to cleanup HSTS by hostname looking only at first half of the hash.

Originally I planned to use full NetworkIsolationKey which includes top_frame_site and frame_site, but initial tests showed pretty noticeable HSTS storage size just after a quick browsing session. I've also found a document that shows Chromium is moving away from triple-keying in favor of top-level site key only. Because of HSTS storage size concerns and upcoming upstream changes I went with top_frame_site as a partition key for HSTS.

Resolves brave/brave-browser#18830

Submitter Checklist:

  • I confirm that no security/privacy review is needed, or that I have requested one
  • There is a ticket for my issue
  • Used Github auto-closing keywords in the PR description above
  • Wrote a good PR/commit description
  • Squashed any review feedback or "fixup" commits before merge, so that history is a record of what happened in the repo, not your PR
  • Added appropriate labels (QA/Yes or QA/No; release-notes/include or release-notes/exclude; OS/...) to the associated issue
  • Checked the PR locally: npm run test -- brave_browser_tests, npm run test -- brave_unit_tests, npm run lint, npm run gn_check, npm run tslint
  • Ran git rebase master (if needed)

Reviewer Checklist:

  • A security review is not needed, or a link to one is included in the PR description
  • New files have MPL-2.0 license header
  • Adequate test coverage exists to prevent regressions
  • Major classes, functions and non-trivial code blocks are well-commented
  • Changes in component dependencies are properly reflected in gn
  • Code follows the style guide
  • Test plan is specified in PR before merging

After-merge Checklist:

Test Plan:

@goodov goodov requested review from a team as code owners April 19, 2022 14:44
@goodov goodov self-assigned this Apr 19, 2022
@goodov goodov force-pushed the issues/18830 branch 5 times, most recently from ca1f866 to fdb6fc5 Compare April 25, 2022 12:32
@goodov goodov force-pushed the issues/18830 branch 2 times, most recently from 4c96f68 to f73174a Compare April 25, 2022 14:49
@goodov
Copy link
Member Author

goodov commented Apr 25, 2022

@iefremov @fmarier @bridiver PTAL. Ready for review. Please read PR description to get the idea of what's going on here.

@fmarier
Copy link
Member

fmarier commented Apr 26, 2022

Based on your description of the approach, it seems like the only time this change is a reduction in security is when dealing with an HTTP site that loads HTTP resources that can be upgraded to HTTPS via HSTS. So for example:

  1. Visit http://blog.example which loads http://facebook.com/sdk.js.
  2. Facebook redirects that request to https://facebook.com/sdk.js and includes an HSTS header.
  3. Visit http://anotherblog.example which also loads http://facebook.com/sdk.js.

Before this change, the second visit (in Step 3) would have gone straight to HTTPS. With this change, the second visit will not know that the Facebook resource should be fetched on HTTPS and will attempt fetch the resource over HTTP first. It's worse from a security point of view, but that said, HTTP pages are already vulnerable to an on-path attacker changing the URLs of sub-resources prior to them being requested, so it's not a big deal.

From a perf/memory point of view, if facebook.com/sdk.js is a popular resource on HTTP blogs, then the HSTS cache could grow quite a bit. Hopefully we won't see too many of these cases.

We could perhaps have an optimization where on an HTTPS page, we ignore any HSTS headers we see on sub-resources since they're not going to be needed on that page anyways, thanks to mixed-content blocking. It's not going to help with the HTTP case above, but at least it will reduce the number of unnecessary entries in the HSTS cache.

@fmarier
Copy link
Member

fmarier commented Apr 26, 2022

What's the story with preloaded HSTS entries? Do they continue to apply (unpartitioned of course)? Can we perhaps add a test for this?

@goodov
Copy link
Member Author

goodov commented Apr 26, 2022

What's the story with preloaded HSTS entries? Do they continue to apply (unpartitioned of course)? Can we perhaps add a test for this?

preloaded HSTS entries lookup should work as usual, I've added tests to ensure it is.

We could perhaps have an optimization where on an HTTPS page, we ignore any HSTS headers we see on sub-resources since they're not going to be needed on that page anyways, thanks to mixed-content blocking. It's not going to help with the HTTP case above, but at least it will reduce the number of unnecessary entries in the HSTS cache.

I've looked into some options, but it's unclear what's the best approach here exactly, because we can break some tests that expect HSTS header is supported in these situations.

@fmarier
Copy link
Member

fmarier commented Apr 26, 2022

I've looked into some options, but it's unclear what's the best approach here exactly, because we can break some tests that expect HSTS header is supported in these situations.

I don't follow the breakage you have in mind here. Unless I'm misunderstanding something, if you're on an HTTPS page, then HSTS for any of the subresources is irrelevant because any HTTP resources would get blocked by the mixed content blocker, no?

@goodov
Copy link
Member Author

goodov commented Apr 27, 2022

I don't follow the breakage you have in mind here. Unless I'm misunderstanding something, if you're on an HTTPS page, then HSTS for any of the subresources is irrelevant because any HTTP resources would get blocked by the mixed content blocker, no?

Yes, that's correct. I think you're right about not storing HSTS state in some situations, but what exact rules should we apply?

For example, we have http://a.com and nested iframe https://b.com.

  1. Should we store HSTS for iframe URL?
  2. Should we store HSTS for a resource fetch inside an iframe?
  3. Should we store HSTS for a resource fetch inside a main frame?

Do we even care for any non-main frame HSTS headers when HSTS is partitioned?
cc @pes10k maybe you have some vision here?

@pes10k
Copy link
Contributor

pes10k commented Apr 27, 2022

@goodov ideally we'd store as much HSTS state as possible. If its the case though that there are perf / memory limits here, then here is my rough preference of options:

  1. store all HSTS state
  2. store all HSTS state except for non-iframe subrequests in 3p iframes
  3. store all HSTS state except for subrequests of the 1p frame
  4. store only HSTS state for 1p

@goodov
Copy link
Member Author

goodov commented Apr 27, 2022

I'm afraid there is not enough data to properly decide if HSTS data is worth storing or not (soon frame_site info will be gone in upstream). One possible option is to filter using this condition for each request: site_for_cookies == top_frame_site. I can apply this if everyone is okay with that or we can keep it as is and store all partitioned HSTS data as Pes prefers.

@pes10k
Copy link
Contributor

pes10k commented Apr 27, 2022

@goodov if the only options then are #1 or #4, what about a 2.5 option, something like:

a. store all site_for_cookies == top_frame_site HSTS instructions (which, will common case just be saving the HSTS instruction for the top level document) in global cache
b. store 3p HSTS instructions in ephemeral storage / site-length cache?

@goodov
Copy link
Member Author

goodov commented Apr 27, 2022

b. store 3p HSTS instructions in ephemeral storage / site-length cache?

I had an in-memory storage for 3p frames initially (without cleanup on site close), but it added a noticeable amount of complexity during lookup and cleanup (reset HSTS if site wants to) phases even without the cleanup step on site close. The complexity can be solved by adding some conflict with upstream, but I decided not to go this way at that moment.

If you think this will be beneficial, then I can look again at this, but if this is not crucial then I'd prefer to not introduce ES for HSTS.

@pes10k
Copy link
Contributor

pes10k commented Apr 27, 2022

Gotcha, understood @goodov. Yes, i am fine with that tradeoff then. Thank you for talking through all the options with me. We are sacrificing some (broad) security and privacy (requests to HTTPS subresources can now be subjected to downgrade attacks) to gain a specific kind of privacy improvement (cross site tracking protection).

I am fine with that tradeoff, as long as the security team is (cc @fmarier @diracdeltas )

@fmarier
Copy link
Member

fmarier commented Apr 27, 2022

If I understand correctly the proposed site_for_cookies == top_frame_site will mean on an HTTP site like http://blog.example which loads http://facebook.com/sdk.js, every page load on blog.example will go to HTTP and then get server-redirected to HTTPS? In the original version of the PR, the HSTS signal would have been saved in the global cache (against blog.example) and then it would have applied to all page loads after the first one. Is that the change being proposed?

@fmarier
Copy link
Member

fmarier commented Apr 27, 2022

My original suggestion was not to drop from the cache entries that could be useful, but rather to remove the ones that aren't useful. If the top-level page is HTTPS, then we don't need to save HSTS signals we receive from the subresources because they're already getting blocked/upgraded by the mixed content blocker. Since HTTPS now covers the majority of page loads, this will likely be the bulk of the entries in the cache.

@pes10k
Copy link
Contributor

pes10k commented Apr 27, 2022

we don't need to save HSTS signals we receive from the subresources because they're already getting blocked/upgraded by the mixed content blocker

I think thats mostly correct, but we would be reducing security in the following case

  1. top level document is HSTS (example.org)
  2. includes a HTTP sub resource for 3p.org
  3. that sub resource servers back a HSTS response
  4. user comes back to example.org in the future
  5. MITM block HTTPS upgrade attack against 3p.org

Im fine with the above trade off to gain useful-in-the-wild protections against cross-site-tracking, but i just wanted to try and explain where i see the trade off

@goodov
Copy link
Member Author

goodov commented May 2, 2022

@iefremov @bridiver the PR is ready and waiting for review.

@@ -21,4 +22,118 @@

#endif

#define BRAVE_TRANSPORT_SECURITY_STATE_DELETE_DYNAMIC_DATA_FOR_HOST \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we add unit tests for DeleteDynamicDataForHost to cover this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added.

namespace net {

// Implements partitioning support for structures in TransportSecurityState.
class PartitionedHostStateMapBase {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we add unit tests for this class?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added.

Copy link
Member

@fmarier fmarier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I read through transport_security_state.cc and I noted three more methods that may need to be partition-aware:

  • TransportSecurityState::NetLogUpgradeToSSLParam()
  • TransportSecurityState::AddHSTS() (used in net-internals)
  • TransportSecurityState::GetDynamicSTSState() (gets called by GetSTSState())

Let me know if you've already looked at them and left them out on purpose.

@fmarier
Copy link
Member

fmarier commented May 10, 2022

I do have one high-level question as well: what do we do with existing HSTS data when partitioning is enabled and disabled?

In other words, once people upgrade an existing profile (with existing HSTS entries in the HSTS cache), does it mean that all existing entries will be ignored (and eventually expire) because the hashes won't match?

That's probably what we want in order to clear any existing HSTS super-cookies, but it does mean that we have an additional (temporary) security downgrade, right?

@goodov
Copy link
Member Author

goodov commented May 10, 2022

I read through transport_security_state.cc and I noted three more methods that may need to be partition-aware:

  • TransportSecurityState::NetLogUpgradeToSSLParam()

Is called directly from TransportSecurityState::ShouldUpgradeToSSL which is now partitioned, i.e. it sets PartitionHash to be used by internal methods such as GetDynamicSTSState, which NetLogUpgradeToSSLParam does:

  dict.SetBoolKey("get_sts_state_result", GetSTSState(host, &sts_state));
  • TransportSecurityState::AddHSTS() (used in net-internals)

This one I decided not to touch, because it was unclear how we can use it without additional UI changes (field for partition). But now I think that we can generate "top frame site" hash by taking eTLD+1 from the host we're trying to add. It makes sense and it should work. Yes, we still wouldn't be able to add partitioned HSTS records, but I think it's fine.

  • TransportSecurityState::GetDynamicSTSState() (gets called by GetSTSState())

GetDynamicSTSState is used publicly only from net-internals. For the real HSTS upgrades, GetSTSState is used which is partitioned and does both Dynamic and Static lookups.
I might add similar logic to GetDynamicSTSState as in AddHSTS to generate partition hash eTLD+1 from the host so our net-internals UI will still work for most cases.

I do have one high-level question as well: what do we do with existing HSTS data when partitioning is enabled and disabled?

the data stays in the database, the lookup/store logic just creates different keys to operate it.

In other words, once people upgrade an existing profile (with existing HSTS entries in the HSTS cache), does it mean that all existing entries will be ignored (and eventually expire) because the hashes won't match?

That's probably what we want in order to clear any existing HSTS super-cookies, but it does mean that we have an additional (temporary) security downgrade, right?

yes, we will have a temporary security downgrade with the current implementation. It might be possible to do a "legacy" lookup for cases when host == top frame site, this will ensure that we use old records, but only in an exact partition. I'll try some options.

@fmarier
Copy link
Member

fmarier commented May 10, 2022

yes, we will have a temporary security downgrade with the current implementation. It might be possible to do a "legacy" lookup for cases when host == top frame site, this will ensure that we use old records, but only in an exact partition. I'll try some options.

I discussed this with the rest of the security team and I don't think the extra complexity is worth it. After all, HSTS is not the only protection we have against downgrades, we also have HTTPS Everywhere.

So we can just accept as a known limitation. It's a one-off event as well.

@goodov goodov force-pushed the issues/18830 branch 5 times, most recently from b2feb03 to 48e726f Compare May 19, 2022 16:25
@goodov
Copy link
Member Author

goodov commented May 20, 2022

@iefremov @bridiver PTAL. it's ready.

const std::string& host,
STSState* result);

// This is used only for manual addiing via net-internals page.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adding

@@ -0,0 +1,52 @@
/* Copyright 2021 The Brave Authors. All rights reserved.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2022 haha

@goodov
Copy link
Member Author

goodov commented May 26, 2022

For the context: I tried to enable upstream transport_security_state_unittest.cc and extend it with partitioning tests, but currently a lot of upstream tests fail because we alter bundled HSTS/pin database, tests don't expect that. This is how we alter this database:

for (auto& entry : chromium_entries) {
// leave test entries
if (entry->pinset != "test") {
// Google has asked us not to include the pins that ship with Chrome.
entry->pinset = "";
}
if (!entry->force_https && entry->pinset.empty() && !entry->expect_ct)
continue;
entries->push_back(std::move(entry));
}

@goodov goodov merged commit 1100f7b into master May 26, 2022
@goodov goodov deleted the issues/18830 branch May 26, 2022 13:55
@goodov goodov added this to the 1.41.x - Nightly milestone May 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
7 participants