Skip to content

Commit

Permalink
Maintain 3 separate HPD caches for the HPRT experiment
Browse files Browse the repository at this point in the history
Hash-prefix real-time checks and URL real-time checks can both fall back
to the hash-prefix database check (HPD). As such, we don't want the
different mechanisms to share the same HPD cache, to avoid one mechanism
benefitting from cached results of another mechanism's lookup.

The two backgrounded mechanisms (hash real-time and hash database
lookups) each have their own HPD cache within V4GetHashProtocolManager.
URL real-time lookups use the primary cache as normal, since the
experiment is performed by intercepting URL real-time lookups.

Within the context of the experiment, reads and writes are performed on
the specific mechanism's HPD cache.
Outside the context of the experiment, reads are performed only on the
primary cache. Writes are performed on all 3 of the caches. This is
because many cache updates are outside the scope of the experiment (e.g.
non-mainframe URLs, checks to non-browsing threat types), but the
backgrounded mechanisms should be allowed to reuse these results if they
happen to match, since that's how it would work if that mechanism were
the default mechanism being used.

We only want to maintain the separate caches if the experiment might run
for the user (they are an ESB user and the feature flag is enabled). As
an approximation to keep the code simple, we start maintaining separate
caches any time there is a lookup performed where these criteria are
matched. The edge cases this misses (e.g. a user enables ESB) are rare
enough that the approximation will have a negligible impact on the
results.

Implementation details:
 - V4GetHashProtocolManager has most of the changes. It holds the 3
caches and owns the read/write logic for them. It also has the property
is_lookup_mechanism_experiment_enabled_ that determines whether it
should maintain 3 caches instead of just one.
 - A MechanismExperimentHashDatabaseCache enum value is threaded through
from consumers to V4GetHashProtocolManager to specify which caches
should be used for reads/writes. Within the experiment, this says to
read/write to the specific mechanism's cache. Outside the experiment,
this specifies to read only from the primary cache and write to all 3
caches.
 - If the experiment should be enabled, at the start of a lookup,
SafeBrowsingUrlCheckerImpl kicks off a chain of
SetLookupMechanismExperimentIsEnabled calls that eventually set
is_lookup_mechanism_experiment_enabled_ to true in
V4GetHashProtocolManager.

Bug: 1392144

Change-Id: Ic7aaa65d990003e6aff2420a03f125496b0d4217
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/4203687
Commit-Queue: thefrog <thefrog@chromium.org>
Reviewed-by: Xinghui Lu <xinghuilu@chromium.org>
Cr-Commit-Position: refs/heads/main@{#1100477}
  • Loading branch information
thefrog authored and Chromium LUCI CQ committed Feb 2, 2023
1 parent 2b91d47 commit ee2a607
Show file tree
Hide file tree
Showing 33 changed files with 592 additions and 147 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -383,7 +383,8 @@ class TestSBClient : public base::RefCountedThreadSafe<TestSBClient>,
// safe signal, handle it right away.
bool synchronous_safe_signal =
safe_browsing_service_->database_manager()->CheckBrowseUrl(
url, threat_types, this);
url, threat_types, this,
MechanismExperimentHashDatabaseCache::kNoExperiment);
if (synchronous_safe_signal) {
threat_type_ = SB_THREAT_TYPE_SAFE;
content::GetUIThreadTaskRunner({})->PostTask(
Expand Down
3 changes: 2 additions & 1 deletion components/safe_browsing/android/remote_database_manager.cc
Original file line number Diff line number Diff line change
Expand Up @@ -194,7 +194,8 @@ bool RemoteSafeBrowsingDatabaseManager::ChecksAreAlwaysAsync() const {
bool RemoteSafeBrowsingDatabaseManager::CheckBrowseUrl(
const GURL& url,
const SBThreatTypeSet& threat_types,
Client* client) {
Client* client,
MechanismExperimentHashDatabaseCache experiment_cache_selection) {
DCHECK_CURRENTLY_ON(BrowserThread::IO);
DCHECK(!threat_types.empty());
DCHECK(SBThreatTypeSetIsValidForCheckBrowseUrl(threat_types));
Expand Down
8 changes: 5 additions & 3 deletions components/safe_browsing/android/remote_database_manager.h
Original file line number Diff line number Diff line change
Expand Up @@ -44,9 +44,11 @@ class RemoteSafeBrowsingDatabaseManager : public SafeBrowsingDatabaseManager {
network::mojom::RequestDestination request_destination) const override;
bool CanCheckUrl(const GURL& url) const override;
bool ChecksAreAlwaysAsync() const override;
bool CheckBrowseUrl(const GURL& url,
const SBThreatTypeSet& threat_types,
Client* client) override;
bool CheckBrowseUrl(
const GURL& url,
const SBThreatTypeSet& threat_types,
Client* client,
MechanismExperimentHashDatabaseCache experiment_cache_selection) override;
bool CheckDownloadUrl(const std::vector<GURL>& url_chain,
Client* client) override;
bool CheckExtensionIDs(const std::set<std::string>& extension_ids,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -142,7 +142,8 @@ class BrowserURLLoaderThrottle::CheckerOnIO
can_check_high_confidence_allowlist_, url_lookup_service_metric_suffix_,
last_committed_url_, content::GetUIThreadTaskRunner({}),
url_lookup_service_, WebUIInfoSingleton::GetInstance(),
hash_realtime_service_, mechanism_experimenter_);
hash_realtime_service_, mechanism_experimenter_,
is_mechanism_experiment_allowed_);

CheckUrl(url, method);
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -181,7 +181,8 @@ void MojoSafeBrowsingImpl::CreateCheckerAndCheck(
/*last_committed_url=*/GURL(), content::GetUIThreadTaskRunner({}),
/*url_lookup_service=*/nullptr, WebUIInfoSingleton::GetInstance(),
/*hash_realtime_service_on_ui=*/nullptr,
/*mechanism_experimenter=*/nullptr);
/*mechanism_experimenter=*/nullptr,
/*is_mechanism_experiment_allowed=*/false);

checker_impl->CheckUrl(
url, method,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,8 @@ class WebApiHandshakeChecker::CheckerOnIO
content::GetUIThreadTaskRunner({}),
/*url_lookup_service=*/nullptr, WebUIInfoSingleton::GetInstance(),
/*hash_realtime_service_on_ui=*/nullptr,
/*mechanism_experimenter=*/nullptr);
/*mechanism_experimenter=*/nullptr,
/*is_mechanism_experiment_allowed=*/false);
url_checker_->CheckUrl(
url, "GET",
base::BindOnce(&WebApiHandshakeChecker::CheckerOnIO::OnCheckUrlResult,
Expand Down
6 changes: 6 additions & 0 deletions components/safe_browsing/core/browser/db/database_manager.cc
Original file line number Diff line number Diff line change
Expand Up @@ -156,6 +156,12 @@ bool SafeBrowsingDatabaseManager::IsDatabaseReady() {
return enabled_;
}

void SafeBrowsingDatabaseManager::SetLookupMechanismExperimentIsEnabled() {
if (v4_get_hash_protocol_manager_) {
v4_get_hash_protocol_manager_->SetLookupMechanismExperimentIsEnabled();
}
}

SafeBrowsingDatabaseManager::SafeBrowsingApiCheck::SafeBrowsingApiCheck(
const GURL& url,
Client* client)
Expand Down
15 changes: 11 additions & 4 deletions components/safe_browsing/core/browser/db/database_manager.h
Original file line number Diff line number Diff line change
Expand Up @@ -150,10 +150,14 @@ class SafeBrowsingDatabaseManager
// can synchronously determine that the url is safe, CheckUrl returns true.
// Otherwise it returns false, and |client| is called asynchronously with the
// result when it is ready. The URL will only be checked for the threat types
// in |threat_types|.
virtual bool CheckBrowseUrl(const GURL& url,
const SBThreatTypeSet& threat_types,
Client* client) = 0;
// in |threat_types|. |experiment_cache_selection| specifies which cache to
// use. See comments above MechanismExperimentHashDatabaseCache's definition
// for more details.
virtual bool CheckBrowseUrl(
const GURL& url,
const SBThreatTypeSet& threat_types,
Client* client,
MechanismExperimentHashDatabaseCache experiment_cache_selection) = 0;

// Check if the prefix for |url| is in safebrowsing download add lists.
// Result will be passed to callback in |client|.
Expand Down Expand Up @@ -221,6 +225,9 @@ class SafeBrowsingDatabaseManager
// Returns whether download protection is enabled.
virtual bool IsDownloadProtectionEnabled() const = 0;

// Calls the method with the same name in |v4_get_hash_protocol_manager_|.
virtual void SetLookupMechanismExperimentIsEnabled();

//
// Methods to indicate when to start or suspend the SafeBrowsing operations.
// These functions are always called on the IO thread.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,8 @@ bool FakeSafeBrowsingDatabaseManager::ChecksAreAlwaysAsync() const {
bool FakeSafeBrowsingDatabaseManager::CheckBrowseUrl(
const GURL& url,
const SBThreatTypeSet& threat_types,
Client* client) {
Client* client,
MechanismExperimentHashDatabaseCache experiment_cache_selection) {
const auto it = dangerous_urls_.find(url);
if (it == dangerous_urls_.end())
return true;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -31,9 +31,11 @@ class FakeSafeBrowsingDatabaseManager : public TestSafeBrowsingDatabaseManager {
bool CanCheckRequestDestination(
network::mojom::RequestDestination request_destination) const override;
bool ChecksAreAlwaysAsync() const override;
bool CheckBrowseUrl(const GURL& url,
const SBThreatTypeSet& threat_types,
Client* client) override;
bool CheckBrowseUrl(
const GURL& url,
const SBThreatTypeSet& threat_types,
Client* client,
MechanismExperimentHashDatabaseCache experiment_cache_selection) override;
bool CheckDownloadUrl(const std::vector<GURL>& url_chain,
Client* client) override;
bool CheckExtensionIDs(const std::set<std::string>& extension_ids,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,8 @@ bool TestSafeBrowsingDatabaseManager::ChecksAreAlwaysAsync() const {
bool TestSafeBrowsingDatabaseManager::CheckBrowseUrl(
const GURL& url,
const SBThreatTypeSet& threat_types,
Client* client) {
Client* client,
MechanismExperimentHashDatabaseCache experiment_cache_selection) {
NOTIMPLEMENTED();
return true;
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -32,9 +32,11 @@ class TestSafeBrowsingDatabaseManager : public SafeBrowsingDatabaseManager {
network::mojom::RequestDestination request_destination) const override;
bool CanCheckUrl(const GURL& url) const override;
bool ChecksAreAlwaysAsync() const override;
bool CheckBrowseUrl(const GURL& url,
const SBThreatTypeSet& threat_types,
Client* client) override;
bool CheckBrowseUrl(
const GURL& url,
const SBThreatTypeSet& threat_types,
Client* client,
MechanismExperimentHashDatabaseCache experiment_cache_selection) override;
AsyncMatch CheckCsdAllowlistUrl(const GURL& url, Client* client) override;
bool CheckDownloadUrl(const std::vector<GURL>& url_chain,
Client* client) override;
Expand Down
130 changes: 109 additions & 21 deletions components/safe_browsing/core/browser/db/v4_get_hash_protocol_manager.cc
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@
#include "base/timer/timer.h"
#include "base/trace_event/trace_event.h"
#include "base/trace_event/traced_value.h"
#include "components/safe_browsing/core/browser/db/v4_protocol_manager_util.h"
#include "components/safe_browsing/core/common/features.h"
#include "components/safe_browsing/core/common/utils.h"
#include "net/base/load_flags.h"
Expand Down Expand Up @@ -214,14 +215,17 @@ FullHashCallbackInfo::FullHashCallbackInfo(
const FullHashToStoreAndHashPrefixesMap&
full_hash_to_store_and_hash_prefixes,
FullHashCallback callback,
const base::Time& network_start_time)
const base::Time& network_start_time,
MechanismExperimentHashDatabaseCache mechanism_experiment_cache_selection)
: cached_full_hash_infos(cached_full_hash_infos),
callback(std::move(callback)),
loader(std::move(loader)),
full_hash_to_store_and_hash_prefixes(
full_hash_to_store_and_hash_prefixes),
network_start_time(network_start_time),
prefixes_requested(prefixes_requested) {}
prefixes_requested(prefixes_requested),
mechanism_experiment_cache_selection(
mechanism_experiment_cache_selection) {}

FullHashCallbackInfo::~FullHashCallbackInfo() {}

Expand Down Expand Up @@ -303,14 +307,16 @@ void V4GetHashProtocolManager::GetFullHashes(
const FullHashToStoreAndHashPrefixesMap
full_hash_to_store_and_hash_prefixes,
const std::vector<std::string>& list_client_states,
FullHashCallback callback) {
FullHashCallback callback,
MechanismExperimentHashDatabaseCache mechanism_experiment_cache_selection) {
DCHECK_CALLED_ON_VALID_SEQUENCE(sequence_checker_);
DCHECK(!full_hash_to_store_and_hash_prefixes.empty());

std::vector<HashPrefixStr> prefixes_to_request;
std::vector<FullHashInfo> cached_full_hash_infos;
GetFullHashCachedResults(full_hash_to_store_and_hash_prefixes, Time::Now(),
&prefixes_to_request, &cached_full_hash_infos);
&prefixes_to_request, &cached_full_hash_infos,
mechanism_experiment_cache_selection);

if (prefixes_to_request.empty()) {
// 100% cache hits (positive or negative) so we can call the callback right
Expand Down Expand Up @@ -386,7 +392,8 @@ void V4GetHashProtocolManager::GetFullHashes(

pending_hash_requests_[loader] = std::make_unique<FullHashCallbackInfo>(
cached_full_hash_infos, prefixes_to_request, std::move(owned_loader),
full_hash_to_store_and_hash_prefixes, std::move(callback), clock_->Now());
full_hash_to_store_and_hash_prefixes, std::move(callback), clock_->Now(),
mechanism_experiment_cache_selection);
UMA_HISTOGRAM_COUNTS_100("SafeBrowsing.V4GetHash.CountOfPrefixes",
prefixes_to_request.size());
}
Expand Down Expand Up @@ -415,15 +422,17 @@ void V4GetHashProtocolManager::GetFullHashesWithApis(
GetFullHashes(full_hash_to_store_and_hash_prefixes, list_client_states,
base::BindOnce(&V4GetHashProtocolManager::OnFullHashForApi,
base::Unretained(this), std::move(api_callback),
full_hashes));
full_hashes),
MechanismExperimentHashDatabaseCache::kNoExperiment);
}

void V4GetHashProtocolManager::GetFullHashCachedResults(
const FullHashToStoreAndHashPrefixesMap&
full_hash_to_store_and_hash_prefixes,
const Time& now,
std::vector<HashPrefixStr>* prefixes_to_request,
std::vector<FullHashInfo>* cached_full_hash_infos) {
std::vector<FullHashInfo>* cached_full_hash_infos,
MechanismExperimentHashDatabaseCache mechanism_experiment_cache_selection) {
DCHECK(!full_hash_to_store_and_hash_prefixes.empty());
DCHECK(prefixes_to_request->empty());
DCHECK(cached_full_hash_infos->empty());
Expand Down Expand Up @@ -457,15 +466,42 @@ void V4GetHashProtocolManager::GetFullHashCachedResults(
// cache entry if they expire AND their expire time is after the negative
// cache expire time.

// See comments above MechanismExperimentHashDatabaseCache's definition for
// more context.
// - Outside the context of the experiment, always read from the main cache.
// - Inside the context of the experiment, read from the cache corresponding
// to the specific mechanism.
FullHashCache& full_hash_cache = full_hash_cache_;
DCHECK(is_lookup_mechanism_experiment_enabled_ ||
mechanism_experiment_cache_selection ==
MechanismExperimentHashDatabaseCache::kNoExperiment);
switch (mechanism_experiment_cache_selection) {
case MechanismExperimentHashDatabaseCache::kNoExperiment:
case MechanismExperimentHashDatabaseCache::kUrlRealTimeOnly:
full_hash_cache = full_hash_cache_;
break;
case MechanismExperimentHashDatabaseCache::kHashRealTimeOnly:
full_hash_cache =
lookup_mechanism_experiment_hash_realtime_full_hash_cache_;
break;
case MechanismExperimentHashDatabaseCache::kHashDatabaseOnly:
full_hash_cache =
lookup_mechanism_experiment_hash_database_full_hash_cache_;
break;
default:
NOTREACHED();
break;
}

std::unordered_set<HashPrefixStr> unique_prefixes_to_request;
for (const auto& it : full_hash_to_store_and_hash_prefixes) {
const FullHashStr& full_hash = it.first;
const StoreAndHashPrefixes& matched = it.second;
for (const StoreAndHashPrefix& matched_it : matched) {
const ListIdentifier& list_id = matched_it.list_id;
const HashPrefixStr& prefix = matched_it.hash_prefix;
auto prefix_entry = full_hash_cache_.find(prefix);
if (prefix_entry != full_hash_cache_.end()) {
auto prefix_entry = full_hash_cache.find(prefix);
if (prefix_entry != full_hash_cache.end()) {
// Case 1.
const CachedHashPrefixInfo& cached_prefix_info = prefix_entry->second;
bool found_full_hash = false;
Expand Down Expand Up @@ -754,7 +790,8 @@ void V4GetHashProtocolManager::SetClockForTests(base::Clock* clock) {
void V4GetHashProtocolManager::UpdateCache(
const std::vector<HashPrefixStr>& prefixes_requested,
const std::vector<FullHashInfo>& full_hash_infos,
const Time& negative_cache_expire) {
const Time& negative_cache_expire,
MechanismExperimentHashDatabaseCache mechanism_experiment_cache_selection) {
DCHECK_CALLED_ON_VALID_SEQUENCE(sequence_checker_);

// If negative_cache_expire is null, don't cache the results since it's not
Expand All @@ -763,18 +800,64 @@ void V4GetHashProtocolManager::UpdateCache(
return;
}

for (const HashPrefixStr& prefix : prefixes_requested) {
// Create or reset the cached result for this prefix.
CachedHashPrefixInfo& chpi = full_hash_cache_[prefix];
chpi.full_hash_infos.clear();
chpi.negative_expiry = negative_cache_expire;
auto update_individual_cache =
[prefixes_requested, negative_cache_expire,
full_hash_infos](FullHashCache& full_hash_cache) {
for (const HashPrefixStr& prefix : prefixes_requested) {
// Create or reset the cached result for this prefix.
CachedHashPrefixInfo& chpi = full_hash_cache[prefix];
chpi.full_hash_infos.clear();
chpi.negative_expiry = negative_cache_expire;

for (const FullHashInfo& full_hash_info : full_hash_infos) {
if (V4ProtocolManagerUtil::FullHashMatchesHashPrefix(
full_hash_info.full_hash, prefix)) {
chpi.full_hash_infos.push_back(full_hash_info);
}
}
}
};

for (const FullHashInfo& full_hash_info : full_hash_infos) {
if (V4ProtocolManagerUtil::FullHashMatchesHashPrefix(
full_hash_info.full_hash, prefix)) {
chpi.full_hash_infos.push_back(full_hash_info);
}
// See comments above MechanismExperimentHashDatabaseCache's definition for
// more context.
if (is_lookup_mechanism_experiment_enabled_) {
switch (mechanism_experiment_cache_selection) {
// If this request is outside the scope of the experiment, update all
// three caches. We still want to update the experiment's backgrounded
// caches because there may be cache entries added outside the experiment
// that end up helping with the experiment's cache hits.
case MechanismExperimentHashDatabaseCache::kNoExperiment:
update_individual_cache(full_hash_cache_);
update_individual_cache(
lookup_mechanism_experiment_hash_realtime_full_hash_cache_);
update_individual_cache(
lookup_mechanism_experiment_hash_database_full_hash_cache_);
break;
// If this request is in the scope of the experiment and it's for the main
// request, we only want to update the main cache.
case MechanismExperimentHashDatabaseCache::kUrlRealTimeOnly:
update_individual_cache(full_hash_cache_);
break;
// If this request is in the scope of the experiment and it's for the
// backgrounded hash real-time request, we only want to update that
// backgrounded cache.
case MechanismExperimentHashDatabaseCache::kHashRealTimeOnly:
update_individual_cache(
lookup_mechanism_experiment_hash_realtime_full_hash_cache_);
break;
// If this request is in the scope of the experiment and it's for the
// backgrounded hash database request, we only want to update that
// backgrounded cache.
case MechanismExperimentHashDatabaseCache::kHashDatabaseOnly:
update_individual_cache(
lookup_mechanism_experiment_hash_database_full_hash_cache_);
break;
default:
NOTREACHED();
break;
}
} else {
update_individual_cache(full_hash_cache_);
}
}

Expand Down Expand Up @@ -868,7 +951,8 @@ void V4GetHashProtocolManager::OnURLLoaderCompleteInternal(
const std::unique_ptr<FullHashCallbackInfo>& fhci = it->second;
UMA_HISTOGRAM_LONG_TIMES("SafeBrowsing.V4GetHash.Network.Time",
clock_->Now() - fhci->network_start_time);
UpdateCache(fhci->prefixes_requested, full_hash_infos, negative_cache_expire);
UpdateCache(fhci->prefixes_requested, full_hash_infos, negative_cache_expire,
fhci->mechanism_experiment_cache_selection);
MergeResults(fhci->full_hash_to_store_and_hash_prefixes, full_hash_infos,
&fhci->cached_full_hash_infos);

Expand Down Expand Up @@ -906,6 +990,10 @@ void V4GetHashProtocolManager::CollectFullHashCacheInfo(
}
}

void V4GetHashProtocolManager::SetLookupMechanismExperimentIsEnabled() {
is_lookup_mechanism_experiment_enabled_ = true;
}

#ifndef DEBUG
std::ostream& operator<<(std::ostream& os, const FullHashInfo& fhi) {
os << "{full_hash: " << fhi.full_hash << "; list_id: " << fhi.list_id
Expand Down

0 comments on commit ee2a607

Please sign in to comment.