Skip to content

Commit

Permalink
Reland "Maintain 3 separate HPD caches for the HPRT experiment"
Browse files Browse the repository at this point in the history
This is a reland of commit ee2a607

The difference between this reland and the original upload is in
Patchsets 3-4, which set a default value for the PendingCheck field
mechanism_experiment_cache_selection because there is a second
constructor that does not pass through a value for it. The change
includes refactoring some common code out of
V4GetHashProtocolManagerTest so that V4LocalDatabaseManagerTest can
reuse it.

Original change's description:
> Maintain 3 separate HPD caches for the HPRT experiment
>
> Hash-prefix real-time checks and URL real-time checks can both fall back
> to the hash-prefix database check (HPD). As such, we don't want the
> different mechanisms to share the same HPD cache, to avoid one mechanism
> benefitting from cached results of another mechanism's lookup.
>
> The two backgrounded mechanisms (hash real-time and hash database
> lookups) each have their own HPD cache within V4GetHashProtocolManager.
> URL real-time lookups use the primary cache as normal, since the
> experiment is performed by intercepting URL real-time lookups.
>
> Within the context of the experiment, reads and writes are performed on
> the specific mechanism's HPD cache.
> Outside the context of the experiment, reads are performed only on the
> primary cache. Writes are performed on all 3 of the caches. This is
> because many cache updates are outside the scope of the experiment (e.g.
> non-mainframe URLs, checks to non-browsing threat types), but the
> backgrounded mechanisms should be allowed to reuse these results if they
> happen to match, since that's how it would work if that mechanism were
> the default mechanism being used.
>
> We only want to maintain the separate caches if the experiment might run
> for the user (they are an ESB user and the feature flag is enabled). As
> an approximation to keep the code simple, we start maintaining separate
> caches any time there is a lookup performed where these criteria are
> matched. The edge cases this misses (e.g. a user enables ESB) are rare
> enough that the approximation will have a negligible impact on the
> results.
>
> Implementation details:
>  - V4GetHashProtocolManager has most of the changes. It holds the 3
> caches and owns the read/write logic for them. It also has the property
> is_lookup_mechanism_experiment_enabled_ that determines whether it
> should maintain 3 caches instead of just one.
>  - A MechanismExperimentHashDatabaseCache enum value is threaded through
> from consumers to V4GetHashProtocolManager to specify which caches
> should be used for reads/writes. Within the experiment, this says to
> read/write to the specific mechanism's cache. Outside the experiment,
> this specifies to read only from the primary cache and write to all 3
> caches.
>  - If the experiment should be enabled, at the start of a lookup,
> SafeBrowsingUrlCheckerImpl kicks off a chain of
> SetLookupMechanismExperimentIsEnabled calls that eventually set
> is_lookup_mechanism_experiment_enabled_ to true in
> V4GetHashProtocolManager.
>
> Bug: 1392144
>
> Change-Id: Ic7aaa65d990003e6aff2420a03f125496b0d4217
> Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/4203687
> Commit-Queue: thefrog <thefrog@chromium.org>
> Reviewed-by: Xinghui Lu <xinghuilu@chromium.org>
> Cr-Commit-Position: refs/heads/main@{#1100477}

Bug: 1392144
Change-Id: I911b854c157092630c6c809b4f4fdf3719e6b686
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/4220115
Commit-Queue: thefrog <thefrog@chromium.org>
Reviewed-by: Xinghui Lu <xinghuilu@chromium.org>
Cr-Commit-Position: refs/heads/main@{#1101166}
  • Loading branch information
thefrog authored and Chromium LUCI CQ committed Feb 3, 2023
1 parent ce627ae commit 7c7c040
Show file tree
Hide file tree
Showing 35 changed files with 765 additions and 214 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -383,7 +383,8 @@ class TestSBClient : public base::RefCountedThreadSafe<TestSBClient>,
// safe signal, handle it right away.
bool synchronous_safe_signal =
safe_browsing_service_->database_manager()->CheckBrowseUrl(
url, threat_types, this);
url, threat_types, this,
MechanismExperimentHashDatabaseCache::kNoExperiment);
if (synchronous_safe_signal) {
threat_type_ = SB_THREAT_TYPE_SAFE;
content::GetUIThreadTaskRunner({})->PostTask(
Expand Down
3 changes: 2 additions & 1 deletion components/safe_browsing/android/remote_database_manager.cc
Original file line number Diff line number Diff line change
Expand Up @@ -194,7 +194,8 @@ bool RemoteSafeBrowsingDatabaseManager::ChecksAreAlwaysAsync() const {
bool RemoteSafeBrowsingDatabaseManager::CheckBrowseUrl(
const GURL& url,
const SBThreatTypeSet& threat_types,
Client* client) {
Client* client,
MechanismExperimentHashDatabaseCache experiment_cache_selection) {
DCHECK_CURRENTLY_ON(BrowserThread::IO);
DCHECK(!threat_types.empty());
DCHECK(SBThreatTypeSetIsValidForCheckBrowseUrl(threat_types));
Expand Down
8 changes: 5 additions & 3 deletions components/safe_browsing/android/remote_database_manager.h
Original file line number Diff line number Diff line change
Expand Up @@ -44,9 +44,11 @@ class RemoteSafeBrowsingDatabaseManager : public SafeBrowsingDatabaseManager {
network::mojom::RequestDestination request_destination) const override;
bool CanCheckUrl(const GURL& url) const override;
bool ChecksAreAlwaysAsync() const override;
bool CheckBrowseUrl(const GURL& url,
const SBThreatTypeSet& threat_types,
Client* client) override;
bool CheckBrowseUrl(
const GURL& url,
const SBThreatTypeSet& threat_types,
Client* client,
MechanismExperimentHashDatabaseCache experiment_cache_selection) override;
bool CheckDownloadUrl(const std::vector<GURL>& url_chain,
Client* client) override;
bool CheckExtensionIDs(const std::set<std::string>& extension_ids,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -142,7 +142,8 @@ class BrowserURLLoaderThrottle::CheckerOnIO
can_check_high_confidence_allowlist_, url_lookup_service_metric_suffix_,
last_committed_url_, content::GetUIThreadTaskRunner({}),
url_lookup_service_, WebUIInfoSingleton::GetInstance(),
hash_realtime_service_, mechanism_experimenter_);
hash_realtime_service_, mechanism_experimenter_,
is_mechanism_experiment_allowed_);

CheckUrl(url, method);
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -181,7 +181,8 @@ void MojoSafeBrowsingImpl::CreateCheckerAndCheck(
/*last_committed_url=*/GURL(), content::GetUIThreadTaskRunner({}),
/*url_lookup_service=*/nullptr, WebUIInfoSingleton::GetInstance(),
/*hash_realtime_service_on_ui=*/nullptr,
/*mechanism_experimenter=*/nullptr);
/*mechanism_experimenter=*/nullptr,
/*is_mechanism_experiment_allowed=*/false);

checker_impl->CheckUrl(
url, method,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,8 @@ class WebApiHandshakeChecker::CheckerOnIO
content::GetUIThreadTaskRunner({}),
/*url_lookup_service=*/nullptr, WebUIInfoSingleton::GetInstance(),
/*hash_realtime_service_on_ui=*/nullptr,
/*mechanism_experimenter=*/nullptr);
/*mechanism_experimenter=*/nullptr,
/*is_mechanism_experiment_allowed=*/false);
url_checker_->CheckUrl(
url, "GET",
base::BindOnce(&WebApiHandshakeChecker::CheckerOnIO::OnCheckUrlResult,
Expand Down
6 changes: 6 additions & 0 deletions components/safe_browsing/core/browser/db/database_manager.cc
Original file line number Diff line number Diff line change
Expand Up @@ -156,6 +156,12 @@ bool SafeBrowsingDatabaseManager::IsDatabaseReady() {
return enabled_;
}

void SafeBrowsingDatabaseManager::SetLookupMechanismExperimentIsEnabled() {
if (v4_get_hash_protocol_manager_) {
v4_get_hash_protocol_manager_->SetLookupMechanismExperimentIsEnabled();
}
}

SafeBrowsingDatabaseManager::SafeBrowsingApiCheck::SafeBrowsingApiCheck(
const GURL& url,
Client* client)
Expand Down
15 changes: 11 additions & 4 deletions components/safe_browsing/core/browser/db/database_manager.h
Original file line number Diff line number Diff line change
Expand Up @@ -150,10 +150,14 @@ class SafeBrowsingDatabaseManager
// can synchronously determine that the url is safe, CheckUrl returns true.
// Otherwise it returns false, and |client| is called asynchronously with the
// result when it is ready. The URL will only be checked for the threat types
// in |threat_types|.
virtual bool CheckBrowseUrl(const GURL& url,
const SBThreatTypeSet& threat_types,
Client* client) = 0;
// in |threat_types|. |experiment_cache_selection| specifies which cache to
// use. See comments above MechanismExperimentHashDatabaseCache's definition
// for more details.
virtual bool CheckBrowseUrl(
const GURL& url,
const SBThreatTypeSet& threat_types,
Client* client,
MechanismExperimentHashDatabaseCache experiment_cache_selection) = 0;

// Check if the prefix for |url| is in safebrowsing download add lists.
// Result will be passed to callback in |client|.
Expand Down Expand Up @@ -221,6 +225,9 @@ class SafeBrowsingDatabaseManager
// Returns whether download protection is enabled.
virtual bool IsDownloadProtectionEnabled() const = 0;

// Calls the method with the same name in |v4_get_hash_protocol_manager_|.
virtual void SetLookupMechanismExperimentIsEnabled();

//
// Methods to indicate when to start or suspend the SafeBrowsing operations.
// These functions are always called on the IO thread.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,8 @@ bool FakeSafeBrowsingDatabaseManager::ChecksAreAlwaysAsync() const {
bool FakeSafeBrowsingDatabaseManager::CheckBrowseUrl(
const GURL& url,
const SBThreatTypeSet& threat_types,
Client* client) {
Client* client,
MechanismExperimentHashDatabaseCache experiment_cache_selection) {
const auto it = dangerous_urls_.find(url);
if (it == dangerous_urls_.end())
return true;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -31,9 +31,11 @@ class FakeSafeBrowsingDatabaseManager : public TestSafeBrowsingDatabaseManager {
bool CanCheckRequestDestination(
network::mojom::RequestDestination request_destination) const override;
bool ChecksAreAlwaysAsync() const override;
bool CheckBrowseUrl(const GURL& url,
const SBThreatTypeSet& threat_types,
Client* client) override;
bool CheckBrowseUrl(
const GURL& url,
const SBThreatTypeSet& threat_types,
Client* client,
MechanismExperimentHashDatabaseCache experiment_cache_selection) override;
bool CheckDownloadUrl(const std::vector<GURL>& url_chain,
Client* client) override;
bool CheckExtensionIDs(const std::set<std::string>& extension_ids,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,8 @@ bool TestSafeBrowsingDatabaseManager::ChecksAreAlwaysAsync() const {
bool TestSafeBrowsingDatabaseManager::CheckBrowseUrl(
const GURL& url,
const SBThreatTypeSet& threat_types,
Client* client) {
Client* client,
MechanismExperimentHashDatabaseCache experiment_cache_selection) {
NOTIMPLEMENTED();
return true;
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -32,9 +32,11 @@ class TestSafeBrowsingDatabaseManager : public SafeBrowsingDatabaseManager {
network::mojom::RequestDestination request_destination) const override;
bool CanCheckUrl(const GURL& url) const override;
bool ChecksAreAlwaysAsync() const override;
bool CheckBrowseUrl(const GURL& url,
const SBThreatTypeSet& threat_types,
Client* client) override;
bool CheckBrowseUrl(
const GURL& url,
const SBThreatTypeSet& threat_types,
Client* client,
MechanismExperimentHashDatabaseCache experiment_cache_selection) override;
AsyncMatch CheckCsdAllowlistUrl(const GURL& url, Client* client) override;
bool CheckDownloadUrl(const std::vector<GURL>& url_chain,
Client* client) override;
Expand Down
130 changes: 109 additions & 21 deletions components/safe_browsing/core/browser/db/v4_get_hash_protocol_manager.cc
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@
#include "base/timer/timer.h"
#include "base/trace_event/trace_event.h"
#include "base/trace_event/traced_value.h"
#include "components/safe_browsing/core/browser/db/v4_protocol_manager_util.h"
#include "components/safe_browsing/core/common/features.h"
#include "components/safe_browsing/core/common/utils.h"
#include "net/base/load_flags.h"
Expand Down Expand Up @@ -214,14 +215,17 @@ FullHashCallbackInfo::FullHashCallbackInfo(
const FullHashToStoreAndHashPrefixesMap&
full_hash_to_store_and_hash_prefixes,
FullHashCallback callback,
const base::Time& network_start_time)
const base::Time& network_start_time,
MechanismExperimentHashDatabaseCache mechanism_experiment_cache_selection)
: cached_full_hash_infos(cached_full_hash_infos),
callback(std::move(callback)),
loader(std::move(loader)),
full_hash_to_store_and_hash_prefixes(
full_hash_to_store_and_hash_prefixes),
network_start_time(network_start_time),
prefixes_requested(prefixes_requested) {}
prefixes_requested(prefixes_requested),
mechanism_experiment_cache_selection(
mechanism_experiment_cache_selection) {}

FullHashCallbackInfo::~FullHashCallbackInfo() {}

Expand Down Expand Up @@ -303,14 +307,16 @@ void V4GetHashProtocolManager::GetFullHashes(
const FullHashToStoreAndHashPrefixesMap
full_hash_to_store_and_hash_prefixes,
const std::vector<std::string>& list_client_states,
FullHashCallback callback) {
FullHashCallback callback,
MechanismExperimentHashDatabaseCache mechanism_experiment_cache_selection) {
DCHECK_CALLED_ON_VALID_SEQUENCE(sequence_checker_);
DCHECK(!full_hash_to_store_and_hash_prefixes.empty());

std::vector<HashPrefixStr> prefixes_to_request;
std::vector<FullHashInfo> cached_full_hash_infos;
GetFullHashCachedResults(full_hash_to_store_and_hash_prefixes, Time::Now(),
&prefixes_to_request, &cached_full_hash_infos);
&prefixes_to_request, &cached_full_hash_infos,
mechanism_experiment_cache_selection);

if (prefixes_to_request.empty()) {
// 100% cache hits (positive or negative) so we can call the callback right
Expand Down Expand Up @@ -386,7 +392,8 @@ void V4GetHashProtocolManager::GetFullHashes(

pending_hash_requests_[loader] = std::make_unique<FullHashCallbackInfo>(
cached_full_hash_infos, prefixes_to_request, std::move(owned_loader),
full_hash_to_store_and_hash_prefixes, std::move(callback), clock_->Now());
full_hash_to_store_and_hash_prefixes, std::move(callback), clock_->Now(),
mechanism_experiment_cache_selection);
UMA_HISTOGRAM_COUNTS_100("SafeBrowsing.V4GetHash.CountOfPrefixes",
prefixes_to_request.size());
}
Expand Down Expand Up @@ -415,15 +422,17 @@ void V4GetHashProtocolManager::GetFullHashesWithApis(
GetFullHashes(full_hash_to_store_and_hash_prefixes, list_client_states,
base::BindOnce(&V4GetHashProtocolManager::OnFullHashForApi,
base::Unretained(this), std::move(api_callback),
full_hashes));
full_hashes),
MechanismExperimentHashDatabaseCache::kNoExperiment);
}

void V4GetHashProtocolManager::GetFullHashCachedResults(
const FullHashToStoreAndHashPrefixesMap&
full_hash_to_store_and_hash_prefixes,
const Time& now,
std::vector<HashPrefixStr>* prefixes_to_request,
std::vector<FullHashInfo>* cached_full_hash_infos) {
std::vector<FullHashInfo>* cached_full_hash_infos,
MechanismExperimentHashDatabaseCache mechanism_experiment_cache_selection) {
DCHECK(!full_hash_to_store_and_hash_prefixes.empty());
DCHECK(prefixes_to_request->empty());
DCHECK(cached_full_hash_infos->empty());
Expand Down Expand Up @@ -457,15 +466,42 @@ void V4GetHashProtocolManager::GetFullHashCachedResults(
// cache entry if they expire AND their expire time is after the negative
// cache expire time.

// See comments above MechanismExperimentHashDatabaseCache's definition for
// more context.
// - Outside the context of the experiment, always read from the main cache.
// - Inside the context of the experiment, read from the cache corresponding
// to the specific mechanism.
FullHashCache& full_hash_cache = full_hash_cache_;
DCHECK(is_lookup_mechanism_experiment_enabled_ ||
mechanism_experiment_cache_selection ==
MechanismExperimentHashDatabaseCache::kNoExperiment);
switch (mechanism_experiment_cache_selection) {
case MechanismExperimentHashDatabaseCache::kNoExperiment:
case MechanismExperimentHashDatabaseCache::kUrlRealTimeOnly:
full_hash_cache = full_hash_cache_;
break;
case MechanismExperimentHashDatabaseCache::kHashRealTimeOnly:
full_hash_cache =
lookup_mechanism_experiment_hash_realtime_full_hash_cache_;
break;
case MechanismExperimentHashDatabaseCache::kHashDatabaseOnly:
full_hash_cache =
lookup_mechanism_experiment_hash_database_full_hash_cache_;
break;
default:
NOTREACHED();
break;
}

std::unordered_set<HashPrefixStr> unique_prefixes_to_request;
for (const auto& it : full_hash_to_store_and_hash_prefixes) {
const FullHashStr& full_hash = it.first;
const StoreAndHashPrefixes& matched = it.second;
for (const StoreAndHashPrefix& matched_it : matched) {
const ListIdentifier& list_id = matched_it.list_id;
const HashPrefixStr& prefix = matched_it.hash_prefix;
auto prefix_entry = full_hash_cache_.find(prefix);
if (prefix_entry != full_hash_cache_.end()) {
auto prefix_entry = full_hash_cache.find(prefix);
if (prefix_entry != full_hash_cache.end()) {
// Case 1.
const CachedHashPrefixInfo& cached_prefix_info = prefix_entry->second;
bool found_full_hash = false;
Expand Down Expand Up @@ -754,7 +790,8 @@ void V4GetHashProtocolManager::SetClockForTests(base::Clock* clock) {
void V4GetHashProtocolManager::UpdateCache(
const std::vector<HashPrefixStr>& prefixes_requested,
const std::vector<FullHashInfo>& full_hash_infos,
const Time& negative_cache_expire) {
const Time& negative_cache_expire,
MechanismExperimentHashDatabaseCache mechanism_experiment_cache_selection) {
DCHECK_CALLED_ON_VALID_SEQUENCE(sequence_checker_);

// If negative_cache_expire is null, don't cache the results since it's not
Expand All @@ -763,18 +800,64 @@ void V4GetHashProtocolManager::UpdateCache(
return;
}

for (const HashPrefixStr& prefix : prefixes_requested) {
// Create or reset the cached result for this prefix.
CachedHashPrefixInfo& chpi = full_hash_cache_[prefix];
chpi.full_hash_infos.clear();
chpi.negative_expiry = negative_cache_expire;
auto update_individual_cache =
[prefixes_requested, negative_cache_expire,
full_hash_infos](FullHashCache& full_hash_cache) {
for (const HashPrefixStr& prefix : prefixes_requested) {
// Create or reset the cached result for this prefix.
CachedHashPrefixInfo& chpi = full_hash_cache[prefix];
chpi.full_hash_infos.clear();
chpi.negative_expiry = negative_cache_expire;

for (const FullHashInfo& full_hash_info : full_hash_infos) {
if (V4ProtocolManagerUtil::FullHashMatchesHashPrefix(
full_hash_info.full_hash, prefix)) {
chpi.full_hash_infos.push_back(full_hash_info);
}
}
}
};

for (const FullHashInfo& full_hash_info : full_hash_infos) {
if (V4ProtocolManagerUtil::FullHashMatchesHashPrefix(
full_hash_info.full_hash, prefix)) {
chpi.full_hash_infos.push_back(full_hash_info);
}
// See comments above MechanismExperimentHashDatabaseCache's definition for
// more context.
if (is_lookup_mechanism_experiment_enabled_) {
switch (mechanism_experiment_cache_selection) {
// If this request is outside the scope of the experiment, update all
// three caches. We still want to update the experiment's backgrounded
// caches because there may be cache entries added outside the experiment
// that end up helping with the experiment's cache hits.
case MechanismExperimentHashDatabaseCache::kNoExperiment:
update_individual_cache(full_hash_cache_);
update_individual_cache(
lookup_mechanism_experiment_hash_realtime_full_hash_cache_);
update_individual_cache(
lookup_mechanism_experiment_hash_database_full_hash_cache_);
break;
// If this request is in the scope of the experiment and it's for the main
// request, we only want to update the main cache.
case MechanismExperimentHashDatabaseCache::kUrlRealTimeOnly:
update_individual_cache(full_hash_cache_);
break;
// If this request is in the scope of the experiment and it's for the
// backgrounded hash real-time request, we only want to update that
// backgrounded cache.
case MechanismExperimentHashDatabaseCache::kHashRealTimeOnly:
update_individual_cache(
lookup_mechanism_experiment_hash_realtime_full_hash_cache_);
break;
// If this request is in the scope of the experiment and it's for the
// backgrounded hash database request, we only want to update that
// backgrounded cache.
case MechanismExperimentHashDatabaseCache::kHashDatabaseOnly:
update_individual_cache(
lookup_mechanism_experiment_hash_database_full_hash_cache_);
break;
default:
NOTREACHED();
break;
}
} else {
update_individual_cache(full_hash_cache_);
}
}

Expand Down Expand Up @@ -868,7 +951,8 @@ void V4GetHashProtocolManager::OnURLLoaderCompleteInternal(
const std::unique_ptr<FullHashCallbackInfo>& fhci = it->second;
UMA_HISTOGRAM_LONG_TIMES("SafeBrowsing.V4GetHash.Network.Time",
clock_->Now() - fhci->network_start_time);
UpdateCache(fhci->prefixes_requested, full_hash_infos, negative_cache_expire);
UpdateCache(fhci->prefixes_requested, full_hash_infos, negative_cache_expire,
fhci->mechanism_experiment_cache_selection);
MergeResults(fhci->full_hash_to_store_and_hash_prefixes, full_hash_infos,
&fhci->cached_full_hash_infos);

Expand Down Expand Up @@ -906,6 +990,10 @@ void V4GetHashProtocolManager::CollectFullHashCacheInfo(
}
}

void V4GetHashProtocolManager::SetLookupMechanismExperimentIsEnabled() {
is_lookup_mechanism_experiment_enabled_ = true;
}

#ifndef DEBUG
std::ostream& operator<<(std::ostream& os, const FullHashInfo& fhi) {
os << "{full_hash: " << fhi.full_hash << "; list_id: " << fhi.list_id
Expand Down

0 comments on commit 7c7c040

Please sign in to comment.