Skip to content

Commit

Permalink
[memories] Support async, piecewise construction of visits.
Browse files Browse the repository at this point in the history
Visit construction is driven my multiple events that occur with
nondeterministic order:
1) The history navigation occurs.
2) The history query for previous visits to the same URL is returned.
3) The omnibox URL is copied.
4) UKM begins tracking the navigation.
5) UKM stops tracking the navigation.
6) The |WebContents| is destroyed (i.e. the tab is closed).
7) Another history navigation occurs.

The typical flow for a navigation that ends with the tab closing is:
1) History navigation occurs.
2) UKM begins tracking the navigation.
3) The history query resolves.
4) |WebContents| is destroyed.
5) UKM stops tracking the navigation.

The typical flow for a navigation that ends with another navigation:
1) History navigation occurs.
2) UKM begins tracking the navigation.
3) The history query resolves.
4) Another history navigation occurs.
5) UKM stops tracking the navigation.
6) UKM begins tracking the 2nd navigation.

The typical flow for a navigation that is followed by a same-app
navigation:
1) History navigation occurs.
2) UKM begins tracking the navigation.
3) The history query resolves.
4) N same-app history navigation occurs & N history queries resolve.
5) |WebContents| is destroyed.
6) UKM stops tracking 1st navigation.

The typical flow for a navigation that ends quickly:
1) History navigation occurs.
2) UKM begins tracking the navigation.
3) |WebContents| is destroyed.
4) UKM stops tracking the navigation.

In other words, arbitrary timing, the lifetime of a navigation, the
number of navigations in the same tab, and whether navigating within a
single-page-app can all effect the order of events.

This CL handles these and other possible permutations of the event
timeline. Specifically, this CL:

- Introduces |HistoryClustersTabHelper::ExpectUkmNavigationComplete()|
  invoked when (4) UKM begins tracking a navigation. This allows
  determining whether we expect
  |HistoryClustersTabHelper::OnUkmNavigationComplete()| to be invoked
  later for that navigation; otherwise, we would have to keep a
  navigation_id->visit mapping around forever.

- Commits incomplete visits only after:
  1) The history query resolves; otherwise, we'd have a memories visit
     without a history visit row which isn't useful.
  2) Either the next same-tab navigation occurs or the tab is closed;
     otherwise we wouldn't have page end context signals.
  3) Either UKM isn't tracking the navigation (i.e. same-app
     navigations) or UKM has stopped tracking the navigation; otherwise
     ths visit wouldn't have a |page_end_reason| context signal and UKM
     wouldn't be able to record the visits' context signals.

- Likewise, records navigation end metrics only when both (1) the
history request has resolved and (2) the navigation has ended.

- Queries the history DB for previous 2 visits to the URL to 1) find the
  navigation's URL and visit rows, and 2) compute the
  |duration_since_last_visit_seconds| context signal.

- Uses navigation IDs and URLs instead of |NavigationHandle| and
  |HistoryAddPageArgs| to identify visits in the memories layer and
  history DB.

- No longer records the context signal |omnibox_url_copied| when copying
  to the |kSelection| buffer (i.e. when selecting text on linux).


Bug: 1171352
Change-Id: I92a031c94ae16f9563c15bf2a5ff75581be1e75e
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/2762559
Reviewed-by: Nicolás Peña Moreno <npm@chromium.org>
Reviewed-by: Scott Violet <sky@chromium.org>
Reviewed-by: Tommy Li <tommycli@chromium.org>
Commit-Queue: manuk hovanesian <manukh@chromium.org>
Cr-Commit-Position: refs/heads/master@{#868140}
  • Loading branch information
manukh authored and Chromium LUCI CQ committed Mar 31, 2021
1 parent 7276d70 commit 67f6040
Show file tree
Hide file tree
Showing 17 changed files with 1,079 additions and 246 deletions.
4 changes: 2 additions & 2 deletions chrome/browser/history/history_tab_helper.cc
Original file line number Diff line number Diff line change
Expand Up @@ -180,8 +180,8 @@ void HistoryTabHelper::DidFinishNavigation(

if (HistoryClustersTabHelper* clusters_tab_helper =
HistoryClustersTabHelper::FromWebContents(web_contents())) {
clusters_tab_helper->DidUpdateHistoryForNavigation(navigation_handle,
add_page_args);
clusters_tab_helper->OnUpdatedHistoryForNavigation(
navigation_handle->GetNavigationId(), add_page_args.url);
}
}

Expand Down
2 changes: 1 addition & 1 deletion chrome/browser/history/history_tab_helper_unittest.cc
Original file line number Diff line number Diff line change
Expand Up @@ -102,7 +102,7 @@ TEST_F(HistoryTabHelperTest, ShouldLimitTitleUpdatesPerPage) {

ASSERT_EQ("title10", QueryPageTitleFromHistory(page_url_));

// Furhter updates should be ignored.
// Further updates should be ignored.
web_contents()->UpdateTitleForEntry(entry, u"title11");
EXPECT_EQ("title10", QueryPageTitleFromHistory(page_url_));
}
Expand Down
295 changes: 176 additions & 119 deletions chrome/browser/history_clusters/history_clusters_tab_helper.cc
Original file line number Diff line number Diff line change
Expand Up @@ -4,19 +4,24 @@

#include "chrome/browser/history_clusters/history_clusters_tab_helper.h"

#include "base/ranges/algorithm.h"
#include <functional>
#include <memory>
#include <utility>

#include "build/build_config.h"
#include "chrome/browser/bookmarks/bookmark_model_factory.h"
#include "chrome/browser/history/history_service_factory.h"
#include "chrome/browser/history_clusters/memories_service_factory.h"
#include "chrome/browser/profiles/profile.h"
#include "components/bookmarks/browser/bookmark_model.h"
#include "components/history/core/browser/history_service.h"
#include "components/history/core/browser/history_backend.h"
#include "components/history/core/browser/history_database.h"
#include "components/history/core/browser/history_db_task.h"
#include "components/history/core/browser/history_types.h"
#include "components/history/core/browser/url_row.h"
#include "components/history_clusters/core/memories_service.h"
#include "components/keyed_service/core/service_access_type.h"
#include "components/omnibox/browser/omnibox_view.h"
#include "content/public/browser/navigation_handle.h"
#include "content/public/browser/web_contents.h"

#if !defined(OS_ANDROID)
#include "base/containers/contains.h"
Expand All @@ -30,6 +35,38 @@

namespace {

using UrlAndVisitCallback =
base::OnceCallback<void(history::URLRow, history::VisitVector)>;

// Gets the 2 most recent visits to a URL. Used to associate a memories visit
// with its history rows and compute the |duration_since_last_visit_seconds|
// context signal.
class GetMostRecentVisitsToUrl : public history::HistoryDBTask {
public:
GetMostRecentVisitsToUrl(const GURL& url, UrlAndVisitCallback callback)
: url_(url), callback_(std::move(callback)) {}

bool RunOnDBThread(history::HistoryBackend* backend,
history::HistoryDatabase* db) override {
if (backend->GetURL(url_, &url_row_))
backend->GetMostRecentVisitsForURL(url_row_.id(), 2, &visits_);
return true;
}

void DoneRunOnMainThread() override {
if (!visits_.empty()) {
DCHECK_LE(visits_.size(), 2u);
std::move(callback_).Run(url_row_, visits_);
}
}

private:
const GURL url_;
history::URLRow url_row_;
history::VisitVector visits_;
UrlAndVisitCallback callback_;
};

bool IsPageInTabGroup(content::WebContents* contents) {
DCHECK(contents);

Expand Down Expand Up @@ -61,6 +98,14 @@ bool IsPageBookmarked(content::WebContents* contents, const GURL& url) {
return model && model->IsBookmarked(url);
}

base::TimeDelta TimeElapsedBetweenVisits(const history::VisitRow& visit1,
const history::VisitRow& visit2) {
base::TimeDelta delta = visit2.visit_time - visit1.visit_time;
// Clamp to 30 days maximum to match the UKM retention period.
const base::TimeDelta kMaxDurationClamp = base::TimeDelta::FromDays(30);
return delta < kMaxDurationClamp ? delta : kMaxDurationClamp;
}

} // namespace

HistoryClustersTabHelper::HistoryClustersTabHelper(
Expand All @@ -69,106 +114,132 @@ HistoryClustersTabHelper::HistoryClustersTabHelper(

HistoryClustersTabHelper::~HistoryClustersTabHelper() = default;

void HistoryClustersTabHelper::LogUrlCopied() {
if (!visits_.empty()) {
visits_.back().context_signals.omnibox_url_copied = true;
void HistoryClustersTabHelper::OnOmniboxUrlCopied() {
DCHECK(!navigation_ids_.empty());
if (!navigation_ids_.empty()) {
GetMemoriesService()
->GetIncompleteVisit(navigation_ids_.back())
.context_signals.omnibox_url_copied = true;
}
}

void HistoryClustersTabHelper::DidUpdateHistoryForNavigation(
content::NavigationHandle* navigation_handle,
const history::HistoryAddPageArgs& add_page_args) {
if (!visits_.empty()) {
// We are starting a new navigation, so record the page end metrics.
// TODO(tommycli): We don't know the page end reason, but maybe we could
// guess and assume it's a same document navigation. Investigate this.
RecordPageEndMetricsIfNeeded(visits_.back());
}

// Copy over exactly what's stored in History for the URL and timestamp.
//
// TODO(tommycli): HistoryAddPageArgs is further manipulated in the
// HistoryBackend where we cannot access it. We may need to have a more
// advanced approach to make sure we are accurately reflecting History.
auto visit = memories::MemoriesVisit(navigation_handle->GetNavigationId(),
add_page_args.url);
visit.visit_time = add_page_args.time;

// TODO(tommycli): add_page_args also contains the context_id and nav_entry_id
// that we probably need to store to get the database visit_id. Maybe we
// should fetch the visit_id in this location... or maybe save the context_id
// and nav_entry_id for saving before we flush to MemoriesService.

// Record page-start context signals.
void HistoryClustersTabHelper::OnUpdatedHistoryForNavigation(
int64_t navigation_id,
const GURL& url) {
StartNewNavigationIfNeeded(navigation_id);
auto* memories_service = GetMemoriesService();
auto& visit = memories_service->GetOrCreateIncompleteVisit(navigation_id);
visit.context_signals.is_existing_part_of_tab_group =
IsPageInTabGroup(web_contents());
visit.context_signals.is_existing_bookmark =
IsPageBookmarked(web_contents(), visit.url);
visits_.push_back(visit);

Profile* profile =
Profile::FromBrowserContext(web_contents()->GetBrowserContext());
history::HistoryService* history_service =
HistoryServiceFactory::GetForProfileIfExists(
profile, ServiceAccessType::IMPLICIT_ACCESS);
if (history_service) {
// Using |visit.visit_time| as the |end_time| param will exactly exclude the
// current visit, as the range is exclusive at the end. Using
// base::Unretained is fine here, as |task_tracker_| prevents
// use-after-free.
//
// TODO(tommycli): Perhaps we should migrate this call to MemoriesService,
// in case the user closes the tab too quickly after we make this call.
history_service->GetLastVisitToURL(
visit.url, visit.visit_time,
base::BindOnce(&HistoryClustersTabHelper::PreviousVisitToUrlCallback,
base::Unretained(this),
navigation_handle->GetNavigationId()),
IsPageBookmarked(web_contents(), url);

if (auto* history_service = GetHistoryService()) {
// This |GetMostRecentVisitsToUrl| task will find at least 1 visit since
// |HistoryTabHelper::UpdateHistoryForNavigation()|, invoked prior to
// |OnUpdatedHistoryForNavigation()|, will have posted a task to add the
// visit represented by |visit|.
history_service->ScheduleDBTask(
FROM_HERE,
std::make_unique<GetMostRecentVisitsToUrl>(
url,
base::BindOnce(
[](HistoryClustersTabHelper* history_clusters_tab_helper,
memories::MemoriesService* memories_service,
int64_t navigation_id, memories::MemoriesVisit& visit,
history::URLRow url_row, history::VisitVector visits) {
DCHECK(history_clusters_tab_helper);
DCHECK(memories_service);
DCHECK(url_row.id());
DCHECK(visits[0].visit_id);
DCHECK_EQ(url_row.id(), visits[0].url_id);
visit.url_row = url_row;
visit.visit_row = visits[0];
if (visits.size() > 1) {
visit.context_signals.duration_since_last_visit_seconds =
TimeElapsedBetweenVisits(visits[1], visits[0])
.InSeconds();
}
// If the navigation has already ended, record the page end
// metrics.
visit.status.history_rows = true;
if (visit.status.navigation_ended) {
DCHECK(!visit.status.navigation_end_signals);
history_clusters_tab_helper->RecordPageEndMetricsIfNeeded(
navigation_id);
}
},
this, memories_service, navigation_id, std::ref(visit))),
&task_tracker_);
}
}

base::Optional<memories::MemoriesVisit>
HistoryClustersTabHelper::UpdatePageEndReasonAndGetVisitForUkm(
void HistoryClustersTabHelper::TagNavigationAsExpectingUkmNavigationComplete(
int64_t navigation_id) {
GetMemoriesService()
->GetOrCreateIncompleteVisit(navigation_id)
.status.expect_ukm_page_end_signals = true;
StartNewNavigationIfNeeded(navigation_id);
}

memories::VisitContextSignals HistoryClustersTabHelper::OnUkmNavigationComplete(
int64_t navigation_id,
const page_load_metrics::PageEndReason page_end_reason) {
// Find the visit that matches the navigation id.
auto visit_it = base::ranges::find(visits_, navigation_id,
[](auto& v) { return v.navigation_id; });
if (visit_it == visits_.end())
return base::nullopt;

visit_it->context_signals.page_end_reason = page_end_reason;
RecordPageEndMetricsIfNeeded(*visit_it);

// Return a copy to the caller, as we are about to flush away our copy.
base::Optional<memories::MemoriesVisit> copy = *visit_it;

// We are done with this visit. Send it to the service, if available, then
// delete from our own vector.
if (memories::MemoriesService* service =
MemoriesServiceFactory::GetForBrowserContext(
web_contents()->GetBrowserContext())) {
service->AddVisit(*visit_it);
visits_.erase(visit_it);
}
return copy;
auto* memories_service = GetMemoriesService();
auto& visit = memories_service->GetIncompleteVisit(navigation_id);
visit.context_signals.page_end_reason = page_end_reason;
// |RecordPageEndMetricsIfNeeded()| will fail to complete the visit as
// |ukm_page_end_signals| hasn't been set yet, but it will record metrics
// if needed (i.e. not already recorded) and possible (i.e. the history
// request has resolved and |history_rows| have been recorded).
RecordPageEndMetricsIfNeeded(navigation_id);
// Make a copy of the context signals as the referenced visit may be
// destroyed in |CompleteVisitIfReady()|.
auto context_signals_copy = visit.context_signals;
DCHECK(visit.status.expect_ukm_page_end_signals);
visit.status.ukm_page_end_signals = true;
memories_service->CompleteVisitIfReady(navigation_id);
return context_signals_copy;
}

void HistoryClustersTabHelper::RecordPageEndMetricsIfNeeded(
memories::MemoriesVisit& visit) {
if (visit.page_end_metrics_recorded)
void HistoryClustersTabHelper::StartNewNavigationIfNeeded(
int64_t navigation_id) {
if (!navigation_ids_.empty() && navigation_id == navigation_ids_.back())
return;

visit.page_end_metrics_recorded = true;
// We are starting a new navigation, so record the navigation end metrics for
// the previous navigation.
// TODO(tommycli): We don't know the page end reason, but maybe we could guess
// and assume it's a same document navigation. Investigate this.
if (!navigation_ids_.empty())
RecordPageEndMetricsIfNeeded(navigation_ids_.back());
navigation_ids_.push_back(navigation_id);
}

void HistoryClustersTabHelper::RecordPageEndMetricsIfNeeded(
int64_t navigation_id) {
auto* memories_service = GetMemoriesService();
if (!memories_service->HasIncompleteVisit(navigation_id))
return;
auto& visit = memories_service->GetIncompleteVisit(navigation_id);
if (visit.status.navigation_end_signals) {
DCHECK(visit.status.navigation_ended);
return;
}
visit.status.navigation_ended = true;
// Don't record page end metrics if the history rows request hasn't resolved
// because some of the metrics rely on |url_row.url()|. Setting
// |navigation_ended| above will ensure |RecordPageEndMetricsIfNeeded()| is
// re-invoked once the history request resolves.
if (!visit.status.history_rows)
return;

visit.context_signals.is_placed_in_tab_group =
!visit.context_signals.is_existing_part_of_tab_group &&
IsPageInTabGroup(web_contents());
visit.context_signals.is_new_bookmark =
!visit.context_signals.is_existing_bookmark &&
IsPageBookmarked(web_contents(), visit.url);

IsPageBookmarked(web_contents(), visit.url_row.url());
// Android does not have NTP Custom Links.
#if !defined(OS_ANDROID)
// This queries the prefs directly if the visit URL is stored as an NTP
Expand All @@ -178,53 +249,39 @@ void HistoryClustersTabHelper::RecordPageEndMetricsIfNeeded(
->GetPrefs();
ntp_tiles::CustomLinksStore custom_link_store(pref_service);
visit.context_signals.is_ntp_custom_link =
base::Contains(custom_link_store.RetrieveLinks(), visit.url,
base::Contains(custom_link_store.RetrieveLinks(), visit.url_row.url(),
[](const auto& link) { return link.url; });
#endif // !defined(OS_ANDROID)

visit.status.navigation_end_signals = true;
memories_service->CompleteVisitIfReady(navigation_id);
}

void HistoryClustersTabHelper::WebContentsDestroyed() {
memories::MemoriesService* service =
MemoriesServiceFactory::GetForBrowserContext(
web_contents()->GetBrowserContext());
if (!service)
return;
for (auto navigation_id : navigation_ids_)
RecordPageEndMetricsIfNeeded(navigation_id);
}

// Finalize and flush any leftover visits, which may be same-doc navigations.
// TODO(tommycli): We really should flush these faster than
// WebContentsDestroyed(). We should also make a guess of the page end reason.
// TODO(tommycli): When a tab is closed, UKM updates the page end reason AFTER
// the WebContents is destroyed. For those last pages, the page end reason is
// not properly recorded, nor is UKM recorded properly. Fix this.
for (auto& visit : visits_) {
RecordPageEndMetricsIfNeeded(visit);
service->AddVisit(visit);
memories::MemoriesService* HistoryClustersTabHelper::GetMemoriesService() {
if (!web_contents()) {
NOTREACHED();
return nullptr;
}
visits_.clear();
auto* service = MemoriesServiceFactory::GetForBrowserContext(
web_contents()->GetBrowserContext());
DCHECK(service);
return service;
}

void HistoryClustersTabHelper::PreviousVisitToUrlCallback(
int64_t navigation_id,
history::HistoryLastVisitResult result) {
if (!result.success || result.last_visit.is_null())
return;

// Find the visit that matches the navigation id.
auto visit_it = base::ranges::find(visits_, navigation_id,
[](auto& v) { return v.navigation_id; });
if (visit_it == visits_.end())
return;

base::TimeDelta since_last_visit = visit_it->visit_time - result.last_visit;

// Clamp to 30 days maximum to match the UKM retention period.
const base::TimeDelta kMaxDurationClamp = base::TimeDelta::FromDays(30);
if (since_last_visit > kMaxDurationClamp) {
since_last_visit = kMaxDurationClamp;
history::HistoryService* HistoryClustersTabHelper::GetHistoryService() {
if (!web_contents()) {
NOTREACHED();
return nullptr;
}

visit_it->context_signals.duration_since_last_visit_seconds =
since_last_visit.InSeconds();
Profile* profile =
Profile::FromBrowserContext(web_contents()->GetBrowserContext());
return HistoryServiceFactory::GetForProfileIfExists(
profile, ServiceAccessType::IMPLICIT_ACCESS);
}

WEB_CONTENTS_USER_DATA_KEY_IMPL(HistoryClustersTabHelper)

0 comments on commit 67f6040

Please sign in to comment.