Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exclude download, redirects, reload, embed and framed link visit types from the get_top_frecent_site_infos query #3505

Merged
merged 1 commit into from Aug 19, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
4 changes: 4 additions & 0 deletions CHANGES_UNRELEASED.md
Expand Up @@ -23,3 +23,7 @@
and `reason` string values.

[#3308](https://github.com/mozilla/application-services/pull/3308/)

- Exclude download, redirects, reload, embed and framed link visit type from the
`get_top_frecent_site_infos` query.
([#3505](https://github.com/mozilla/application-services/pull/3505))
Expand Up @@ -176,6 +176,13 @@ class PlacesConnectionTest {

@Test
fun testGetTopFrecentSiteInfos() {
db.noteObservation(VisitObservation(url = "https://www.example.com/1", visitType = VisitType.DOWNLOAD))
db.noteObservation(VisitObservation(url = "https://www.example.com/1", visitType = VisitType.EMBED))
db.noteObservation(VisitObservation(url = "https://www.example.com/1", visitType = VisitType.REDIRECT_PERMANENT))
db.noteObservation(VisitObservation(url = "https://www.example.com/1", visitType = VisitType.REDIRECT_TEMPORARY))
db.noteObservation(VisitObservation(url = "https://www.example.com/1", visitType = VisitType.FRAMED_LINK))
db.noteObservation(VisitObservation(url = "https://www.example.com/1", visitType = VisitType.RELOAD))

val toAdd = listOf(
"https://www.example.com/123",
"https://www.example.com/123",
Expand Down
32 changes: 25 additions & 7 deletions components/places/src/storage/history.rs
Expand Up @@ -1195,16 +1195,34 @@ pub fn get_visited_urls(
}

pub fn get_top_frecent_site_infos(db: &PlacesDb, num_items: i32) -> Result<TopFrecentSiteInfos> {
// Get the complement of the visit types that should be excluded.
let allowed_types = VisitTransitionSet::for_specific(&[
VisitTransition::Download,
VisitTransition::Embed,
VisitTransition::RedirectPermanent,
VisitTransition::RedirectTemporary,
VisitTransition::FramedLink,
VisitTransition::Reload,
])
.complement();

let infos = db.query_rows_and_then_named_cached(
"SELECT frecency, title, url
FROM moz_places
WHERE (SUBSTR(url, 1, 6) == 'https:' OR SUBSTR(url, 1, 5) == 'http:')
AND (last_visit_date_local + last_visit_date_remote) != 0 AND
NOT hidden
ORDER BY frecency DESC
LIMIT :limit",
"SELECT h.frecency, h.title, h.url
FROM moz_places h
WHERE EXISTS (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I'm wondering about the performance implications for large history sets. We're already doing a full table scan on moz_places, since we don't index on anything except frecency; now we'll need another table scan on moz_historyvisits.

I don't want to block your PR on that, but it's something to think about if we want to push top site storage down into Places. Having a top_site tri-state column in moz_places (always show, never show, or show based on frecency) would let us do that without the second scan—we'd pick the "always show" sites first, and then fill the rest based on frecency up to the limit. "Never show" would be for sites that the user explicitly removes from top sites, as well as ones that have forbidden types—and we can determine those cheaply at insert time! 🎉

SELECT v.visit_type
FROM moz_historyvisits v
WHERE h.id = v.place_id
AND (SUBSTR(h.url, 1, 6) == 'https:' OR SUBSTR(h.url, 1, 5) == 'http:')
AND (h.last_visit_date_local + h.last_visit_date_remote) != 0
AND ((1 << v.visit_type) & :allowed_types) != 0 AND
NOT h.hidden
)
ORDER BY h.frecency DESC
LIMIT :limit",
rusqlite::named_params! {
":limit": num_items,
":allowed_types": allowed_types,
},
TopFrecentSiteInfo::from_row,
)?;
Expand Down