Skip to content
This repository has been archived by the owner. It is now read-only.

feat (tippytop): #3218 implement TippyTopFeed with remote manifest fetching #3787

Merged
merged 1 commit into from Nov 10, 2017

Conversation

@rlr
Copy link
Contributor

@rlr rlr commented Nov 2, 2017

Before:
screen shot 2017-11-02 at 10 47 51 am

After:
screen shot 2017-11-02 at 10 39 50 am

@rlr
Copy link
Contributor Author

@rlr rlr commented Nov 2, 2017

I'm still missing a bunch of tests and have some changes in mind, but I wanted to get some 👀 on this sooner than later. I have some doubts/concerns that I want to address. I'll add some comments inline.

f? @Mardak @ncloudioj @k88hudson

@@ -142,8 +141,8 @@ this.TopSitesFeed = class TopSitesFeed {
* @param {bool} options.broadcast Should the update be broadcasted.
*/
async refresh(options = {}) {
if (!this._tippyTopProvider.initialized) {
await this._tippyTopProvider.init();
if (!this.store.getState().TippyTop.initialized) {

This comment has been minimized.

@rlr

rlr Nov 2, 2017
Author Contributor

TippyTop initialization is going to take longer now on a cold cache (first run, for example). It has to load the default manifest we ship with and then fetch the remote one. So, maybe we want to mark as initialized after the defaults are loaded? The first new tab opened might only have the default set and we might end up requesting screenshots that aren't really needed. But we can show the stuff faster.

@rlr
Copy link
Contributor Author

@rlr rlr commented Nov 2, 2017

Another possible performance issue is that the icon URLs are remote. You can see a delay on first load. I think we should be able to add a persistent cache for them. Basically fetch them and store as data URIs.


this.getDomain = function(url) {
let domain = new URL(url).hostname;
if (domain && domain.startsWith("www.")) {

This comment has been minimized.

@ncloudioj

ncloudioj Nov 2, 2017
Member

Do we want to handle those www2 mirror sites? There are quite a few of them (www1/2/3/4) in my Places.

This comment has been minimized.

@rlr

rlr Nov 3, 2017
Author Contributor

hmm, i guess we could do a regex check for most cases?

@ncloudioj
Copy link
Member

@ncloudioj ncloudioj commented Nov 2, 2017

Another possible performance issue is that the icon URLs are remote. You can see a delay on first load. I think we should be able to add a persistent cache for them. Basically fetch them and store as data URIs.

I wonder if we can simply rely on the browser's image caching for this? The first load could be slow, but it should hit the local cache for the following loads afterwards. IIRC we only ship image URLs for Tiles in the manifest file, although we do host those images on our own server with proper CDN settings.

@ncloudioj
Copy link
Member

@ncloudioj ncloudioj commented Nov 2, 2017

@rlr This looks good to me. Nice feature!

I'd recommend to add a module level document to outline how tippytop works in AS, because there are a few moving parts involved. For example, the default local tippytop file, the remote one that gets downloaded periodically as well as the persistent cache for it.

@piatra
Copy link
Collaborator

@piatra piatra commented Nov 2, 2017

@rlr

Another possible performance issue is that the icon URLs are remote. You can see a delay on first load. I think we should be able to add a persistent cache for them. Basically fetch them and store as data URIs.

There's an ExpirationFilter that will keep those images around if you tell it https://searchfox.org/mozilla-central/rev/423b2522c48e1d654e30ffc337164d677f934ec3/toolkit/modules/NewTabUtils.jsm#1919
It's already setup to work with AS Topsites https://searchfox.org/mozilla-central/rev/423b2522c48e1d654e30ffc337164d677f934ec3/browser/base/content/browser-thumbnails.js#214-218 but if you're not seeing that probably your PR adds new images/URLs that the ExpirationFilter is not aware of?

@Mardak
Copy link
Member

@Mardak Mardak commented Nov 2, 2017

@piatra the expiration filter is only for thumbnails. The purpose of tippy top is to avoid thumbnails to begin with.

If you're suggesting of putting the tippy top icons into the thumbnail cache, that will be a bit of work as thumbnails are currently shown along with a letter fallback in the corner, but tippy top's behavior is to skip that behavior.

@Mardak
Copy link
Member

@Mardak Mardak commented Nov 2, 2017

A different way to get tippy top icons into some cache is to update moz_favicons. @ncloudioj knows more of those details from saving rich icons from pages. But basically, we could save our hosted rich icon as the icon for the page (assuming it's not competing against a page-provided rich icon -- where in the screenshots above seem to show different rich icons for groupon…)

Also, unclear if it's "okay" to inject this tippy top data into moz_favicons to begin with…

@ncloudioj
Copy link
Member

@ncloudioj ncloudioj commented Nov 2, 2017

Also, unclear if it's "okay" to inject this tippy top data into moz_favicons to begin with…

To preload those tippy top icons into moz_favicons, we'll have to create dummy visits and store them into Places first, under the hood, Firefox uses PlacesUtils.favicon.setAndFetchFaviconForPage(aPageURI, aFaviconURI... to associate an icon with that site, I believe that'll be harder to get it done right.

@Mardak
Copy link
Member

@Mardak Mardak commented Nov 2, 2017

Oh sorry, I didn't realize https://s3.amazonaws.com/activitystream-dev-default-resources-s3bucket-1qw8m6s29v3dq/tippytop/icons.json linked to per-site hosted icons. (I haven't really looked at the PR in detail yet.)

What's the purpose of this tippy-top approach in the context of firefox already saving rich icons from the site? For some reason rich icons aren't being saved from the page?

I just visited nike, upwork, nba, airbnb as the "before" screenshot above showed thumbnails. But all of those seem to show rich icons for me… ?

@Mardak
Copy link
Member

@Mardak Mardak commented Nov 2, 2017

@ncloudioj we wouldn't need to preload the icon…? If we were to show a top site (page), we could setAndFetchFaviconForPage of that page url with the icon url from tippy top.

@ncloudioj
Copy link
Member

@ncloudioj ncloudioj commented Nov 2, 2017

@Mardak Oh, you're right. We can setAndFetch at that point, then all the future icon loads could be fetched from moz_favicon.

Weird, just tried nike.com here, got a screenshot. Checking my favicon database, there are a few rich icons for "nike.com". Maybe it's because the link in my topsites was "www.nike.com/ca/en_gb"?

@ncloudioj
Copy link
Member

@ncloudioj ncloudioj commented Nov 2, 2017

Hrmm, "python.org" didn't grab the rich icon either. Can you guys try it out?

@piatra
Copy link
Collaborator

@piatra piatra commented Nov 2, 2017

@ncloudioj screenshot for me for python.org

@ncloudioj
Copy link
Member

@ncloudioj ncloudioj commented Nov 2, 2017

Filed a bug #3788 to track the issue described above.

@rlr
Copy link
Contributor Author

@rlr rlr commented Nov 3, 2017

Sounds like we are not sure how much benefit this will give us over our rich icons. Rich icons seems to be pretty damn good already and if we fix the few issues we are uncovering, they'll be even better.

What's the purpose of this tippy-top approach in the context of firefox already saving rich icons from the site? For some reason rich icons aren't being saved from the page?

It would be good to know how often users are seeing screenshots. Maybe we still want remote tippy top to fix some "critical" cases and we leave out all the sites that we know we are already finding rich icons for. That would keep the payload small.

What should we do until we answer those questions? One options is we can land it with remote fetching disabled. Then we can enable after learning more or run experiments/shield studies in the meantime. And I still like this feed approach over the previous provider approach.

Landing this would also allow me to create my own custom manifest so I can make my top sites cooler 😄

@rlr rlr force-pushed the rlr:tippy-top branch from 59f7769 to 292097d Nov 3, 2017
@Mardak
Copy link
Member

@Mardak Mardak commented Nov 3, 2017

@tspurway clarified that one main benefit of these tippy top icons is to get rich icons that websites only advertise to iOS useragents. Ideally, websites will start putting rich icons on all pages, so Firefox icon parser will just work without needing tippy top, but before then, we can make things look better for users.

@rlr I would think we would want to get some form of caching before this lands to avoid live requests for these site-hosted rich icons each time we show activity stream. The moz_favicons approach seems most natural as it's an existing persistent cache for favicons, and its existing consumers (including activity stream) will just happen to get better icons.

This however, is a pretty different approach from the current PR. In particular, at a high level:

  • TippyTopProvider remains unchanged to provide icons for potentially-unvisited default top sites
  • TippyTopFeed of some sort still needs to exist to fetch/cache manifest
    • instead of being in the critical path of TopSitesFeed, NewTabUtils just happens to get the tippy top icons
    • maybe one approach would be part of the TopSitesFeed._fetchIcon method similar to the async screenshot request but an async tippytop check? and dispatch something similar to SCREENSHOT_UPDATED but with the result of having just added the tippy top icon to moz_favicons
@Mardak
Copy link
Member

@Mardak Mardak commented Nov 3, 2017

Also on a separate note, did none of the tippy top sites have .svg icons? Or that the parser isn't looking for svg? The firefox code treats svg as a preferred icon.

@rlr
Copy link
Contributor Author

@rlr rlr commented Nov 6, 2017

...did none of the tippy top sites have .svg icons? Or that the parser isn't looking for svg? The firefox code treats svg as a preferred icon.

The scraper I used doesn't grab svg but that should be easy to fix.

@rlr
Copy link
Contributor Author

@rlr rlr commented Nov 6, 2017

The moz_favicons approach seems most natural as it's an existing persistent cache for favicons, and its existing consumers (including activity stream) will just happen to get better icons.

That sounds good. Once we have that in place, I think we would want rich icons to take priority over tippy top? So, basically we would show the user icons in this priority: 1) rich icons found by our builtin parser, 2) rich icons found with the help of tippy top manifest, 3) tippy top icons we ship with, 4) screenshots with low res icon, 5) a letter.

I think the only tricky part will be the messaging between TopSitesFeed and TippyTopFeed to try to fetch the icon and send back a response before falling back to try to get a screenshot. But I think it can all be by dispatching actions.

@Mardak
Copy link
Member

@Mardak Mardak commented Nov 6, 2017

I think the only tricky part will be the messaging between TopSitesFeed and TippyTopFeed to try to fetch the icon and send back a response before falling back to try to get a screenshot.

It's not ideal but unclear how bad it would be if we ended up requesting a screenshot and only later deciding that we should inject a tippy top rich icon and use that instead. The user might not even see the screenshot as in the common case:

  1. tippy top icon has already been inserted into moz_favicons
  2. if not, tippy top manifest already exists in the redux state, so any newly added top sites will trigger rich icon insertion behavior (and not screenshot)
  3. a worst case when the manifest hasn't been loaded yet, we end up unnecessarily requesting a screenshot then replacing it before a new tab is opened
  4. worst case would be no manifest, screenshot triggered, user opens new tab seeing letter fallback then screenshot then potentially replaced by injected rich icon (assuming we dispatch on injected icon vs just having it ready for next time)
@rlr rlr force-pushed the rlr:tippy-top branch 3 times, most recently from 6578168 to 8edd725 Nov 8, 2017
@rlr
Copy link
Contributor Author

@rlr rlr commented Nov 8, 2017

@Mardak I think this is ready for another look 👀 .

Right now the term TippyTop might cause some confusion. Maybe TippyTopFeed should be renamed to RichIconFeed and it should handle the other issues you mentioned in #3788? Or maybe TippyTopProvider should be renamed to DefaultSiteIconProvider?

switch (action.type) {
case at.TIPPYTOP_INIT:
case at.TIPPYTOP_UPDATED:
return Object.assign({}, prevState, {initialized: true, sitesByDomain: action.data});

This comment has been minimized.

@Mardak

Mardak Nov 8, 2017
Member

@k88hudson should there be any concern about putting all the tippy top data into redux? Even if we didn't broadcast it on TIPPYTOP_{INIT,UPDATED}, it would make its way to content via rehydration. And it looks like there's no use of that data from content in this PR or near future (?) ?

Instead of putting tippytop data in the store, maybe we just stash it as some property on the feed?

This comment has been minimized.

@rlr

rlr Nov 8, 2017
Author Contributor

Good point. The way this ended up, we don't need this data in the redux store 👍

if (!this.initializing) {
await this.init();
}
return;

This comment has been minimized.

@Mardak

Mardak Nov 9, 2017
Member

All of this initialization/queue stuff seems a bit complex when async fetchIcon can just naturally queue up by awaiting. Maybe something like:

async fetchIcon(url) {
  const sitesByDomain = await this.getSitesByDomain();
  if (domain in sitesByDomain) 
}
getSitesByDomain() {
  // return an already loaded object or a promise for that object
  return this._sitesByDomain || (this._sitesByDomain = new Promise(async resolve => {
    await this.loadCachedData();
    await this.maybeRefresh();
    resolve(this._sitesByDomain); // NB one of the above set an object to this._sitesByDomain
  }));
}

When this._sitesByDomain

  • … is an Object, it's effectively initialized
  • … is a Promise, it's effectively initializing
  • … is undefined, it's basically doing init

And each call to fetchIcon can just await this.getSites to get the current sites and happen to wait for initialization without realizing it's initialized / initializing or not.

And however we end up refreshing, it should assign the new sites object to this._sitesByDomain, and fetchIcon calls will just get the updated data.

This comment has been minimized.

@rlr

rlr Nov 9, 2017
Author Contributor

ooooh, the same trick we used in the PersistentCache. Some day I'll start thinking in promises 😉

@Mardak
Copy link
Member

@Mardak Mardak commented Nov 9, 2017

Right now the term TippyTop might cause some confusion. Maybe TippyTopFeed should be renamed to RichIconFeed and it should handle the other issues you mentioned in #3788?

Actually, that sounds pretty good. Maybe more generically FaviconFeed (as the fix for #3788 could end up saving non-rich icons) and leave TippyTopProvider (to avoid removing an existing file).

@rlr rlr force-pushed the rlr:tippy-top branch from c9641e4 to da645d2 Nov 9, 2017
}],
["tippyTop.service.endpoint", {
title: "Tippy Top service manifest url",
value: "https://s3.amazonaws.com/activitystream-dev-default-resources-s3bucket-1qw8m6s29v3dq/tippytop/icons.json"

This comment has been minimized.

@rlr

rlr Nov 9, 2017
Author Contributor

@Mardak How is this pref managed later on once it lands? We might want separate values for nightly/beta/release. We might want to run experiments with different URLs. etc. Do I need to do anything special for now?

This comment has been minimized.

@Mardak

Mardak Nov 9, 2017
Member

If it's via a shield study, the prefs will be replaced with some other value. Also, we probably need to fix up mozilla-central test prefs to disable this. And what's the plan for production endpoint?

@rlr rlr changed the title [WIP] feat (tippytop): #3218 implement TippyTopFeed with remote manifest fetching feat (tippytop): #3218 implement TippyTopFeed with remote manifest fetching Nov 9, 2017
@rlr
Copy link
Contributor Author

@rlr rlr commented Nov 9, 2017

This cleaned up pretty nicely as it's now pretty much a self contained feed that just listens for actions and does stuff.

@Mardak
Mardak approved these changes Nov 9, 2017
Copy link
Member

@Mardak Mardak left a comment

Seems to be working. We'll need some followups:

  • updating content and maybe links cache?
  • telemetry
    • would need NewTabUtils changes to return isTippy based on url
  • scraping svg
if (data && data._timestamp) {
this._sitesByDomain = data;
this.tippyTopLastUpdated = data._timestamp;
this.etag = data._etag;

This comment has been minimized.

@Mardak

Mardak Nov 9, 2017
Member

This separate this.etag is probably not necessary anymore as it'll just be a sub-property off of _sitesByDomain

const domain = getDomain(url);
if (domain in sitesByDomain) {
let iconUri = Services.io.newURI(sitesByDomain[domain].image_url);
iconUri.ref = "tippytop";

This comment has been minimized.

@Mardak

Mardak Nov 9, 2017
Member

A comment here to identify the url stored in moz_favicons

@rlr rlr force-pushed the rlr:tippy-top branch from e6c5a02 to 80baa63 Nov 10, 2017
@rlr rlr merged commit 3e6d701 into mozilla:master Nov 10, 2017
1 check passed
1 check passed
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@Mardak
Copy link
Member

@Mardak Mardak commented Nov 10, 2017

Other followup bugs:

  • handling failure cases
    • retry / backoff ?
  • being smarter in update interval
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

5 participants