feat (tippytop): #3218 implement TippyTopFeed with remote manifest fetching #3787
Conversation
|
I'm still missing a bunch of tests and have some changes in mind, but I wanted to get some |
| @@ -142,8 +141,8 @@ this.TopSitesFeed = class TopSitesFeed { | |||
| * @param {bool} options.broadcast Should the update be broadcasted. | |||
| */ | |||
| async refresh(options = {}) { | |||
| if (!this._tippyTopProvider.initialized) { | |||
| await this._tippyTopProvider.init(); | |||
| if (!this.store.getState().TippyTop.initialized) { | |||
rlr
Nov 2, 2017
Author
Contributor
TippyTop initialization is going to take longer now on a cold cache (first run, for example). It has to load the default manifest we ship with and then fetch the remote one. So, maybe we want to mark as initialized after the defaults are loaded? The first new tab opened might only have the default set and we might end up requesting screenshots that aren't really needed. But we can show the stuff faster.
TippyTop initialization is going to take longer now on a cold cache (first run, for example). It has to load the default manifest we ship with and then fetch the remote one. So, maybe we want to mark as initialized after the defaults are loaded? The first new tab opened might only have the default set and we might end up requesting screenshots that aren't really needed. But we can show the stuff faster.
|
Another possible performance issue is that the icon URLs are remote. You can see a delay on first load. I think we should be able to add a persistent cache for them. Basically fetch them and store as data URIs. |
|
|
||
| this.getDomain = function(url) { | ||
| let domain = new URL(url).hostname; | ||
| if (domain && domain.startsWith("www.")) { |
ncloudioj
Nov 2, 2017
Member
Do we want to handle those www2 mirror sites? There are quite a few of them (www1/2/3/4) in my Places.
Do we want to handle those www2 mirror sites? There are quite a few of them (www1/2/3/4) in my Places.
rlr
Nov 3, 2017
Author
Contributor
hmm, i guess we could do a regex check for most cases?
hmm, i guess we could do a regex check for most cases?
I wonder if we can simply rely on the browser's image caching for this? The first load could be slow, but it should hit the local cache for the following loads afterwards. IIRC we only ship image URLs for Tiles in the manifest file, although we do host those images on our own server with proper CDN settings. |
|
@rlr This looks good to me. Nice feature! I'd recommend to add a module level document to outline how tippytop works in AS, because there are a few moving parts involved. For example, the default local tippytop file, the remote one that gets downloaded periodically as well as the persistent cache for it. |
There's an ExpirationFilter that will keep those images around if you tell it https://searchfox.org/mozilla-central/rev/423b2522c48e1d654e30ffc337164d677f934ec3/toolkit/modules/NewTabUtils.jsm#1919 |
|
@piatra the expiration filter is only for thumbnails. The purpose of tippy top is to avoid thumbnails to begin with. If you're suggesting of putting the tippy top icons into the thumbnail cache, that will be a bit of work as thumbnails are currently shown along with a letter fallback in the corner, but tippy top's behavior is to skip that behavior. |
|
A different way to get tippy top icons into some cache is to update moz_favicons. @ncloudioj knows more of those details from saving rich icons from pages. But basically, we could save our hosted rich icon as the icon for the page (assuming it's not competing against a page-provided rich icon -- where in the screenshots above seem to show different rich icons for groupon…) Also, unclear if it's "okay" to inject this tippy top data into moz_favicons to begin with… |
To preload those tippy top icons into moz_favicons, we'll have to create dummy visits and store them into Places first, under the hood, Firefox uses |
|
Oh sorry, I didn't realize https://s3.amazonaws.com/activitystream-dev-default-resources-s3bucket-1qw8m6s29v3dq/tippytop/icons.json linked to per-site hosted icons. (I haven't really looked at the PR in detail yet.) What's the purpose of this tippy-top approach in the context of firefox already saving rich icons from the site? For some reason rich icons aren't being saved from the page? I just visited nike, upwork, nba, airbnb as the "before" screenshot above showed thumbnails. But all of those seem to show rich icons for me… ? |
|
@ncloudioj we wouldn't need to preload the icon…? If we were to show a top site (page), we could |
|
@Mardak Oh, you're right. We can setAndFetch at that point, then all the future icon loads could be fetched from moz_favicon. Weird, just tried nike.com here, got a screenshot. Checking my favicon database, there are a few rich icons for "nike.com". Maybe it's because the link in my topsites was "www.nike.com/ca/en_gb"? |
|
Hrmm, "python.org" didn't grab the rich icon either. Can you guys try it out? |
|
@ncloudioj screenshot for me for python.org |
|
Filed a bug #3788 to track the issue described above. |
|
Sounds like we are not sure how much benefit this will give us over our rich icons. Rich icons seems to be pretty damn good already and if we fix the few issues we are uncovering, they'll be even better.
It would be good to know how often users are seeing screenshots. Maybe we still want remote tippy top to fix some "critical" cases and we leave out all the sites that we know we are already finding rich icons for. That would keep the payload small. What should we do until we answer those questions? One options is we can land it with remote fetching disabled. Then we can enable after learning more or run experiments/shield studies in the meantime. And I still like this feed approach over the previous provider approach. Landing this would also allow me to create my own custom manifest so I can make my top sites cooler |
|
@tspurway clarified that one main benefit of these tippy top icons is to get rich icons that websites only advertise to iOS useragents. Ideally, websites will start putting rich icons on all pages, so Firefox icon parser will just work without needing tippy top, but before then, we can make things look better for users. @rlr I would think we would want to get some form of caching before this lands to avoid live requests for these site-hosted rich icons each time we show activity stream. The This however, is a pretty different approach from the current PR. In particular, at a high level:
|
|
Also on a separate note, did none of the tippy top sites have |
The scraper I used doesn't grab |
That sounds good. Once we have that in place, I think we would want rich icons to take priority over tippy top? So, basically we would show the user icons in this priority: 1) rich icons found by our builtin parser, 2) rich icons found with the help of tippy top manifest, 3) tippy top icons we ship with, 4) screenshots with low res icon, 5) a letter. I think the only tricky part will be the messaging between TopSitesFeed and TippyTopFeed to try to fetch the icon and send back a response before falling back to try to get a screenshot. But I think it can all be by dispatching actions. |
It's not ideal but unclear how bad it would be if we ended up requesting a screenshot and only later deciding that we should inject a tippy top rich icon and use that instead. The user might not even see the screenshot as in the common case:
|
6578168
to
8edd725
| switch (action.type) { | ||
| case at.TIPPYTOP_INIT: | ||
| case at.TIPPYTOP_UPDATED: | ||
| return Object.assign({}, prevState, {initialized: true, sitesByDomain: action.data}); |
Mardak
Nov 8, 2017
Member
@k88hudson should there be any concern about putting all the tippy top data into redux? Even if we didn't broadcast it on TIPPYTOP_{INIT,UPDATED}, it would make its way to content via rehydration. And it looks like there's no use of that data from content in this PR or near future (?) ?
Instead of putting tippytop data in the store, maybe we just stash it as some property on the feed?
@k88hudson should there be any concern about putting all the tippy top data into redux? Even if we didn't broadcast it on TIPPYTOP_{INIT,UPDATED}, it would make its way to content via rehydration. And it looks like there's no use of that data from content in this PR or near future (?) ?
Instead of putting tippytop data in the store, maybe we just stash it as some property on the feed?
rlr
Nov 8, 2017
Author
Contributor
Good point. The way this ended up, we don't need this data in the redux store 👍
Good point. The way this ended up, we don't need this data in the redux store
| if (!this.initializing) { | ||
| await this.init(); | ||
| } | ||
| return; |
Mardak
Nov 9, 2017
Member
All of this initialization/queue stuff seems a bit complex when async fetchIcon can just naturally queue up by awaiting. Maybe something like:
async fetchIcon(url) {
const sitesByDomain = await this.getSitesByDomain();
if (domain in sitesByDomain) …
}
getSitesByDomain() {
// return an already loaded object or a promise for that object
return this._sitesByDomain || (this._sitesByDomain = new Promise(async resolve => {
await this.loadCachedData();
await this.maybeRefresh();
resolve(this._sitesByDomain); // NB one of the above set an object to this._sitesByDomain
}));
}
When this._sitesByDomain…
- … is an
Object, it's effectively initialized
- … is a
Promise, it's effectively initializing
- … is
undefined, it's basically doing init
And each call to fetchIcon can just await this.getSites to get the current sites and happen to wait for initialization without realizing it's initialized / initializing or not.
And however we end up refreshing, it should assign the new sites object to this._sitesByDomain, and fetchIcon calls will just get the updated data.
All of this initialization/queue stuff seems a bit complex when async fetchIcon can just naturally queue up by awaiting. Maybe something like:
async fetchIcon(url) {
const sitesByDomain = await this.getSitesByDomain();
if (domain in sitesByDomain) …
}
getSitesByDomain() {
// return an already loaded object or a promise for that object
return this._sitesByDomain || (this._sitesByDomain = new Promise(async resolve => {
await this.loadCachedData();
await this.maybeRefresh();
resolve(this._sitesByDomain); // NB one of the above set an object to this._sitesByDomain
}));
}When this._sitesByDomain…
- … is an
Object, it's effectivelyinitialized - … is a
Promise, it's effectivelyinitializing - … is
undefined, it's basically doinginit
And each call to fetchIcon can just await this.getSites to get the current sites and happen to wait for initialization without realizing it's initialized / initializing or not.
And however we end up refreshing, it should assign the new sites object to this._sitesByDomain, and fetchIcon calls will just get the updated data.
rlr
Nov 9, 2017
Author
Contributor
ooooh, the same trick we used in the PersistentCache. Some day I'll start thinking in promises 😉
ooooh, the same trick we used in the PersistentCache. Some day I'll start thinking in promises
Actually, that sounds pretty good. Maybe more generically |
| }], | ||
| ["tippyTop.service.endpoint", { | ||
| title: "Tippy Top service manifest url", | ||
| value: "https://s3.amazonaws.com/activitystream-dev-default-resources-s3bucket-1qw8m6s29v3dq/tippytop/icons.json" |
rlr
Nov 9, 2017
Author
Contributor
@Mardak How is this pref managed later on once it lands? We might want separate values for nightly/beta/release. We might want to run experiments with different URLs. etc. Do I need to do anything special for now?
@Mardak How is this pref managed later on once it lands? We might want separate values for nightly/beta/release. We might want to run experiments with different URLs. etc. Do I need to do anything special for now?
Mardak
Nov 9, 2017
Member
If it's via a shield study, the prefs will be replaced with some other value. Also, we probably need to fix up mozilla-central test prefs to disable this. And what's the plan for production endpoint?
If it's via a shield study, the prefs will be replaced with some other value. Also, we probably need to fix up mozilla-central test prefs to disable this. And what's the plan for production endpoint?
|
This cleaned up pretty nicely as it's now pretty much a self contained feed that just listens for actions and does stuff. |
|
Seems to be working. We'll need some followups:
|
| if (data && data._timestamp) { | ||
| this._sitesByDomain = data; | ||
| this.tippyTopLastUpdated = data._timestamp; | ||
| this.etag = data._etag; |
Mardak
Nov 9, 2017
Member
This separate this.etag is probably not necessary anymore as it'll just be a sub-property off of _sitesByDomain
This separate this.etag is probably not necessary anymore as it'll just be a sub-property off of _sitesByDomain
| const domain = getDomain(url); | ||
| if (domain in sitesByDomain) { | ||
| let iconUri = Services.io.newURI(sitesByDomain[domain].image_url); | ||
| iconUri.ref = "tippytop"; |
Mardak
Nov 9, 2017
Member
A comment here to identify the url stored in moz_favicons
A comment here to identify the url stored in moz_favicons
|
Other followup bugs:
|
Before:

After:
