Fix Bug 1497616 - feat(experiments): Pocket Personalization V2 #4447
Conversation
* Testing squash * Back to working after rebase. * Making tests run and lint. * Key name change
* First pass at removing logs and lint stuff. * Lines not needed * Lines not needed
| return RemoteSettings(name).get(); | ||
| } | ||
|
|
||
| getRecipeExecutor(nbTaggers, nmfTaggers) { |
ncloudioj
Sep 26, 2018
Member
nit: this function creates a new executor instead of returning an existing one. Better rename it to generateRecipeExecutor() or createRecipeExecutor(). Same applies to getNaiveBayesTextTagger() and getNmfTextTagger().
nit: this function creates a new executor instead of returning an existing one. Better rename it to generateRecipeExecutor() or createRecipeExecutor(). Same applies to getNaiveBayesTextTagger() and getNmfTextTagger().
ScottDowne
Sep 27, 2018
•
Collaborator
These were just here to make testing easier and consistent with the only way I could stub remote settings. If it's causing issues, it's probably best if I just remove all of these except for getFromRemoteSettings and just stub them in as globals in the tests.
These were just here to make testing easier and consistent with the only way I could stub remote settings. If it's causing issues, it's probably best if I just remove all of these except for getFromRemoteSettings and just stub them in as globals in the tests.
| } | ||
|
|
||
| getRemoteSettings(name) { | ||
| return RemoteSettings(name).get(); |
ncloudioj
Sep 26, 2018
Member
Although RemoteSettings.get() won't throw, it might return an empty array in certain cases (such as can't sync with the upstream, or IndexedDB failures etc), some care is needed here. Will comment separately at those places.
Although RemoteSettings.get() won't throw, it might return an empty array in certain cases (such as can't sync with the upstream, or IndexedDB failures etc), some care is needed here. Will comment separately at those places.
ScottDowne
Sep 27, 2018
Collaborator
Are we sure it'll only ever return an empty array or data? Part of me wants to do something like this just in case.
return RemoteSettings(name).get() || []; just to be sure.
Are we sure it'll only ever return an empty array or data? Part of me wants to do something like this just in case.
return RemoteSettings(name).get() || []; just to be sure.
ScottDowne
Sep 27, 2018
Collaborator
Oh wait, this is a promise that above wont work quite like that.
Oh wait, this is a promise that above wont work quite like that.
| async generateRecipeExecutor() { | ||
| let nbTaggers = []; | ||
| let nmfTaggers = {}; | ||
| const models = await this.getRemoteSettings("personality-provider-models"); |
ncloudioj
Sep 26, 2018
Member
Again, models could be an empty array. Shall we short-circuit, or create a RecipeExecutor regardless?
Again, models could be an empty array. Shall we short-circuit, or create a RecipeExecutor regardless?
ScottDowne
Sep 26, 2018
Collaborator
short-circuit?
short-circuit?
ncloudioj
Sep 26, 2018
Member
In case of getRemoteSettings returns an empty array (we've seen this happened in prod), shall we just bail out instead of generating a malformed recipe executor?
In case of getRemoteSettings returns an empty array (we've seen this happened in prod), shall we just bail out instead of generating a malformed recipe executor?
ScottDowne
Sep 27, 2018
•
Collaborator
Ah, I'm not sure. If we bail, I think things are going to be pretty wonky. I think the safest thing to do is return empty arrays, which if it doesn't already, I can ensure causes everything to do nothing.
It would end up generating something as if it was a fresh profile with no history. Which is the same as no personalization?
Ah, I'm not sure. If we bail, I think things are going to be pretty wonky. I think the safest thing to do is return empty arrays, which if it doesn't already, I can ensure causes everything to do nothing.
It would end up generating something as if it was a fresh profile with no history. Which is the same as no personalization?
ncloudioj
Sep 27, 2018
Member
Yep, I was wondering that if we could just abort this round completely until the Remote Settings is ready. @jonathankoren what do you think?
This also reminds me that shall we introduce some mechanism to trigger this whole personalization. Like if RS is not ready or Places doesn't return enough records, then just wait for next cycle.
Yep, I was wondering that if we could just abort this round completely until the Remote Settings is ready. @jonathankoren what do you think?
This also reminds me that shall we introduce some mechanism to trigger this whole personalization. Like if RS is not ready or Places doesn't return enough records, then just wait for next cycle.
ScottDowne
Sep 27, 2018
Collaborator
Hm, what does it mean when you say "RS is not ready"? Does it go through a state change where, it's not ready, and using it is slow, and it is ready and using it is fast, or something?
Hm, what does it mean when you say "RS is not ready"? Does it go through a state change where, it's not ready, and using it is slow, and it is ready and using it is fast, or something?
ncloudioj
Sep 27, 2018
Member
In short, RS.get() might run into some trouble in the wild. For example, this is what we've observed in production "NetworkError when attempting to fetch resource". It printed this error message and returned an empty array, perhaps would retry it later.
In short, RS.get() might run into some trouble in the wild. For example, this is what we've observed in production "NetworkError when attempting to fetch resource". It printed this error message and returned an empty array, perhaps would retry it later.
ScottDowne
Sep 28, 2018
•
Collaborator
Yeah I'm seeing this a lot locally with the dev server. It's kinda concerning. How often does it happen in the wild?
Yeah I'm seeing this a lot locally with the dev server. It's kinda concerning. How often does it happen in the wild?
ScottDowne
Sep 28, 2018
Collaborator
I thought it was related to the dev server, (which production might be more reliable) because locally/dev server I'm seeing it a lot.
I thought it was related to the dev server, (which production might be more reliable) because locally/dev server I'm seeing it a lot.
| * Grabs a slice of browse history for building a interest vector | ||
| */ | ||
| async fetchHistory(columns, beginTimeSecs, endTimeSecs) { | ||
| let sql = `SELECT * |
ncloudioj
Sep 26, 2018
Member
A couple of questions for this query:
- Perhaps you only want to fetch certain columns other than
SELECT * here
- Columns like
description and title could also be NULL other than ""
- Do you want to set a limit on the return set? It's always a good idea to do so to avoid the unexpected behavior
- You can leverage the
params of executePlaceQuery function for goodies such as specify the column names and set the limit etc, here is an example
A couple of questions for this query:
- Perhaps you only want to fetch certain columns other than
SELECT *here - Columns like
descriptionandtitlecould also beNULLother than"" - Do you want to set a limit on the return set? It's always a good idea to do so to avoid the unexpected behavior
- You can leverage the
paramsof executePlaceQuery function for goodies such as specify the column names and set the limit etc, here is an example
ScottDowne
Sep 27, 2018
Collaborator
@jonathankoren thoughts on this one? I think you're more familiar with this function currently.
@jonathankoren thoughts on this one? I think you're more familiar with this function currently.
ScottDowne
Sep 27, 2018
Collaborator
@jonathankoren happy to dig into this one after I finished updating the other things, if it helps.
@jonathankoren happy to dig into this one after I finished updating the other things, if it helps.
jonathankoren
Sep 30, 2018
•
Author
Contributor
Fixing the query to explicitly avoid NULL is definately something we should do. Good catch.
Internal polling on our how histories at Pocket showed that the WHERE description <> "" has the habit of reducing the total number of URLs in the history somewhere between 75% and 80%. I guess we can always make the LIMIT very high, so we can do that. I'll have to get back to what a good number is.
The reason why we run SELECT * is that we just wanted to make all the columns available. We ran into a problem before we went down the JS route, where not all the columns (specifically description) were not available on the browse history API available to extensions. This future proofs us, and it's not like the table has a large number of columns anyway.
Fixing the query to explicitly avoid NULL is definately something we should do. Good catch.
Internal polling on our how histories at Pocket showed that the WHERE description <> "" has the habit of reducing the total number of URLs in the history somewhere between 75% and 80%. I guess we can always make the LIMIT very high, so we can do that. I'll have to get back to what a good number is.
The reason why we run SELECT * is that we just wanted to make all the columns available. We ran into a problem before we went down the JS route, where not all the columns (specifically description) were not available on the browse history API available to extensions. This future proofs us, and it's not like the table has a large number of columns anyway.
ncloudioj
Oct 1, 2018
Member
Yes, settings a very high LIMIT is definitely better than nothing. I believe you can find a good tradeoff from the model's minimal requirement and the computing limitation.
The reason why we run SELECT * is that we just wanted to make all the columns available.
Do you really need to make all the columns available? Explicitly specifying the needed columns has at least two benefits:
- It significantly reduces the IO for the big query results, there is no point to load columns without using them
- It makes the code self-described about what Places columns are in use here
Yes, settings a very high LIMIT is definitely better than nothing. I believe you can find a good tradeoff from the model's minimal requirement and the computing limitation.
The reason why we run SELECT * is that we just wanted to make all the columns available.
Do you really need to make all the columns available? Explicitly specifying the needed columns has at least two benefits:
- It significantly reduces the IO for the big query results, there is no point to load columns without using them
- It makes the code self-described about what Places columns are in use here
| return new NmfTextTagger(model); | ||
| } | ||
|
|
||
| getNewTabUtils() { |
ncloudioj
Sep 26, 2018
•
Member
Why wrapping NewTabUtils here? You can simply just do await NewTabUtils.activityStreamProvider.executePlacesQuery
Why wrapping NewTabUtils here? You can simply just do await NewTabUtils.activityStreamProvider.executePlacesQuery
ScottDowne
Sep 26, 2018
•
Collaborator
I got into the habit of doing this for globals for test purposes because a lot of things (example remote-settings) are globals I cannot set as something else in unit tests. While I only needed to wrap remote settings to make stubbing possible, the rest was just for convenience to make it easier to stub.
I got into the habit of doing this for globals for test purposes because a lot of things (example remote-settings) are globals I cannot set as something else in unit tests. While I only needed to wrap remote settings to make stubbing possible, the rest was just for convenience to make it easier to stub.
ncloudioj
Sep 26, 2018
Member
That's a fair point. NewtabUtils.activityStreamProvider already has a default stub defined in test/unit-entry.js, is it possible to reuse that here?
That's a fair point. NewtabUtils.activityStreamProvider already has a default stub defined in test/unit-entry.js, is it possible to reuse that here?
ScottDowne
Sep 27, 2018
Collaborator
Yeah, I'm removing all these get functions cept for the needed one, which was renamed. I'm using stubs in the tests instead.
Yeah, I'm removing all these get functions cept for the needed one, which was renamed. I'm using stubs in the tests instead.
| switch (topic) { | ||
| case "idle-daily": | ||
| this.updateDomainAffinityScores(); | ||
| await this.updateDomainAffinityScores(); |
ncloudioj
Sep 26, 2018
Member
With the typical parameter settings, have you done any benchmarks for this feature to get some basic idea about the typical required running time? Even better to do this on the reference machine. It's hard for me to review this part without those inputs.
The V1 personalization is currently being computed in the idle-daily handler since we are aware of its typical calculation duration is within hundreds of ms. Assuming V2 needs longer, computing it in daily-idle perhaps is not optimal anymore due to the fact that daily-idle is already crowded with other players such as daily maintenance for Places and IndexedDB quota manager.
With the typical parameter settings, have you done any benchmarks for this feature to get some basic idea about the typical required running time? Even better to do this on the reference machine. It's hard for me to review this part without those inputs.
The V1 personalization is currently being computed in the idle-daily handler since we are aware of its typical calculation duration is within hundreds of ms. Assuming V2 needs longer, computing it in daily-idle perhaps is not optimal anymore due to the fact that daily-idle is already crowded with other players such as daily maintenance for Places and IndexedDB quota manager.
|
Please resolve all the review comments/nits. |
|
@ScottDowne I'm cool with this if @ncloudioj is. |
Indeed. If the collection does not have any client registered, no local dump and nothing in local database, then it's not downloaded. You can double check but we already have collections for Focus and Rocket that already leverage that behavior.
That's a bit extreme, but apart the memory impact when calling |
I forgot to mention that the signature verification is probably going to be laborious on old machines, since we will build a canonical JSON of the whole dataset and compute a cryptographic signature of it. |
@leplatrem Thanks for confirming this! @ScottDowne @jonathankoren Could you take one more look at the size of the persistent cache? Is it possible to reduce its size? |
|
We’re not going to be getting these files smaller before code freeze.
…--
jonathankoren
jonathan@jonathankoren.com
On Oct 9, 2018, at 7:41 AM, Nan Jiang ***@***.***> wrote:
Indeed. If the collection does not have any client registered, no local dump and nothing in local database, then it's not downloaded.
@leplatrem Thanks for confirming this!
@ScottDowne @jonathankoren Could you take one more look at the size of the persistent cache? Is it possible to reduce its size?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
|
At least previous values I've seen have values like |
|
Just a quick sanity check on NB_space.json:
Where small removed the trailing 8 digits: Edit: And for NMF_programming:
19% smaller text, 35% smaller gz |
|
@Mardak I already did the 30% reduction by dropping digits. Those files are the ones we had uploaded to dev RS. ;) GZip is also an option we've considered. Yes, it would work. However Scott and I aren't sure there's time to make and test that change. |
* Adding v2 personalization data events to docs. * Updates * updates
|
Is there cleaning up of Remote Settings when a user finishes the experiment (I'm guessing version changes from 2 to 1)? Will there be continued syncing even when not using v2 anymore? |
|
@Mardak that would probably be wise, but not sure how to do that. |
|
Data review granted https://bugzilla.mozilla.org/show_bug.cgi?id=1497616#c10 |
|
|
||
| this.dispatch(ac.PerfEvent({ | ||
| event: "PERSONALIZATION_V2_TOTAL_DURATION", | ||
| value: Math.round(perfService.absNow() - this.perfStart), |
ncloudioj
Oct 11, 2018
Member
Why initialize this.prefStart in the constructor? It'll be incorrect if another init call gets triggered. I believe a local perfStart should be good enough?
Why initialize this.prefStart in the constructor? It'll be incorrect if another init call gets triggered. I believe a local perfStart should be good enough?
ScottDowne
Oct 11, 2018
Collaborator
Yeah this can be updated. I see no issues with this.
Yeah this can be updated. I see no issues with this.
|
LGTM! Thanks for putting this up together! A few more follow up enhancements for your consideration:
|
https://bugzilla.mozilla.org/show_bug.cgi?id=1497616
Description
This is the third (and hopefully final) major PR for Pocket Personalization v2. This code provides that code to fetch tagging models and personalization recipes, feed them through the various taggers and RecipeExecutor. It builds and caches a personal interest vector from the user's browse history on the daily-idle thread, and then uses this to rescore recommended items provided by Pocket.
This code is controlled by an about:config preference that allows us to switch between no personalization, the preëxisting v1 personalization, and this new v2 personalization.
Testing
All code is unit tested. Additionally, we tested this code locally using data pushed to dev remote settings and by reconfiguring about:config to activate v2. Using console logs, we saw that a reasonable interest vector was built, and items received from pocket were appearing in different orders when comparing off with v2.
It's important to note, that you need to have a browse history of sufficient depth, in order to build a working interest vector. In testing, this was achieved by simply copying over the places.sqlite from our main profile to a test profile.
To activate v2, set the
browser.newtabpage.activity-stream.feeds.section.topstories.optionsabout:config preference to{"api_key_pref":"extensions.pocket.oAuthConsumerKey","hidden":false,"provider_icon":"pocket","provider_name":"Pocket","read_more_endpoint":"https://getpocket.com/explore/trending?src=fx_new_tab","stories_endpoint":"https://getpocket.cdn.mozilla.net/v3/firefox/global-recs?count=27&version=3&consumer_key=$apiKey&locale_lang=en-US&feed_variant=default_spocs_on","stories_referrer":"https://getpocket.com/recommendations","topics_endpoint":"https://getpocket.cdn.mozilla.net/v3/firefox/trending-topics?version=2&consumer_key=$apiKey&locale_lang=en-US","model_keys":["nmf_model_animals","nmf_model_business","nmf_model_career","nmf_model_datascience","nmf_model_design","nmf_model_education","nmf_model_entertainment","nmf_model_environment","nmf_model_fashion","nmf_model_finance","nmf_model_food","nmf_model_health","nmf_model_home","nmf_model_life","nmf_model_marketing","nmf_model_politics","nmf_model_programming","nmf_model_science","nmf_model_shopping","nmf_model_sports","nmf_model_tech","nmf_model_travel","nb_model_animals","nb_model_books","nb_model_business","nb_model_career","nb_model_datascience","nb_model_design","nb_model_economics","nb_model_education","nb_model_entertainment","nb_model_environment","nb_model_fashion","nb_model_finance","nb_model_food","nb_model_game","nb_model_health","nb_model_history","nb_model_home","nb_model_life","nb_model_marketing","nb_model_military","nb_model_philosophy","nb_model_photography","nb_model_politics","nb_model_productivity","nb_model_programming","nb_model_psychology","nb_model_science","nb_model_shopping","nb_model_society","nb_model_space","nb_model_sports","nb_model_tech","nb_model_travel","nb_model_writing"],"show_spocs":true,"personalized":true,"version":2}Open Questions
We need some help checking the performance of this PR on various machines.
Related
#4294
#4398