New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Bug 1497616 - feat(experiments): Pocket Personalization V2 #4447

Merged
merged 25 commits into from Oct 11, 2018

Conversation

Projects
None yet
5 participants
@jonathankoren
Contributor

jonathankoren commented Sep 24, 2018

https://bugzilla.mozilla.org/show_bug.cgi?id=1497616

Description

This is the third (and hopefully final) major PR for Pocket Personalization v2. This code provides that code to fetch tagging models and personalization recipes, feed them through the various taggers and RecipeExecutor. It builds and caches a personal interest vector from the user's browse history on the daily-idle thread, and then uses this to rescore recommended items provided by Pocket.

This code is controlled by an about:config preference that allows us to switch between no personalization, the preëxisting v1 personalization, and this new v2 personalization.

Testing

All code is unit tested. Additionally, we tested this code locally using data pushed to dev remote settings and by reconfiguring about:config to activate v2. Using console logs, we saw that a reasonable interest vector was built, and items received from pocket were appearing in different orders when comparing off with v2.

It's important to note, that you need to have a browse history of sufficient depth, in order to build a working interest vector. In testing, this was achieved by simply copying over the places.sqlite from our main profile to a test profile.

To activate v2, set the browser.newtabpage.activity-stream.feeds.section.topstories.options about:config preference to
{"api_key_pref":"extensions.pocket.oAuthConsumerKey","hidden":false,"provider_icon":"pocket","provider_name":"Pocket","read_more_endpoint":"https://getpocket.com/explore/trending?src=fx_new_tab","stories_endpoint":"https://getpocket.cdn.mozilla.net/v3/firefox/global-recs?count=27&version=3&consumer_key=$apiKey&locale_lang=en-US&feed_variant=default_spocs_on","stories_referrer":"https://getpocket.com/recommendations","topics_endpoint":"https://getpocket.cdn.mozilla.net/v3/firefox/trending-topics?version=2&consumer_key=$apiKey&locale_lang=en-US","model_keys":["nmf_model_animals","nmf_model_business","nmf_model_career","nmf_model_datascience","nmf_model_design","nmf_model_education","nmf_model_entertainment","nmf_model_environment","nmf_model_fashion","nmf_model_finance","nmf_model_food","nmf_model_health","nmf_model_home","nmf_model_life","nmf_model_marketing","nmf_model_politics","nmf_model_programming","nmf_model_science","nmf_model_shopping","nmf_model_sports","nmf_model_tech","nmf_model_travel","nb_model_animals","nb_model_books","nb_model_business","nb_model_career","nb_model_datascience","nb_model_design","nb_model_economics","nb_model_education","nb_model_entertainment","nb_model_environment","nb_model_fashion","nb_model_finance","nb_model_food","nb_model_game","nb_model_health","nb_model_history","nb_model_home","nb_model_life","nb_model_marketing","nb_model_military","nb_model_philosophy","nb_model_photography","nb_model_politics","nb_model_productivity","nb_model_programming","nb_model_psychology","nb_model_science","nb_model_shopping","nb_model_society","nb_model_space","nb_model_sports","nb_model_tech","nb_model_travel","nb_model_writing"],"show_spocs":true,"personalized":true,"version":2}

Open Questions

We need some help checking the performance of this PR on various machines.

Related

#4294
#4398

ScottDowne and others added some commits Sep 20, 2018

P18 1904 squash (#18)
* Testing squash

* Back to working after rebase.

* Making tests run and lint.

* Key name change
First pass at removing logs and lint stuff. (#20)
* First pass at removing logs and lint stuff.

* Lines not needed

* Lines not needed

@ncloudioj ncloudioj changed the title from feat(experiments): Pocket Personalization V2 to feat(experiments): Pocket Personalization V2 - Part 3 Sep 24, 2018

@ncloudioj ncloudioj self-assigned this Sep 24, 2018

@ncloudioj ncloudioj self-requested a review Sep 24, 2018

Show resolved Hide resolved lib/PersonalityProvider.jsm Outdated
return RemoteSettings(name).get();
}
getRecipeExecutor(nbTaggers, nmfTaggers) {

This comment has been minimized.

@ncloudioj

ncloudioj Sep 26, 2018

Contributor

nit: this function creates a new executor instead of returning an existing one. Better rename it to generateRecipeExecutor() or createRecipeExecutor(). Same applies to getNaiveBayesTextTagger() and getNmfTextTagger().

This comment has been minimized.

@ScottDowne

ScottDowne Sep 27, 2018

Collaborator

These were just here to make testing easier and consistent with the only way I could stub remote settings. If it's causing issues, it's probably best if I just remove all of these except for getFromRemoteSettings and just stub them in as globals in the tests.

}
getRemoteSettings(name) {
return RemoteSettings(name).get();

This comment has been minimized.

@ncloudioj

ncloudioj Sep 26, 2018

Contributor

Although RemoteSettings.get() won't throw, it might return an empty array in certain cases (such as can't sync with the upstream, or IndexedDB failures etc), some care is needed here. Will comment separately at those places.

This comment has been minimized.

@ScottDowne

ScottDowne Sep 27, 2018

Collaborator

Are we sure it'll only ever return an empty array or data? Part of me wants to do something like this just in case.

return RemoteSettings(name).get() || []; just to be sure.

This comment has been minimized.

@ScottDowne

ScottDowne Sep 27, 2018

Collaborator

Oh wait, this is a promise that above wont work quite like that.

This comment has been minimized.

async generateRecipeExecutor() {
let nbTaggers = [];
let nmfTaggers = {};
const models = await this.getRemoteSettings("personality-provider-models");

This comment has been minimized.

@ncloudioj

ncloudioj Sep 26, 2018

Contributor

Again, models could be an empty array. Shall we short-circuit, or create a RecipeExecutor regardless?

This comment has been minimized.

@ScottDowne

ScottDowne Sep 26, 2018

Collaborator

short-circuit?

This comment has been minimized.

@ncloudioj

ncloudioj Sep 26, 2018

Contributor

In case of getRemoteSettings returns an empty array (we've seen this happened in prod), shall we just bail out instead of generating a malformed recipe executor?

This comment has been minimized.

@ScottDowne

ScottDowne Sep 27, 2018

Collaborator

Ah, I'm not sure. If we bail, I think things are going to be pretty wonky. I think the safest thing to do is return empty arrays, which if it doesn't already, I can ensure causes everything to do nothing.

It would end up generating something as if it was a fresh profile with no history. Which is the same as no personalization?

This comment has been minimized.

@ncloudioj

ncloudioj Sep 27, 2018

Contributor

Yep, I was wondering that if we could just abort this round completely until the Remote Settings is ready. @jonathankoren what do you think?

This also reminds me that shall we introduce some mechanism to trigger this whole personalization. Like if RS is not ready or Places doesn't return enough records, then just wait for next cycle.

This comment has been minimized.

@ScottDowne

ScottDowne Sep 27, 2018

Collaborator

Hm, what does it mean when you say "RS is not ready"? Does it go through a state change where, it's not ready, and using it is slow, and it is ready and using it is fast, or something?

This comment has been minimized.

@ncloudioj

ncloudioj Sep 27, 2018

Contributor

In short, RS.get() might run into some trouble in the wild. For example, this is what we've observed in production "NetworkError when attempting to fetch resource". It printed this error message and returned an empty array, perhaps would retry it later.

This comment has been minimized.

@ScottDowne

ScottDowne Sep 28, 2018

Collaborator

Yeah I'm seeing this a lot locally with the dev server. It's kinda concerning. How often does it happen in the wild?

This comment has been minimized.

@ScottDowne

ScottDowne Sep 28, 2018

Collaborator

I thought it was related to the dev server, (which production might be more reliable) because locally/dev server I'm seeing it a lot.

* Grabs a slice of browse history for building a interest vector
*/
async fetchHistory(columns, beginTimeSecs, endTimeSecs) {
let sql = `SELECT *

This comment has been minimized.

@ncloudioj

ncloudioj Sep 26, 2018

Contributor

A couple of questions for this query:

  • Perhaps you only want to fetch certain columns other than SELECT * here
  • Columns like description and title could also be NULL other than ""
  • Do you want to set a limit on the return set? It's always a good idea to do so to avoid the unexpected behavior
  • You can leverage the params of executePlaceQuery function for goodies such as specify the column names and set the limit etc, here is an example

This comment has been minimized.

@ScottDowne

ScottDowne Sep 27, 2018

Collaborator

@jonathankoren thoughts on this one? I think you're more familiar with this function currently.

This comment has been minimized.

@ScottDowne

ScottDowne Sep 27, 2018

Collaborator

@jonathankoren happy to dig into this one after I finished updating the other things, if it helps.

This comment has been minimized.

@jonathankoren

jonathankoren Sep 30, 2018

Contributor

Fixing the query to explicitly avoid NULL is definately something we should do. Good catch.

Internal polling on our how histories at Pocket showed that the WHERE description <> "" has the habit of reducing the total number of URLs in the history somewhere between 75% and 80%. I guess we can always make the LIMIT very high, so we can do that. I'll have to get back to what a good number is.

The reason why we run SELECT * is that we just wanted to make all the columns available. We ran into a problem before we went down the JS route, where not all the columns (specifically description) were not available on the browse history API available to extensions. This future proofs us, and it's not like the table has a large number of columns anyway.

This comment has been minimized.

@ncloudioj

ncloudioj Oct 1, 2018

Contributor

Yes, settings a very high LIMIT is definitely better than nothing. I believe you can find a good tradeoff from the model's minimal requirement and the computing limitation.

The reason why we run SELECT * is that we just wanted to make all the columns available.

Do you really need to make all the columns available? Explicitly specifying the needed columns has at least two benefits:

  • It significantly reduces the IO for the big query results, there is no point to load columns without using them
  • It makes the code self-described about what Places columns are in use here
return new NmfTextTagger(model);
}
getNewTabUtils() {

This comment has been minimized.

@ncloudioj

ncloudioj Sep 26, 2018

Contributor

Why wrapping NewTabUtils here? You can simply just do await NewTabUtils.activityStreamProvider.executePlacesQuery

This comment has been minimized.

@ScottDowne

ScottDowne Sep 26, 2018

Collaborator

I got into the habit of doing this for globals for test purposes because a lot of things (example remote-settings) are globals I cannot set as something else in unit tests. While I only needed to wrap remote settings to make stubbing possible, the rest was just for convenience to make it easier to stub.

This comment has been minimized.

@ncloudioj

ncloudioj Sep 26, 2018

Contributor

That's a fair point. NewtabUtils.activityStreamProvider already has a default stub defined in test/unit-entry.js, is it possible to reuse that here?

This comment has been minimized.

@ScottDowne

ScottDowne Sep 27, 2018

Collaborator

Yeah, I'm removing all these get functions cept for the needed one, which was renamed. I'm using stubs in the tests instead.

switch (topic) {
case "idle-daily":
this.updateDomainAffinityScores();
await this.updateDomainAffinityScores();

This comment has been minimized.

@ncloudioj

ncloudioj Sep 26, 2018

Contributor

With the typical parameter settings, have you done any benchmarks for this feature to get some basic idea about the typical required running time? Even better to do this on the reference machine. It's hard for me to review this part without those inputs.

The V1 personalization is currently being computed in the idle-daily handler since we are aware of its typical calculation duration is within hundreds of ms. Assuming V2 needs longer, computing it in daily-idle perhaps is not optimal anymore due to the fact that daily-idle is already crowded with other players such as daily maintenance for Places and IndexedDB quota manager.

@ncloudioj

Please resolve all the review comments/nits.

@jonathankoren

This comment has been minimized.

Contributor

jonathankoren commented Oct 8, 2018

@ScottDowne I'm cool with this if @ncloudioj is.

@leplatrem

This comment has been minimized.

leplatrem commented Oct 8, 2018

If I understand you, if the pref is off, and a get on the collection is never made for that user, they never get the records downloaded?

Indeed. If the collection does not have any client registered, no local dump and nothing in local database, then it's not downloaded.

You can double check but we already have collections for Focus and Rocket that already leverage that behavior.

There are 57 records, which totals around 20MB.

That's a bit extreme, but apart the memory impact when calling .get(), there should not be any major problem. IndexedDB should be able to handle that.

@leplatrem

This comment has been minimized.

leplatrem commented Oct 9, 2018

That's a bit extreme, but apart the memory impact when calling .get(), there should not be any major problem. IndexedDB should be able to handle that.

I forgot to mention that the signature verification is probably going to be laborious on old machines, since we will build a canonical JSON of the whole dataset and compute a cryptographic signature of it.

@ncloudioj

This comment has been minimized.

Contributor

ncloudioj commented Oct 9, 2018

Indeed. If the collection does not have any client registered, no local dump and nothing in local database, then it's not downloaded.

@leplatrem Thanks for confirming this!

@ScottDowne @jonathankoren Could you take one more look at the size of the persistent cache? Is it possible to reduce its size?

@jonathankoren

This comment has been minimized.

Contributor

jonathankoren commented Oct 9, 2018

@Mardak

This comment has been minimized.

Member

Mardak commented Oct 9, 2018

At least previous values I've seen have values like -0.6901305870205201 and -5.994137996346771. Are they all negative and all need that much precision? E.g., potentially transforming those to .69013058 and 5.9941379

@Mardak

This comment has been minimized.

Member

Mardak commented Oct 9, 2018

Just a quick sanity check on NB_space.json:

   24375 orig.json
    8419 orig.json.gz
   16720 small.json
    5705 small.json.gz

Where small removed the trailing 8 digits: s/[0-9]{8}([,\]])/\1/g. Roughly 33% smaller for both text and gz. No clue about accuracy / behavior changes 😉

Edit: And for NMF_programming:

 1905882 orig.json
  513245 orig.json.gz
 1550963 small.json
  334571 small.json.gz

19% smaller text, 35% smaller gz

@jonathankoren

This comment has been minimized.

Contributor

jonathankoren commented Oct 9, 2018

@Mardak I already did the 30% reduction by dropping digits. Those files are the ones we had uploaded to dev RS. ;)

GZip is also an option we've considered. Yes, it would work. However Scott and I aren't sure there's time to make and test that change.

@Mardak Mardak changed the title from feat(experiments): Pocket Personalization V2 - Part 3 to feat(experiments): Bug 1497616 Pocket Personalization V2 - Part 3 Oct 9, 2018

Adding v2 personalization data events to docs. (#29)
* Adding v2 personalization data events to docs.

* Updates

* updates

@ScottDowne ScottDowne changed the title from feat(experiments): Bug 1497616 Pocket Personalization V2 - Part 3 to Bug 1497616 - feat(experiments): Pocket Personalization V2 - Part 3 Oct 9, 2018

@ScottDowne ScottDowne changed the title from Bug 1497616 - feat(experiments): Pocket Personalization V2 - Part 3 to Fix Bug 1497616 - feat(experiments): Pocket Personalization V2 - Part 3 Oct 9, 2018

@ncloudioj ncloudioj changed the title from Fix Bug 1497616 - feat(experiments): Pocket Personalization V2 - Part 3 to Fix Bug 1497616: Pocket Personalization V2 Oct 9, 2018

@jonathankoren jonathankoren changed the title from Fix Bug 1497616: Pocket Personalization V2 to Fix Bug 1497616 - feat(experiments): Pocket Personalization V2 Oct 9, 2018

@jonathankoren

This comment has been minimized.

@Mardak

This comment has been minimized.

Member

Mardak commented Oct 10, 2018

Is there cleaning up of Remote Settings when a user finishes the experiment (I'm guessing version changes from 2 to 1)? Will there be continued syncing even when not using v2 anymore?

@ScottDowne

This comment has been minimized.

Collaborator

ScottDowne commented Oct 10, 2018

@Mardak that would probably be wise, but not sure how to do that.

@ncloudioj

This comment has been minimized.

Contributor

ncloudioj commented Oct 11, 2018

this.dispatch(ac.PerfEvent({
event: "PERSONALIZATION_V2_TOTAL_DURATION",
value: Math.round(perfService.absNow() - this.perfStart),

This comment has been minimized.

@ncloudioj

ncloudioj Oct 11, 2018

Contributor

Why initialize this.prefStart in the constructor? It'll be incorrect if another init call gets triggered. I believe a local perfStart should be good enough?

This comment has been minimized.

@ScottDowne

ScottDowne Oct 11, 2018

Collaborator

Yeah this can be updated. I see no issues with this.

@ncloudioj

LGTM! Thanks for putting this up together! 👍

A few more follow up enhancements for your consideration:

  • Add some cleaning up code for, such as RemoteSettings and persistent cache, when the experiment finishes
  • Keep an eye on the ordering bug, make sure it won't introduce any unexpected result that could affect the experiment performance
  • Keep monitoring the performance metrics once this experiment is alive

@ncloudioj ncloudioj added PR / R+ and removed PR / Needs work labels Oct 11, 2018

@ncloudioj ncloudioj merged commit 87eccde into mozilla:master Oct 11, 2018

2 checks passed

Taskcluster (pull_request) TaskGroup: success
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment