Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added a way to sample requests that are sent out from amp-analytics. #2368

Merged
merged 1 commit into from May 6, 2016

Conversation

avimehta
Copy link
Contributor

@avimehta avimehta commented Mar 1, 2016

Still pending in this PR: Documentation for the new feature. Rest of it is ready for review.

Fixes #1550

@Meggin
Copy link
Contributor

Meggin commented Mar 2, 2016

Can you add a documentation label if one isn't there?
On Mar 1, 2016 11:39 AM, "Avi Mehta" notifications@github.com wrote:

Still pending in this PR: Documentation for the new feature. Rest of it is
ready for review.

Fixes #1550 #1550

You can view, comment on, or merge this pull request online at:

#2368
Commit Summary

  • Added a way to sample requests that are sent out from amp-analytics.

File Changes

Patch Links:


Reply to this email directly or view it on GitHub
#2368.

console./*OK*/error(this.getName_(), 'Invalid sampling spec.');
return resolve;
}
const key = this.expandTemplate_(spec['sampleOn'], trigger);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you explain the sampling algorithm a bit?

Could we instead just once create a random number and then always use that same number to determine whether you are in the sample?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PS: with crypto.getRandomValues being in all supported browsers, is that a better way to determine the sample?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aside: I had called the 'sampleOn' attribute the trait in my initial design.

GA wants to consistently sample hits based on clientIds[1]. Others might want to sample based on random numbers, page path or something else. That is the reason I added the attribute 'sampleOn'.

If we were to use getRandomValues, we lose the ability to specify what to sample based on. You suggested that we always use the same number to sample. Any ideas on how that would work? Will the random number be a trigger level thing? or a tag level thing or something else?

[1] Sampling based on clientIds results in user sessions to be consistently in the reports or out of them.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, we should definitely not hash anything on each actual event. That is what I meant. The user's sample state should only be calculated once. If this was e.g. used on an interval or active-use trigger it would get really expensive.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What type of return value would this.expandTemplate_(spec['sampleOn'], trigger) typically give. You are using requestCount in the test. Is that a typical usage?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On Mon, Apr 11, 2016 at 1:51 PM, Malte Ubl notifications@github.com wrote:

In extensions/amp-analytics/0.1/amp-analytics.js
#2368 (comment):

  • * not based on sampleSpec.
  • * @Private
  • */
  • isSampledIn_(trigger) {
  • const spec = trigger['sampleSpec'];
  • const resolve = Promise.resolve(true);
  • if (!spec) {
  •  return resolve;
    
  • }
  • const threshold = spec['threshold'];
  • if (!spec['sampleOn'] ||
  •    Number.isNaN(parseFloat(threshold)) || !Number.isFinite(threshold)) {
    
  •  console./_OK_/error(this.getName_(), 'Invalid sampling spec.');
    
  •  return resolve;
    
  • }
  • const key = this.expandTemplate_(spec['sampleOn'], trigger);

What type of return value would this.expandTemplate_(spec['sampleOn'],
trigger) typically give. You are using requestCount in the test. Is that
a typical usage?

GA uses client id to sample hits. Some people might want random too.


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
https://github.com/ampproject/amphtml/pull/2368/files/ddb55ee8e687cbb11c8fe886147cf218cd3d4e5e#r59278956

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don't quite follow: Would it return CLIENT_ID(…)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spec['sampleOn'] can be any string with one or more macros. I used requestCount in test but it could be RANDOM or CLIENT_ID(...) in real world.

this.expandTemplate_(spec['sampleOn'], trigger) uses var definitions from vendors.js and trigger configs and converts them into something that can be used by url-replacements.js class. The call to urlReplacementsFor(this.getWin()).expand() below will then convert that template string into another string with macros substituted for real values.

In the tests, the value of requestCount will always be 0 since no requests is generated at that point. But that actually doesn't matter because I stub the values returned by sha384. I can change the tests to use RANDOM instead if that makes it less confusing.

@avimehta avimehta force-pushed the sample branch 2 times, most recently from 92189b3 to f537420 Compare March 22, 2016 22:08
@avimehta
Copy link
Contributor Author

@ampsauce retry

@avimehta
Copy link
Contributor Author

ptal. Now we do sampling calculations once per trigger.

@dvoytenko
Copy link
Contributor

@avimehta , sorry just noticed this. Is this ready to be reviewed? Could you please rebase to master? There are some conflicts...

@avimehta
Copy link
Contributor Author

avimehta commented Apr 8, 2016

@dvoytenko this is ready for review. Forgot to update the PR.

@dvoytenko dvoytenko assigned cramforce and unassigned dvoytenko Apr 11, 2016
@dvoytenko
Copy link
Contributor

/to @cramforce to follow up on the initial review.

@cramforce
Copy link
Member

@avimehta Sorry, this got neglected. I think the PR is ready from my POV. Could you rebase? Thanks.

@avimehta
Copy link
Contributor Author

avimehta commented May 6, 2016

rebased. Marging soon.

@avimehta avimehta merged commit 558a176 into ampproject:master May 6, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants