-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
✨Performance task locally served pages & wg handlers #27283
✨Performance task locally served pages & wg handlers #27283
Conversation
@@ -2,6 +2,6 @@ | |||
|
|||
Use `gulp performance` to run this test. It measures performance for the current branch compared to the current release. By default, the command first needs to compile minified runtime and components from the current branch by executing the `gulp dist` task. To skip this step, use the `--nobuild` flag, but the task will throw an error if the dist files are missing. Only the runtime and components used by documents from the configured URLs are compiled. | |||
|
|||
The `config.json` file contains settings. If you set `headless` to `false`, you will be able to watch the test run in Chrome. The default value for `runs` is 100, which takes ~10 minutes to run, and generates results with ~5% margin of error. Differences larger than this should be investigated. The `urls` array of strings must be set before running the test, and each URL will be tested separately. | |||
The `config.json` file contains settings. If you set `headless` to `false`, you will be able to watch the test run in Chrome. The default value for `runs` is 100, which takes ~10 minutes to run, and generates results with ~5% margin of error. Differences larger than this should be investigated. The `urls` array of strings must be set before running the test, and each URL will be tested separately. If you wish to test `urls` hosted locally, use `gulp --cdn` to serve the binaries from cdn. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should ask all to use gulp serve --compiled
when testing locally.
It should be up to user to decide to use gulp serve --compiled
or gulp serve --cdn
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well if you use the --compiled
version the task won't work for locally served pages. It only works with --cdn
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why doesn't it worked with --compiled
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm I see, that's interesting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. I wonder why don't we use the local server proxy /proxy
.
const matchArray = interceptedUrl.match(CDN_ANALYTICS_REGEXP); | ||
if (matchArray) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wondering if this is a good opportunity to setup an extensible pattern here, possibly some mapping of URLs to handlers since it seems likely there would be more cases?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point! I know the ads team want to collect two requests delay. Not sure if they'll use this specific task. Maybe useful to expand the feature here.
@powerivq Any comment? Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am thinking of this: each team still implements its own handleRequest to respond the payload it desires, and we make another interceptor to record the timing of the request. Does it make sense? @zhouyx
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kevinkimball good point!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@powerivq I tested out multiple request handlers and it seems to work fine. Seems like a good route to go down in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looked at the definition of networkidle0
. I don't know if this would satisfy ads team's request. ActiveView requires the ad to be visible for at least 50% for 1 second. It's very possible that the task ends before activeView. @kevinkimball @micajuine-ho any workaround?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe just adding a setTimeout would do the trick, however I'm not sure if we want to always include it. Maybe we could add a flag (--set-timeout=500
) that would set how long to wait for, if any. Thoughts @zhouyx ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A race between timeout, and the analytics request sounds good to me.
My recommendation is to always include the timeout, but set it to a value that's long enough. People will go with the default setting most of the time, and a timeout will serve as a fallback and reduce flakiness.
BTW this is something I'm testing with, please ignore the console log part : )
let timeout_count = 0;
const promise = new Promise(resolve => {
r = resolve;
setTimeout(() => {
timeout_count++;
resolve();
}, TIMEOUT);
});
page.on('console', message => {
if (String(message._text).includes('request delay')) {
const delay = Number(message._text.replace('analytics request delay is ', ''));
all.push(delay);
r();
}
});
.....
await promise;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zhouyx Ok that sounds good to me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add @estherkim for the best way to customize timeout for different test cases.
69df27f
to
2ab806c
Compare
Adding @estherkim to review changes to the |
f9e03f6
to
e15f0ab
Compare
e15f0ab
to
f963c68
Compare
b4ae9fd
to
a1aba4e
Compare
@@ -1,5 +1,15 @@ | |||
{ | |||
"headless": true, | |||
"runs": 100, | |||
"urls": [] | |||
"urls": [], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what does the top level urls
do? should this be?
"noHandlers": {
"urls": [],
....
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zhouyx Yes, that's what it does. Would it be more clear to have noHandlerUrls: []
instead of making it an object, because right now I don't see any other config data that fits in to the noHandlers
category (headless
and runs
applies to all urls)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find the declaration a bit confusing because headless
and runs
applies to all urls, but the top level urls
doesn't. The mix and match makes the configuration structure not intuitive imo.
I don't have strong opinion as long as everything agrees with the config setting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm what do you guys think of the following:
urls: {
'http://normal.test',
'http://analytics-1.test': {handler: analyticsHandler},
'http://analytics-2.test': {handler: analyticsHandler},
'http://ads-1.test': {handler: adsHandler},
}
I got the inspo from eslint configs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@estherkim I like it, but would it have 'http://normal.test': {},
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah hmm. Could be ok?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would still prefer categorize by handler, because it is more often that one wants to test all handler related urls locally. But this would also work!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kevinkimball Any preference?
Edit: @estherkim actually after thinking about this for a I while I think the eslint config isn't optimal here. Since each handler could be applied many urls, we would end up having duplicate configs for a lot of urls. Also I don't think that urls with the same handler
would have any different values (aka timeout: 2000
vs timeout:5000
). Also, having urls in within the handler definition ensures that each url gets treated the same way with less chance for flake or bias.
await cacheDocuments(urls); | ||
await compileScripts(urls); | ||
await rewriteScriptTags(urls); | ||
await getMetrics(urls, {headless, runs}); | ||
await rewriteAnalyticsTags(urls); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think rewriteScriptTags
and rewriteAnalyticsTags
can be merged to one to avoid read and write files multiple times.
Another optimization would be to not invoke rewriteAnalyticsTags
for urls without the analytics handler defined.
Not blocking, but may be a good optimization to have once we start to run the task on travis. cc @estherkim
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another optimization would be to not invoke rewriteAnalyticsTags for urls without the analytics handler defined.
Sounds good to me.
I think rewriteScriptTags and rewriteAnalyticsTags can be merged to one to avoid read and write files multiple times.
Maybe once the ads team checks in their handler, we can see how we can group the rewrites (assuming that they need to rewrite something).
dace9d0
to
5ac22f0
Compare
5ac22f0
to
e234f8e
Compare
Ready for another round of review. Changes since last review:
|
const CDN_URL = 'https://cdn.ampproject.org/'; | ||
const AMP_JS_PATH = '/dist/amp.js'; | ||
const LOCAL_PATH_REGEXP = /dist\/(v0\/amp-[A-Za-z\-0-9\.]+).max(.js)/; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you explain when would it fetch non compiled script? My understanding is that this should never happen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Originally, we thought that gulp --cdn
to run local pages would work, since that changes the script tags to cdn.ampproject....
and we can parse out the filepath.
However, to use gulp --cdn
, we have to run gulp dist --fortesting
, which then causes amp-analytics/config.js
to not work correctly; it requests file://...v0/analytics-vendors/gtag.json
, which is blocked because AMP doesn't allow requests using the file protocol (it uses the file protocol because this.win.location.protocol
is file on puppeteer.
So the fix is to use neither gulp --cdn to serve local pages, nor gulp --fortesting, but just parse out the filepaths manually. Also the change wasn't too bad.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please verify with @estherkim that running gulp dist
without --fortesting
works on travis.
But this doesn't explain why you need .max.js
and also amp.js
, they should never appear when you run gulp dist
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But this doesn't explain why you need .max.js and also amp.js, they should never appear when you run gulp dist.
Turns out I was making mistakes when running the gulp dist --forstesting
+ gulp serve --compiled
.
Verified that we will be running gulp dist --forstesting
+ gulp serve --compiled
on Travis.
@@ -28,11 +29,12 @@ const {printReport} = require('./print-report'); | |||
*/ | |||
async function performance() { | |||
installPackages(__dirname); | |||
const {headless, runs, urls} = new loadConfig(); | |||
const {headless, runs, urls, handlers} = new loadConfig(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is so so confusing.
We first flat the config object to an array of urls and the raw handlers config.
Then later in measureDocument
we iterate the handler config and find the corresponding handler type again?
Could you please optimize the design a bit. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops sorry for the confusion!
- Changed the config API:
handlers
(Object) =>handlers
(List) - Changed
loadConfing()
to do a little more preprocessing to avoid extra work in measureDocuments - Moved analytic specific parameters to the config and removed necessary helpers
- Passed in full config to
measureDocuments
(we were basically doing this before) - Split up
setUpRequestHandlers()
}); | ||
return mapping; | ||
}, {}); | ||
if (Object.keys(config.urlToHandlers).length < 1) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So running gulp performance
without changing the config will end up with this error? Is that expected?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes and yes. Plan is that before we integrate with our CI, we add urls.
"extraUrlParam": { | ||
"analytics": "amp-analytics-performance-param" | ||
}, | ||
"analyticsParam": "analytics=amp-analytics-performance-param" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two general recommendation.
- The config should be simple and only contains essential information.
analyticsParam
should be able to be generated fromextraUrlParam
in code. extraUrlParam
andanalyticsParam
don't apply to all handlers. Specifying here may be confusing, a nestedhandlerConfig
may be helpful.
This is fine with only three handler types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zhouyx Agree with removing analyticsParam
from the config.
extraUrlParam and analyticsParam don't apply to all handlers.
Yep, they already live in the analyticsHandler
. Is that good enough?
const CDN_URL = 'https://cdn.ampproject.org/'; | ||
const AMP_JS_PATH = '/dist/amp.js'; | ||
const LOCAL_PATH_REGEXP = /dist\/(v0\/amp-[A-Za-z\-0-9\.]+).max(.js)/; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please verify with @estherkim that running gulp dist
without --fortesting
works on travis.
But this doesn't explain why you need .max.js
and also amp.js
, they should never appear when you run gulp dist
.
*/ | ||
function setUpAddionalHandlers(handlersList, handlerOptions) { | ||
// Setting up timing and adding specified handler | ||
if (handlerOptions.handlerName === 'analyticsHandler') { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct me if this wrong.
All urls within the same handlers
config share the same handlerOptions
config. (we're not creating new instances of the handlerOptions
object within load-config.js
file). I'm afraid rewriting the handlerOptions
object would affect other instances that fall within the same handler. Have you tried the test with multiple urls.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I have tried it with multiple urls; within measureDocument()
, we create handlerOptionsForUrl
by creating a shallow copy of the original handlerOptions
config. This works fine for storing the startTime
and endTime
, but would cause some bugs if someone were to mutate objects within the handlerOptions
. Instead of shallow copy, should I create a deep copy?
40fa4b7
to
157e12e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good if this runs correctly, please also get infra team's approval.
@@ -19,13 +19,16 @@ const fs = require('fs'); | |||
const path = require('path'); | |||
|
|||
const CDN_URL = 'https://cdn.ampproject.org/'; | |||
const V0_PATH = '/dist/v0.js'; | |||
const LOCAL_PATH_REGEXP = /dist\/(v0\/amp-[A-Za-z\-0-9\.]+.js)/; | |||
const ANALYTICS_VENDORS_PATH = '../../../extensions/amp-analytics/0.1/vendors/'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's use the compiled vendor which lives under v0/analytics-vendors/xxx.json
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, if we're using that path then we should also add the gulp vendor-configs
command as part of this task...
That command will be run in the Travis build, but I think it's good to have it for local testing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we always run that with gulp build
or gulp dist
, so it should be fine?
157e12e
to
57b7840
Compare
@estherkim PTAL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for working on this!
@@ -28,6 +29,7 @@ async function compileScripts(urls) { | |||
if (!argv.nobuild) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
--nobuild is only used when you have already built the binary and don't want to build it again.
There's no need to await for vendorConfigs
.
57b7840
to
3b84fde
Compare
Adding functionality to the performance gulp task to track how long it takes for requests (pageview and ini-load) to fire for pages with amp-analytics tags.This PR is now for supporting locally served pages, along with building a structure for wgs easily add in handlers to measure requests (i.e. wg-ads/wg-analytics measuring delay til the
visible
ping is sent out). This is one of the steps in adding the performance task into our CI (since our test pages will be locally served pages).Changes:
rewrite-script-tags
:rewrite-analytics-tags
:measureDocuments
: