Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Track impression on amp landing page #5606

Merged
merged 13 commits into from Oct 28, 2016
15 changes: 15 additions & 0 deletions build-system/server.js
Expand Up @@ -404,6 +404,21 @@ app.use('/examples/amp-fresh.amp.(min.|max.)?html', function(req, res, next) {
next();
});


app.use('/impression-proxy/', function(req, res) {
assertCors(req, res, ['GET']);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about this. I don't want to put a fake cookie here to get the ad server response. Is there a way to get the correct cookie here in proxy server ?(I don't think there should be due to security reason). If not I think I will stick with the fake ad response for now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't think you can. You'd need to hardcode one :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😢

// Fake response with the following optional fields:
// location: The Url the that server would have sent redirect to w/o ALP
// tracking_url: URL that should be requested to track click
// gclid: The conversion tracking value
const body = {
'location': 'localhost:8000/examples/?gclid=1234&foo=bar&example=123',
'tracking_url': 'tracking_url',
'gclid': '1234',
};
res.send(body);
});

// Proxy with unminified JS.
// Example:
// http://localhost:8000/max/s/www.washingtonpost.com/amphtml/news/post-politics/wp/2016/02/21/bernie-sanders-says-lower-turnout-contributed-to-his-nevada-loss-to-hillary-clinton/
Expand Down
12 changes: 9 additions & 3 deletions src/document-info.js
Expand Up @@ -44,8 +44,6 @@ export let DocumentInfoDef;
export function documentInfoForDoc(nodeOrDoc) {
return /** @type {!DocumentInfoDef} */ (getServiceForDoc(nodeOrDoc,
'documentInfo', ampdoc => {
const url = ampdoc.getUrl();
const sourceUrl = getSourceUrl(url);
const rootNode = ampdoc.getRootNode();
let canonicalUrl = rootNode && rootNode.AMP
&& rootNode.AMP.canonicalUrl;
Expand All @@ -56,11 +54,19 @@ export function documentInfoForDoc(nodeOrDoc) {
canonicalUrl = parseUrl(canonicalTag.href).href;
}
const pageViewId = getPageViewId(ampdoc.win);
return {url, sourceUrl, canonicalUrl, pageViewId};
const res = {
get sourceUrl() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still a getter property. Remove the get part, can it turns into a getter method. Maybe call it getSourceUrl.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you want user to call data.getSourceUrl(), instead of data.sourceUrl? I think this get makes sure they can access sourceUrl just like before and that's what we want.

return getSourceUrl(ampdoc.getUrl());
},
canonicalUrl,
pageViewId,
};
return res;
}));
}



/**
* Returns a relatively low entropy random string.
* This should be called once per window and then cached for subsequent
Expand Down
94 changes: 90 additions & 4 deletions src/impression.js
Expand Up @@ -14,31 +14,69 @@
* limitations under the License.
*/

import {user} from './log';
import {dev, user} from './log';
import {isExperimentOn} from './experiments';
import {viewerForDoc} from './viewer';
import {xhrFor} from './xhr';
import {
isProxyOrigin,
parseUrl,
parseQueryString,
addParamsToUrl,
} from './url';
import {timerFor} from './timer';
import {getMode} from './mode';

const TIMEOUT_VALUE = 8000;

let trackImpressionPromise = null;

/**
* A function to get the trackImpressionPromise;
* @return {!Promise}
*/
export function getTrackImpressionPromise() {
return dev().assert(trackImpressionPromise);
}

/**
* Function that reset the trackImpressionPromise only for testing
* @visibleForTesting
*/
export function resetTrackImpressionPromiseForTesting() {
trackImpressionPromise = null;
}

/**
* Emit a HTTP request to a destination defined on the incoming URL.
* Protected by experiment.
* @param {!Window} win
*/
export function maybeTrackImpression(win) {
let resolveImpression;

trackImpressionPromise = new Promise(resolve => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a ForTesting function that sets it back to null?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ForTesting function added. But why can't we just export resolveImpression for testing?

resolveImpression = resolve;
});

if (!isExperimentOn(win, 'alp')) {
resolveImpression();
return;
}

const viewer = viewerForDoc(win.document);
/** @const {string|undefined} */
const clickUrl = viewer.getParam('click');

if (!clickUrl) {
resolveImpression();
return;
}
if (clickUrl.indexOf('https://') != 0) {
user().warn('Impression',
'click fragment param should start with https://. Found ',
clickUrl);
resolveImpression();
return;
}
if (win.location.hash) {
Expand All @@ -47,15 +85,63 @@ export function maybeTrackImpression(win) {
// avoid duplicate tracking.
win.location.hash = '';
}

viewer.whenFirstVisible().then(() => {
invoke(win, clickUrl);
// TODO(@zhouyx) need test with a real response.
const promise = invoke(win, dev().assertString(clickUrl)).then(response => {
applyResponse(win, viewer, response);
});

// Timeout invoke promise after 8s and resolve trackImpressionPromise.
resolveImpression(timerFor(win).timeoutPromise(TIMEOUT_VALUE, promise,
'timeout waiting for ad server response').catch(() => {}));
});
}

/**
* Send the url to ad server and wait for its response
* @param {!Window} win
* @param {string} clickUrl
* @return {!Promise<!JSONType>}
*/
function invoke(win, clickUrl) {
xhrFor(win).fetchJson(clickUrl, {
if (getMode().localDev && !getMode().test) {
clickUrl = 'http://localhost:8000/impression-proxy?url=' + clickUrl;
}
return xhrFor(win).fetchJson(clickUrl, {
credentials: 'include',
requireAmpResponseSourceOrigin: true,
});
// TODO(@cramforce): Do something with the result.
}

/**
* parse the response back from ad server
* Set for analytics purposes
* @param {!Window} win
* @param {!Object} response
*/
function applyResponse(win, viewer, response) {
const adLocation = response['location'];
const adTracking = response['tracking_url'];

// If there is a tracking_url, need to track it
// Otherwise track the location
const trackUrl = adTracking || adLocation;

if (trackUrl && !isProxyOrigin(trackUrl)) {
// To request the provided trackUrl for tracking purposes.
new Image().src = trackUrl;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add comment as to why and what this does.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}

// Replace the location href params with new location params we get.
if (adLocation) {
if (!win.history.replaceState) {
return;
}
const currentHref = win.location.href;
const url = parseUrl(adLocation);
const params = parseQueryString(url.search);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't you just parseQueryString(adLocation)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parseQueryString expects the input to be a query string either ?a=1&b=1 or a=1&b=1

const newHref = addParamsToUrl(currentHref, params);
win.history.replaceState(null, '', newHref);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Imagine the URLs are:
location.href = http://a.com/?a=1&b=2
adLocation = http://a.com/?a=1&b=2

The above logic gives:
newHref = http://a.com/?a=1&b=2&a=1&b=2

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is probably OK I think.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't we just make it the servers responsibility to come up the right URL, and we just blindly replace it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not 100% sure what you are suggesting, but we want to minimize server changes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant to make the protocol simple, just completely replace the URL to the location provided in the response if the host matches. That gives server the ability to remove any existing query params if it wants.

Overall, I think client is not the right place to concatenate the URL, because client does not have the full picture as the server does. It will potentially limit future changes at server side.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lannka would it be the case that location provided in the response and the original href has different query params? If so what should we do?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lannka persuaded me that blindly replacing URL is the right way to do here on my bus home yesterday 😄 . It will help to keep behaviors same with what we have now w/o ALP. Server most likely will keep all query params and append gclid to the end. If server really want to delete any query params that we have, we should still enable them to do that, instead of keeping it.
@cramforce What's your idea about this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fine. I'd avoid the call to replaceState if old and new are the same.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think replaceState is optional for us. Check if it exists and just don't call it if it does not.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmmm, add the check for if(!win.history.replaceState). But when would it be null?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not supported by some browsers that otherwise work with AMP.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. we just do nothing at all for this case? I mean we can probably use other way to attach the params to location.hash

}
}
40 changes: 32 additions & 8 deletions src/service/url-replacements-impl.js
Expand Up @@ -29,6 +29,7 @@ import {viewportForDoc} from '../viewport';
import {userNotificationManagerFor} from '../user-notification';
import {activityFor} from '../activity';
import {isExperimentOn} from '../experiments';
import {getTrackImpressionPromise} from '../impression.js';


/** @private @const {string} */
Expand Down Expand Up @@ -163,6 +164,14 @@ export class UrlReplacements {
return removeFragment(info.sourceUrl);
}));

this.setAsync_('SOURCE_URL', () => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please provide sync alternatives to these that do not wait for the promise

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return getTrackImpressionPromise().then(() => {
return this.getDocInfoValue_(info => {
return removeFragment(info.sourceUrl);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto duplication.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I don't quite understand. ❓

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see! I don't want to change this though. it's a one line code, no need to add an extra method for it.

});
});
});

// Returns the host of the Source URL for this AMP document.
this.set_('SOURCE_HOST', this.getDocInfoValue_.bind(this, info => {
return parseUrl(info.sourceUrl).host;
Expand All @@ -186,15 +195,13 @@ export class UrlReplacements {
}));

this.set_('QUERY_PARAM', (param, defaultValue = '') => {
user().assert(param,
'The first argument to QUERY_PARAM, the query string ' +
'param is required');
const url = parseUrl(this.ampdoc.win.location.href);
const params = parseQueryString(url.search);
return this.getQueryParamData_(param, defaultValue);
});

return (typeof params[param] !== 'undefined') ?
params[param] :
defaultValue;
this.setAsync_('QUERY_PARAM', (param, defaultValue = '') => {
return getTrackImpressionPromise().then(() => {
return this.getQueryParamData_(param, defaultValue);
});
});

/**
Expand Down Expand Up @@ -540,6 +547,23 @@ export class UrlReplacements {
return navigationInfo[attribute];
}

/**
* Return the QUERY_PARAM from the current location href
* @param {*} param
* @param {string} defaultValue
* @return {string}
*/
getQueryParamData_(param, defaultValue) {
user().assert(param,
'The first argument to QUERY_PARAM, the query string ' +
'param is required');
user().assert(typeof param == 'string', 'param should be a string');
const url = parseUrl(this.ampdoc.win.location.href);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't have the same problem as above, because it reads the .href every time.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dvoytenko Should this look at the ampdoc?

const params = parseQueryString(url.search);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

return (typeof params[param] !== 'undefined')
? params[param] : defaultValue;
}

/**
*
* Sets a synchronous value resolver for the variable with the specified name.
Expand Down
1 change: 0 additions & 1 deletion src/service/xhr-impl.js
Expand Up @@ -162,7 +162,6 @@ export class Xhr {
const init = opt_init || {};
init.method = normalizeMethod_(init.method);
setupJson_(init);

return this.fetchAmpCors_(input, init).then(response => {
return assertSuccess(response);
}).then(response => response.json());
Expand Down
22 changes: 20 additions & 2 deletions test/functional/test-document-info.js
Expand Up @@ -63,12 +63,30 @@ describe('document-info', () => {
};
win.document.defaultView = win;
installDocService(win, true);
expect(documentInfoForDoc(win.document).url).to.equal(
'https://cdn.ampproject.org/v/www.origin.com/foo/?f=0');
expect(documentInfoForDoc(win.document).sourceUrl).to.equal(
'http://www.origin.com/foo/?f=0');
});

it('should provide the updated sourceUrl', () => {
const win = {
document: {
nodeType: /* document */ 9,
querySelector() { return 'http://www.origin.com/foo/?f=0'; },
},
Math: {random() { return 0.123456789; }},
location: {
href: 'https://cdn.ampproject.org/v/www.origin.com/foo/?f=0',
},
};
win.document.defaultView = win;
installDocService(win, true);
expect(documentInfoForDoc(win.document).sourceUrl).to.equal(
'http://www.origin.com/foo/?f=0');
win.location.href = 'https://cdn.ampproject.org/v/www.origin.com/foo/?f=1';
expect(documentInfoForDoc(win.document).sourceUrl).to.equal(
'http://www.origin.com/foo/?f=1');
});

it('should provide the pageViewId', () => {
return getWin('https://twitter.com/').then(win => {
sandbox.stub(win.Math, 'random', () => 0.123456789);
Expand Down