Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add SRA support for Doubleclick Fast Fetch Traffic #9563

Merged
merged 36 commits into from Jun 8, 2017
Merged

Add SRA support for Doubleclick Fast Fetch Traffic #9563

merged 36 commits into from Jun 8, 2017

Conversation

keithwrightbos
Copy link
Contributor

@keithwrightbos keithwrightbos commented May 25, 2017

Still a WIP but contains the following:

  • If page url includes force_sra=true query parameter, doubleclick request will be sent via SRA
  • combines doubleclick amp-ad elements by network ID (sends separate SRA requests for different networks)
  • expect line delimited response format where odd lines are JSON formatted headers and event lines are the creative content
  • uses streaming fetch method so that blocks can be rendered without waiting for entire SRA response (renders each block as their pair of lines are retrieved) where possible (requires native fetch API) otherwise falls back to fetching entire response prior to processing.
  • in the event of error, slots are collapsed and frame get fallback method is disabled
  • resets global SRA fetch executor in unlayoutCallback allowing for resumeCallback initiated chain to refetch via SRA
  • separates parameters into block and page level. Assumes that only block level parameters included in COMBINER global functions are required. TODO: verify and implement exceptions (e.g. block fields within googleBlockParameters such as ifi, adf, and act).

TODO:

  • allow for selecting into SRA by publisher and/or experiment
  • update CSI metrics to properly capture SRA behavior
  • consider if object key iteration is sufficient to guarantee parameter order (current testing suggests order by insertion is enforced)

cc/ @ampproject/a4a

@keithwrightbos
Copy link
Contributor Author

Looking for initial feedback while tests are updated

Fixes #9115

Copy link
Contributor

@jridgewell jridgewell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like two unrelated PRs got merged together? Can you split them up?

return googlePageParameters(a4a.win, a4a.getAmpDoc(), startTime)
.then(pageLevelParameters => {
Object.assign(parameters, blockLevelParameters);
Object.assign(parameters, pageLevelParameters);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be done in a single call:

Object.assign(parameters, blockLevel, pageLevel);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

const blockLevelParameters = googleBlockParameters(a4a, opt_experimentIds);
return googlePageParameters(a4a.win, a4a.getAmpDoc(), startTime)
.then(pageLevelParameters => {
Object.assign(parameters, blockLevelParameters);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's weird to mutate an object you don't own.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose that's fair but given the mutated object is constructed within this function and never exposed, I see no reason not to use it as opposed to creating a copy.

@@ -493,6 +493,9 @@ export class FetchResponse {

/** @type {boolean} */
this.bodyUsed = false;

/** @type {?ReadableStream} */
this.body = this.xhr_.body;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're worried about polyfill not having a body, set to null directly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't this essentially the same end result?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really. You're accessing a property here that doesn't exist and isn't expected to exist. Setting it to null makes that clear.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

* @return {!Promise<!../../../src/service/xhr-impl.FetchResponse>} response,
* note that accessing body will cause error as its consumed in streaming.
*/
export function fetchLineDelimitedChunks(xhr, url, chunkHandler, opt_init) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is this used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.then(response => {
if (!response.body || !xhr.win.TextDecoder) {
// TODO(keithwrightbos) - TextDecoder polyfill?
response.text().then(content =>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Return here, to avoid nesting in the else (else can be dropped).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

handleFetchResponseStream_(
response.body.getReader(), chunkCallback, new TextDecoder('utf-8'));
}
return response;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is return necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to allow access to the SRA request's headers and status code

snippet += chunk.substr(pos, end - pos);
pos = hasDelimiter ? delimIndex + 1 : end;
if (hasDelimiter) {
if (!metaData) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please flip.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

const chunkCallback = (chunk, done) => {
let hasDelimiter;
let pos = 0;
do {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const regex = /(.+)(\n)?/g;
while ((match = regex.exec(chunk))) {
  snippet += chunk[1];
  if (match[2]) {
    if (metadata) {
      metadata = parse(snippet);
    } else {
      chunk(snippet, metadata, done && regex.lastIndex === chunk.length);
    }
    snippet = '';
  }
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done though I had to extend the RegExp to handle the case where a line could have no content as this would occur when no creative could be found for a given request.

// the last slash is used to escape a character that we haven't
// received yet. So we skip the last byte in this iteration and
// wait for the next chunk to see what is being escaped.
end--;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't make sense.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed


/**
* Creates an XHR streaming request with expected response format of repeated
* pairs of JSON object metadata then string HTML delimited by line returns.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a lot of code to support streaming line delimited responses. If these single lines are very long and there are very few lines, you're likely not seeing any benefit from streaming the lines separately.

Streaming bodies makes sense when the HTML can be parsed incrementally. As it is now, we're not doing that, we're streaming lines of text.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True and I may look into doing sub-line chunking in the future but currently both of our rendering methods (friendly frame srcdoc and safeframe passing the entire creative as the name attribute) require the full creative prior to rendering start. I plan to look into possible service worker support which would allow streaming (will email to discuss outside this PR).

However, note that in testing the chunk size is 32k (matches gzip window) and its often that can account for 1-2 creative responses. Doubleclick SRA requests commonly have many blocks (happy to share data offline) so I do believe this solution to have meaning impact as it will allow blocks to render prior to receiving the entire SRA response.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created a sample. Both streaming and all-text paths seem to complete in about ~10ms (using 7 responses, some taking 16k, others taking 8k). I still think this is way over-engineering.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The improvement comes from lower bandwidth usage cases. For instance, if you patch this PR and go to http://localhost:8000/proxy/amp.timeinc.net/time/4458554/best-video-games-all-time/?amp_js_v=0.1&force_sra=true#exp=a4a%3A-1 and emulate 3g, you'll find that the delay between the first and last chunk is ~150 ms which is certainly meaningful in the ad space. The hope is to make SRA the default behavior as it allows for better compression across requests, reduced server usage, and possible downstream optimizations. However, it means that the delay imposed on the first slot's time to render must be as minimal as possible. Chunking helps ensure this and I hope in the future to combine SRA with service workers allowing the results to be streamed into the frame as you suggest.

@keithwrightbos
Copy link
Contributor Author

@jridgewell PTAL. Regarding your comment why this appears to be two PRs instead of one, did you review/see the changes to amp-ad-network-doubleclick-impl.js?

};
handleFetchResponseStream_(
response.body.getReader(), chunkCallback, new TextDecoder('utf-8'));
return response;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's return an object instead of the response. Since this body is being used, it's just going to lead to problems later on if we anyone tries to use it again.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -493,6 +493,9 @@ export class FetchResponse {

/** @type {boolean} */
this.bodyUsed = false;

/** @type {?ReadableStream} */
this.body = this.xhr_.body;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really. You're accessing a property here that doesn't exist and isn't expected to exist. Setting it to null makes that clear.

export function fetchLineDelimitedChunks(xhr, url, chunkHandler, opt_init) {
return xhr.fetch(url, opt_init)
.then(response => {
if (!response.body || !xhr.win.TextDecoder) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, can we just send this down the same path? It'll require splitting this method into two:

  1. A check for fast path (body and TextDecoder) or slow path
  2. Extracting lines from the string

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


/**
* Creates an XHR streaming request with expected response format of repeated
* pairs of JSON object metadata then string HTML delimited by line returns.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created a sample. Both streaming and all-text paths seem to complete in about ~10ms (using 7 responses, some taking 16k, others taking 8k). I still think this is way over-engineering.

'scp': serializeTargeting(
this.jsonTargeting_['targeting'] || null,
this.jsonTargeting_['categoryExclusions'] || null),
}, googleBlockParameters(this));;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please remove second ; (as you just noticed while looking over my shoulder)

*/
export function groupAmpAdsByType(win, type, groupFn) {
return resourcesForDoc(win.document).getMeasuredResources(win,
r => r.element.tagName == 'AMP-AD' &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can you either add {} around the body of the anonymous function or make it all one line?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I do that then I need to add a return as well... this is so pretty

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fine

return googlePageParameters(a4a.win, a4a.getAmpDoc(), startTime)
.then(pageLevelParameters => {
Object.assign(parameters, blockLevelParameters, pageLevelParameters);
Object.assign(parameters, pageLevelParameters);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the second assign necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch

@@ -16,6 +16,7 @@

import {AmpA4A} from '../amp-a4a';
import {base64UrlDecodeToBytes} from '../../../../src/utils/base64';
import '../../../amp-ad/0.1/amp-ad-xorigin-iframe-handler';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this import needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed: Was running into test failures due to assert in amp-a4a

* @private @const
* {!Array<!function(AmpAdNetworkDoubleclickImpl>):?Object<string,string>}
*/
const BLOCK_SRA_COMBINERS_ = [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add some documentation on this array / what the different functions within it are for

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

const tfcd = jsonParameters['tagForChildDirectedTreatment'];
const adTestOn = isInManualExperiment(this.element);

this.jsonTargeting_ = rawJson ? JSON.parse(rawJson) : {};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this throws from bad json on the element, is that handled already somewhere else? Otherwise you should use tryParseJson here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That code previously existed but I fixed!

@keithwrightbos
Copy link
Contributor Author

@jridgewell & @bradfrizzell PTAL. I am working on adding more tests to test-amp-ad-network-doubleclick-impl.js to cover combiner array and SRA XHR flow.

const TFCD_ = 'tagForChildDirectedTreatment';

/** @private {?Promise} */
let SraRequests = null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup.

// non-SRA request (benefit is it allows direct cache method).
if (typeInstances.length == 1) {
dev().info(TAG, `single block in network ${networkId}`);
typeInstances[0].sraResponseResolver_(null);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes and no. It'll work, because Closure doesn't enforce private. But private properties are only for this instance, you can't touch another instances private properties.

});
})
.then(response => lineDelimitedStreamer(this.win, response,
metaJsonCreativeGrouper((creative, headers) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's define this as a function declaration, to avoid this super nesting.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

// Force safeframe rendering method.
headers[RENDERING_TYPE_HEADER] = XORIGIN_MODE.SAFEFRAME;
// Need to convert to FetchResponse object?
const fetchResponseHeaders =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you just construct a FetchResponseHeaders?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Problem is that the constructor expects !XMLHttpRequest|!XDomainRequest. I could modify the class to take object as an optional object or can I create a mock XMLHttpRequest... thoughts?

has: name => !!headers[name],
});
const fetchResponse =
/** @type {?../../../src/service/xhr-impl.FetchResponse} */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto.

// explicit is required!
instance.resetAdUrl();
if (!canceled) {
instance.forceCollapse();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#forceCollapse should be a privileged API (causes page jumps). Isn't there one that collapses conditionally like #attemptChangeHeight?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

*/
export function groupAmpAdsByType(win, type, groupFn) {
return resourcesForDoc(win.document).getMeasuredResources(win,
r => r.element.tagName == 'AMP-AD' &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fine

callback(
unescapeLineDelimitedHtml_(line),
/** @type {!Object<string, *>} */(tryParseJson(first) || {}),
done);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you're calling the callback with 3 arguments, but I'm only seeing callbacks that take 2 arguments getting passed in.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True as I don't actually care if a given chunk is the last. However, I have yet to update CSI beacons and I can see the total processing time for an SRA request being important.

if (!this.useSra_) {
return super.sendXhrRequest(adUrl);
}
// Wait for SRA request which will ncall response promise when this block's
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: fix typo 'ncall'

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@keithwrightbos
Copy link
Contributor Author

@jridgewell & @bradfrizzell PTAL. I added test coverage for amp-ad-network-doubleclick-impl.js changes. Only remaining items as I see it (and would like to cover in a later PR) are:

  • integration test coverage
  • CSI beacon modifications
  • configure experiment to test and/or possible beta publisher integration

let fixture;
let xhrMock;

function createInstance(networkId) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

createA4aSraInstance instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

* networkId:number,
* instances:number,
* xhrFail:boolean|undefined,
* invalidInstances:number}}>} networkIds
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doc incorrect, should be items here or change items to networkIds below. Please add some additional documentation for this function too, or at least the parameter

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

};
createInstances(network.instances);
createInstances(network.invalidInstances, true);
allInvalidNetworks[network.networkId] =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this name isn't really right, seeing as all networks are included in this regardless of whether or not they are invalid. Maybe something like networkValidity?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

createInstances(network.invalidInstances, true);
allInvalidNetworks[network.networkId] =
network.invalidInstances && !network.instances;
xhrFailingNetworks[network.networkId] = !!network.xhrFail;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same with this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

function executeTest(items) {
const xhrFailingNetworks = {};
const allInvalidNetworks = {};
const instances = [];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could use a more descriptive name here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Contributor

@bradfrizzell bradfrizzell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removing approval

Copy link
Contributor

@bradfrizzell bradfrizzell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, but add a test of assignAdUrlToError

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants