Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add SRA support for Doubleclick Fast Fetch Traffic #9563

Merged
merged 36 commits into from Jun 8, 2017
Merged
Show file tree
Hide file tree
Changes from 17 commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
1144b42
Initial implementation of Doubleclick SRA
keithwrightbos May 23, 2017
733f31d
Merge branch 'master' of github.com:ampproject/amphtml into dbclk_sra
keithwrightbos May 23, 2017
457a628
continued development
keithwrightbos May 25, 2017
798585c
Merge branch 'master' of github.com:ampproject/amphtml into dbclk_sra
keithwrightbos May 25, 2017
2ebc596
modify url building to pass parameters as object
keithwrightbos May 25, 2017
49d42ee
fix type error
keithwrightbos May 25, 2017
28f728f
fix type errors
keithwrightbos May 25, 2017
616c2ab
reset SraRequests global in unlayoutCallback
keithwrightbos May 25, 2017
6c8b1d6
test fixes
keithwrightbos May 26, 2017
ae90234
Merge branch 'master' of github.com:ampproject/amphtml into dbclk_sra
keithwrightbos May 26, 2017
911de44
line-delimited-response-handler test coverage
keithwrightbos May 26, 2017
a136616
remove console log
keithwrightbos May 26, 2017
46d0865
PR feedback
keithwrightbos May 31, 2017
4788276
PR feedback
keithwrightbos May 31, 2017
e3a5115
Merge branch 'master' of github.com:ampproject/amphtml into dbclk_sra
keithwrightbos May 31, 2017
9b9cbe6
Merge branch 'master' of github.com:ampproject/amphtml into dbclk_sra
keithwrightbos Jun 1, 2017
a32cf99
PR review
keithwrightbos Jun 1, 2017
154b5a5
PR review, test fixes
keithwrightbos Jun 1, 2017
b75d821
small refactor, test fix
keithwrightbos Jun 1, 2017
c8b813f
make functions private
keithwrightbos Jun 1, 2017
d3de0bb
PR feedback
keithwrightbos Jun 2, 2017
0608e05
Merge branch 'master' of github.com:ampproject/amphtml into dbclk_sra
keithwrightbos Jun 3, 2017
e4d8874
PR Review and additional test coverage
keithwrightbos Jun 6, 2017
ff3a2fe
Merge branch 'master' of github.com:ampproject/amphtml into dbclk_sra
keithwrightbos Jun 7, 2017
1bb1fb8
fix gulp presubmit failures
keithwrightbos Jun 7, 2017
28278a8
Test fixes/additions
keithwrightbos Jun 7, 2017
1b2a657
slight tweak to how tests are called
keithwrightbos Jun 7, 2017
15c8c09
Merge branch 'master' of github.com:ampproject/amphtml into dbclk_sra
keithwrightbos Jun 7, 2017
1b34e91
PR feedback
keithwrightbos Jun 7, 2017
77c1106
lint fix
keithwrightbos Jun 7, 2017
2d1a66e
update sfv parameter
keithwrightbos Jun 7, 2017
cb3fa07
add still current check after xhr response; add comments
keithwrightbos Jun 7, 2017
4c6ed8b
fix issue with error logging
keithwrightbos Jun 7, 2017
fa74e12
Additional test coverage
keithwrightbos Jun 8, 2017
9b45a1d
Merge branch 'master' of github.com:ampproject/amphtml into dbclk_sra
keithwrightbos Jun 8, 2017
e651878
remove only from test runner
keithwrightbos Jun 8, 2017
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
116 changes: 116 additions & 0 deletions ads/google/a4a/line-delimited-response-handler.js
@@ -0,0 +1,116 @@
/**
* Copyright 2017 The AMP HTML Authors. All Rights Reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS-IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

import {dev} from '../../../src/log';
import {isObject} from '../../../src/types';

/**
* Creates an XHR streaming request with expected response format of repeated
* pairs of JSON object metadata then string HTML delimited by line returns.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a lot of code to support streaming line delimited responses. If these single lines are very long and there are very few lines, you're likely not seeing any benefit from streaming the lines separately.

Streaming bodies makes sense when the HTML can be parsed incrementally. As it is now, we're not doing that, we're streaming lines of text.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True and I may look into doing sub-line chunking in the future but currently both of our rendering methods (friendly frame srcdoc and safeframe passing the entire creative as the name attribute) require the full creative prior to rendering start. I plan to look into possible service worker support which would allow streaming (will email to discuss outside this PR).

However, note that in testing the chunk size is 32k (matches gzip window) and its often that can account for 1-2 creative responses. Doubleclick SRA requests commonly have many blocks (happy to share data offline) so I do believe this solution to have meaning impact as it will allow blocks to render prior to receiving the entire SRA response.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created a sample. Both streaming and all-text paths seem to complete in about ~10ms (using 7 responses, some taking 16k, others taking 8k). I still think this is way over-engineering.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The improvement comes from lower bandwidth usage cases. For instance, if you patch this PR and go to http://localhost:8000/proxy/amp.timeinc.net/time/4458554/best-video-games-all-time/?amp_js_v=0.1&force_sra=true#exp=a4a%3A-1 and emulate 3g, you'll find that the delay between the first and last chunk is ~150 ms which is certainly meaningful in the ad space. The hope is to make SRA the default behavior as it allows for better compression across requests, reduced server usage, and possible downstream optimizations. However, it means that the delay imposed on the first slot's time to render must be as minimal as possible. Chunking helps ensure this and I hope in the future to combine SRA with service workers allowing the results to be streamed into the frame as you suggest.

*
* @param {!../../../src/service/xhr-impl.Xhr} xhr
* @param {string} url
* @param {!function(string, !Object<string, *>, boolean)} slotCallback called
* with creative, metadata, and boolean indicating if completed. Failure to
* JSON parse metadata will cause empty object to be given.
* @param {?../../../src/service/xhr-impl.FetchInitDef=} opt_init
* @return {!Promise<!../../../src/service/xhr-impl.FetchResponse>} response,
* note that accessing body will cause error as its consumed in streaming.
*/
export function fetchLineDelimitedChunks(xhr, url, slotCallback, opt_init) {
return xhr.fetch(url, opt_init)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so this function has 3 concerns:

  1. Fetch
  2. Read line delimited streams
  3. Group two lines together

These should all be separate.

  1. Fetch should be handled by caller.
  2. Reading lines is good
  3. Grouping lines should be handled by slotCallback (bad naming, too)

Instead, I'm looking for something along these lines:

// caller, I guess DoubleClick
const fetch = xhr.fetch(url, input);
fetch.then(lineDelimitedStreamer(grouper(doubleClickHandler)));

// This should probably be an instance method
// This groups two lines together (metadata and creative lines)
function grouper(callback) {
  let first;
  return function(chunk, done) {
    if (first) {
      callback(first, chunk);
      first = null;
    } else {
      first = chunk;
    }
  };
}

// This is a generic line-delimited response streamer
function lineDelimitedStreamer(callback) {
  let snippet = '';
  function streamer(text) {
    // regex stuff here
  }
  return function(response) {
    if (!isStreamable(response)) {
      response.text().then(streamer);
      return;
    }

    const decoder = new TextDecoder('utf-8');
    const reader = response.body.getReader();
    reader.read().then(function chunk(result) {
      // ...
      streamer(chunk);
    });
  };
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

.then(response => {
let metadata;
let snippet = '';
const chunkCallback = (chunk, done) => {
const regex = /([^\n]*)(\n)?/g;
let match;
while ((match = regex.exec(chunk))) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add some specification on what chunk can be?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think its fairly straight forward that the chunk is a portion of streamed content that may or may not contain a line delimiter

snippet += match[1];
if (match[2]) {
if (metadata) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm assuming that it is intended that we only get the metadata on the 'even' matches, and only call the slotCallback on the 'odd' matches? If so, could this be rewritten in such a way to make that clearer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, after refactor suggested by @jridgewell I believe this is more clear given it now utilizes a function named metaJsonCreativeGrouper

slotCallback(unescapeHtml_(snippet), metadata,
done && regex.lastIndex === chunk.length);
metadata = undefined;
} else {
metadata = safeJsonParse_(snippet);
}
snippet = '';
}
if (regex.lastIndex === chunk.length) {
break;
}
}
};
if (!response.body || !xhr.win.TextDecoder) {
// TODO(keithwrightbos) - TextDecoder polyfill?
response.text().then(content => chunkCallback(content, true));
} else {
handleFetchResponseStream_(
response.body.getReader(), chunkCallback, new TextDecoder('utf-8'));
}
return {status: response.status, headers: response.headers};
});
}

/**
* @param {!ReadableStreamReader} reader The reader of the fetch's
* response stream.
* @param {!function(string, boolean)} chunkHandler
* @param {!TextDecoder} textDecoder
* @private
*/
function handleFetchResponseStream_(reader, chunkHandler, textDecoder) {
reader.read().then(function chunk(result) {
if (result.value) {
chunkHandler(
textDecoder.decode(
/** @type {!ArrayBuffer} */(result.value), {'stream': true}),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally, you don't want to stream when the result is done.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair point though it can't hurt and makes the code simpler

result.done);
}
if (!result.done) {
// More chunks to read.
reader.read().then(chunk);
}
});
}

/**
* Unescapes characters that are escaped in line-delimited JSON-HTML.
* @param {string} html An html snippet.
* @return {string}
* @private
*/
function unescapeHtml_(html) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe name can be unescapeLineDelimitedHtml_ instead, would make it clearer for people reading that they aren't seeing an AMP specific version of UriDecode or something.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

return html ? html.replace(
/\\(n|r|\\)/g,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about \r\n?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This already handles that case (can see test coverage)

(_, match) => match == 'n' ? '\n' : match == 'r' ? '\r' : '\\') : html;
}

/**
* @param {string} rawString to be parsed
* @return {!Object<string, *>} JSON parsed string or empty object if invalid.
* @private
*/
function safeJsonParse_(rawString) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't you use tryParseJson instead of making a new function that does mostly the same thing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't aware of tryParseJson... thanks

try {
const result = JSON.parse(rawString);
dev().assert(isObject(result));
return /** @type {!Object<string,*>} */(result);
} catch (err) {
return {};
}
}
201 changes: 201 additions & 0 deletions ads/google/a4a/test/test-line-delimited-response-handler.js
@@ -0,0 +1,201 @@
/**
* Copyright 2017 The AMP HTML Authors. All Rights Reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS-IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

import {fetchLineDelimitedChunks} from '../line-delimited-response-handler';
import * as sinon from 'sinon';

describe('#line-delimited-response-handler', () => {

const url = 'https://someadnetwork.com?who=me&yes=you';
const xhrInit = {
mode: 'cors',
method: 'GET',
credentials: 'include',
};
let fetchStub;
let chunkHandlerStub;
let slotData;
let xhr;

/**
* @param {!Array<!{{creative:string,headers:!Object<string,string>}}>} slotData
* @return {string} slot data written in expected stream response format
*/
function generateResponseFormat() {
let slotDataString = '';
slotData.forEach(slot => {
// TODO: escape creative returns
const creative = slot.creative.replace(/\\/g, '\\\\')
.replace(/\n/g, '\\n').replace(/\r/g, '\\r');
slotDataString += `${JSON.stringify(slot.headers)}\n${creative}\n`;
});
return slotDataString;
}

function executeAndVerifyResponse() {
// Streamed response calls chunk handlers after returning so need to
// wait on chunks.
let chunkResolver;
const chunkPromise = new Promise(resolver => chunkResolver = resolver);
const chunkHandlerWrapper = (creative, metaData) => {
chunkHandlerStub(creative, metaData);
if (chunkHandlerStub.callCount == slotData.length) {
chunkResolver();
}
};
// If no slots then callback will never execute so we need to resolve
// immediately.
if (slotData.length == 0) {
chunkResolver();
}
return Promise.all([
fetchLineDelimitedChunks(xhr, url, chunkHandlerWrapper, xhrInit),
chunkPromise,
])
.then(results => {
// results[0] is the response
expect(results[0]).to.be.ok;
expect(chunkHandlerStub.callCount).to.equal(slotData.length);
// Could have duplicate responses so need to iterate and get counts.
// TODO: can't use objects as keys :(
const calls = {};
slotData.forEach(slot => {
const key = slot.creative + JSON.stringify(slot.headers);
calls[key] ? calls[key]++ : (calls[key] = 1);
});
slotData.forEach(slot => {
expect(chunkHandlerStub.withArgs(slot.creative, slot.headers).callCount)
.to.equal(calls[slot.creative + JSON.stringify(slot.headers)]);
});
});
}

beforeEach(() => {
fetchStub = sinon.stub();
chunkHandlerStub = sinon.stub();
});

describe('stream not supported', () => {
beforeEach(() => {
fetchStub.returns(Promise.resolve({
text: () => Promise.resolve(generateResponseFormat()),
}));
xhr = {
fetch: fetchStub,
win: {
// TextDecoder should exist but not be called
TextDecoder: () => { throw new Error('fail'); },
},
};
});

it('should fallback to text if no stream support', () => {
slotData = [
{headers: {foo: 'bar'}, creative: '<html>baz\n</html>'},
{headers: {hello: 'world'}, creative: '<html>do not\n chunk me</html>'},
];
return executeAndVerifyResponse();
});

it('should fallback to text if no stream support w/ empty response', () => {
slotData = [];
return executeAndVerifyResponse();
});

it('should fallback to text if no TextDecoder', () => {
slotData = [
{headers: {foo: 'bar', hello: 'world'}, creative: '<html>baz\n</html>'},
{headers: {hello: 'world'},
creative: '\r\n<html>\nchunk\b me</html>\r\b\n\r'},
{headers: {foo: 'bar'}, creative: ''},
{headers: {}, creative: '\r\n\r\b\n\r'},
];
return executeAndVerifyResponse();
});
});

describe('streaming', () => {
let readStub;

function setup() {
const responseString = generateResponseFormat();
const textEncoder = new TextEncoder('utf-8');
const CHUNK_SIZE = 5;
let chunk = 0;
do {
const value = textEncoder.encode(responseString.substr(
chunk * CHUNK_SIZE, CHUNK_SIZE), {'stream': true});
const done = chunk * CHUNK_SIZE >= responseString.length - 1;
readStub.onCall(chunk).returns(Promise.resolve({value, done}));
} while (chunk++ * CHUNK_SIZE < responseString.length);
}

beforeEach(() => {
readStub = sinon.stub();
fetchStub.returns(Promise.resolve({
text: () => Promise.resolve(),
body: {
getReader: () => {
return {
read: readStub,
};
},
},
}));
xhr = {
fetch: fetchStub,
win: {
TextDecoder,
},
};
});

it('should handle empty streamed response properly', () => {
slotData = [];
setup();
return executeAndVerifyResponse();
});

it('should handle no fill response properly', () => {
slotData = [{headers: {}, creative: ''}];
setup();
return executeAndVerifyResponse();
});

it('should handle multiple no fill responses properly', () => {
slotData = [
{headers: {}, creative: ''},
{headers: {}, creative: ''},
];
setup();
return executeAndVerifyResponse();
});

it('should stream properly', () => {
slotData = [
{headers: {}, creative: ''},
{headers: {foo: 'bar', hello: 'world'},
creative: '\t\n\r<html>\bbaz\r</html>\n\n'},
{headers: {}, creative: ''},
{headers: {hello: 'world'},
creative: '<html>\nchu\nnk me</h\rtml\n\t>'},
{headers: {}, creative: ''},
];
setup();
return executeAndVerifyResponse();
});
});
});
2 changes: 1 addition & 1 deletion ads/google/a4a/test/test-utils.js
Expand Up @@ -367,7 +367,7 @@ describe('Google A4A utils', () => {
const impl = new MockA4AImpl(elem);
noopMethods(impl, doc, sandbox);
return fixture.addElement(elem).then(() => {
return googleAdUrl(impl, '', 0, [], [], ['789', '098']).then(url1 => {
return googleAdUrl(impl, '', 0, {}, ['789', '098']).then(url1 => {
expect(url1).to.match(/eid=123%2C456%2C789%2C098/);
});
});
Expand Down
15 changes: 9 additions & 6 deletions ads/google/a4a/url-builder.js
Expand Up @@ -21,7 +21,8 @@ export let QueryParameterDef;
* Builds a URL from query parameters, truncating to a maximum length if
* necessary.
* @param {string} baseUrl scheme, domain, and path for the URL.
* @param {!Array<!QueryParameterDef>} queryParams query parameters for the URL.
* @param {!Object<string,!string|number|null>} queryParams query parameters for
* the URL.
* @param {number} maxLength length to truncate the URL to if necessary.
* @param {?QueryParameterDef=} opt_truncationQueryParam query parameter to
* append to the URL iff any query parameters were truncated.
Expand All @@ -41,13 +42,15 @@ export function buildUrl(
if (encodedTruncationParam) {
capacity -= encodedTruncationParam.length + 1;
}
for (let i = 0; i < queryParams.length; i++) {
const param = queryParams[i];
if (param.value == null || param.value === '') {
const keys = Object.keys(queryParams);
for (let i = 0; i < keys.length; i++) {
const key = keys[i];
const value = queryParams[key];
if (value == null || value === '') {
continue;
}
const encodedNameAndSep = encodeURIComponent(param.name) + '=';
const encodedValue = encodeURIComponent(String(param.value));
const encodedNameAndSep = encodeURIComponent(key) + '=';
const encodedValue = encodeURIComponent(String(value));
const fullLength = encodedNameAndSep.length + encodedValue.length + 1;
if (fullLength > capacity) {
const truncatedValue = encodedValue
Expand Down