New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add SRA support for Doubleclick Fast Fetch Traffic #9563
Changes from 17 commits
1144b42
733f31d
457a628
798585c
2ebc596
49d42ee
28f728f
616c2ab
6c8b1d6
ae90234
911de44
a136616
46d0865
4788276
e3a5115
9b9cbe6
a32cf99
154b5a5
b75d821
c8b813f
d3de0bb
0608e05
e4d8874
ff3a2fe
1bb1fb8
28278a8
1b2a657
15c8c09
1b34e91
77c1106
2d1a66e
cb3fa07
4c6ed8b
fa74e12
9b45a1d
e651878
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,116 @@ | ||
/** | ||
* Copyright 2017 The AMP HTML Authors. All Rights Reserved. | ||
* | ||
* Licensed under the Apache License, Version 2.0 (the "License"); | ||
* you may not use this file except in compliance with the License. | ||
* You may obtain a copy of the License at | ||
* | ||
* http://www.apache.org/licenses/LICENSE-2.0 | ||
* | ||
* Unless required by applicable law or agreed to in writing, software | ||
* distributed under the License is distributed on an "AS-IS" BASIS, | ||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
* See the License for the specific language governing permissions and | ||
* limitations under the License. | ||
*/ | ||
|
||
import {dev} from '../../../src/log'; | ||
import {isObject} from '../../../src/types'; | ||
|
||
/** | ||
* Creates an XHR streaming request with expected response format of repeated | ||
* pairs of JSON object metadata then string HTML delimited by line returns. | ||
* | ||
* @param {!../../../src/service/xhr-impl.Xhr} xhr | ||
* @param {string} url | ||
* @param {!function(string, !Object<string, *>, boolean)} slotCallback called | ||
* with creative, metadata, and boolean indicating if completed. Failure to | ||
* JSON parse metadata will cause empty object to be given. | ||
* @param {?../../../src/service/xhr-impl.FetchInitDef=} opt_init | ||
* @return {!Promise<!../../../src/service/xhr-impl.FetchResponse>} response, | ||
* note that accessing body will cause error as its consumed in streaming. | ||
*/ | ||
export function fetchLineDelimitedChunks(xhr, url, slotCallback, opt_init) { | ||
return xhr.fetch(url, opt_init) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ok, so this function has 3 concerns:
These should all be separate.
Instead, I'm looking for something along these lines: // caller, I guess DoubleClick
const fetch = xhr.fetch(url, input);
fetch.then(lineDelimitedStreamer(grouper(doubleClickHandler)));
// This should probably be an instance method
// This groups two lines together (metadata and creative lines)
function grouper(callback) {
let first;
return function(chunk, done) {
if (first) {
callback(first, chunk);
first = null;
} else {
first = chunk;
}
};
}
// This is a generic line-delimited response streamer
function lineDelimitedStreamer(callback) {
let snippet = '';
function streamer(text) {
// regex stuff here
}
return function(response) {
if (!isStreamable(response)) {
response.text().then(streamer);
return;
}
const decoder = new TextDecoder('utf-8');
const reader = response.body.getReader();
reader.read().then(function chunk(result) {
// ...
streamer(chunk);
});
};
} There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done |
||
.then(response => { | ||
let metadata; | ||
let snippet = ''; | ||
const chunkCallback = (chunk, done) => { | ||
const regex = /([^\n]*)(\n)?/g; | ||
let match; | ||
while ((match = regex.exec(chunk))) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you add some specification on what chunk can be? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think its fairly straight forward that the chunk is a portion of streamed content that may or may not contain a line delimiter |
||
snippet += match[1]; | ||
if (match[2]) { | ||
if (metadata) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm assuming that it is intended that we only get the metadata on the 'even' matches, and only call the slotCallback on the 'odd' matches? If so, could this be rewritten in such a way to make that clearer? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, after refactor suggested by @jridgewell I believe this is more clear given it now utilizes a function named metaJsonCreativeGrouper |
||
slotCallback(unescapeHtml_(snippet), metadata, | ||
done && regex.lastIndex === chunk.length); | ||
metadata = undefined; | ||
} else { | ||
metadata = safeJsonParse_(snippet); | ||
} | ||
snippet = ''; | ||
} | ||
if (regex.lastIndex === chunk.length) { | ||
break; | ||
} | ||
} | ||
}; | ||
if (!response.body || !xhr.win.TextDecoder) { | ||
// TODO(keithwrightbos) - TextDecoder polyfill? | ||
response.text().then(content => chunkCallback(content, true)); | ||
} else { | ||
handleFetchResponseStream_( | ||
response.body.getReader(), chunkCallback, new TextDecoder('utf-8')); | ||
} | ||
return {status: response.status, headers: response.headers}; | ||
}); | ||
} | ||
|
||
/** | ||
* @param {!ReadableStreamReader} reader The reader of the fetch's | ||
* response stream. | ||
* @param {!function(string, boolean)} chunkHandler | ||
* @param {!TextDecoder} textDecoder | ||
* @private | ||
*/ | ||
function handleFetchResponseStream_(reader, chunkHandler, textDecoder) { | ||
reader.read().then(function chunk(result) { | ||
if (result.value) { | ||
chunkHandler( | ||
textDecoder.decode( | ||
/** @type {!ArrayBuffer} */(result.value), {'stream': true}), | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Generally, you don't want to stream when the result is done. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fair point though it can't hurt and makes the code simpler |
||
result.done); | ||
} | ||
if (!result.done) { | ||
// More chunks to read. | ||
reader.read().then(chunk); | ||
} | ||
}); | ||
} | ||
|
||
/** | ||
* Unescapes characters that are escaped in line-delimited JSON-HTML. | ||
* @param {string} html An html snippet. | ||
* @return {string} | ||
* @private | ||
*/ | ||
function unescapeHtml_(html) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe name can be unescapeLineDelimitedHtml_ instead, would make it clearer for people reading that they aren't seeing an AMP specific version of UriDecode or something. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done |
||
return html ? html.replace( | ||
/\\(n|r|\\)/g, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What about There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This already handles that case (can see test coverage) |
||
(_, match) => match == 'n' ? '\n' : match == 'r' ? '\r' : '\\') : html; | ||
} | ||
|
||
/** | ||
* @param {string} rawString to be parsed | ||
* @return {!Object<string, *>} JSON parsed string or empty object if invalid. | ||
* @private | ||
*/ | ||
function safeJsonParse_(rawString) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why can't you use tryParseJson instead of making a new function that does mostly the same thing? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I wasn't aware of tryParseJson... thanks |
||
try { | ||
const result = JSON.parse(rawString); | ||
dev().assert(isObject(result)); | ||
return /** @type {!Object<string,*>} */(result); | ||
} catch (err) { | ||
return {}; | ||
} | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,201 @@ | ||
/** | ||
* Copyright 2017 The AMP HTML Authors. All Rights Reserved. | ||
* | ||
* Licensed under the Apache License, Version 2.0 (the "License"); | ||
* you may not use this file except in compliance with the License. | ||
* You may obtain a copy of the License at | ||
* | ||
* http://www.apache.org/licenses/LICENSE-2.0 | ||
* | ||
* Unless required by applicable law or agreed to in writing, software | ||
* distributed under the License is distributed on an "AS-IS" BASIS, | ||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
* See the License for the specific language governing permissions and | ||
* limitations under the License. | ||
*/ | ||
|
||
import {fetchLineDelimitedChunks} from '../line-delimited-response-handler'; | ||
import * as sinon from 'sinon'; | ||
|
||
describe('#line-delimited-response-handler', () => { | ||
|
||
const url = 'https://someadnetwork.com?who=me&yes=you'; | ||
const xhrInit = { | ||
mode: 'cors', | ||
method: 'GET', | ||
credentials: 'include', | ||
}; | ||
let fetchStub; | ||
let chunkHandlerStub; | ||
let slotData; | ||
let xhr; | ||
|
||
/** | ||
* @param {!Array<!{{creative:string,headers:!Object<string,string>}}>} slotData | ||
* @return {string} slot data written in expected stream response format | ||
*/ | ||
function generateResponseFormat() { | ||
let slotDataString = ''; | ||
slotData.forEach(slot => { | ||
// TODO: escape creative returns | ||
const creative = slot.creative.replace(/\\/g, '\\\\') | ||
.replace(/\n/g, '\\n').replace(/\r/g, '\\r'); | ||
slotDataString += `${JSON.stringify(slot.headers)}\n${creative}\n`; | ||
}); | ||
return slotDataString; | ||
} | ||
|
||
function executeAndVerifyResponse() { | ||
// Streamed response calls chunk handlers after returning so need to | ||
// wait on chunks. | ||
let chunkResolver; | ||
const chunkPromise = new Promise(resolver => chunkResolver = resolver); | ||
const chunkHandlerWrapper = (creative, metaData) => { | ||
chunkHandlerStub(creative, metaData); | ||
if (chunkHandlerStub.callCount == slotData.length) { | ||
chunkResolver(); | ||
} | ||
}; | ||
// If no slots then callback will never execute so we need to resolve | ||
// immediately. | ||
if (slotData.length == 0) { | ||
chunkResolver(); | ||
} | ||
return Promise.all([ | ||
fetchLineDelimitedChunks(xhr, url, chunkHandlerWrapper, xhrInit), | ||
chunkPromise, | ||
]) | ||
.then(results => { | ||
// results[0] is the response | ||
expect(results[0]).to.be.ok; | ||
expect(chunkHandlerStub.callCount).to.equal(slotData.length); | ||
// Could have duplicate responses so need to iterate and get counts. | ||
// TODO: can't use objects as keys :( | ||
const calls = {}; | ||
slotData.forEach(slot => { | ||
const key = slot.creative + JSON.stringify(slot.headers); | ||
calls[key] ? calls[key]++ : (calls[key] = 1); | ||
}); | ||
slotData.forEach(slot => { | ||
expect(chunkHandlerStub.withArgs(slot.creative, slot.headers).callCount) | ||
.to.equal(calls[slot.creative + JSON.stringify(slot.headers)]); | ||
}); | ||
}); | ||
} | ||
|
||
beforeEach(() => { | ||
fetchStub = sinon.stub(); | ||
chunkHandlerStub = sinon.stub(); | ||
}); | ||
|
||
describe('stream not supported', () => { | ||
beforeEach(() => { | ||
fetchStub.returns(Promise.resolve({ | ||
text: () => Promise.resolve(generateResponseFormat()), | ||
})); | ||
xhr = { | ||
fetch: fetchStub, | ||
win: { | ||
// TextDecoder should exist but not be called | ||
TextDecoder: () => { throw new Error('fail'); }, | ||
}, | ||
}; | ||
}); | ||
|
||
it('should fallback to text if no stream support', () => { | ||
slotData = [ | ||
{headers: {foo: 'bar'}, creative: '<html>baz\n</html>'}, | ||
{headers: {hello: 'world'}, creative: '<html>do not\n chunk me</html>'}, | ||
]; | ||
return executeAndVerifyResponse(); | ||
}); | ||
|
||
it('should fallback to text if no stream support w/ empty response', () => { | ||
slotData = []; | ||
return executeAndVerifyResponse(); | ||
}); | ||
|
||
it('should fallback to text if no TextDecoder', () => { | ||
slotData = [ | ||
{headers: {foo: 'bar', hello: 'world'}, creative: '<html>baz\n</html>'}, | ||
{headers: {hello: 'world'}, | ||
creative: '\r\n<html>\nchunk\b me</html>\r\b\n\r'}, | ||
{headers: {foo: 'bar'}, creative: ''}, | ||
{headers: {}, creative: '\r\n\r\b\n\r'}, | ||
]; | ||
return executeAndVerifyResponse(); | ||
}); | ||
}); | ||
|
||
describe('streaming', () => { | ||
let readStub; | ||
|
||
function setup() { | ||
const responseString = generateResponseFormat(); | ||
const textEncoder = new TextEncoder('utf-8'); | ||
const CHUNK_SIZE = 5; | ||
let chunk = 0; | ||
do { | ||
const value = textEncoder.encode(responseString.substr( | ||
chunk * CHUNK_SIZE, CHUNK_SIZE), {'stream': true}); | ||
const done = chunk * CHUNK_SIZE >= responseString.length - 1; | ||
readStub.onCall(chunk).returns(Promise.resolve({value, done})); | ||
} while (chunk++ * CHUNK_SIZE < responseString.length); | ||
} | ||
|
||
beforeEach(() => { | ||
readStub = sinon.stub(); | ||
fetchStub.returns(Promise.resolve({ | ||
text: () => Promise.resolve(), | ||
body: { | ||
getReader: () => { | ||
return { | ||
read: readStub, | ||
}; | ||
}, | ||
}, | ||
})); | ||
xhr = { | ||
fetch: fetchStub, | ||
win: { | ||
TextDecoder, | ||
}, | ||
}; | ||
}); | ||
|
||
it('should handle empty streamed response properly', () => { | ||
slotData = []; | ||
setup(); | ||
return executeAndVerifyResponse(); | ||
}); | ||
|
||
it('should handle no fill response properly', () => { | ||
slotData = [{headers: {}, creative: ''}]; | ||
setup(); | ||
return executeAndVerifyResponse(); | ||
}); | ||
|
||
it('should handle multiple no fill responses properly', () => { | ||
slotData = [ | ||
{headers: {}, creative: ''}, | ||
{headers: {}, creative: ''}, | ||
]; | ||
setup(); | ||
return executeAndVerifyResponse(); | ||
}); | ||
|
||
it('should stream properly', () => { | ||
slotData = [ | ||
{headers: {}, creative: ''}, | ||
{headers: {foo: 'bar', hello: 'world'}, | ||
creative: '\t\n\r<html>\bbaz\r</html>\n\n'}, | ||
{headers: {}, creative: ''}, | ||
{headers: {hello: 'world'}, | ||
creative: '<html>\nchu\nnk me</h\rtml\n\t>'}, | ||
{headers: {}, creative: ''}, | ||
]; | ||
setup(); | ||
return executeAndVerifyResponse(); | ||
}); | ||
}); | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a lot of code to support streaming line delimited responses. If these single lines are very long and there are very few lines, you're likely not seeing any benefit from streaming the lines separately.
Streaming bodies makes sense when the HTML can be parsed incrementally. As it is now, we're not doing that, we're streaming lines of text.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True and I may look into doing sub-line chunking in the future but currently both of our rendering methods (friendly frame srcdoc and safeframe passing the entire creative as the name attribute) require the full creative prior to rendering start. I plan to look into possible service worker support which would allow streaming (will email to discuss outside this PR).
However, note that in testing the chunk size is 32k (matches gzip window) and its often that can account for 1-2 creative responses. Doubleclick SRA requests commonly have many blocks (happy to share data offline) so I do believe this solution to have meaning impact as it will allow blocks to render prior to receiving the entire SRA response.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I created a sample. Both streaming and all-text paths seem to complete in about ~10ms (using 7 responses, some taking 16k, others taking 8k). I still think this is way over-engineering.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The improvement comes from lower bandwidth usage cases. For instance, if you patch this PR and go to http://localhost:8000/proxy/amp.timeinc.net/time/4458554/best-video-games-all-time/?amp_js_v=0.1&force_sra=true#exp=a4a%3A-1 and emulate 3g, you'll find that the delay between the first and last chunk is ~150 ms which is certainly meaningful in the ad space. The hope is to make SRA the default behavior as it allows for better compression across requests, reduced server usage, and possible downstream optimizations. However, it means that the delay imposed on the first slot's time to render must be as minimal as possible. Chunking helps ensure this and I hope in the future to combine SRA with service workers allowing the results to be streamed into the frame as you suggest.