Page size (sz) and Estimated Arrival (ea) #5

wants to merge 1 commit into


None yet

2 participants


(fingers crossed I've managed to get this right - this is my first time playing with git)

Page size is mod to BW which reports the size of the current HTML page using document.documentElement.innerHTML.length. The code here differs from that I had previously posted on the discussion board - I've now added a new method which subscribes to the before_beacon event (had problems using page_ready within BW). Using Dokuwiki to generate some test pages, I saw a very consistent relationship between sz and the %b figure in the logs - more specifically,

%b = sz - 1016 +/- 70

As per my comments on the discussion board, embedding the arrival time snapshot into the boomerang code is not ideal - but makes for a cleaner implementation - and its overridden if webtiming is available.


I'm not sure how useful the page size is. It doesn't consider the size of additional assets, whether content was gzipped over the network or not, whether it was read in from cache or not and also considers dynamically added content that wasn't necessarily downloaded, but is still in the document.

The idea is interesting, but I'm not sure this is the right number to capture. What do you think?


Ideally I want to know how long it took the server to generate the content, independently of the additional files that go to make up the page. I don't think I can measure this directly on the client (see also my post on the first byte metric on the discussion board).

Knowing the document size, bandwidth, HTTP latency and the time between the request start and the current request completing (rather than the full page) should allow me to calculate the server time. So despite 2 changes across 2 plugins to get 2 new metrics, really I'm trying to get a single value out of the system.

There are caveats on this:
1) I am seeing variations in the relationship between 'sz' and '%b' which seem to be related to the structure of the page (javascript, CSS). So far I've only tested using dokuwiki and some plain html and php pages, however the relationship seems to be highly linear. So for a single site (leaving aside other considerations - see below) it should be possible to calculate the size of the document accurately.

2) Agreed there are a lot of things which could cause 'sz' to not reflect what was sent over the network - I work predominantly with SaaS apps - and the HTML is exclusively dynamically generated, and there is very little other content which is not static. So the HTML is never cached. Regarding compression, yes this could be a problem, however I'd expect a fairly predictable relationship between raw and compressed content. But I admit I don't currently have stats to back this up.

So yes, there are some assumptions here - ultimately the value (at least from my point of view) is whether the resulting metrics are reasonably accurate - or at least allow an accurate metric to be derived. And that needs to be determined experimentally. This is still a work in progress.

If you can suggest a better way of measuring server time on the client, I'm keen to hear more!


So while I think might be a useful metric, I don't think it's generic enough in its current form to be part of the core plugins. Furthermore, to get a more accurate measure, I'd get my server side scripting language to do more work (we do this a lot at Yahoo!). For example, in PHP, I'd do this:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment