## A/B testing performance metrics
During this test we experimented with serveral variables, randomizing each one independently of the others. 

- Server Side Rendering
- Serving assets directly from a worker versus from S3
- Streaming the head versus preparing the whole response
- Pushing the assets using server push

### Conclusions
This should be considered early exploration.

We'd love to see more a/b tests run in the wild with different setups. We tested a number of variables, but there are a lot more that could affect the performance for a page and even more possible interactions between these variables. It's interesting to look across the quantiles of these results. We believe that a/b testing for performance offers a lot of promise. We'd like to run these tests for longer to look for correlations between performance differences and user behavior. 

There is a ton of work to be done here and much of it is probably very dependent on the specific site being optimized.


If you're interested in this work feel free to reach out to us at tti@digitaloptgroup.com. 

### About the test
Pages were served out of Cloudflare worker scripts. Data was collected using the browser's navigation and resource timing apis. TTI was collected with this pollyfil: https://github.com/GoogleChromeLabs/tti-polyfill

Traffic was from Google ads and represented a sampling of visitors from around the world. The test included around 1,500 visitors. 

In [85]:
data <- read.csv("bq.csv")
colnames(data)

head(data)
nrow(data)

VID,RID,epoch,deviceType,osName,isWorker,SSR,serverPush,streamingResponse,workerAssets,fromGoogle,dnsLookupTime,timeToFirstByte,cssBundleLoadTime,mainJsBundleLoadTime,firstPaint,firstContentfulPaint,tti
5211c15c-a6e6-43f8-99ff-c9416b5ebeda,5211c15c-a6e6-43f8-99ff-c9416b5ebeda,1552201000000.0,mobile,Android,1,1,1,1,0,1,0.0,,,,,,
ef4d62b5-e2f6-408b-9f35-f89f3b7b435b,ef4d62b5-e2f6-408b-9f35-f89f3b7b435b,1552202000000.0,mobile,Android,1,1,1,0,0,1,8.0,499.0,,,,,
96581843-3cc1-43db-b34b-612b0a90cbd1,96581843-3cc1-43db-b34b-612b0a90cbd1,1552343000000.0,mobile,iOS,1,0,1,0,1,1,,,,,,,
4b6978dc-c9b7-4e01-8f76-8e37017355b9,4b6978dc-c9b7-4e01-8f76-8e37017355b9,1552231000000.0,mobile,Android,1,1,1,0,1,1,34.0,517.0,,,,,
96eddb16-997f-44b0-a0f7-33e03a0303dc,96eddb16-997f-44b0-a0f7-33e03a0303dc,1552291000000.0,mobile,Android,1,0,0,0,0,1,102.0,513.0,114.0,217.0,800.0,1081.0,
35ce14f6-5533-4a30-85cd-aaea2303aff1,35ce14f6-5533-4a30-85cd-aaea2303aff1,1552346000000.0,mobile,Android,1,1,0,1,1,1,,,,,,,


## Biggest observed differences to TTI
The biggest differences were observed when looking at the combinations of variables. 

In [96]:
print("No SSR - Server Push - Streaming Response - Worker Assets")
summary(data$tti[data$SSR==0 & data$isWorker==1 & data$serverPush ==1 & data$streamingResponse==1 & data$workerAssets ==1])

print("SSR - No Server Push - No Streaming - S3 Assets")
summary(data$tti[data$SSR==1 & data$serverPush ==0 & data$streamingResponse==0 & data$workerAssets ==0])

[1] "No SSR - Server Push - Streaming Response - Worker Assets"


   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
  510.0   867.2  1213.5  2286.2  2537.2  9558.0      86 

[1] "No SSR - Without Server Push - Streaming Response - Worker Assets"


   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
    642    1284    1536    2853    2788   12036      89 

[1] "SSR - No Server Push - No Streaming - S3 Assets"


   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
    728    1061    2170    5893    5807   41534      96 

## Streaming response from worker
In this test we had two versions. In one version the head was immediately streamed from the worker, with the body then being server side rendered and streamed when ready. In the second version the entire page was sent together as a single response.

Below we look at Time to first byte and Time to interactive (TTI). It's interesting to note the 3rd quantile of TTI with and without streaming. This is only looking at streaming combined with SSR.

In [87]:
print("Time to first byte with Streaming")
summary(data$timeToFirstByte[data$streamingResponse==1 & data$SSR ==1])

print("Time to first byte without Streaming")
summary(data$timeToFirstByte[data$streamingResponse==0 & data$SSR ==1])

print("TTI with streaming")
summary(data$tti[data$streamingResponse==1 & data$SSR ==1])

print("TTI without streaming")
summary(data$tti[data$streamingResponse==0 & data$SSR ==1])

[1] "Time to first byte with Streaming"


   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
  -31.0   469.5   645.0   878.1   939.5 13337.0     182 

[1] "Time to first byte without Streaming"


   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
  -22.0   464.0   693.0   994.3  1034.5 26576.0     185 

[1] "TTI with streaming"


   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
    605    1116    1815    4271    3424   37483     334 

[1] "TTI without streaming"


   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
    491    1132    1878    4814    4777   57268     339 

## Server side rendering
This was a Preact application with two versions tested. The first version rendered the full page on the server and the second version utilized client side rendering.

In [88]:
print("First Contentful Paint with SSR")
summary(data$firstContentfulPaint[data$SSR==1 & data$isWorker==1 & data$serverPush ==1 & data$workerAssets ==1])

print("First Contentful Paint without SSR")
summary(data$firstContentfulPaint[data$SSR==0 & data$isWorker==1 & data$serverPush ==1 & data$workerAssets ==1])

print("TTI with SSR")
summary(data$tti[data$SSR==1 & data$isWorker==1 & data$serverPush ==1 & data$workerAssets ==1])

print("TTI without SSR")
summary(data$tti[data$SSR==0 & data$isWorker==1 & data$serverPush ==1 & data$workerAssets ==1])

print("TTI with SSR")
summary(data$tti[data$SSR==1 & data$isWorker==1 & data$serverPush ==0 & data$streamingResponse==1 & data$workerAssets ==1])

print("TTI without SSR")
summary(data$tti[data$SSR==0 & data$isWorker==1 & data$serverPush ==0 & data$streamingResponse==1 & data$workerAssets ==1])

[1] "First Contentful Paint with SSR"


   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
  439.0   782.2  1083.5  1803.4  1471.2 32334.0     140 

[1] "First Contentful Paint without SSR"


   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
  486.0   881.5  1188.0  1564.6  1986.0  5762.0     164 

[1] "TTI with SSR"


   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
    605    1032    1628    4095    3688   24511     146 

[1] "TTI without SSR"


   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
    476     871    1394    3711    3878   58027     175 

[1] "TTI with SSR"


   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
    710    1103    1720    3432    2774   20856      89 

[1] "TTI without SSR"


   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
    642    1284    1536    2853    2788   12036      89 

## Embedding main.js and main.css into the worker vs S3 origin
In version one we experimented with embedding our main js/css bundles into the workers and serving them directly from there. In the second version we served the assets from S3.

It's interesting to note the difference at the higher percentiles.

In [89]:
print("Assets in worker")
fromWorker <- data$mainJsBundleLoadTime[data$workerAssets==1 & data$isWorker==1]
length(fromWorker)
quantile(fromWorker, c(.25, .50,  .75, .85, .95, .98, .99), na.rm=TRUE) 

print("Assets on s3")
froms3 <- data$mainJsBundleLoadTime[data$workerAssets==0 & data$isWorker==1]
length(froms3)
quantile(froms3, c(.25, .50, .75, .85, .95, .98, .99), na.rm=TRUE) 

[1] "Assets in worker"


[1] "Assets on s3"


## Server Push
Server push is very interesting. Across all variations tested, it shows a slight improvement to TTI. If we break it down by other combinations there are some interesting results, as seen below.

In [90]:
print("TTI with server push")
result <- (data$tti[data$isWorker==1 & data$serverPush == 1])
length(result)
summary(result)

print("TTI - without server push")
result <-(data$tti[data$isWorker==1  & data$serverPush ==0])
length(result)
summary(result)

[1] "TTI with server push"


   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
    476    1083    1742    4355    4283  100809     606 

[1] "TTI - without server push"


   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
    467    1157    1802    4378    4317   57268     597 

## Server push interactions with streaming response
In this analysis we look at the interaction of streaming responses and server push, given that the assets are server from the worker. In all cases we also include server side rendering. 

It's interesting to look at the 3rd quantiles, where (at least in this dataset) streaming response without server push is showing the lowest time. This could make sense as the browser can get started requesting 

In [91]:
print("TTI - with server push - with streaming response")
result <- (data$tti[data$serverPush == 1 & data$SSR==1 & data$streamingResponse==1 & data$workerAssets ==1])
length(result)
summary(result)

print("TTI - without server push - with streaming response")
result <- (data$tti[data$serverPush == 0 & data$SSR==1 & data$streamingResponse==1 & data$workerAssets ==1])
length(result)
summary(result)

print("TTI - with server push - without streaming response")
result <- (data$tti[data$serverPush == 1 & data$SSR==1 & data$streamingResponse==0 & data$workerAssets ==1])
length(result)
summary(result)

print("TTI - without server push - without streaming response")
result <- (data$tti[data$serverPush == 0 & data$SSR==1 & data$streamingResponse==0 & data$workerAssets ==1])
length(result)
summary(result)

[1] "TTI - with server push - with streaming response"


   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
  605.0   975.5  1650.0  4675.8  3968.2 22488.0      78 

[1] "TTI - without server push - with streaming response"


   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
    710    1103    1720    3432    2774   20856      94 

[1] "TTI - with server push - without streaming response"


   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
    689    1193    1560    3685    3620   24511      88 

[1] "TTI - without server push - without streaming response"


   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
    837    1070    1859    5331    4429   57268      74 

## Server push interactions with S3 vs Worker Assets 


In [92]:
print("TTI with server push - Assets on S3")
result <- (data$tti[data$isWorker==1 & data$serverPush == 1 & data$workerAssets ==0 & data$SSR==1])
length(result)
summary(result)

print("TTI - without server push - Assets on S3")
result <-(data$tti[data$isWorker==1  & data$serverPush ==0 & data$workerAssets ==0 & data$SSR==1])
length(result)
summary(result)

print("TTI with server push - Assets Embedded in Worker")
result <-(data$tti[data$isWorker==1 & data$serverPush == 1 & data$workerAssets ==1 & data$SSR==1])
length(result)
summary(result)

print("TTI - without server push - Assets Embedded in Worker")
result <-(data$tti[data$isWorker==1  & data$serverPush ==0 & data$workerAssets ==1 & data$SSR==1])
length(result)
summary(result)

[1] "TTI with server push - Assets on S3"


   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
    491    1176    2112    4471    4720   25443     131 

[1] "TTI - without server push - Assets on S3"


   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
    728    1168    2071    5103    4914   41534     127 

[1] "TTI with server push - Assets Embedded in Worker"


   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
    605    1032    1628    4095    3688   24511     146 

[1] "TTI - without server push - Assets Embedded in Worker"


   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
    710    1078    1790    4565    3950   57268     148 