Notes taken at Velocty CA 2014
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
README.md

README.md

Velocity Conf CA 2014

Tuesday

Patrick Meenan / WebPageTest power users

  • http://www.slideshare.net/patrickmeenan
  • http://www.youtube.com/user/patrickmeenan
  • http://twitter.com/patmeenan
  • lots of undocumented stuff about to be described, check code on github to see even more undocumented stuff
  • always: odd number, pick median, 9 limit - 1st run warms up dns caches, web server, can have long ttfb
  • medianMetric param in query string lets you pick the median metric that is used
    • eg ?medianMetric=speedIndex
    • WORKS ON ANY PAGE!!!!! - ALSO WORKS IN API!!!!!!
    • medianMetric defaults to onload ('loadTime')
    • valid metric names for medianMetric are the identifiers used in the API
    • any metric name from the API can be used (even those not visible on the site/page)
  • visual comparisons - best if you test specific configurations
    • log in to the webpagetest website before using - it will track your test history via login, cross browser/device
    • label your tests, makes it much easier to find later and when doing visual comparisons
    • visual comparison lets you compare old tests
    • test history tab shows all your old tests, check the ones you're interested in, then click compare
    • visual comparison needs video capture on (obviously)
    • visual comparison then shows frame-by-frame comparison of two runs
    • best if you compare things like ads vs no ads, usa vs uk, i.e. isolate one specific change and see the difference
    • test id drives everything in visual comparison url
    • after test id in url, you can set other flags
    • -c:0 / -c:1 in url chooses first view or repeat view
    • -r:n in url sets which run number
    • -l:foo in url sets label
    • -e:doc/full/visual/all in url sets the test end time
    • test labels in visual comparison are a link back to test details (useful if you forget what the label means)
    • visual comparison can show interesting details - eg content disappears for a number of seconds when custom font is applied
    • you can then run side by side video of visual comparison to see the frames in real time
    • really useful to get a sense of the actual user comparison
    • gold border around frames in visual comparison shows when frames differ from previous frame, handy if you quickly want to scroll to the interesting frames
    • if it's not clear, open images up in different tabs and flick between the two to show up the difference
    • in waterfall charts, width of event bars show difference between event start and event end, ie how long it takes event to run
    • in visual comparison, there is a waterfall chart underneath the horizontal scroll - the scrollbar is synchronised to the waterfall so that when it reaches, say, the load event, those frames are shown in the comparison view
    • mobile tests are on actual devices, screen captures are taken from 8 mbit video feed from device
    • on desktop browsers, screenshot is taken every 100 milliseconds
    • pixel diffs used to detect changes
    • visual comparison / film strip view uses steve souders' waterfall ui conventions resource type colours
    • main waterfall view will change to do the same soon
  • query param: ?pngss=1 enables full resolution png screenshots, lets you identify exactly which pixel changes because there's no loss
  • query param: ?iq=n sets JPEG image quality, 95 is upper limit (affects both video frames and screenshots), useful for presentations
    • both can be used in the main page (there's a hidden form field), don't overuse them as it taxes the server
    • images are unlicensed, free to do what you want with
    • for private agents, default image qualities can be set in settings.ini
  • advanced settings > chrome
    • checkboxes to include devtools timeline and call stacks (adds a small amount of overhead on mobile, set it on for specific cases rather than all the time)
    • when enabled, processing breakdown link under the waterfall links on the results page let you see how the main thread is spending its time doing different activities
    • processing breakdown link is also at top of results page
    • when looking at chrome timeline, expand the js bars, it will show a breakdown of exactly what is happening, eg if it has caused the browser to paint something, so not just processing the js itself is responsible
    • chrome timeline events also show up at bottom of waterfall chart (main thread and background thread)
    • chrome timeline events also show up at bottom of film strip view
  • advantage of using actual mobile devices is that the cpu impact of different actions (js, images, whatever) varies hugely compared to desktop and between different devices
    • emulators not a good representation of what actually happens
    • amazon, google search are excellent real-life examples of this - some mobile devices really struggle with them
    • resource fetching is not the only thing to worry about on mobile, execution is also very significant
    • film strip / visual comparison very instructive when testing different devices with same page, network conditions, location etc
    • different test locations have different mobile devices available
    • every mobile test in webpagetest has 1 second plus proxy connection (via spdy, encrypted), bear this in mind
  • advanced settings > advanced
    • save response bodies
      • save the actual resource that was served to that browser (fixed to that request), useful for debugging
      • doesn't work on mobile devices yet, requires emulation (should be fixed soon)
      • clicking on any row in waterfall shows request and response details, if response bodies are saved link to them will appear there (best loaded in a separate tab then copied and pasted elsewhere)
      • for images, the object tab on the resource detail box will not always show the same image when response bodies are saved, so beware (for storage reasons it links to the resource)
    • capture network packet trace (tcpdump)
      • linked under waterfall link on left hand side of results page
      • can be opened in wireshark or cloudshark
      • useful for verifying/debugging actual server/network problems (if browser definitely not at fault)
      • cloudshark much better for sharing than saying eg "packet 35 in this dump" or whatever
      • in cloudshark, filter by http, find your resource request, go to analysis tool, follow stream / show only the stream, lets you see everything that happens on the network for that resource
      • dns retransmits at 2 seconds, so if dns takes just over 2 seconds there was packet loss and it was retransmitted, if that happens regularly you have a dns issue
      • socket connect times that are a multiple of 3 seconds, synack dropped, packet loss
  • advanced settings > chrome
    • chrome trace
      • captures detailed trace of chrome activity for that test, good for reporting chrome bugs
      • chrome trace link shows up under waterfall link on left hand side of results
      • trace files are pretty big, circa 20-30 megabytes
      • when viewing chrome trace, use wasd keys to scroll up, left, down, right
      • breakes everything down per process, per thread
      • lets you eg, see why a bit of js is taking so long to parse
      • may have some overhead, lower overhead than timeline, but still test with and without
  • at bottom of waterfall view is a 'view all images' link
    • shows all images on page (linked images, not saved, metadata is accurate to the time of the actual request, image is not)
    • view all images is a quick way to eyeball all the images, can flag up that eg image sprites should be used
    • tracking beacons that are fetched as images will show up on 'show all images' as blank images
    • the 'compress images' report in result detail compares to 85% compression - 85% should always be safe - that info is also available on show all images page
    • 'analyze jpeg' link on show all images page will show you the original image and the recommended image, so you can compare quality to see whether recommendation is safe
  • advanced settings > advanced
    • preserve orginal user agent string
      • use this if site breaks when UA string is changed to add webpagetest (eg AOL doesn't serve ads to webpagetest)
  • advanced settings > block
    • add a file extension, all matching requests will be blocked and you can see the performance of the page without those resources
    • resource blocking fast fails
    • does a substring match, not just file extension
  • advanced settings > spof
    • add a domain, all matching requests will fail (eg what if cdn goes down or is not available in china)
    • spof = single point of failure
    • spof test automatically does with and without tests, takes you to visual comparison page
  • scripting lets you do multi-step requests, eg log in, go here etc, can capture whole sequence or just the bit you're interested in
  • scripting has form commands - exec, runs a bit of script, can set some data then submit a form for example
  • custom browser settings available for chrome and firefox
  • user timing - performance.mark and performance.measure, can timestamp specific actions, they'll show up in waterfall views etc
  • javascript start render time is very unreliable, light green bar as opposed to dark green bar for start render in waterfall view (if they're far apart, don't trust js metric)
  • advanced settings > custom
    • lets you insert custom metrics, explained in the documentation
    • inserts arbitrary js, lets you collect anything you want
    • also useful for validating page content, did the test succeed?
  • o'reilly book - 'using webpagetest' coming out soon

Tobias Baldauf / Advanced image compression techniques

  • http://twitter.com/tbaldauf
  • 3 levels of tools
    • binaries: libjpeg, pngquant, cwebp...
    • wrappers: jpegoptim, jpegrescan...
    • guis: imageoptim, pnggauntlet, pngcrush...
    • guis rarely make a difference, it is underlying binaries & wrappers that do legwork
  • in results, dssim score rates similarity of before and after images, dssim=0 means perfect match, dssim=1 means no similarity
    • dssim runs as command line tool
    • dssim can only compare PNGs, use imagemagick to convert jpegs to pngs, use dwebp to convert webps to pngs
    • dssim score of below 0.01 should be acceptable (i.e. below 1% difference)
  • jpegoptim - on github, availble in most linux distros, well-maintained, recommended
    • decent compression on jpegs
    • jpegoptim --strip-all --all-progressive --stdout in.jpg > out.jpg
      • strip EXIF data
      • progressive JPEG
      • dssim will be 0
    • jpegoptim --max=75 --strip-all --all-progressive --stdout in.jpg > out-lossy.jpg
      • as before, but lossy, 75 is a good in-between threshold, will give a much greater size reduction, dssim will climb up a bit
  • jpegtran - optimise huffman tables ONCE
    • jpegtran -copy none -optimize -progressive -outfile out.jpg in.jpg
    • no gain from jpegoptim, as jpegoptim has already done that work
  • jpegrescan - runs jpegtran many times and finds optimal huffman tables
    • jpegrescan -q -s -i in.jpg out.jpg
      • small reduction in size from jpegoptim/jpegtran
      • dssim is still 0
    • about 1% to 2% more performant than jpegtran based on sample of 700,000 pseudorandom JPEGs from http archive
  • jpegoptim+jpegrescan produces median image size reduction of 12% on 12,000 pseudorandom JPEGs
    • 0.02% median dssim
  • adaptive compression based on rtt - akamai doing this!
    • create 10 lower quality versions of image on first request
    • client served with most appropriate image quality based on rtt
    • works 'automatically'!
  • alternative approach - adaptive compression based on ux
    • experimental script by tobias, not open-sourced yet
    • uses jpegoptim+jpegrescan, could work with any tools
    • use dssim threshold of 0.002 to decide whether more compression is required, binary search used to repeatedly optimise compression ratio
    • wasteful but good proof of concept
  • webp = png8 + 24 + jpeg + gif + better compression
    • i.e. not perfect for high definition images, it's a generic solution for the web
    • chrome, opera and android browser are the only implementations
    • webp has won because of jpegxr licensing issues
    • cwebp is compress binary name, dwebp is decompress binary name
    • cwebp in.jpg -o out.webp
      • big reduction in size, compared to jpegoptim+jpegrescan image
      • dssim circa 1.5%, compared to jpegoptim+jpegrescan image
    • you need dwebp to work with webps: dwebp out.webp -o out.png
    • akamai will automatically serve webp to conformant browsers
    • median size reduction of 32.9% on 12,000 pseudorandom JPEGs, 0.3% median dssim score
  • adept
    • https://github.com/technopagan/adept-jpg-compressor
    • runs on command line, will be renamed jpegadept soon
    • adaptive compression to avoid artifacts in lossy comrpession
    • compression artifacts most noticeable in high contrast areas
    • adept looks for edges / contrast
      • applies high compression where there are no edges, lower compression where edges are found
    • also uses "maximum symmetric surround saliency"
    • doesn't work well on large, complex images like photos
    • median size reduction of 29.9% on 12,000 pseudorandom JPEGs, 0.7% median dssim score
      • close to webp performance for a JPEG
      • dssim score still needs work (it's only a side-project at the moment)
        • high compression areas use 69% compression
        • plan is to use smoothing to improve perception
    • imageman takes a similar approach
    • partly motivated by jpegmini, which is closed source
      • they do saliency detection
      • doesn't inherit quality from input jpeg
      • median size reduction of 17.2% on 12,000 pseudorandom JPEGs, 0.4% median dssim score
        • compression is worse than adept, but dssim score is better
  • final rank: webp beats adept, beats jpegmini, beats jpegoptim+jpegrescan

Buddy Brewer & Philip Tellis / RUM: Beyond page-level metrics

  • code for talk - https://github.com/lognormal/velocity-sc2014-beyond-page-metrics
  • browserscope - runs regular diagnostic tests on browsers, publishes results
  • resource timing doesn't distinguish between 304 and 200, doesn't show resources that error
  • use document.getElementsByTagName to identify resources that are not reported by resource timing
  • Timing-Allow-Origin header required for every step in the redirect chain
  • window.performance.getEntries() returns an array of nav timing, res timing and user timing objects
  • user timing done with performance.mark() and performance.measure(), cleared with clearMarks/clearMeasures methods
  • .mark(id) lets you set marks
  • .measure(id) returns time from navigation start until mark
  • .measure(id, id) returns time between two marks
  • .now() returns time from navigation start until now
  • user timing has nanosecond precision
  • lots of work on measuring performance and on how to improve performance, not much on how fast is fast enough
  • important to collect behaviour metrics like goal conversions and try to correlate their change against performance improvements
  • then use the correlation to put a value on patience, e.g. 1 second = $54m or x number of petition signings
  • boomerang also fires on beforeunload to try and measure the time when a user aborts because things were too slow
    • those metrics obviously won't contain a load event
    • need to handle this in boomcatch #todo #phil
  • be aware of margin for error before extrapolating from averages
  • latency measured by downloading a 32 byte gif 10 times (32 bytes fits in one tcp packet)

Paul Irish / Chrome performance tools

  • cmd+d - record shortcut on timeline
  • frames mode - shows 60fps and 30fps guidelines, identifies which part of the processing exceed those targets
  • cpu profile - select chart from dropdown, shows flame chart
  • on timeline, select tracing mode, it lets you zoom in and get a similar visualisation as the flame chart, but with timeline data
  • 'capture stacks' checkbox on timeline, adds more info to the flame chart
    • height / y-axis is call stack
    • width / x-axis is time spent
  • on timeline, rendering sub tab, show paint rectangles and show composite layers
  • prefer translate and opacity for transitions
  • from timeline:
    • script-bound stuff, dig into js profiler or combined flame chart
    • layout sutff, dig into triggering js
    • paint stuff, turn on layers and rects, promote layers out of refresh storms

Nick Fitzgerald / Firefox devtools

  • coredumps, heap snapshots coming soon
  • prioritise above-the-fold content:
    • move content to top of document
    • inline above-the-fold images and css
    • delay other resources
  • csscoverage - command you can use via developer toolbar, has a command line interface
    • csscoverage oneshot - css usage now
    • csscverage start / csscoverage stop - css usage between two points
    • while scroll is at top of the page, run oneshot and put all those styles inline
  • network monitor - use it to minimise round trips, maximise caching
    • reloads page twice, with and without cache
  • network > timings - waterfall chart
  • console.time() - use for benchmarking (less precise than performance.now())
  • console.profile() / console.profileEnd(), console.time() / console.timeEnd()
    • benchmark specific tasks
    • profiles can be saved and exported
  • performance tab
    • improved profiler
    • analyses entire platform rather than just js
    • uses a sampling profiler
  • 'highlight painted area' option, like chrome's show paint rectangles
  • enable reflow logging in console
    • get reflow messages, either sync or async
    • sync reflows are the ones that cause layout thrashing (eg use of clientHeight)
  • canvas debugger
  • devtools documentation on mdn

Tobin Titus / IE developer tools

  • F12 developer tools
    • mark source files as library
      • *.min.js is library by default
      • step in will step over those modules
      • callstack differentiates library code from your code
      • can also right click on tab to mark file as library
    • first chance exceptions
      • break on handled errors
      • can also ignore library code (or not)
    • source maps
      • debug inside source language
      • can toggle to js compiled view at any point
      • call stack changes when you toggle
      • syntax highlighting for coffee, typescript and script#
    • changebars
      • in dom tree, it shows the elements that have changed due to style changes in devtools
      • yellow = change, red = delete, green = add
      • changes tab shows a checklist of what's changed, you can copy that and when you paste it will paste css with comments showing where the change should go in the source
    • checkboxes to force :hover and :visited pseudo-classes
    • memory tab
      • profiles memory usage, take snapshots for analysis
      • view details - reports elements that may be issues
      • dominators sub-tab lets you dig down further to see what the problem is
      • later snapshots can be compared to show if issues are growing
      • source view will then have comments suggesting fixes for memory leaks
    • IE dev channel now available
  • system-level tools
    • windows performance recorder - record system information
    • windows performance analyzer - dig into all kinds of process and os information

Ignite Velocity

  • tombstones
    1. find function that you suspect is cruft
    2. add call with unique id and current date, eg: tombstone(id, '2014-06-26')
    3. whenever you notice a tombstone that is n weeks/months old, check logs for unique id
    4. delete code
    5. perfiodically have tombstone clean-up days

Wednesday

Mark Zeman / SpeedCurve

Guy Podjarny / 3rd-party performance: don't let others get you down

  • http://twitter.com/guypod
  • missed the sodding start because i was getting stickers :(
  • spof-o-matic
    • chrome plugin by patrick meenan
    • identify single points of failure on the page
  • images are async, don't block rendering
  • scripts are sync, block rendering
  • add async attribute to 3rd-party script elements in the page to stop them blocking
    • don't inject async script elements (see ilya grigorik explanation for why)
  • document.write() in an async script may wipe your page
    • assign a nop function to document.write to prevent 3rd-party scripts fucking with your content
    • modern browsers do this for you now
  • individual beacons are cheap, lots of them can stack up to cause performance problems
  • although images don't block rendering, they do block load event, so beacons delay load time
  • if load time is delayed, browser doesn't indicate to user page is complete
  • javascript beacons preferred to image beacons because of the load delay
    • image beacon included in a noscript element as a fallback
  • if google analytics or quantcast are down, they will delay your load event by ages
  • async is preferred to defer to avoid losing users
  • required: async without delaying the load event
    • use invisible iframe
    • iframe.contentWindow.document.open().write('<body onload="' + 'var script = document.createElement(\'script\'); script.src = foo.js"'></body>');
    • initial write postpones parent load, but by using an onload handler in the child frame we get async behaviour
    • problem: no access to parent document from child frame
  • W3C beacon API
    • lets you send a beacon that doesn't block load event
    • works with unload event, send data after user has left the page
  • W3C resource priorities API
    • mark low priorities
    • lazyload attribute
    • doesn't delay load event
    • supports link elements and works for a whole bunch of different elements
  • wait for load event before injecting third party scripts, then you don't defer load event
  • don't create dependencies between scripts, beacuse it requires blocking
    • there is async="false" but it's a bad idea
    • async="false" makes scripts async, but guarantees execution order
    • causes problems, didn't catch what they were though
  • merge 3rd-party libraries with 1st party content
    • Cache-Control: public, max-age= header
    • makes them cachable AND lets you host them on your own server
  • http://jsmanners.com/
    • site for rating 3rd-party scripts
    • aims to be a knowledge base for 3rd-party stuff
    • not taken off yet, needs more community contribution
  • beware tag managers
    • too easy to add lots of 3rd-party tags
    • introduce complexity

Dan Slimmon / Smoke alarms, car alarms and monitoring

  • http://twitter.com/danslimmon
  • learn to do some stats and visualisation
  • signal-to-noise ratio
  • sensitivity & specificity
    • concepts from medicine
  • sensitivity
    • % of actual positives identified as positives
    • high = sensitive to problems
  • specificity
    • % of actual negatives identified as negatives
    • high = works for a specific type of problem
  • prevalence
    • probability of problem occuring
  • uptime = 100% - prevalence
  • positive predictive value
    • probability that a positive is truly positive
  • if service has 99.9% uptime and a probe has 99% sensitivity and 99% specificity
    • probability of true positive = (probability of service failure) * (sensitivity) = 0.1% * 99% = 0.099% = 1 in 1,000 times
    • probability of false positive = (probability of working) * (100% - specificity) = 99.9% * 1% = 0.99% = 1 in 100 times (uh oh)
    • positive predictive value = PTP / (PTP + PFP) = 0.099% / (0.099% + 0.99%) = 9.1%
    • in this example, false positive is 10 times more likely than true positive
    • in other words, for this alert there is only a 1 in 10 chance that something is actually wrong
      • 9 out of 10 alerts are false
  • car alarms are high sensitivity but low specificity
    • people ignore car alarms
  • smoke alarms are high sensitivity and high specificity
    • people listen to smoke alarms
  • you want probes like smoke alarms, not car alarms
  • why are alerts noisy?
    • don't want customers or boss finding problems
    • we focus on sensitivity as a result
  • aim for more degrees of freedom to gain both sensitivity and specificity
  • add degrees of freedom using hysteresis
    • hysteresis = dependence on past behaviour
    • eg state machines, time series analysis
    • more variables to tweak
    • can change one of sensitivity or specificity without the other as a result
    • if you only have a single threshold, you can't independently change sensitivity or specificity
  • pagerrhea
  • as uptime increases, specificity must also increase to compensate
    • otherwise, false positives will increase
  • separate problem detection from problem identification
  • alerting should tell you whether work is getting done
  • don't alert on thngs like apache process count or swap usage
  • alert on whether the system is working, eg response time or requests per second (or both combined is even better)
  • once they indicate the service is down, then check apache process count or swap usage
  • alert on problem detection, include problem identification when alert is triggered
  • track actionability and investigability of alerts
  • any alerts that are not actionable or not investigable, lose the alert
  • https://github.com/etsy/opsweekly
    • categorisation manager for alerts
  • ideally nagios would separate detection and identification
  • bischeck is an interesting monitoring tool that supports hysteresis
  • medicine is an interesting area for ops people, involves understanding systems that we can't see inside
  • blog post
  • maths in this talk related to base-rate fallacy

J. Paul Reed / A look at looking in the mirror

  • http://twitter.com/soberbuildeng
  • 5-whys
    • root cause analysis
    • assumes linear sequence of events, like dominoes
    • assumes easy solutions
    • often mistaken for comprehensive causality
    • bounds the system to the questions, can skew actual causes
    • often boundaries are too constrained, essentially arbitrary
  • epidemiological model
    • acknowledges different subsystems
    • identifies different types of failure: latent, active
    • accounts for various system actors
    • aka the swiss cheese model
    • assumes we can break down systems and quantify boundaries
    • assumes we can fix latent failure
    • still a linear model
  • biases
    • hindsight bias
      • "we should have known..."
      • "why didn't you notice..."
      • acknowledge a reality that doesn't exist
    • ????? bias
      • ignores non-fatal problems
      • assumes repeated attempts can improve things
    • correspondence bias
      • fundamental attribution error
      • different wants between ops and devs
  • systemic model
    • models accidents as emerging from interactions between
      • system components and processes
      • and actors
      • and tiers of the org
      • across time
    • in a complex organisation it is too chaotic to model in these terms
    • assumes simple systems
    • assumes our systems are inherently safe
    • assumes safety can be baked in
    • assumes priorities and goals are static and coherent
    • assumes there are specific, singular fixes to problems
  • accountability is not quite the same as responsibility
    • everyone is accountable
    • might not be held responsible
  • SMART recomendations
    • Specific
    • Measurable
    • Agreed/Agreeable
    • Realistic
    • Time-bound

Doug Sillars & Andy Davies / What makes mobile websites tick? How do we make them faster? Insights from WebPageTest and HTTPArchive

  • focus on speedIndex
  • avoid needless redirects
  • avoid blocking scripts
  • generally a low number of requests delievers fast visual performance
  • any css and script in the head, inline it
  • browser can't paint the screen until it has fully built the render tree
  • best performing sites make no external requests in the head
  • fonts can block rendering
    • ie doesn't wait
    • firefox and chrome will wait three seconds before displaying content
    • guardian lazy load the font and store it in local storage
      • if everyone did that people would run out of local storage, not responsible
    • web font loader js
    • font load events api
    • consider subsetting fonts, why download the whole set if you're only using alphanumeric
  • using webpagetest book is in early release, buy it now
  • inlining stuff brings tradeoff because it means not using the cache
    • consider how important first view is
    • only inine the minimum for above the fold, then cache the rest
    • use a/b testing for performance, see which works best, inline or cached

Parashuram Narashimhan / Making front-end performance testing a part of continuous integration: PerfJankie

  • browser-perf
    • npm package that is a port of a python port of chrome's performance tools
  • PerfJankie
    • built on browser-perf, tracks performance over time
    • drive it with selenium tests

Toufic Boubez / Some simple math to get some signal out of your ops data noise

  • there is no simple math for anomaly detection
  • all following data is real data
  • unlabelled y axis deliberately, specific metric is not important
  • metrics often picked fairly arbitrarily
  • alert fatigue
  • people assume data is gaussian / normally distributed, that the mean and standard deviation don't change
  • 3 sigma rule follows 68-95-99.7 %ages in 1st, 2nd, 3rd standard deviations
  • data center data is usually different to processing data, doesn't have gaussian distribution
  • 3 sigma rule doesn't apply in those cases
  • plot frequency distribution / histogram to see the distribution
    • gaussian = bell-shaped curve
  • non-gaussian distributions require a moving average instead of a static mean
    • allows the 3 sigma rule to apply when adding new values
  • simple moving averages are skewed by spikes in data
  • weighted moving averages assigns a linearly/arithmetically decreasing weight to the average, also not ideal
  • exponential / smoothing moving average, also not ideal
  • all smoothing predictive methods work better with normally distributed data
  • non gaussian techniques are needed instead
  • first thing to do with your data is plot a histogram
    • you can't tell the distribution until that is done
    • if it is gaussian you can use one of the above techniques
  • kolmogorov-smirnov test
    • non-parametric test
    • doesn't assume a distribution
    • measures maximum distance between cumulative distributions
    • used to compare periodic/seasonal metric periods
    • use r / octave / matlab
    • pick three windows from similar timeframes, eg three consecutive mondays
    • kolmogorov-smirnov can compare those windows, looking for maximum distance between the probability distribution
    • very good indicator of the difference in probabilistic behaviour
    • take two windows, slide them in time, perform constant kolmogorov-smirnov test as data streams in, computing score on the fly
    • use that data for alerting, works much better, far less false positives
  • box plots
    • also non-parametric
    • median is better than mean for non-gaussian data
    • quartiles are better than standard deviation for non-gaussian data
    • Q1, Q2, Q3 = 25%, 50%, 75%
    • draw fences for each quarter, add 1.5 * iqr to Q1 and Q3, those three chunks give you excellent equivalent to standard deviation
    • iqr works well for some non-gaussian data
    • need to re-compute quartiles, iqr and fences as data streams in
    • automatically smoothes thresholds on the fly, no need to eyeball it, let the machine do the work
  • diffing / derivatives
    • often when the data is not stationary / gaussian, the derivatives do tend to be static
    • most frequently, first difference is sufficient
    • can perform analytics on first derivative
    • the diffs often form a gaussian distribution
    • so you're predicting / alerting on the derivative of the data, not the data itself
    • derivative is calculated by substracting s(t) from s(t+1)
  • neural networks
    • not enough time - BOOOOOOOOOOOOOOOOOO
  • use r / octave / matlab
  • understand statistical data
  • don't assume gaussian
  • use appropriate techniques
  • use kale / skyline + oculus for anomaly detection and matching
    • or apply kolmogorov-smirnov test in r or octave, look for data sets with a low distance
    • kale is the only open-source tool to do it

Thursday

Keynote

Jonah Stiennon / Appium

  • mobile automation made awesome
  • like selenium for iOS / android / firefox os
    • windows phone support coming, still a bit of work to be done
  • open-source
  • uses exact native automation libraries that operating systems provide, not emulated
  • runs on node
  • supports multi-touch
  • exposes a multi-language api for scripting actions (js, ruby, all major languages)

Josh Marantz / Lessons learned building PageSpeed and how to make the web fast

  • PageSpeed Insights
    • analysis & diagnosis
    • also diagnoses UX issues
  • PageSpeef Optimisation
    • web server implementation

Kent Alstad / WebPageTest & continuous integration

  • radware value dashboard
  • node wrapper around webpagetest
  • spins up instances around the world and runs tests on them
  • compare 30 accelerated tests vs 30 unaccelerated tests
  • 30 - 100 runs in webpagetest required to get to 95% confidence

Peter Hedenskog / sitespeed.io

  • open-source
  • command line app
  • sitespeed.io -u http://www.example.com -d 2 -c chrome,firefox -z 3
    • -u is test root
    • -d is crawl depths
    • -c is browser
    • -z is number of tests
  • generates an html report
    • different tabs for different views
    • summary tab
    • pages tab
      • sort by column
      • step into page
    • assets tab
      • shows time since last modification, helpful for setting cache headers
  • being rewritten in node
    • will accept data from pagespeed insights, webpagetest and har files

Eric Lawrence / Fiddler

  • http://twitter.com/ericlaw
  • telerik acquired fiddler to make it free to the community
  • runs on mac/linux now
  • can write fiddler script in C#, will run on mac/linux with mono
  • can proxy devices to connect via fiddler and collect that traffic too
  • can import pcap files from tcpdump, wireshark etc
    • file > import
  • has transcoders
    • can import pcap, convert to har
  • is a pluggable platform now, not just a proxy
  • handles all caching quirks
  • zopfly compression makes jquery 18% smaller than gzip
    • fiddler exposes zopfly comparison in ??? tab

Patrick Lightbody / Software analytics for performance nerds

  • http://twitter.com/plightbo
  • what if long tail of slow performers are customers who spend most money
  • john rauser "look at your data"
    • old velocity talk, worth watching
  • new relic insights
    • captures every interaction in browser and on back-end
    • can query with a query language
    • marries business requirements and performance

Seth Walker / Performance and maintainability with continuous experimentation

  • http://twitter.com/sethwalker
  • feature flags
    • branch code based on visitor state
    • also useful for gradually introducing new features based on performance analysis
    • use unique ids to match up logs with feature flag test cases
    • about to implement feature flagging in the asset pipeline
    • also plan to include tombstones in feature flagging
  • analytics
  • continuous experimentation
    • see dan mckinley post
    • small, measureable changes
    • invalidate hypotheses
    • use feature flagging aggressively
    • test all performance changes
      • progressive jpegs
      • prefetch
    • might find that postive performance improvements can negatively affect engagement
      • eg rendering performance might be worse, jank
  • cruft
    • introduces cognitive overhead
    • can affect performance
    • eg, lots of different feature flags can cause a whole bunch of different link/script elements
    • need to ensure you clean up old feature flags
      • don't let fear prevent that happening
      • design for throwawayability - see bill scott post
      • instrument the code, run instrumented code, see what is being used or not
  • adapt tools and processes to promote experimentation
  • if you tell people they are in an experiment, they will behave differently
  • have tombstone days where everyone cleans stuff up
  • document experiments and when they were run
  • it's good to repeat experiments

Jan-Willem Maessen / Making the web POSH

  • in-place resource optimisation (IPRO)
  • automatically rewriting pages, especially changing URLs, can break JS
  • doing a perfect job of web performance optimisation is impossible, unless the pages themselves don't change
  • IPRO doesn't change the page
  • how rewriting breaks js
    • introspective javascript
      • looks up src attribute of its script element
      • uses that url then inserts new nodes
    • URL mangling in response to user action
      • img hover adds -hover to the src attribute to load a special hover image
    • locating dom nodes by matching the URL in a src or href attribute
      • instead of using data attributes, class or id
  • don't change the html at all, instead:
    • re-compress images
    • minify js and css
    • don't change any URLs
  • pagespeed IPRO does all this:
    • ModPagespeedRewriteLevel OptimizeForBandwidth
  • rewriting URLs will provide greater savings
    • resize images according to their size on the page, update src attribute to resized image
    • cache-extend resources while preserving site updates by adding a content hash
      • when the resource changes, the content hash changes
    • combine resources
      • concatenate css resources
      • concatenate js resources
      • image spriting
      • reduces number of connections used
      • resource data comes in over warm connections
    • inline resources
      • inlining is the only way to achieve 1000ms time to glass
      • concentrate on inlining the css that is actually used by the page, no need to inline unused css
      • limit image inlining to above-the-fold images
      • inline low-quality versions of large above-the-fold images and lazyload the full resolution
      • inline css imports into the file in which they occur
      • inline images referenced in css
        • will slow page down unless they're small, because of the way browser prioritises and handles images in css
  • pagespeed settings:
    • PageSpeedRewriteLevel core
    • PageSpeedEnableFilters prioritize_critical_css
  • if pages are ok with having URLs changed, do you still need IPRO?
    • probably
    • web page optimisation can only omptimise what it can see
    • IPRO can optimise eg data_src attribute dynamically/lazy loaded image
      • ModPageSpeedUrlValuedAtrribute img data_src image
    • dynamically loaded document fragments
      • can't see them on page
      • can see them as they come off the server
      • modify them in-place
  • measuring performance
    • 0% pagespeed experiments
    • not delivered to any users without cookie PageSpeedExperiment=1
    • ModPageSpeedRunExperiment on
    • ModPageSpeedExperimentSpec id=1;percent=0;options=...;enabled=im_place_optimize_for_browser
    • then test in WebPageTest, with cookie set in advanced settings
    • use visual comparison film strips to visualise the improvement
  • don't confuse bandwidth with speed
  • ipro improves bandwidth
  • rewriting urls improves speed
  • very small images are smaller as PNGs than as WEBPs, so if image recompression makes no difference, the original image will still be served
  • another pagespeed setting for google fonts

Eddie Canales / Speed kills: when faster pages mean less revenue

Ilya Grigorik / Is TLS fast yet?

  • http://twitter.com/igrigorik
  • short answer: yes
  • transport layer security
  • authentication, data integrity, encrytpion
  • overheads
    • can be optimised, can even yield faster loading pages
    • CPU cost
      • asymmetric crypto
        • 1ms ish
      • symmetric crypto
        • 100 mbs+ per core with sha245 and 1024 byte blocks
      • tls accounts for less than 1% cpu load on large, optimised sites
      • optimise for keepalive and session resumption
        • can re-use pre-negotiated parameters, avoid handshake
        • skips asymmetric crypto entirely
        • enable 1-rtt
        • use either session identifiers (shared state on server) or session tickets (shared state on client, preferred, easier to deploy)
    • memory cost
      • tls accounts for less than 10k memory on large, optimised sites
      • disable tls compression
        • rely on gzip compression at http level
        • costs ~1mb per connection vs 100kb per connection
        • not secure
  • boringSSL
    • fork of openSSL
    • maintained by google
    • internal cleanup patches
    • reduced resource usage
    • will be used in chrome soon and android later
  • perfect forward secrecy (pfs) requires key rotation
    • separate layer of logic, not done automatically by web server
  • tls handshake
    • textbook tls handshake takes 2 rtts, 1 rtt is possible with optimisation
    • use cdn - edge termination can significantly reduce tcp and tls handshake costs
      • still use tls between edge and origin, uses persistent connection, little overhead
  • online certificate status protocol (ocsp)
    • checks whether certificate has been revoked
    • stops the negotiation, performs dns lookup, tcp connection, waits for response
    • not done over tls itself! can be hijacked!
      • see adam langley presentation
    • chrome doesn't block on ocsp
    • firefox blocks on ocsp
    • generally has bad latency
    • use ocsp stapling instead
      • server is responsible for fetching status, stapling it to certificate
      • client verifies stapled response
      • stapled status is signed, so safe
      • means no blocking in any browser
  • if a certificate requires 2 rtts, tls handshake will require 3 rtts
    • many certificate chains overflow the old tcp (4 packet) cwnd
  • some servers pause on "large certificates" until they get an ack for first 4 bytes
    • eyeball waterfall chart to see
    • eg older nginx
  • tls false start
    • client sends application data immediately after "finished"
    • eliminates 1 rtt
    • no protocol changes, only timing is affected
    • breaks some servers
    • opt-in
    • deploying false start
      • chrome and firefox require npn/alpn advertisement, eg "http/1.1", and forward secrecy ciphersuite, eg ecdhe
      • safari requires forward secrecy
      • ie requires blacklist and timeout, retry without false start if it fails
  • eyeball waterfall chart, if tls negotiation is not 1 rtt, optimise it
  • same with ttfb
    • large records are split across tcp packets, bumps up rtts
      • tls allows up to 16kb of application per records
      • google servers implement dynamic record sizing to mitigate this
        • start with small record size (1400 bytes)
        • after ~1mb is sent, switch to 16kb records
        • after ~1s of inactivity, revert to 1400 bytes
        • no perfect record size, adjust dynamically
  • out-of-the-box tls performance in servers is poor across the board
    • nginx is least bad
    • need to enable things in all servers
  • cdn performance also poor in many cases
    • akamai are good
    • cloudfront, edgecast, heroku not so good

Sarah Novotny / 5 things you didn't know NGINX could do

  • 146,000,000 websites run nginx
    • 23% of top 1m
  • a/b testing
    • split_clients module
      • sends traffic, based on a hash, to different assets, based on a weighting mechanism that you specify
      • no measurement, monitoring or analysis
  • rewrite content
    • don't need common content to be in every page, can be injected by nginx
    • directives can be used in http, server and location contexts
    • sub_filter_once, sub_filter_types, sub_filter
    • use for tracking code, automatically updating copyright statements, whatever
  • online upgrades
    • update either the config or the binary, without dropping connections
    • config
      • nginx -s reload
    • binary
      • upgrade the binary
      • kill -USR2 <pid>
      • kill -WINCH <pid>
      • verify things are working
      • kill -QUIT <pid>
      • to revert if verification fails: kill -HUP <newpid>
  • thread exhaustion
    • protect apache from thread exhaustion by using nginx in front of apache
    • handles all keepalive traffic with evented model
    • mitigates 'slow loris', 'keep dead' and 'front page of hn' attacks
  • asset compression
    • gzip on;
    • gzip_types <content type> <content type> ...;
    • gzip_proxy ;
    • image_filter size;
    • image_filter resize <width> <height>;
    • image_filter rotate <degrees>;
  • form spamming
    • stop brute force password attacks, or form spamming
    • allows granular control of request processing rate
    • directives can be used in http, server and location contexts
    • limit_req_zone $binary_remote_addr zone=foo:10m rate=1r/s;
    • limit_req zone=foo burst=5
  • manipulate proxy headers
    • mask content source (like assets in s3)
    • manage proxy behaviour
    • inject your own headers
    • allows perception maagement of content delivery
    • directives can be used in http, server and location contexts
    • proxy_hide_header
    • proxy_set_header
    • proxy_ignore_header
  • configure flags
    • nginx -V
    • returns a nearly complete configure argument list
  • include directive
    • includes files
    • can be used in any context
    • promotes modularity of configuration
  • going to be an nginx user conference, announcement on monday

Mark Holland & Mike McCall / Chasing waterfalls: resource timing data in the wild

  • use tree to remove redundancy from URLs in resource data
    • { 'http://' : { 'example.com/': { ... #todo #boomerang
  • generate waterfall chart for different percentiles
    • what does a fast waterfall chart look like
    • what does a slow one look like
    • etc
  • lots of noise in data
  • common patterns:
    • base page slow
      • not commonly the main cause on slow pages
    • 1st- or 3rd-party content slow
      • most common big impact on slow pages
      • web fonts big cause of this slowness
      • also advertising often at fault
    • slow css or js
    • resource queuing / blocked resources
    • periods of no network activity / gaps