The most important factor in web application design is responsiveness. And the first step toward responsiveness is speed. But speed within a web application is complicated.
Our strategy for keeping GitHub fast begins with powerful internal tools that expose and explain performance metrics. With this data, we can more easily understand a complex production environment and remove bottlenecks to keep GitHub fast and responsive.
Response time as a simple average isn’t very useful in a complex application. But what number is useful? The performance dashboard attempts to give an answer to this question. Powered by data from Graphite, it displays an overview of response times throughout github.com.
We split response times by the kind of request we’re serving. For the ambiguous items:
- Browser - A page loaded in a browser by a logged in user.
- Public - A page loaded in a browser by a logged out user.
Clicking one of the rows allows you to dive in and see the mean, 98th percentile, and 99.9th percentile response times.
The performance dashboard shows performance information, but it doesn't explain. We needed something more fine-grained and detailed.
Mission control bar
GitHub staff can browse the site in staff mode. This mode is activated via a keyboard shortcut and provides access to staff-only features, including our Mission control bar. When it’s showing, we see staff-only features and have the ability to moderate the site. When it’s hidden, we’re just regular users.
Spoiler alert: you might notice a few things in this screenshot that haven’t fully shipped yet.
The left-hand side shows which branch is currently deployed and the total time it took to serve and render the page. For some browsers (like Chrome), we show a detailed breakdown of the various time periods that make up a rendered page. This is massively useful in understanding where slowness comes from: the network, the browser, or the application.
- render – How long did it take to render this page on the server?
- cache – memcached calls.
- sql – MySQL calls.
- git – Grit calls.
- jobs – The current background job queue.
And many more…
A lot of the numbers in this post are much slower than I’d like them to be, but we’re hoping with better transparency we’ll be able to deliver the fastest web application that’s ever existed.
As @tnm says: it’s not fully shipped until it’s fast.