How Deflect fits together
NB: this flow should not be a list... watch this space for nice process diagram tbd
- user enters http://www.somesite.com/page into user agent
- UA (or proxy server) asks DNS
- DNS returns an edge server IP
- UA requests www.somesite.com/page from edge
GET /page HTTP/1.1
- edge checks for local copy.
If found and fresh, returns to client, end. if no local copy found,
edge requests from origin
GET /page HTTP/1.1
Host: www.somesite.comif found in cache but not fresh, client attempts to revalidate with an IMS request to origin:
GET /page HTTP/1.0
If-Modified-Since: Sat, 6 Oct 1985 1:24:00 GMT
- origin webserver gets request, and builds page if dynamic If we requested If-Modified-Since, and the page has not been modified since the moment Marty arrived back in 1985, then the server might return "304 Not Modified" - in that case the edge updates time on cached object and serves from cache. Otherwise, whether IMS or not, the server will respond with the appropriate HTTP code and data: probably 200 OK (though it could be any code). If the edge cannot be reached in without origin-timeout, we choose to serve from cache no matter how old.
- edge returns response to UA
- UA optionally caches reponse locally
How to build Deflect from nothing
host names for edges (optional)
edge pool- edge.deflect.ca (multiple A records)
note: do not reconfigure www.site.com at this point
Configure a master and secondary master node
apt-get install prereqs
download trafficserver source
unpack in usr/local/src/
make && make install
cp -rp /usr/local/trafficserver /usr/local/deflect/trafficserver-app
cp -rp trafficserver-app/conf trafficserver-conf
copy custom scripts
build ats and place in repository
configure nagios server
2ndary master instructs tbd
Configure some edge nodes
automated node build - :ref:`Daily tasks - create an edge node <daily_tasks_new_edge>`
add to monitoring - :ref:`Daily tasks - Add a host to monitoring <daily_tasks_add_host_monitoring>`
Follow the instructions for editing a config file in [:ref:`Daily tasks -
Apache Traffic Server configuration <daily_tasks_ats_configuration>`
- it's not necessary to push each file separately. It's a trade-off
between validating the configuration as you go along, and the reality
that some changes cannot be test piece-meal. At the time of writing (15
Feb 2011) there are five files we run that are different from stock.
Look at the files on disk, under
/usr/local/deflect/trafficserver-conf/ for specifics, but here's an
This is the file that controls what HTTP Host header maps to what origin server.
This file is the main configuration of the ATS application behaviour - for example, this file determines whether to check remap.config at all! Some caching behaviour is configured here- for example, whether to over-ride the Cache-control: headers we receive from origin servers.
A lot less config here than you might expect- mostly just the TTL in cache per domain, and the catch-all default TTL.
We use the conf_remap and the stats_over_http plugins.
Our custom log format is defined here. It is basically netscape "common" format, with some addition caching information - it logs one code like TCP_HIT, TCP_MISS, TCP_IMS_HIT (that last mean ats sent an If-Modified-Since request and received a 306 Not Modified response)
- Most important is functional testing.
master is backed up
dns is accurate and complete
logs are being collected from all edges
monitoring is active and alerting for all nodes (master and edges)
functional testing passed for all edges
performance has been tested and is adequate
point the public URLs (eg site.com, www.site.com) at the pool of edge servers
Notes on the sysadmin choices
- make it easy for yourself and others
Sometimes you're not the one fixing a problem- if standards are adhered to it's a lot easier to troubleshoot. Other times, you might not remember what you did last time. Therefore: standardise, document, take copious but orderly backups, and broadcast the work you're doing for the informational and peer review purposes. this means less headaches for everyone. Consider documenting work before you start, or work from existing documentation- and be very wary of any deviation from the document. If you have to deviate, then capture it!
- infrastructure grade
Deflect is an infrastructure service- downtime = bad.
- that's all
Here's some of the ways we try to implement the principles of stress free administration....
The bvi wrapper
Take a look at:
Capturing your session with
A simple .screenrc file exists at
~root/.screenrc - if you run
screen as root, screen will do two important things:
- keep a log of your session so to aid review, documentation, troubleshooting and collaboration later
- allow you to reconnect to a session if your login session is interrupted for any reason.
Even better practice is to copy .screenrc to your own homedir, create
$HOME/screenlogs and run screen as yourself, before
su to an administrative user account - this makes it even easier
to enjoy the benefits of a logged session after the fact.
This is a cludge, but it's doing its job pretty well at the moment. Have a look inside the file - or run make in that directory with no arguments for useful help text. This is the main interface for controlling the edges, and any work that needs to be done on edges (plural) should be captured in the makefile.
In time this may be rolled into a shell or perl script, or the makefile cleaned up.
It currently calls some of the scripts in
/usr/local/deflect/scripts/ - not a bad place to have a poke around
to see what does what.
I like to keep everything where I can see it- so I build to /usr/local/. I think it makes upgrades easier, to name but one advantage.
SSH and passwords
As described in https://xkcd.com/936/ - the "search space" for two different passwords - that is how many you might have to try before you'd be certain of brute forcing this password.
= 6.05 x 10e19 or 5,748,511,570,879,116,626,495
= 1.99 x 10e67 or 19,943,457,888,530,122,458,259,355,763,514,562,458,206,830,362,647,183,073,840,370,234,460
That's six sextillion for a bitch-to-remember password that you have to write down or store (bad) vs twenty unvigintillion (I looked it up :) for a really easy to remember password - that's a factor of over 100 tredecillion! SSH keys are like the PGP of login security. The properly guarded private key is never ever exposed- the public key is used to generate a challenge that can only be met by the private key. The response is then verified using the public key, and if it's a match, and the target host's sshd is configured to allow the owner of the private partner of the public key it has to log on as the requested user, access is granted. The future is ssh keys encrypted with strong passphrases- much better. I will make this section better in future, maybe using the number unvigintillion again.
and its online backup backup are named for their roles. the edge hosts are named by the VPS provider for each, in the form providerN.deflect.ca, where N indicates whether this is the first, second, Nth VPS at that provider, eg