webSAT (Web Server Analytics and Telemetry) is a library intended for performing real-time analytics on your webserver's access log data using PHP, ZeroMQ, and Memcached. It works by using ZeroMQ to transmit access log data over TCP or IPC from a producer script, and collecting the data from a consumer script, that can then aggregate and store such data in Memcached. You can then use aggregated data in Memcached to produce useful graphs for monitoring your webservers. This allows you to also collect access log data from multiple webservers concurrently and aggregate them in one or multiple places.
Important things you can monitor in real time using this library:
- TTLB (Time To Last Byte - otherwise known as request time)
- TTFB (Time To First Byte - using mod_log_firstbyte)
- Number of 2XX, 3XX, 4XX, 5XX response codes
- Daily active unique visitors (using IP or cookie data from the access log)
- Slowest performing URLs
- Most popular URLs
How It Works
We equip your webserver with a beacon, that sends out signals about what your webserver is doing. These signals are then captured by a satellite, which transmitts them to a station that has one or more microwaves. If a given microwave tuned to the beacon's signal knows what to do with that signal it can process it, otherwise it is discarded.
First, to clear up some things, let's explain what all these terms mean...
|Satellite||A Satellite is the PULL mechanism of the consumer in this PUSH/PULL topology|
|Beacon||A Beacon is the PUSH mechanism of the producer in this PUSH/PULL topology|
|Signal||A Signal is what a Beacon will transmit to a Satellite and is comprised of two parts (namely: a designation and a payload)|
|Microwave||A Microwave is the worker mechanism of the consumer, which does the actual work when consuming a given Signal|
|Station||A Station is the collection of consumer mechanisms, including the Satellite and one or more Microwaves (workers)|
Station can only have a single
Satellite (this is the thing pulling in the actual messages off the queue), but multiple
Microwaves (the thing doing the actual work). Each
Microwave can have one or more designations. When the
Station receives a
Signal from the
Satellite, it will ask all of its
Microwaves if they have the given designation in the incoming
Signal. If the
Microwave has such designation it will be handed the
Signal for processing. It is possible for more than one
Microwave to have the same designation(s), since the same
Signal may be consumed by more than one worker in a given
Station. Once a message is consumed by the
Satellite, however, it can not be consumed again by the same
Satellite. It's the
Station's job to hand off
Signals to its
Microwaves after consumption from a
Microwave implements a
Worker interface. You can extend the
Microwave abstract class directly to create your worker, or simply implement the
Worker interface yourself directly. Each
Microwave (worker) then has to implement four basic methods. The
hasDesignation methods. The
hasDesignation method must always return a boolean
false and accepts a
Signal argument. If
Signal is handed to that
onReceive callback method for processing.
It is possible to connect multiple
Satellites together by having one
Microwave hand messages off to antoher
Satellite via another
Beacon's may all talk to the same
Satellite. This allows you to build distributed topologies within the existing primitive topology.
Since webSAT is completely isolated from your application layer, it does not introduce any meaningful overhead either on the client or server side. While Google Analytics is a great tool it can not give you certain insight into how your individual webservers are performing (such as TTFB). webSAT utilizes ZeroMQ and TCP/IP, which makes it a portable and versatile solution that can be used in a distributed fashoin across multiple servers.
Since nginx does not support a piped logging facility such as that of Apache httpd, you will need to write an additional script to tail and pipe your access log for you. You get this for free with Apache httpd, however (no additional moving parts). For now the library does not offer this facility to you directly, but there are plenty of tools out there that support this - see mkfifo(1), for example.
Quick Start Guide
The easiest way to use webSAT is through Apache httpd's piped logging facility. Simply clone the library's git repository onto your server, then place the following line into your vhost file or httpd.conf file:
CustomLog "|/path/to/webSat/producer.php" combined
Be sure to modify to your local path. The producer script will then take the access log data Apache
httpd spits out to STDIN and pipes it into the queue. You can then run the consumer script in
supervisord and have the data aggregated in Memcached.
The aggregate data has a granularity of 1 minute intervals. You can tweak this from the consumer's worker script if you'd like, but this is usually sufficient for high-load environments as it's more memory, I/O, and CPU efficient.