This project is a command line utility that can be used to monitor a file, '/tmp/access.log' by default, and report general statistics and high traffic events periodically.
Every 10 seconds, general statistics are reported (stdout) for all data appended to the file during those 10 seconds.
Every 120 seconds the associated data are analysed:
- whenever the total traffic exceeds a certain threshold, on average, a message 'High traffic' is reported.
- whenever the total traffic drops again below the threshold, on average, another message 'Back to normal traffic' is reported.
- both messages give details about the time when the threshold was crossed. The default value for the threshold is 10.0 requests per second.
Notice that two minutes worth of log data are kept in memory. Let's do some calculations:
-
If average log line size in bytes: 200.
-
When average number of requests per second: 1e6
200 * 1e6 * 120 = 24e9 bytes ~ 22.35 GB
-
When average number of requests per second: 10
200 * 10 * 120 = 24e3 bytes ~ 23.4 MB
Consider partitioning long period schedules into smaller schedules that aggregate.
The contents of the file are expected to be lines in CLF format.
To change defaults, please provide a list of arguments containing the keys to override, each followed by the value. For example:
./logpulse -gsp 5 -f logpulse-store/testLogs.log
will work on file 'logpulse-store/testLogs.log' and will report
general statistics every 5 seconds, instead of 10.
Available parameter keys:
-h | -help: Shows this text. This key takes not value.
-f | -file | -in: Path (absolute, or relative to the launch script's location)
of the file being processed.
Value type: text, default: /tmp/access.log
-gsp | -generalStatsPeriod: Sets the period (seconds) for general statistics
reporting.
Value type: int, default: 10
-tgp | -trafficGaugePeriod: Sets the period (seconds) for high traffic gauge's
reporting.
Value type: int, default: 120
-tgt | -trafficGaugeThreshold: Sets the threshold (requests per second on avg.
for the considered period) for high traffic reporting.
Value type: double, default: 10.0
Requirements: JDK17, Gradle (preferably 7.x.x, we use 7.4.2) are required.
To create the documentation: ./gradlew javadoc
To build the project: ./gradlew clean build
The main artefact resulting from the build can be found under: build/libs/logpulse-1.0-SNAPSHOT-all.jar, an uberjar containing all dependencies so that you may run the application with a command like:
java -Xmx1G -Dfile.encoding=UTF-8 -jar build/libs/logpulse-1.0-SNAPSHOT-all.jar -help
However, use the more convenient launch command:
./logpulse
The Main class contains:
public static void main(String[] args) {
Parameters parameters = Parameters.parseArgs(args);
Scheduler<CLF> scheduler = new Scheduler<>(new CLFReadoutHandler(parameters.file), false);
scheduler.setPeriodicSchedule(new GeneralStatsView(System.out));
scheduler.setPeriodicSchedule(new HighTrafficGauge(System.out));
Runtime.getRuntime().addShutdownHook(new Thread(() -> {
try {
scheduler.stop();
} catch (IllegalStateException ignore) {
// it means it is already stopped, likely because
// the parent folder has been deleted
}
}, "logpulse-shutdown-hook"));
scheduler.start();
}
which hints at the use of this project as a framework that gives you efficient readout of CLF log files from an ever growing file, and a mechanism to set periodic schedules to act on them.