Performs normalised levenshtein distance calculations on log entries to reduce repeated data...
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.


I was inspired by reading the blog post here: It mentioned a tool for reducing log file data sets using Levenstein distance, with the resulting output being greatly reduced, thus allowing easier spotting of anomalies. The tool was only supplied in binary format so I wrote my own version.

Example Reduction

Test file: syslog
Lines Before: 698
Line After: 49

Test File: dpkg.log
Lines Before: 26602
Line After: 34