build your own log parser
Python
Latest commit c85ee18 Jan 10, 2013 @athoune PEP8 validation.
Permalink
Failed to load latest commit information.
bin Decent setup with yaml data. Feb 25, 2012
src PEP8 validation. Jan 10, 2013
test PEP8 validation. Jan 10, 2013
.gitignore ignorance is power Dec 24, 2010
README.md Better code example. Feb 24, 2012
setup.py with a script. Feb 25, 2012

README.md

Logator

Build your own log parser.

Installing it

python setup.py build
sudo python setup.py install

Using it

You need a source. Something wich iterate log line. The simplest way is STDIN and files, but you can also use syslogd protocol or more complex source.

For reading loglines, you need a reader. Reader is basically a regex with simple string manipulations. You can add dynamic getter for castly query (ip to country for example). Dynamic attributes are lazy loaded and memoized.

Query is done with filter, wich can be piped.

Result can be return as dict wich can be easily serialized if you wont to index it or storing it.

from logator.log import log
from logator.weblog import Common, UserAgent, HostByName, Filter_by_code, Filter_by_attribute
#The filter
filtr = Filter_by_code(200) | Filter_by_attribute('command', 'GET')
#The source
logs = open('/var/log/apache2/access.log', 'r')
#Lighttpd is the reader with two dynamic attributes reader : UserAgent, HostByName
for line in filtr.filter(logs, Lighttpd, UserAgent, HostByName):
    print line.as_dict()

User Agent parsing is stolen from Google code : http://code.google.com/p/ua-parser/.

The future

  • √ Filter
  • √ Dynamic attributes
  • √ Parsing http server log
  • _ Parsing mail log (postfix + amavis)
  • _ Reading stdin
  • _ Reading syslog protocol
  • _ Reading "à la" tail -f
  • _ Filling a mongo database
  • √ IP to country
  • _ Querying
  • _ Nice graph from stored data

Licence

MIT, 2012 © Mathieu Lecarme