A scriptable/programmable query infrastructure to web log files. Initial thoughts is to import interesting records into an sqlite table for further analysis.
$ php ./cli_import 'example-dataset' data/access-log.2 data/access-log.3.gz
$ zcat /etc/apache2/logs/access.logs.1.gz | php cli_import.php 'example-dataset'
$ cat ./data/access.log.3 | php ./cli_import.php 'example-dataset'
$logfile = '/etc/apache2/logs/access';
$logexam = new LogExaminer('example_dataset');
$logexam->import(
$logfile,
array(
'date_from' => '2010-07-01T06:00',
'date_to' => '2010-07-01T09:00',
'ignore_ext' => array('css','js','gif','png','jpg','swf')
)
);
- Populate session data
- split URL into path and query string component fields -- URL in the urls table, query string in the log_entry table