![weblog] (http://halobates.de/weblog)
Simple web log analysis tools
Originally written long ago.
Written for apache, but should work with all web servers generating standard logs. This is a light weight alternative to more complex database based setups. The output is simple ASCII that can be easily processed further in shell pipelines. They can be also combined for more complex tasks.
One big advantage is simplicity, so can be easily adapted for specific purposes.
It uses multiple passes through the logs and will likely not scale to large logs. However it works quite well for moderate sized logs.
searchterms may need occasional updates for the latest URLs generated by search engines
To use add the directory to your $PATH export PATH=dir/webtools or call weball with an absolute name to let it set the path.
If you don't know how to use it just run weball
Dependencies: Perl-TimeDate (yum install Perl-TimeDate or similar)
Tools:
run all analyses on a log
somewhat slow for larger logs, because it does many passes N=NUM weball log print NUM top entries
extract search engine search terms from a http log referer logging needs to be enabled options: -n list numeric ips in front -c add search engine domain name -u print target -U print URLs typed into search engine -p print position in search engine results (or 0)
print referer from a httpd access log
remove search machine crawls from a http logfile
display search engines
print hits for pages
Identify individual visitors
print user agents from a log
print total bytes transferred
accumulate on fieldnum field default field 1
For example to track countries of search engine searchterms -c log | accumulate
print errors from a weblog
To get top errors weberrors log | accumulate 2 | head
resolve hosts in a weblog
print time range in a log
Andi Kleen