Small Perl script that aims to replace and outperform pgFouine
Pull request Compare This branch is 587 commits behind dalibo:master.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.


    pgBadger - a fast PostgreSQL log analysis report

    pgbadger [options] logfile [...]

            PostgreSQL log analyzer with fully detailed reports and charts.


        logfile can be a single log file, a list of files, or a shell command
        returning a list of file. If you want to pass log content from stdin
        use - as filename. Note that input from stdin will not work with csvlog.


        -a | --average minutes : number of minutes to build the average graphs of
                                 queries and connections.
        -b | --begin datetime  : start date/time for the data to be parsed in log.
        -c | --dbclient host   : only report what concern the given client host.
        -C | --nocomment       : remove comments like /* ... */ from queries.
        -d | --dbname database : only report what concern the given database
        -e | --end datetime    : end date/time for the data to be parsed in log.
        -f | --format logtype  : possible values: syslog,stderr,csv. Default: stderr.
                                 There's also syslog-ng for traces with [ID *]
        -G | --nograph         : disable graphs on HTML output. Enable by default.
        -h | --help            : show this message and exit.
        -i | --ident name      : programname used as syslog ident. Default: postgres
        -l | --last-parsed file: allow incremental log parsing by registering the
                                 last datetime and line parsed. Useful if you want
                                 to watch errors since last run or if you want one
                                 report per day with a log rotated each week.
        -m | --maxlength size  : maximum length of a query, it will be cutted above
                                 the given size. Default: no truncate
        -n | --nohighlight     : disable SQL code highlighting.
        -N | --appname name    : only report what concern the given application name
        -o | --outfile filename: define the filename for the output. Default depends
                                 of the output format: out.html, out.txt or out.tsung.
                                 To dump output to stdout use - as filename.
        -p | --prefix string   : give here the value of your custom log_line_prefix
                                 defined in your postgresql.conf. You may used it only
                                 if don't have the default allowed formats ot to use
                                 other custom variables like client ip or application
                                 name. See examples below.
        -P | --no-prettify     : disable SQL queries prettify formatter.
        -q | --quiet           : don't print anything to stdout, even not a progress bar.
        -s | --sample number   : number of query sample to store/display. Default: 3
        -s | --select-only     : use it if you want reports on select queries only.
        -t | --top number      : number of query to store/display. Default: 20
        -T | --title string    : change title of the HTML page report.
        -u | --dbuser username : only report what concern the given user
        -U | --exclude-user username : make report with exclude specified user.
        -v | --verbose         : enable verbose or debug mode. Disabled by default.
        -V | --version         : show pgBadger version and exit.
        -w | --watch-mode      : only report errors just like logwatch could do.
        -x | --extension       : output format. Values: text, html or tsung. Default: html
        -z | --zcat exec_path  : set the full path to the zcat program. Use it if
                                 zcat or bzcat or unzip is not on your path.
        --pie-limit num        : pie data lower than num% will show a sum instead.
        --exclude-query regex  : any query matching the given regex will be excluded
                                 from the report. For example: "^(VACUUM|COMMIT)"
                                 you can use this option multiple time.
        --exclude-file filename: path of the file which contains all the regex to use
                                 to exclude queries from the report. One regex per line.
        --include-query regex  : any query that do not match the given regex will be
                                 excluded from the report. For example: "(table_1|table_2)"
                                 you can use this option multiple time.
        --include-file filename: path of the file which contains all the regex of the
                                 queries to include from the report. One regex per line.
        --disable-error        : do not generate error report.
        --disable-hourly       : do not generate hourly reports.
        --disable-type         : do not generate query type report.
        --disable-query        : do not generate queries reports (slowest, most
                                 frequent, ...).
        --disable-session      : do not generate session report.
        --disable-connection   : do not generate connection report.
        --disable-lock         : do not generate lock report.
        --disable-temporary    : do not generate temporary report.
        --disable-checkpoint   : do not generate checkpoint report.
        --enable-log_duration  : force pgbadger to use log_duration even if
                                 log_min_duration_statement format is autodetected.
        --enable-log_min_duration: force pgbadger to use log_min_duration even if
                                 log_duration format is autodetected.


            pgbadger /var/log/postgresql.log
            pgbadger /var/log/postgres.log.2.gz /var/log/postgres.log.1.gz /var/log/postgres.log
            pgbadger /var/log/postgresql/postgresql-2012-05-*
            pgbadger --exclude-query="^(COPY|COMMIT)" /var/log/postgresql.log
            pgbadger -b "2012-06-25 10:56:11" -e "2012-06-25 10:59:11" /var/log/postgresql.log
            cat /var/log/postgres.log | pgbadger -
            # log prefix with stderr log output
            perl pgbadger --prefix '%t [%p]: [%l-1] user=%u,db=%d,client=%h' \
            perl pgbadger --prefix '%m %u@%d %p %r %a : ' /pglog/postgresql.log
            # Log line prefix with syslog log output
            perl pgbadger --prefix 'user=%u,db=%d,client=%h,appname=%a' \

    Generate Tsung sessions XML file with select queries only:

        perl pgbadger -S -o sessions.tsung --prefix '%t [%p]: [%l-1] user=%u,db=%d ' /pglog/postgresql-9.1.log

    Reporting errors every week by cron job:

        30 23 * * 1 /usr/bin/pgbadger -q -w /var/log/postgresql.log -o /var/reports/pg_errors.html

    Generate report every week using incremental behavior:

        0 4 * * 1 /usr/bin/pgbadger -q `find /var/log/ -mtime -7 -name "postgresql.log*"` \
            -o /var/reports/pg_errors-`date +%F`.html -l /var/reports/pgbadger_incremental_file.dat

    This suppose that your log file and HTML report are also rotated every

    pgBadger is a PostgreSQL log analyzer built for speed with fully
    detailed reports from your PostgreSQL log file. It's a single and small
    Perl script that aims to replace and outperform the old php script

    By the way, we would like to thank Guillaume Smet for all the work he
    has done on this really nice tool. We've been using it a long time, it
    is a really great tool!

    pgBadger is written in pure Perl language. It uses a javascript library
    to draw graphs so that you don't need additional Perl modules or any
    other package to install. Furthermore, this library gives us additional
    features, such as zooming.

    pgBadger is able to autodetect your log file format (syslog, stderr or
    csvlog). It is designed to parse huge log files, as well as gzip, zip or
    bzip2 compressed files. See a complete list of features below.

    pgBadger reports everything about your SQL queries:

            Overall statistics.
            The slowest queries.
            Queries that took up the most time.
            The most frequent queries.
            The most frequent errors.

    The following reports are also available with hourly charts:

            Hourly queries statistics.
            Hourly temporary file statistics.
            Hourly checkpoints statistics.
            Locks statistics.
            Queries by type (select/insert/update/delete).
            Sessions per database/user/client.
            Connections per database/user/client.

    All charts are zoomable and can be saved as PNG images. SQL queries
    reported are highlighted and beautified automatically.

    PgBadger comes as a single Perl script- you do not need anything else
    than a modern Perl distribution. Charts are rendered using a Javascript
    library so you don't need anything. Your browser will do all the work.

    If you planned to parse PostgreSQL CSV log files you might need some
    Perl Modules:

            Text::CSV_XS - to parse PostgreSQL CSV log files.

    This module is optional, if you don't have PostgreSQL log in the CSV
    format you don't need to install it.

    Compressed log file format is autodetected following the file exention.
    If pgbadger find a gz extention it will use the zcat utility, with a bz2
    extention it will use bzcat and if the file extention is zip then the
    unzip utility will be used.

    If those utilities are not found in the PATH environment variable then
    use the --zcat command line otption to change this path. For exemple:

            --zcat="/usr/local/bin/gunzip -c" or --zcat="/usr/local/bin/bzip2 -dc"
            --zcat="C:\tools\unzip -p"

    By default pgBadger will use the zcat, bzcat and unzip utilities
    following the file extension. If you use the default autodetection
    compress format you can mixed gz, bz2 or zip files. Specifying a custom
    value to --zcat option will removed this feature of mixed compressed

    You must enable some configuration directives in your postgresql.conf
    before starting.

    You must first enable SQL query logging to have something to parse:

            log_min_duration_statement = 0

    Note that pgBadger is not compatible with statements logs provided by
    log_statement and log_duration. See next chapter for more information.

    With 'stderr' log format, log_line_prefix must be at least:

            log_line_prefix = '%t [%p]: [%l-1] '

    Log line prefix could add user and database name as follows:

            log_line_prefix = '%t [%p]: [%l-1] user=%u,db=%d '

    or for syslog log file format:

            log_line_prefix = 'user=%u,db=%d '

    Log line prefix for stderr output could also be:

            log_line_prefix = '%t [%p]: [%l-1] db=%d,user=%u '

    or for syslog output:

            log_line_prefix = 'db=%d,user=%u '

    You need to enable other parameters in postgresql.conf to get more
    informations from your log files:

            log_checkpoints = on
            log_connections = on
            log_disconnections = on
            log_lock_waits = on
            log_temp_files = 0

    Do not enable log_statement and log_duration, their log format will not
    be parsed by pgBadger.

    Of course your log messages should be in english without locale support:


    but this is not only recommanded by pgbadger.

log_min_duration_statement versus log_duration
    If you want full statistics reports from your log file you must set
    log_min_duration_statement = 0. If you just want to report duration and
    number of queries and don't want all detail about queries, set
    log_min_duration_statement to -1 and enable log_duration. If you want to
    report only queries that tooks more than 5 seconds for example but still
    want to report all queries duration and number of queries you will need
    to generate 2 reports. One using the log_min_duration_statement and the
    second using the log_duration. Proceed as follow:

            pgbadger --enable-log_duration /var/log/postgresql.log

    to generate hourly statistics about the number of queries and duration
    stats. To generate detailled reports about queries use the following

            pgbadger --enable-log_min_duration /var/log/postgresql.log

    Note that enabling log_statement will not help at all and enabling those
    two options in the same command will report an error.

    Use those options if you don't want to log each queries to preserve your
    I/O but still want to know the slowest queries upper a certain time and
    still have a report on the number of queries and their duration.
    Otherwize if you don't have too much performance lost with a
    log_min_duration_statement set to 0, do not enable log_duration in your
    postgresql.conf file.

    Download the tarball from github and unpack the archive as follow:

            tar xzf pgbadger-1.x.tar.gz
            cd pgbadger-1.x/
            perl Makefile.PL
            make && sudo make install

    This will copy the Perl script pgbadger in /usr/local/bin/pgbadger
    directory by default and the man page into
    /usr/local/share/man/man1/pgbadger.1. Those are the default installation
    directory for 'site' install.

    If you want to install all under /usr/ location, use INSTALLDIRS='perl'
    as argument of Makefile.PL. The script will be installed into
    /usr/bin/pgbadger and the manpage into /usr/share/man/man1/pgbadger.1.

    For example, to install everything just like Debian does, proceed as

            perl Makefile.PL INSTALLDIRS=vendor

    By default INSTALLDIRS is set to site.

    pgBadger is an original work from Gilles Darold. It is maintained by the
    good folks at Dalibo and every one who wants to contribute.

    pgBadger is free software distributed under the PostgreSQL Licence.