Commits on Jul 1, 2012
  1. Handle all OSError when spawning a collector.

    Recently ran into `OSError: [Errno 26] Text file busy' while
    live-editing a collector, which made the main thread die.  :(
    tsuna committed with tsuna Jul 1, 2012
Commits on Jun 22, 2012
  1. Merging ZFS collectors for tcollector

    Manuel Amador (Rudd-O) committed Jun 22, 2012
Commits on Apr 24, 2012
  1. Handle errors in tcollector related to failure to spawn collectors.

    Manuel Amador (Rudd-O) committed with tsuna Apr 24, 2012
Commits on Mar 30, 2012
  1. Better check logging flags. Increase default log size.

    tsuna committed Mar 30, 2012
  2. Limit the length of a line read from collectors.

    TSD won't accept any data point that doesn't fit in 1024 bytes anyway,
    so we may as well drop them early while in tcollector.
    tsuna committed with tsuna Mar 27, 2012
Commits on Mar 29, 2012
  1. Add RotatingFileHandler handler options to tcollector.

    This adds three new flags --max-bytes, --backup-count, and --logfile.
    Chad Rhyner committed with tsuna Mar 15, 2012
Commits on Mar 27, 2012
  1. Properly handle JMX TabularData.

    tsuna committed Mar 27, 2012
  2. Don't double-print the value when dealing with arrays.

    tsuna committed Mar 27, 2012
Commits on Mar 9, 2012
  1. Close stdin if we don't need it.

    tsuna committed Mar 9, 2012
  2. Rename a service introduced in HBase 0.92.1.

    tsuna committed Mar 9, 2012
  3. Catch some more invalid lines and report them instead of dying.

    tsuna committed with tsuna Mar 9, 2012
Commits on Feb 8, 2012
  1. Fix jmx path

    code left in during testing
    davebarr committed Feb 8, 2012
  2. Add hadoop datanode collector

    davebarr committed with tsuna Feb 8, 2012
Commits on Oct 17, 2011
  1. Handle uncaught exceptions in the SenderThread.

    Allow up to 100 uncaught exceptions in a row for common kinds of
    exceptions that aren't too bad.  Other exceptions, or an excessive
    number of uncaught exceptions, will cause tcollector to shutdown.
    All uncaught exceptions in the SenderThread are now logged.
    tsuna committed Oct 17, 2011
Commits on Oct 13, 2011
  1. Add a Redis collector.

    This collector gathers data from local Redis servers.  This requires
    the Redis module for Python.  We use netstat to look for 'redis-server'
    processes running on the local machine, since many people run multiple
    Redis servers per box.
    It is also suggested you put a hint in your Redis configuration file to
    tell this collector a logical 'cluster' name.  This helps if you have
    several Redis instances on different hosts and you want to be able to
    aggregate the data.
    zorkian committed with tsuna Oct 10, 2011
  2. Add a Riak collector.

    This collector is for the Riak distributed database.  It uses the stats
    JSON object to parse out data and create some timeseries.
    This expects /usr/lib/riak to exist (it does by default) and it uses the
    default ports.  This expects you to only be running one Riak instance on
    a machine.  This also only collects stats from the local machine -- you
    will need to run collectors on every machine you use as a Riak node.
    zorkian committed with tsuna Oct 10, 2011
  3. Only print slave status if we have slaves in our setup.

    This closes #24.
    Alex Newman committed with tsuna Oct 7, 2011
Commits on Sep 17, 2011
  1. MySQL collectors: ignore search directories that don't exist.

    tsuna committed Sep 17, 2011
  2. Fix a couple variable names broken in the last change.

    Yay for unsafe languages that blow up at the last minute.
    tsuna committed Sep 17, 2011
Commits on Sep 16, 2011
  1. IPv6 support: Use `getaddrinfo' to resolve the TSD's host.

    This way it works with IPv6 hosts and can work with DNS entries that
    have multiple A or AAAA records.
    spark404 committed with tsuna Sep 16, 2011
Commits on Aug 21, 2011
  1. Collect more internal metrics from InnoDB.

    Run `SHOW ENGINE INNODB STATUS' and parse the output to extract some
    of the metrics.  Some InnoDB metrics are exposed in SHOW GLOBAL STATUS,
    but many are not.
    This adds 14 new metrics for InnoDB.  We can add more in the future.
    tsuna committed with tsuna Aug 13, 2011
  2. Refresh the timestamp more frequently.

    This is in case a command takes a significant amount of time.
    tsuna committed with tsuna Aug 13, 2011
  3. Detect when the InnoDB engine is used.

    tsuna committed with tsuna Aug 12, 2011
  4. Fix the metric name used for InnoDB mutex locks.

    The metric name ought to be `mysql.innodb.locks'.
    tsuna committed with tsuna Aug 12, 2011
Commits on Aug 17, 2011
  1. Handle the output of "SHOW PROCESSLIST" from MySQL 5.5.

    New columns have been added in 5.5, just ignore them.
    tsuna committed with tsuna Aug 17, 2011
Commits on Aug 16, 2011
  1. Make sure there are no spaces in the `state' tag.

    tsuna committed with tsuna Aug 16, 2011
Commits on Aug 12, 2011
  1. Add a basic collector for ElasticSearch.

    The collector comes with 38 metrics about ElasticSearch server instances
    as well as 8 additional cluster-wide metrics collected from the master
    node.  Most metrics are system-level metrics, because right now ES
    doesn't have many serving statistics.  We don't collect per-index
    metrics at this time, because many indices are named dynamically and
    we would need a way of canonicalizing index names.
    tsuna committed with tsuna Aug 11, 2011
  2. Add a collector for MySQL.

    The collector includes about 300 metrics about MySQL (when InnoDB
    is used).  Most metrics are collected through `SHOW GLOBAL STATUS'.
    The collector has a configuration file, `', in which
    the user / password to use to connect to MySQL must be specified.
    The collector has limited support for MySQL 5.0, because in that
    version of MySQL running the command `SHOW GLOBAL STATUS' has a
    big impact on the performance of the database.  Hopefully almost
    everyone uses at least MySQL 5.1 these days.
    tsuna committed Aug 12, 2011
Commits on Jul 14, 2011
  1. Add the ability to pass additional tags with `startstop'.

    It may be necessary to pass additional tags when running tcollector.
    In our case we are monitoring host level OpenStack systems, and
    want to roll up into availability zones and hypervisor type.
    ./startstop start -t az=paloalto0 hv=kvm
    retr0h committed with tsuna Jul 14, 2011
Commits on Jun 21, 2011
  1. Fix issue #2 using ALIVE flag for graceful termination of threads.

    Nikolay Botev committed with tsuna May 18, 2011
Commits on Jun 17, 2011
  1. Use '/' in tag values for mount points.

    Implements OpenTSDB feature request #14.
    Jari Takkala committed with tsuna Mar 31, 2011
  2. Allow `/' in metric names, tag keys and values.

    Jari Takkala committed with tsuna Apr 4, 2011
  3. Evict old keys from the de-dup cache.

    For every combination of (metric, tags), collectors remember what was
    the last value they saw so that they can remove duplicate values (or,
    in the future, perform RLE encoding).  If there are loads of different
    combinations of (metric, tags) changing over time, this can lead to
    excessive memory consumption because old values are never evicted from
    this "de-dup cache".
    This change adds a flag, --evict-interval (set to 6000 seconds = 1h40m
    by default), to put an upper bound on how long a collector will remember
    the last value seen for a specific combination of (metric, tags).
    This change is based on a contribution of Kai Ren <kair at>.
    tsuna committed Jun 16, 2011
  4. Simplify a call to pgrep.

    tsuna committed with tsuna Jun 15, 2011
  5. Fix restart in case pidfile is stale

    If the pidfile is stale, tcollector won't get restarted
    if the pid got reused.
    Don't do simplistic checks of the pid running but just
    always use the pgrep check.
    Removes blindly using the pidfile in stop() too.
    Should just get rid of using the pidfile, but currently
    it's required as part of  The remaining
    use case is the forcerestart logic, but we could get
    the same functionality from looking at start time from
    davebarr committed with tsuna Jun 15, 2011