Distributed, Fault-tolerant Metrics Processing
Blueflood is an open source distributed system designed to ingest and process time series data generated by other systems. It was created by the Rackspace Monitoring team at Rackspace to manage raw metrics generated from the Rackspace Monitoring system.
Data from Blueflood can be used to construct dashboards, generate reports, graphs or for any other use involving time series data.
Written in Java, Blueflood exists as a cluster of distributed services. Data is stored using Cassandra. Blueflood is fault-tolerant and highly available.
What Blueflood does?
Blueflood has three primary responsibilities: metric ingestion, rollup consolidation and data query.
Data arrives into the system and moves through a series of tranforms until it is written to the database.
Rollups process full-resolution data into coarser granularies of 5 minutes, 20 minutes, 60 minutes, 4 hours and 24 hours.
Blueflood offers a flexible query model so that you get the right data you need for your specific use case.
See Data Query for details
What Blueflood does not do?
Blueflood is not a data warehouse or ETL system
Data warehouses are designed to hold data for a long time and perform high latency batch processes against the data. Blueflood, on the other hand, keeps data for a fixed period of time before purging it automatically. This allows Blueflood to operate in near real time. The length of time Blueflood keeps data is configurable and is longer for coarser granularities.
Blueflood does not have a UI component
Blueflood is a time-series datastore with APIs. It currently does not have a UI component.
Instead, Blueflood integrates with existing metrics UI's, notably grafana and graphite (See Blueflood plugin for Grafana).
Blueflood is not a logging service
Blueflood support Annotations and numeric values. It is not a service where you can send log messages or status codes.