Skip to content

Single server VictoriaMetrics

Aliaksandr Valialkin edited this page Mar 18, 2019 · 28 revisions

Free single-node VictoriaMetrics

VictoriaMetrics is a long-term remote storage for Prometheus. It is available in binary releases and in docker images.

Single-node version supports all the SaaS features except of:

  • Clustering - it runs on a single server
  • Multi-tenancy - it stores all the data in a single datastore with the given -retentionPeriod

Prominent features

  • Supports Prometheus querying API, so it can be queried by Prometheus datasource in Grafana. Additionally, VictoriaMetrics extends PromQL with opt-in useful features.
  • High performance and good scalability for both inserts and selects. Outperforms InfluxDB and TimescaleDB.
  • Uses 10x less RAM than InfluxDB when working with millions of unique time series (aka high cardinality).
  • High data compression, so more metric values may be crammed into a limited storage comparing to TimescaleDB and InfluxDB.
  • Optimized for storage with high-latency IO and low iops (HDD and network storage in AWS, Google Cloud, Microsoft Azure, etc). See graphs from these benchmarks.
  • Easy operation:
    • VictoriaMetrics consists of a single executable without external dependencies.
    • All the configuration is done via explicit command-line flags.
    • All the data is stored in a single directory pointed by -storageDataPath flag.
    • Easy backups from instant snapshots.
  • Storage is protected from corruption on unclean shutdown (i.e. hardware reset or kill -9) thanks to the architecture.
  • Supports metrics' ingestion via the following protocols:

Operation

Table of contents

How to start VictoriaMetrics

Just start VictoriaMetrics executable or docker image with the desired command-line flags.

The following command line flags are used the most:

  • -storageDataPath - path to data directory. VictoriaMetrics stores all the data in this directory.
  • -retentionPeriod - retention period in months for the data. Older data is automatically deleted.
  • -httpListenAddr - TCP address to listen to for http requests. By default it listens port 8428 on all the network interfaces.
  • -graphiteListenAddr - TCP and UDP address to listen to for Graphite data. By default it is disabled.

Pass -help to see all the available flags with description and default values.

Prometheus setup

Add the following lines to Prometheus config file (it is usually located at /etc/prometheus/prometheus.yml):

remote_write:
  - url: http://<victoriametrics-addr>:8428/api/v1/write
    queue_config:
      max_samples_per_send: 10000

Substitute <victoriametrics-addr> with the hostname or IP address of VictoriaMetrics. Then apply the new config via the following command:

kill -HUP `pidof prometheus`

Prometheus writes incoming data to local storage and to remote storage in parallel. This means the data remains available in local storage for --storage.tsdb.retention.time duration if remote storage stops working.

If you plan sending data to VictoriaMetrics from multiple Prometheus instances, then add the following lines into global section of Prometheus config:

global:
  external_labels:
    datacenter: dc-123

This instructs Prometheus to add datacenter=dc-123 label to each time series sent to remote storage. The label name may be arbitrary - datacenter is just an example. The label value must be unique across Prometheus instances, so time series may be filtered and grouped by this label.

Grafana setup

Create Prometheus datasource in Grafana with the following Url:

http://<victoriametrics-addr>:8428

Substitute <victoriametrics-addr> with the hostname or IP address of VictoriaMetrics.

Then build graphs with the created datasource using Prometheus query language. VictoriaMetrics supports native PromQL and extends it with useful features.

How to send data from InfluxDB-compatible agents such as Telegraf?

Just use http://<victoriametric-addr>:8428 url instead of InfluxDB url in agents' configs. For instance, put the following lines into Telegraf config, so it sends data to VictoriaMetrics instead of InfluxDB:

[[outputs.influxdb]]
  urls = ["http://<victoriametrics-addr>:8428"]

Do not forget substituting <victoriametrics-addr> with the real address where VictoriaMetrics runs.

How to send data from Graphite-compatible agents such as StatsD?

  1. Enable Graphite receiver in VictoriaMetrics by setting -graphiteListenAddr command line flag. For instance, the following command will enable Graphite receiver in VictoriaMetrics on TCP and UDP port 2003:
/path/to/victoria-metrics-prod ... -graphiteListenAddr=:2003
  1. Use the configured address in Graphite-compatible agents. For instance, set graphiteHost to the VictoriaMetrics host in StatsD configs.

How to apply new config / upgrade VictoriaMetrics?

VictoriaMetrics must be restarted in order to upgrade or apply new config:

  1. Send SIGINT signal to VictoriaMetrics process in order to gracefully stop it.
  2. Wait until the process stops. This can take a few seconds.
  3. Start the upgraded VictoriaMetrics with new config.

How to work with snapshots?

Navigate to http://<victoriametrics-addr>:8428/snapshot/create in order to create an instant snapshot. The page will return the following JSON response:

{"status":"ok","snapshot":"<snapshot-name>"}

Snapshots are created under <-storageDataPath>/snapshots directory, where <-storageDataPath> is the command-line flag value. Snapshots can be archived to backup storage via rsync -L, scp -r or any similar tool that follows symlinks during copying.

The http://<victoriametrics-addr>:8428/snapshot/list page contains the list of available snapshots.

Navigate to http://<victoriametrics-addr>:8428/snapshot/delete?snapshot=<snapshot-name> in order to delete <snapshot-name> snapshot.

Navigate to http://<victoriametrics-addr>:8428/snapshot/delete_all in order to delete all the snapshots.

How to delete time series?

Send a request to http://<victoriametrics-addr>:8428/api/v1/admin/tsdb/delete_series?match[]=<timeseries_selector_for_delete>, where <timeseries_selector_for_delete> may contain any time series selector for metrics to delete. After that all the time series matching the given selector are deleted. Storage space for the deleted time series isn't freed instantly - it is freed during subsequent merges of data files.

How to export time series?

Send a request to http://<victoriametrics-addr>:8428/api/v1/export?match[]=<timeseries_selector_for_export>, where <timeseries_selector_for_export> may contain any time series selector for metrics to export. The response would contain all the data for the selected time series in JSON streaming format. Each JSON line would contain data for a single time series. An example output:

{"metric":{"__name__":"up","job":"node_exporter","instance":"localhost:9100"},"values":[0,0,0],"timestamps":[1549891472010,1549891487724,1549891503438]}
{"metric":{"__name__":"up","job":"prometheus","instance":"localhost:9090"},"values":[1,1,1],"timestamps":[1549891461511,1549891476511,1549891491511]}

Optional start and end args may be added to the request in order to limit the time frame for the exported data. These args may contain either unix timestamp in seconds or RFC3339 values.

Federation

VictoriaMetrics exports Prometheus-compatible federation data at http://<victoriametrics-addr>:8428/federate?match[]=<timeseries_selector_for_federation>.

Optional start and end args may be added to the request in order to scrape the last point for each selected time series on the [start ... end] interval. start and end may contain either unix timestamp in seconds or RFC3339 values. By default the last point on the interval [now - max_lookback ... now] is scraped for each time series. Default value for max_lookback is 5m (5 minutes), but can be overriden. For instance, /federate?match[]=up&max_lookback=1h would return last points on the [now - 1h ... now] interval. This may be useful for time series federation with scrape intervals exceeding 5m.

Capacity planning

Rough estimation of the required resources:

  • RAM size: less than 1KB per active time series. So, ~1GB of RAM is required for 1M active time series. Time series is considered active if new data points have been added to it recently or if it has been recently queried. VictoriaMetrics stores various caches in RAM. Memory size for these caches may be limited with -memory.allowedPercent flag.
  • CPU cores: a CPU core per 300K inserted data points per second. So, ~4 CPU cores are required for processing the insert stream of 1M data points per second.
  • Storage size: less than a byte per data point on average. So, ~260GB is required for storing a month-long insert stream of 100K data points per second. The actual storage size heavily depends on data randomness (entropy). Higher randomness means higher storage size requirements.

High availability

  1. Install multiple VictoriaMetrics instances in distinct datacenters.
  2. Add addresses of these instances to remote_write section in Prometheus config:
remote_write:
  - url: http://<victoriametrics-addr-1>:8428/api/v1/write
    queue_config:
      max_samples_per_send: 10000
  # ...
  - url: http://<victoriametrics-addr-N>:8428/api/v1/write
    queue_config:
      max_samples_per_send: 10000
  1. Apply the updated config:
kill -HUP `pidof prometheus`
  1. Now Prometheus should write data into all the configured remote_write urls in parallel.
  2. Set up Promxy in front of all the VictoriaMetrics replicas.
  3. Set up Prometheus datasource in Grafana that points to Promxy.

Multiple retentions

Just start multiple VictoriaMetrics instances with distinct values for the following flags:

  • -retentionPeriod
  • -storageDataPath, so the data for each retention period is saved in a separate directory
  • -httpListenAddr, so clients may reach VictoriaMetrics instance with proper retention

Scalability and cluster version

Though single-node VictoriaMetrics cannot scale to multiple nodes, it is optimized for resource usage - storage size / bandwidth / IOPS, RAM, CPU. This means that a single-node VictoriaMetrics may scale vertically and substitute moderately sized cluster built with competing solutions such as Thanos, Uber M3, InfluxDB or TimescaleDB.

So try single-node VictoriaMetrics at first and then contact us for cluster version of VictoriaMetrics if you still need horizontally scalable long-term remote storage for really large Prometheus deployments.

You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.