Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add VictoriaMetrics support #96

Merged
merged 15 commits into from
May 11, 2020
Merged

Conversation

hagen1778
Copy link
Contributor

This PR adds support for opensource timeseries database VictoriaMetrics.

VM supports InfluxDB ingestion protocol, so tsbs_load_victoriametrics is very similar to tsbs_load_influx.

VictoriaMetrics supports PromQL which has some limitations comparing to SQL. Because of this VictoriaMetrics query generator lacks for implementation of query types groupby-orderby-limit and lastpoint for devops use-case and all queries for iot use-case. But for other cases it still may be comparable with other databases.

We also believe this PR may help with adding of Prometheus to the list of supported databases.

* add VictoriaMetrics generator

* add VictoriaMetrics loader

* add VictoriaMetrics query generator

* support multiple URLs for loading data into VM

* fix loader workers support

* add VictoriaMetrics query runner

* fix query generator

* use default HTTP client

* mention VictoriaMetrics in README

* add load script and extend documentation

* address review comments
@codecov-io
Copy link

codecov-io commented Oct 4, 2019

Codecov Report

Merging #96 into master will increase coverage by 0.17%.
The diff coverage is 74.11%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #96      +/-   ##
==========================================
+ Coverage   69.55%   69.72%   +0.17%     
==========================================
  Files         104      110       +6     
  Lines        5810     5982     +172     
==========================================
+ Hits         4041     4171     +130     
- Misses       1690     1722      +32     
- Partials       79       89      +10
Impacted Files Coverage Δ
internal/inputs/utils.go 78.26% <ø> (ø) ⬆️
cmd/tsbs_load_victoriametrics/creator.go 0% <0%> (ø)
internal/inputs/generator_data.go 88.96% <100%> (+0.14%) ⬆️
cmd/tsbs_load_victoriametrics/main.go 29.03% <29.03%> (ø)
internal/inputs/generator_queries.go 72.45% <33.33%> (-0.72%) ⬇️
cmd/tsbs_load_victoriametrics/processor.go 73.91% <73.91%> (ø)
cmd/tsbs_load_victoriametrics/scan.go 75% <75%> (ø)
...nerate_queries/databases/victoriametrics/common.go 91.3% <91.3%> (ø)
...nerate_queries/databases/victoriametrics/devops.go 96.66% <96.66%> (ø)
... and 8 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5c4f0c1...f3e0002. Read the comment docs.

@RobAtticus
Copy link
Member

Hi, thanks for this, we'll put it on our radar to review!

@tenmozes
Copy link

@RobAtticus could you please take a look?

@RobAtticus
Copy link
Member

I've been out at conferences, so I might be able to next week. @blagojts if you have some spare cycles, make take an initial pass?

@syepes
Copy link

syepes commented Oct 29, 2019

+1

� Conflicts:
�	internal/inputs/utils.go
@hagen1778
Copy link
Contributor Author

Hi @RobAtticus! Any chance to get it reviewed?

@tenmozes
Copy link

tenmozes commented Dec 5, 2019

Hi folks, any updates on this PR?

� Conflicts:
�	README.md
�	internal/inputs/generator_queries.go
�	internal/inputs/utils.go
@hagen1778
Copy link
Contributor Author

Hello! Any updates on this?

@freeseacher
Copy link

Hi! any news ?

// {__name__=~'cpu_.*', hostname=~"hostname1|hostname2...|hostnameN"}[12h]
// )
// ) by (hostname) > 90
func (d *Devops) HighCPUForHosts(qq query.Query, nHosts int) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be returning very different data than infuxdb -- the 12h grouping here isn't present in the equivalent influx or timescale query. Thus those systems return several hundred MB of JSON whereas Victoria is returning only two data points per host, making it way faster.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for noticing! I removed this case since it is unclear how to support it via MetricsQL/PromQL. The query model assumes step (grouping by time) to be always set. So to satisfy this case we need to pass log-interval flag value used for generating data which we don't know when generating queries. I hope other cases are consistent and comparable with other databases. Thank you again!

…atabase since it can't be supported via MetricsQL query language due to data model specifics
Copy link
Member

@RobAtticus RobAtticus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First, let me apologize for the delay in reviewing this. I've been focused on a different effort at the company so this stuff unfortunately did not make it to the top of my queue, but I should have found someone to do it. Thanks for dealing with the other comments people had.

I just have a few minor suggestions, mostly around documentation.

README.md Outdated

¹ Does not support the `groupby-orderby-limit` query
² Does not support the `groupby-orderby-limit` and `lastpoint` queries
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needs to include that it does not support high-cpu-1 and high-cpu-all

@@ -0,0 +1,127 @@
// tsbs_run_queries_clickhouse speed tests ClickHouse using requests from stdin or file.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment needs to be updated

types for `devops` use-case:
* `groupby-orderby-limit` - results are always ordered by time and can't be limited;
* `lastpoint` - can't be queried if datapoint is older than 5 minutes;
* `high-cpu` - can't be queried without grouping by step.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high-cpu-1/high-cpu-all

@hagen1778
Copy link
Contributor Author

Thx for review @RobAtticus! Addressed the comments. Pls let me know if there is anything else I can do.

@hagen1778
Copy link
Contributor Author

@RobAtticus ping

@AlekseyKanaev
Copy link
Contributor

Any updates on the PR?

@RobAtticus RobAtticus merged commit 09e0e8a into timescale:master May 11, 2020
@RobAtticus
Copy link
Member

Merged! Thanks again for your patience

@valyala valyala deleted the master2 branch May 12, 2020 19:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants