The contains function may not be optimized #20931
Labels
area/flux
Issues related to the Flux query engine
area/performance
area/2.x
OSS 2.0 related issues and PRs
Environment info:
influxDB version: 2.0.3
System info: from docker
Debain, X86_64, 8-core Intel(R) Xeon(R) Platinum 8260 CPU @ 2.40GHz, 16GB RAM
Data describe:
BucketName: 15day_profile_bucket
MeasurementName: function_info
Tags: Function, Pid, Tid, ProcessName, UUID, State
Fields: Internal, cumulative
there may be 2.4w record and 200 series in 1 minute.
Problem:
The
contains
function query is very slow, it seems that the group key filter is not used.The following flux query took 0.63s:
this is image:
but the flux query took 37.88s:
this is image:
Expected behavior:
The time spent on the two queries differs too much.
Because UUID is a tag field, so the first flux query ( is filter first then limit), and the second query (is limit first then filter) should no big difference.
So I guess the
contains
function does not use the group key for filtering, but scans all the data。Use Case:
Our team used influxdb-v2, but that is the bottleneck of our project now.
I have tried to use multiple
or
operations to replacecontains
function, but when the number of filters is large(70+), the or operation is slower.The text was updated successfully, but these errors were encountered: