Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vmui showing nonexistent datapoints #5516

Open
victoramsantos opened this issue Dec 21, 2023 · 3 comments
Open

vmui showing nonexistent datapoints #5516

victoramsantos opened this issue Dec 21, 2023 · 3 comments
Labels
bug Something isn't working vmui

Comments

@victoramsantos
Copy link
Contributor

Describe the bug

Hi, I'm facing a strange behavior with vmui. When looking with big time window (such as past 7d) I can see some continuous lines, however if I make a zoom the lines went away.

Looking through the past 7d
image

Zooming into some of the lines
image

BUT, if I go back (clicking in the <- of my Google chorme browser) That is the answer for the almost the same period
image

BUT, if I select disable cache
image

I have a discussion on this slack thread.

To Reproduce

Use the data:

{"metric":{"__name__":"aws_lambda_duration_maximum","environment":"staging","job":"aws_lambda","account":"ist","aws_region":"us-east-1","country":"ist","function_name":"custodian-service-quota-with-usage-metrics","kubernetes_namespace":"monitoring","kubernetes_pod_name":"cloudwatch-exporter-beta-8496b7c89c-c6bc7","monitor":"staging-global-green-prometheus","prototype":"global","stack_id":"green"},"values":[446003.51,446003.51,446003.51,446003.51,435151.56,435151.56,437618.85,437618.85,437618.85,437618.85],"timestamps":[1702720560000,1702720620000,1702720680000,1702720800000,1702807080000,1702807200000,1702893360000,1702893420000,1702893540000,1702893600000]}
{"metric":{"__name__":"aws_lambda_duration_maximum","environment":"staging","job":"aws_lambda","account":"ist","aws_region":"us-east-1","country":"ist","function_name":"custodian-service-quota-with-usage-metrics","kubernetes_namespace":"monitoring","kubernetes_pod_name":"cloudwatch-exporter-beta-8496b7c89c-27p57","monitor":"staging-global-green-prometheus","prototype":"global","stack_id":"green"},"values":[451812.04,451812.04],"timestamps":[1702979940000,1702980000000]}
{"metric":{"__name__":"aws_lambda_duration_maximum","environment":"staging","service":"cloudwatch-exporter","job":"aws_lambda","account":"ist","aws_region":"us-east-1","business_unit":"shared","country":"ist","function_name":"custodian-service-quota-with-usage-metrics","kubernetes_namespace":"monitoring","kubernetes_pod_name":"staging-global-green-cloudwatch-exporter-deployment-67c87dffx8r","monitor":"staging-global-green-prometheus","prototype":"global","squad":"reliability-metrics-tracing","stack_id":"green","tier":"useful"},"values":[433216.7,433216.7,433216.7],"timestamps":[1702375080000,1702375140000,1702375200000]}
{"metric":{"__name__":"aws_lambda_duration_maximum","environment":"staging","service":"cloudwatch-exporter","job":"aws_lambda","account":"ist","aws_region":"us-east-1","business_unit":"shared","country":"ist","function_name":"custodian-service-quota-with-usage-metrics","kubernetes_namespace":"monitoring","kubernetes_pod_name":"staging-global-green-cloudwatch-exporter-deployment-7d9b447cbpx","monitor":"staging-global-green-prometheus","prototype":"global","squad":"reliability-metrics-tracing","stack_id":"green","tier":"useful"},"values":[448435.47,448435.47,448435.47,448435.47,448435.47],"timestamps":[1702461360000,1702461420000,1702461480000,1702461540000,1702461600000]}
{"metric":{"__name__":"aws_lambda_duration_maximum","environment":"staging","service":"cloudwatch-exporter","job":"aws_lambda","account":"ist","aws_region":"us-east-1","business_unit":"shared","country":"ist","function_name":"custodian-service-quota-with-usage-metrics","kubernetes_namespace":"monitoring","kubernetes_pod_name":"staging-global-green-cloudwatch-exporter-deployment-585bbdcbw8v","monitor":"staging-global-green-prometheus","prototype":"global","squad":"reliability-metrics-tracing","stack_id":"green","tier":"useful"},"values":[480344.34,480344.34],"timestamps":[1702547880000,1702548000000]}
{"metric":{"__name__":"aws_lambda_duration_maximum","environment":"staging","service":"cloudwatch-exporter","job":"aws_lambda","account":"ist","aws_region":"us-east-1","business_unit":"shared","country":"ist","function_name":"custodian-service-quota-with-usage-metrics","kubernetes_namespace":"monitoring","kubernetes_pod_name":"staging-global-green-cloudwatch-exporter-deployment-585bbdvg59z","monitor":"staging-global-green-prometheus","prototype":"global","squad":"reliability-metrics-tracing","stack_id":"green","tier":"useful"},"values":[436206.7,436206.7,446003.51,446003.51,435151.56,435151.56,437618.85,437618.85,451812.04,451812.04],"timestamps":[1702634220000,1702634340000,1702720620000,1702720740000,1702807020000,1702807140000,1702893420000,1702893540000,1702979820000,1702979940000]}

Version

v1.95.1-cluster

Logs

No response

Screenshots

No response

Used command-line flags

No response

Additional information

No response

@victoramsantos victoramsantos added the bug Something isn't working label Dec 21, 2023
@hagen1778
Copy link
Collaborator

hagen1778 commented Dec 22, 2023

Hello @victoramsantos! Thanks for detailed report!

I think there are two things in this issue.

  1. The cache. VictoriaMetrics uses rollupCache to cache intermediate datapoints for range queries. As a cache key it uses MetricsQL expression and step. It stores datapoints returned for Expression and Step into a cache. Next time when user requests the same Expr with the same Step - VM will check if rollup cache has something already for the requested time range - and will use that.

  2. The vmui.

In the first screenshot you saw a long line for a metric. Here you have step of 23m. Because of this step you see more data than actually exists. See why here.

In the second screenshot, you zoom in and the step changes to 3m. VM can't use rollup cache here, because Step is different. So it executes a request without cache and shows a single datapoint on the range.

In the third screenshot, you pressed the Back button in the browser. For some reason, vmui did change its Step to 23min but didn't change the time-range. Since Step and Expression are now the same as on 1st screenshot, VM uses a rollup cache to fill results with pre-cached data.


In summary.

  1. It is expected to see such long lines for the sparse data as you provided. See more about this here
  2. It is weird that Back broswer button doesn't restore time-range in vmui. Probably, it is something to look on for @Loori-R

@hagen1778 hagen1778 added the vmui label Dec 22, 2023
hagen1778 pushed a commit that referenced this issue May 20, 2024
This PR fixes the handling of URL parameters to ensure correct browser
navigation using the back and forward buttons.

#6126

#5516 (comment)
hagen1778 pushed a commit that referenced this issue May 20, 2024
This PR fixes the handling of URL parameters to ensure correct browser
navigation using the back and forward buttons.

#6126

#5516 (comment)
(cherry picked from commit f14497f)
@valyala
Copy link
Collaborator

valyala commented Jun 7, 2024

@Loori-R , whether this issue is fixed in v1.102.0-rc1?

@Loori-R
Copy link
Contributor

Loori-R commented Jun 10, 2024

Yes, the browser navigation has been fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working vmui
Projects
None yet
Development

No branches or pull requests

4 participants