Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

very slow query of segments #9899

Open
FrankChen021 opened this issue May 20, 2020 · 5 comments
Open

very slow query of segments #9899

FrankChen021 opened this issue May 20, 2020 · 5 comments

Comments

@FrankChen021
Copy link
Member

FrankChen021 commented May 20, 2020

this issue is related to #9755 . At first I thought it was because there were too many rows in the druid_segments table. So we killed unused segments through API to reduce the records in that table. After that, the count of rows dropped from about 360,000 to about 50,000.

But our DBA still reported that there were slow queries there, which pointed to statement SELECT payload FROM druid_segments WHERE used=true. So I ran this SQL from local mysql client, and it reported that it took more than 90 seconds to get the result.

I ran another SQL SELECT id FROM druid_segments WHERE used=true from local, which reported that it took more than 12 seconds to complete.

image

Since the payload field in database is a LONGBLOB, I guess the reason why the query is so slow is because the return set is too large in bytes, which takes a long time to transmit data from MySQL server to client in network.

Affected Version

the version we currently use is 0.16.0. since this issue is related to database, Ii think the latest version is still affected

@github-actions
Copy link

This issue has been marked as stale due to 280 days of inactivity.
It will be closed in 4 weeks if no further activity occurs. If this issue is still
relevant, please simply write any comment. Even if closed, you can still revive the
issue at any time or discuss it on the dev@druid.apache.org list.
Thank you for your contributions.

@github-actions github-actions bot added the stale label Aug 31, 2023
@sivasai-ucmo
Copy link

Is this issue resolved ? im still facing the same issue with v25. Does any one know who fires this query (is it Overlord or Coordinator) ? and the reason for these queries ? each query is taking almost 60 seconds and returning empty results which is blocking overall ingestion and made the platform unstable (no response in UI)

@github-actions github-actions bot removed the stale label Sep 17, 2023
@teyeheimans
Copy link

We have exactly the same issue. We see that the max length of the payload is around 6000 characters.

Maybe it's a good improvement to change the column type of the payload column from longblob to varchar(10000)? As I understand from the MySQL manual, BLOB columns are not kept in memory.

We have also begun to re-index our segments. We have changed the segment granularity from hour to day or even month, which decreased the number of segments. For other people who find this topic, this is something which is a good solution for it. And kill unused segments as the topic starter also suggested.

@lukoou3
Copy link

lukoou3 commented May 6, 2024

We have exactly the same issue when segments count > 100, 000, . SQL SELECT id FROM druid_segments WHERE used=true query vrey slow.

@doufenghu
Copy link

doufenghu commented May 6, 2024

Druid version 26.0.0. We have the same issue about 200K segments that SQL query very slowly about 3 minutes.
SELECT payload FROM druid_segments WHERE used=true

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants