Skip to content

merge to analytics stable kqp cpu scheduler fix use after free #20153

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

GrigoriyPA
Copy link
Collaborator

Changelog entry

Fixed use after free in CPU scheduler, fixed verify fail in CS CPU limiter: #20116

Changelog category

  • Bugfix

Description for reviewers

Also fixed ResourceWeight enable

@GrigoriyPA GrigoriyPA requested a review from a team as a code owner June 25, 2025 10:18
@GrigoriyPA
Copy link
Collaborator Author

#20116

@GrigoriyPA GrigoriyPA requested review from ssmike and Copilot June 25, 2025 10:18
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes a use-after-free issue in the CPU scheduler, addresses a verify failure in the CPU limiter, and updates handling of resource weight in the node service.

  • Fix CPU limiter verification by normalizing AmountCPULimit near zero.
  • Refactor resource weight handling in the compute scheduler and node service for safety and clarity.

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File Description
ydb/core/tx/conveyor/service/service.cpp Adds a check to reset AmountCPULimit to zero when near zero to prevent verify failures.
ydb/core/kqp/runtime/kqp_compute_scheduler.cpp Refactors resource weight check by caching the Enabled() call and loops through Params until fully purged.
ydb/core/kqp/node_service/kqp_node_service.cpp Adds an extra check with HasResourceWeight() to safely assign resource weight from the message.
Comments suppressed due to low confidence (1)

ydb/core/kqp/runtime/kqp_compute_scheduler.cpp:190

  • [nitpick] Consider renaming 'toerase' to 'keysToErase' for improved clarity.
        std::vector<TParameterKey> toerase;

Copy link

github-actions bot commented Jun 25, 2025

2025-06-25 10:21:42 UTC Pre-commit check linux-x86_64-release-asan for ec84951 has started.
2025-06-25 10:21:54 UTC Artifacts will be uploaded here
2025-06-25 10:24:47 UTC ya make is running...
🟡 2025-06-25 11:23:57 UTC Some tests failed, follow the links below. This fail is not in blocking policy yet Going to retry failed tests...

Test history | Ya make output | Test bloat

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
12098 11991 0 56 23 28

2025-06-25 11:25:05 UTC ya make is running... (failed tests rerun, try 2)
🟡 2025-06-25 11:43:45 UTC Some tests failed, follow the links below. This fail is not in blocking policy yet Going to retry failed tests...

Test history | Ya make output | Test bloat | Test bloat

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
145 (only retried tests) 113 0 2 5 25

2025-06-25 11:43:55 UTC ya make is running... (failed tests rerun, try 3)
🟡 2025-06-25 11:55:22 UTC Some tests failed, follow the links below. This fail is not in blocking policy yet

Test history | Ya make output | Test bloat | Test bloat | Test bloat

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
55 (only retried tests) 24 0 1 4 26

🟢 2025-06-25 11:55:29 UTC Build successful.
🟢 2025-06-25 11:55:55 UTC ydbd size 3.7 GiB changed* by +88 Bytes, which is < 100.0 KiB vs stable-25-1-analytics: OK

ydbd size dash stable-25-1-analytics: 40c517e merge: ec84951 diff diff %
ydbd size 3 989 250 960 Bytes 3 989 251 048 Bytes +88 Bytes +0.000%
ydbd stripped size 1 393 108 360 Bytes 1 393 108 232 Bytes -128 Bytes -0.000%

*please be aware that the difference is based on comparing your commit and the last completed build from the post-commit, check comparation

Copy link

github-actions bot commented Jun 25, 2025

2025-06-25 10:21:54 UTC Pre-commit check linux-x86_64-relwithdebinfo for ec84951 has started.
2025-06-25 10:22:04 UTC Artifacts will be uploaded here
2025-06-25 10:24:52 UTC ya make is running...
🟡 2025-06-25 11:11:46 UTC Some tests failed, follow the links below. Going to retry failed tests...

Test history | Ya make output | Test bloat

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
19153 17859 0 6 1235 53

2025-06-25 11:13:25 UTC ya make is running... (failed tests rerun, try 2)
🟡 2025-06-25 11:25:57 UTC Some tests failed, follow the links below. Going to retry failed tests...

Test history | Ya make output | Test bloat | Test bloat

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
119 (only retried tests) 76 0 1 5 37

2025-06-25 11:26:05 UTC ya make is running... (failed tests rerun, try 3)
🟢 2025-06-25 11:43:59 UTC Tests successful.

Test history | Ya make output | Test bloat | Test bloat | Test bloat

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
74 (only retried tests) 36 0 0 4 34

🟢 2025-06-25 11:44:06 UTC Build successful.
🟢 2025-06-25 11:44:27 UTC ydbd size 2.1 GiB changed* by +16 Bytes, which is < 100.0 KiB vs stable-25-1-analytics: OK

ydbd size dash stable-25-1-analytics: 40c517e merge: ec84951 diff diff %
ydbd size 2 291 333 704 Bytes 2 291 333 720 Bytes +16 Bytes +0.000%
ydbd stripped size 482 026 880 Bytes 482 026 944 Bytes +64 Bytes +0.000%

*please be aware that the difference is based on comparing your commit and the last completed build from the post-commit, check comparation

@GrigoriyPA GrigoriyPA merged commit 9e31902 into ydb-platform:stable-25-1-analytics Jun 25, 2025
12 checks passed
@GrigoriyPA GrigoriyPA deleted the merge-to-aydb-stable-Kqp-CPU-Scheduler-fix-use-after-free branch June 25, 2025 11:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants