-
Notifications
You must be signed in to change notification settings - Fork 454
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: DROP ... CASCADE can be slow #26373
Labels
C-bug
Category: something is broken
Comments
Looked at stats for current blue/green use cases and the drop can be as slow as 7s, which should be improved, but is not as severe as #26361. |
5 tasks
ParkMyCar
added a commit
that referenced
this issue
May 21, 2024
) This PR refactors `mz_catalog::Op::DropObject` to take a `Vec<ObjectId>` instead of a single `ObjectId`. Previously a `DROP ... CASCADE` would result in a `Op::DropObject` for every object we needed to drop. This had poor performance because removing a batch of items from a durable catalog transaction is `O(n)`, where `n` is the number of objects in the catalog. But because we would issue an `Op::DropObject` for every item we're dropping, we would have a batch size of 1 and the runtime of `DROP ... CASCADE` became `O(m * n)`, where `m` is the number of objects we're dropping. With this refactoring the runtime of `DROP ... CASCADE` is `O(n)`. Concretely I tested creating a schema with 1 table and 1,000 views then running `DROP SCHEMA a CASCADE`: * `main`: ~28 seconds * This PR: ~700ms ### Motivation Fixes #26373 ### Tips for reviewer Hiding white space helps, a large code block got surrounded by `for id in object_ids { ... }` and indented ### Checklist - [ ] This PR has adequate test coverage / QA involvement has been duly considered. ([trigger-ci for additional test/nightly runs](https://trigger-ci.dev.materialize.com/)) - [ ] This PR has an associated up-to-date [design doc](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/README.md), is a design doc ([template](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/00000000_template.md)), or is sufficiently small to not require a design. <!-- Reference the design in the description. --> - [ ] If this PR evolves [an existing `$T ⇔ Proto$T` mapping](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/command-and-response-binary-encoding.md) (possibly in a backwards-incompatible way), then it is tagged with a `T-proto` label. - [ ] If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label ([example](MaterializeInc/cloud#5021)). <!-- Ask in #team-cloud on Slack if you need help preparing the cloud PR. --> - [x] This PR includes the following [user-facing behavior changes](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/guide-changes.md#what-changes-require-a-release-note): - Improves the performance of dropping large amount of items at once.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
What version of Materialize are you using?
v0.94.2
What is the issue?
Running a
DROP ... CASCADE
can be relatively slow because we collect all of the items to drop at the planning stage and then is subject to a similar issue as seen in #26361The text was updated successfully, but these errors were encountered: