Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"nodetool stop RESHAPE" stops individual task, but doesn't abort the whole operation #15058

Closed
raphaelsc opened this issue Aug 15, 2023 · 10 comments
Assignees
Labels
Milestone

Comments

@raphaelsc
Copy link
Member

On reshape on boot, we want "nodetool stop" to abort the whole reshape operation, rather than only stopping individual tasks. Otherwise, it's not helping to bypass reshape on boot.

[shard 13] compaction_manager - Stopping 1 tasks for 1 ongoing compactions for table ... and type=Reshape due to user request
[shard 13] compaction - [Reshape ...] Reshaping of 8 sstables interrupted due to: sstables::compaction_stopped_exception (Compaction for ... was stopped due to: user request)
[shard 13] compaction_manager - Reshape compaction task 0x6040045ce150 for table ... [0x604003b32f90]: stopped, reason: Compaction for ... was stopped due to: user request
compaction - [Reshape ...] Reshaping [/var/lib/scylla/data/...]
@raphaelsc
Copy link
Member Author

That's a bad interaction between compaction_manager and shard_reshaping_compaction_task_impl::run(), where latter expects the former to propagate compaction_stopped_exception to it (which is being swallowed by compaction_manager::perform_task()), in order to abort the whole reshape operation, not only the individual task.

This is a regression introduced in 5.1.

@mykaul
Copy link
Contributor

mykaul commented Aug 15, 2023

@roydahan do we have a test for this API?

@raphaelsc
Copy link
Member Author

this new dtest reproduces the issue https://github.com/scylladb/scylla-dtest/pull/3510

@Deexie FYI

@roydahan
Copy link

@roydahan do we have a test for this API?

Hopefully now we will have...

Deexie added a commit to Deexie/scylla that referenced this issue Aug 16, 2023
Loop in shard_reshaping_compaction_task_impl::run relies on whether
sstables::compaction_stopped_exception is thrown from run_custom_job.
The exception is swallowed for each type of compaction
in compaction_manager::perform_task.

Rethrow an exception in perfrom task for reshape compaction.

Fixes: scylladb#15058.
Deexie added a commit to Deexie/scylla that referenced this issue Aug 17, 2023
Loop in shard_reshaping_compaction_task_impl::run relies on whether
sstables::compaction_stopped_exception is thrown from run_custom_job.
The exception is swallowed for each type of compaction
in compaction_manager::perform_task.

Rethrow an exception in perfrom task for reshape compaction.

Fixes: scylladb#15058.
@DoronArazii
Copy link

@scylladb/scylla-maint please consider backport

@DoronArazii DoronArazii added backport/5.2 Issues that should be backported to 5.2 branch once they'll be fixed Requires-Backport-to-5.1 labels Aug 22, 2023
@DoronArazii DoronArazii added this to the 5.4 milestone Aug 22, 2023
@denesb
Copy link
Contributor

denesb commented Aug 22, 2023

What are the affected branches? I doubt 5.1 is affected. @Deexie ?

@mykaul
Copy link
Contributor

mykaul commented Aug 22, 2023

What are the affected branches? I doubt 5.1 is affected. @Deexie ?

#15058 (comment) mentions 5.1.

@denesb
Copy link
Contributor

denesb commented Aug 22, 2023

Cherry-pick is not clean, not even to 5.2. @Deexie please open backport pull requests for 5.2 and 5.1.

Deexie added a commit to Deexie/scylla that referenced this issue Aug 22, 2023
Loop in shard_reshaping_compaction_task_impl::run relies on whether
sstables::compaction_stopped_exception is thrown from run_custom_job.
The exception is swallowed for each type of compaction
in compaction_manager::perform_task.

Rethrow an exception in perfrom task for reshape compaction.

Fixes: scylladb#15058.
Deexie added a commit to Deexie/scylla that referenced this issue Aug 22, 2023
Loop in shard_reshaping_compaction_task_impl::run relies on whether
sstables::compaction_stopped_exception is thrown from run_custom_job.
The exception is swallowed for each type of compaction
in compaction_manager::perform_task.

Rethrow an exception in perfrom task for reshape compaction.

Fixes: scylladb#15058.
@Deexie
Copy link
Contributor

Deexie commented Aug 22, 2023

@denesb Here are PRs for 5.2: #15122 and 5.1: #15123

Deexie added a commit to Deexie/scylla that referenced this issue Aug 22, 2023
Loop in shard_reshaping_compaction_task_impl::run relies on whether
sstables::compaction_stopped_exception is thrown from run_custom_job.
The exception is swallowed for each type of compaction
in compaction_manager::perform_task.

Rethrow an exception in perfrom task for reshape compaction.

Fixes: scylladb#15058.

(cherry picked from commit e0ce711)
Deexie added a commit to Deexie/scylla that referenced this issue Aug 22, 2023
Loop in shard_reshaping_compaction_task_impl::run relies on whether
sstables::compaction_stopped_exception is thrown from run_custom_job.
The exception is swallowed for each type of compaction
in compaction_manager::perform_task.

Rethrow an exception in perfrom task for reshape compaction.

Fixes: scylladb#15058.

(cherry picked from commit e0ce711)
raphaelsc pushed a commit to raphaelsc/scylla that referenced this issue Aug 22, 2023
Loop in shard_reshaping_compaction_task_impl::run relies on whether
sstables::compaction_stopped_exception is thrown from run_custom_job.
The exception is swallowed for each type of compaction
in compaction_manager::perform_task.

Rethrow an exception in perfrom task for reshape compaction.

Fixes: scylladb#15058.

Closes scylladb#15067
denesb pushed a commit that referenced this issue Aug 23, 2023
Loop in shard_reshaping_compaction_task_impl::run relies on whether
sstables::compaction_stopped_exception is thrown from run_custom_job.
The exception is swallowed for each type of compaction
in compaction_manager::perform_task.

Rethrow an exception in perfrom task for reshape compaction.

Fixes: #15058.

(cherry picked from commit e0ce711)

Closes #15122
denesb pushed a commit that referenced this issue Aug 23, 2023
Loop in shard_reshaping_compaction_task_impl::run relies on whether
sstables::compaction_stopped_exception is thrown from run_custom_job.
The exception is swallowed for each type of compaction
in compaction_manager::perform_task.

Rethrow an exception in perfrom task for reshape compaction.

Fixes: #15058.

(cherry picked from commit e0ce711)

Closes #15123
@denesb
Copy link
Contributor

denesb commented Aug 23, 2023

@denesb Here are PRs for 5.2: #15122 and 5.1: #15123

Both are queued. Removing labels.

@denesb denesb removed Backport candidate backport/5.2 Issues that should be backported to 5.2 branch once they'll be fixed Requires-Backport-to-5.1 labels Aug 23, 2023
raphaelsc pushed a commit to raphaelsc/scylla that referenced this issue Aug 29, 2023
Loop in shard_reshaping_compaction_task_impl::run relies on whether
sstables::compaction_stopped_exception is thrown from run_custom_job.
The exception is swallowed for each type of compaction
in compaction_manager::perform_task.

Rethrow an exception in perfrom task for reshape compaction.

Fixes: scylladb#15058.

Closes scylladb#15067
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants