Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pageserver lifecycle : replace task_mgr spawn & shutdown with handle-based abstraction #4175

Open
problame opened this issue May 8, 2023 · 0 comments
Labels
c/storage/pageserver Component: storage: pageserver

Comments

@problame
Copy link
Contributor

problame commented May 8, 2023

Motivation

DoD

  • task_mgr::shutdown_tasks, task_mgr::associate_with and task_mgr::is_shutdown_requested are gone
  • task_mgr::spawn callers get back a handle on the task they spawned
  • the task handle allows requesting task shutdown & waiting for task to exit
  • there are mechanisms to ensure that all tasks associated with a tenant have shut down
    • e.g., ability to assert at end of detach that all tasks associated with the tenant have stopped
      • this is critical for relocation correctness, task_mgr does a 90%-good job at it

bonus points / follow-up:

  • there is a mechanism to prevent new tasks from being started for a tenant (tombstone)
    • detach arms this mechanism
    • attach disables it
@shanyp shanyp added t/Epic Issue type: Epic c/storage/pageserver Component: storage: pageserver and removed t/Epic Issue type: Epic labels May 8, 2023
@shanyp shanyp changed the title Epic: replace task_mgr spawn & shutdown with handle-based abstraction Pageserver lifecycle : replace task_mgr spawn & shutdown with handle-based abstraction May 10, 2023
koivunej added a commit that referenced this issue Mar 9, 2024
## Problem

Before this PR, it was possible that on-demand downloads were started
after `Timeline::shutdown()`.

For example, we have observed a walreceiver-connection-handler-initiated
on-demand download that was started after `Timeline::shutdown()`s final
`task_mgr::shutdown_tasks()` call.

The underlying issue is that `task_mgr::shutdown_tasks()` isn't sticky,
i.e., new tasks can be spawned during or after
`task_mgr::shutdown_tasks()`.

Cc: #4175 in lieu of a more
specific issue for task_mgr. We already decided we want to get rid of it
anyways.

Original investigation:
https://neondb.slack.com/archives/C033RQ5SPDH/p1709824952465949

## Changes

- enter gate while downloading
- use timeline cancellation token for cancelling download

thereby, fixes #7054

Entering the gate might also remove recent "kept the gate from closing"
in staging.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c/storage/pageserver Component: storage: pageserver
Projects
None yet
Development

No branches or pull requests

2 participants