Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

layer: unimplemented support for evicting wanted deleted layers #6928

Closed
koivunej opened this issue Feb 27, 2024 · 3 comments · Fixed by #6931 or #6131
Closed

layer: unimplemented support for evicting wanted deleted layers #6928

koivunej opened this issue Feb 27, 2024 · 3 comments · Fixed by #6931 or #6131
Assignees
Labels
c/storage/pageserver Component: storage: pageserver t/bug Issue Type: Bug

Comments

@koivunej
Copy link
Contributor

koivunej commented Feb 27, 2024

As seen on 2024-02-27, evictions started hanging, and so deletions started hanging, disk usage based eviction task got stuck, and manual intervention was required.

Initial suspicion was wrong, only discovered one obviously missing metric update in: #6931.

Later realized it was the wanted deleted layers inability to communicate anything to Layer::evict_and_wait which was the cause of the hangs.

Slack channel: #temp-2024-02-27-stuck-disk-usage-based-eviction

@koivunej

This comment was marked as outdated.

@koivunej

This comment was marked as outdated.

@koivunej
Copy link
Contributor Author

The problem is simpler: evicting a wanted_deleted layer always hangs.

@koivunej koivunej changed the title layer: wrong check order causes hanging evictions layer: unimplemented support for evicting wanted deleted layers Feb 27, 2024
koivunej added a commit that referenced this issue Feb 27, 2024
Not allowing evicting wanted deleted layers is something I've forgotten
to implement on #5645. This PR makes it possible to evict such layers,
which should reduce the amount of hanging evictions.

Fixes: #6928

Co-authored-by: Christian Schwarz <christian@neon.tech>
koivunej added a commit that referenced this issue Feb 27, 2024
Not allowing evicting wanted deleted layers is something I've forgotten
to implement on #5645. This PR makes it possible to evict such layers,
which should reduce the amount of hanging evictions.

Fixes: #6928

Co-authored-by: Christian Schwarz <christian@neon.tech>
koivunej added a commit that referenced this issue Feb 29, 2024
…rics (#6131)

Because of bugs evictions could hang and pause disk usage eviction task.
One such bug is known and fixed #6928. Guard each layer eviction with a
modest timeout deeming timeouted evictions as failures, to be
conservative.

In addition, add logging and metrics recording on each eviction
iteration:
- log collection completed with duration and amount of layers
    - per tenant collection time is observed in a new histogram
    - per tenant layer count is observed in a new histogram
- record metric for collected, selected and evicted layer counts
- log if eviction takes more than 10s
- log eviction completion with eviction duration

Additionally remove dead code for which no dead code warnings appeared
in earlier PR.

Follow-up to: #6060.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c/storage/pageserver Component: storage: pageserver t/bug Issue Type: Bug
Projects
None yet
1 participant