You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is an issue for the 10-minute hang on #5331 first observed in staging. It is related to #6028, which tracks the need for Layer::keep_resident to repair the state.
A fix is in #7030 -- use a separate watch channel to signal intent to begin a download, thus canceling the semaphore wait that a later started Layer::keep_resident would be doing.
The need to wait for downloads caused a disk full situation on 2024-03-15 00:32:00 on pageserver-0.eu-west-1.aws.neon.build because a download was awaited, which could not be completed. After all, the disk was full, and we currently do not preallocate space for downloads, so it always failed, not in the beginning, but somewhere along the way. A failed download, such as disk full or not found1, will also cause Layer::keep_resident to wait for exponential backoff.
Footnotes
In past staging issues when generation numbers have rolled back, there had been retrying on S3 returning "not found". It should no longer be retried after (PR, which moved timeouts to remote_storage). ↩
The text was updated successfully, but these errors were encountered:
This is an issue for the 10-minute hang on #5331 first observed in staging. It is related to #6028, which tracks the need for
Layer::keep_resident
to repair the state.A fix is in #7030 -- use a separate watch channel to signal intent to begin a download, thus canceling the semaphore wait that a later started
Layer::keep_resident
would be doing.The need to wait for downloads caused a disk full situation on 2024-03-15 00:32:00 on
pageserver-0.eu-west-1.aws.neon.build
because a download was awaited, which could not be completed. After all, the disk was full, and we currently do not preallocate space for downloads, so it always failed, not in the beginning, but somewhere along the way. A failed download, such as disk full or not found1, will also causeLayer::keep_resident
to wait for exponential backoff.Footnotes
In past staging issues when generation numbers have rolled back, there had been retrying on S3 returning "not found". It should no longer be retried after (PR, which moved timeouts to remote_storage). ↩
The text was updated successfully, but these errors were encountered: