New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
librbd/cache/pwl/ssd: fix first_free_entry and m_first_free_entry corruption #41490
Conversation
cc @MahatiC |
cc @CongMinYin |
jenkins test api |
@idryomov Hi Ilya, I'm trying to reproduce this error locally and apply this PR on it. Will leave a review after that. Thanks. |
lgtm, thanks! |
In append_ops(), new_first_free_entry is assigned to after aio_submit() is called. This can result in accessing uninitialized or freed memory because all I/Os may complete and append_ctx callback may run before the assignment is executed. Garbage value gets written to first_free_entry and we eventually crash, most likely in bufferlist manipulation code. But worse, the corrupted first_free_entry makes it to media almost all the time. The result is a corrupted cache -- dirty user data is lost. Fixes: https://tracker.ceph.com/issues/50832 Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
In append_op_log_entries(), new_first_free_entry is read after append_ops() returns. This can result in accessing freed memory because all I/Os may complete and append_ctx callback may run by the time new_first_free_entry is read. Garbage value gets written to m_first_free_entry and depending on the circumstances it may allow AbstractWriteLog code to accept more dirty user data than we have space for. Luckily we usually crash before then. Fixes: https://tracker.ceph.com/issues/50832 Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Ensure first_{valid,free}_entry are inside the expected range when scheduling root updates and decoding the root on recovery. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
e1e6f46
to
f8fb760
Compare
Sorry, there was a bug in |
Thanks Ilya for the changes. Could you let us know the specific test that triggered this bug? (with the first revision of this PR) |
There wasn't any specific test failure. I'm working through |
Thanks for the change, looks good to me. |
@trociny This isn't ready for full qa because of another issue. I'm holding off on it for now. |
The crashes that appeared to be triggered by this PR turned out to be an existing use-after-free bug, fixed in #42145. |
This manifested as sporadic segfaults during 4K randwrite workload with QD=2 and higher but turned out to be much more serious. When it segfaulted the cache's root was already corrupted on media, resulting in data loss.