New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
os/bluestore/BlueFS: do not hold internal lock while waiting for IO #9898
Conversation
assert(!h->file->dirty); | ||
uint64_t old_dirty_seq = h->file->dirty_seq; | ||
lock.unlock(); | ||
wait_for_aio(h); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a problem here, Thread T1 will finish _flush(updates log file) and waits for the IO to complete. Thread T2 can grab the lock, and do _flush(update the log file), releases the lock. T1 will grab and flushes the log file contains T1 contents + T2 contents. If T2 IO fails now, we will have log containing T2 updates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep. Updated this to only allow one concurrent log flush at a time. It now passes your first test, but I'm unsure if we want to do something more complex here or not...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now we have the following test cases,
Test case 1:
- One writer for single(keeps appending to the file).
- Couple of threads, to generate more files(new).
Both threads keep doing fsync from both.
Test case 2:
Have multiple threads to generate new files. will keep doing fsync.
Will add the following
Test case 3:
Have couple of threads to do append only or create new files and have a sync_metadata thread.
This will generate log compaction once the threshould is hit.
If you have something in mind i can add them. This covers most of the uses cases coming to bluefs.
f62f56a
to
7512a56
Compare
Be accurate. No functional change. Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: Sage Weil <sage@redhat.com>
We only need to dirty the log file if the allocate changes. Replay is smart enough to learn the file size as it goes. Signed-off-by: Sage Weil <sage@redhat.com>
Note when we dirty a file, and clean it only if that seq has been committed. Currently this is always the case because we don't drop the lock, but that will change shortly. Signed-off-by: Sage Weil <sage@redhat.com>
_flush_wait is safe to call without a lock, as long as our reference is stable. Rename it wait_for_aio() to be more clear about what it does and the fact that it doesn't require a lock. Signed-off-by: Sage Weil <sage@redhat.com>
It is safe to call without a lock. Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: Sage Weil <sage@redhat.com>
We need to be careful with IOContext destruction racing with the bdev io completion thread. Signed-off-by: Sage Weil <sage@redhat.com>
Handle cases where we have multiple racing threads trying to flush the log by only allowing one concurrent log flush to be in progress at a time, and behave if, after flushing, there are no more dirty records to flush. Signed-off-by: Sage Weil <sage@redhat.com>
If we know what event we need to wait for, only wait long enough for it to flush. This helps the situation where another thread flushed what we needed, and more dirty stuff was added to log_t, but we don't need to wait for that too for our caller to be happy. Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: Varada Kari <varada.kari@sandisk.com>
Signed-off-by: Varada Kari <varada.kari@sandisk.com>
…r block Signed-off-by: Varada Kari <varada.kari@sandisk.com>
a3631f9
to
6a696a6
Compare
I don't think we should merge this until unittest_bluefs gets some tests that
do parallel threads writing to the same and different files and doing fsync.