librbd/cache/pwl/ssd/WriteLog: decrement m_bytes_allocated when retiring #41068

idryomov · 2021-04-28T13:51:31Z

Currently if ssd cache is filled to capacity, all future I/O hangs
indefinitely because even though the cache eventually becomes clean
and retires enough entries to get back under RETIRE_HIGH_WATER, this
isn't communicated to AbstractWriteLog::check_allocation().

Fixes: https://tracker.ceph.com/issues/50560
Signed-off-by: Ilya Dryomov idryomov@gmail.com

idryomov · 2021-04-28T15:45:30Z

cc @MahatiC

idryomov · 2021-04-28T15:47:08Z

cc @CongMinYin

CongMinYin · 2021-04-29T02:02:12Z

src/librbd/cache/pwl/ssd/WriteLog.cc

@@ -708,6 +708,8 @@ bool WriteLog<I>::retire_entries(const unsigned long int frees_per_tx) {
          m_first_valid_entry = first_valid_entry;
          ceph_assert(m_first_valid_entry % MIN_WRITE_ALLOC_SSD_SIZE == 0);
          this->m_free_log_entries += retiring_entries.size();
+          ceph_assert(this->m_bytes_allocated >= allocated_bytes);
+          this->m_bytes_allocated -= allocated_bytes;


This logic does exist in RWL, but it is omitted here.

I'm not sure I understand the comment. Do you mean that it is omitted intentionally?

No, it's not omitted intentionally. This PR looks correct. I actually noticed this issue and fixed it locally along with a few more fixes and haven't upstreamed them yet. And I am currently running some tests as well. I will review this change once again by tomorrow. Thank you for this PR!

FWIW I have another PR for the ssd mode almost ready -- fixing power cycle issues. Should be able to submit by tomorrow. I'll CC you, thanks for taking a look!

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

Currently if ssd cache is filled to capacity, all future I/O hangs indefinitely because even though the cache eventually becomes clean and retires enough entries to get back under RETIRE_HIGH_WATER, this isn't communicated to AbstractWriteLog::check_allocation(). Fixes: https://tracker.ceph.com/issues/50560 Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

Follow rwl mode and use AbstractWriteLog::m_bytes_allocated_cap instead of m_log_pool_ring_buffer_size specific to ssd. This fixes "bytes available" calculation in STATS output. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

…evel Bump "Waiting for allocation" to 5. "Retiring" is at 20 for rwl and 1 for ssd. Bump the latter to 20 as well. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

While at it, reduce the number of calls to operator<< and drop the trailing comma. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

idryomov · 2021-04-29T20:11:40Z

Added a fix for an existing free()/delete mismatch.

MahatiC · 2021-04-30T11:34:51Z

Thanks! Looks good to me.

idryomov added bug-fix rbd labels Apr 28, 2021

CongMinYin reviewed Apr 29, 2021

View reviewed changes

idryomov added 6 commits April 29, 2021 22:09

librbd/cache/pwl/ssd/WriteLog: fix indentation

a40676c

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

librbd/cache/pwl/ssd/WriteLog: fix free()/delete mismatch

5b89c47

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

librbd/cache/pwl: bump "Waiting for allocation" and "Retiring" dout l…

626a995

…evel Bump "Waiting for allocation" to 5. "Retiring" is at 20 for rwl and 1 for ssd. Bump the latter to 20 as well. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

librbd/cache/pwl: include head and tail pointers in STATS

2a974fd

While at it, reduce the number of calls to operator<< and drop the trailing comma. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

idryomov force-pushed the wip-rbd-pwl-ssd-capacity branch from b5b8150 to 2a974fd Compare April 29, 2021 20:09

idryomov merged commit 7f17f21 into ceph:master May 3, 2021

idryomov mentioned this pull request May 3, 2021

librbd/cache/pwl/ssd/WriteLog: don't crash on split log entries #41093

Merged

idryomov deleted the wip-rbd-pwl-ssd-capacity branch May 3, 2021 12:15

ideepika mentioned this pull request Nov 2, 2021

pacific: librbd/cache/pwl: persistant cache backports #43772

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

librbd/cache/pwl/ssd/WriteLog: decrement m_bytes_allocated when retiring #41068

librbd/cache/pwl/ssd/WriteLog: decrement m_bytes_allocated when retiring #41068

idryomov commented Apr 28, 2021

idryomov commented Apr 28, 2021

idryomov commented Apr 28, 2021

CongMinYin Apr 29, 2021

idryomov Apr 29, 2021

MahatiC Apr 29, 2021

idryomov Apr 29, 2021

idryomov commented Apr 29, 2021

MahatiC commented Apr 30, 2021

librbd/cache/pwl/ssd/WriteLog: decrement m_bytes_allocated when retiring #41068

librbd/cache/pwl/ssd/WriteLog: decrement m_bytes_allocated when retiring #41068

Conversation

idryomov commented Apr 28, 2021

idryomov commented Apr 28, 2021

idryomov commented Apr 28, 2021

CongMinYin Apr 29, 2021

Choose a reason for hiding this comment

idryomov Apr 29, 2021

Choose a reason for hiding this comment

MahatiC Apr 29, 2021

Choose a reason for hiding this comment

idryomov Apr 29, 2021

Choose a reason for hiding this comment

idryomov commented Apr 29, 2021

MahatiC commented Apr 30, 2021