qa: add tests for persistent writeback cache #38921

lixiaoy1 · 2021-01-15T03:52:46Z

Signed-off-by: Li, Xiaoyan xiaoyan.li@intel.com

Checklist

References tracker ticket
Updates documentation if necessary
Includes tests for new functionality or reproducer for bug

Show available Jenkins commands

jenkins retest this please
jenkins test classic perf
jenkins test crimson perf
jenkins test signed
jenkins test make check
jenkins test make check arm64
jenkins test submodules
jenkins test dashboard
jenkins test api
jenkins test docs
jenkins render docs
jenkins test ceph-volume all
jenkins test ceph-volume tox

lixiaoy1 · 2021-01-15T03:53:50Z

@dillaman I am still testing these cases in my local teuthology environment, and please help to review the PR in the meantime.

dillaman · 2021-01-21T15:16:25Z

@lixiaoy1 We also need a PR to tweak the default ceph.spec.in and debian/rules to enable RWL/SSD caches during the build.

lixiaoy1 · 2021-01-25T01:52:03Z

@lixiaoy1 We also need a PR to tweak the default ceph.spec.in and debian/rules to enable RWL/SSD caches during the build.

#39049 is raised.

lixiaoy1 · 2021-02-04T03:57:04Z

@dillaman The test cases are tested successfully. Please help to review.

dillaman · 2021-02-05T00:26:37Z

qa/suites/rbd/persistent-writeback-cache/pool/rwl.yaml

+- ceph:
+    conf:
+      client:
+        rbd_persistent_cache_mode: rwl


Nit: should probably be in a different directory from pool (i.e. cache_mode) and we would want a file to also exercise the SSD mode.

I created a new folder for rwl. And the other file for SSD will be created in a later PR as it needs to test at first.

dillaman

It would be good to run the xfstest workload as well.

dillaman · 2021-02-05T00:27:52Z

qa/suites/rbd/persistent-writeback-cache/pool/rwl.yaml

+    conf:
+      client:
+        rbd_persistent_cache_mode: rwl
+        rbd_rwl_path: /tmp


I suspect that this path will be a tmpfs and will fail since it doesn't support direct IO. Might need to use a directory under ~ubuntu/cephtest and make sure you delete it at the end of the test run.

dillaman · 2021-02-05T00:30:18Z

qa/tasks/persistent_writeback_cache_thrash.py

+           "bench",
+           "--io-type",
+           "write"
+           ]


Nit: maybe --io-pattern rand to avoid simple sequential case

dillaman · 2021-02-05T00:31:41Z

qa/tasks/persistent_writeback_cache_thrash.py

@@ -0,0 +1,76 @@
+"""


Nit: not sure this really needs to be a new task. You can probably just write a small script (or use the built-in exec task) to start timeout <random>s rbd bench --io-type write ...

Deleted the new task, and use timeout.

lixiaoy1 · 2021-02-07T06:05:15Z

It would be good to run the xfstest workload as well.

xfstest is added.

dillaman · 2021-02-07T19:23:06Z

qa/suites/rbd/persistent-writeback-cache/6-workloads/recovery.yaml

+      image_format: 2
+- exec:
+    client.0:
+      - "timeout 10s rbd bench --io-type write testimage || true"


Nit: --io-pattern rand

dillaman · 2021-02-07T19:27:21Z

https://pulpito.ceph.com/jdillaman-2021-02-07_14:27:07-rbd-wip-jd-testing-distro-basic-smithi/

dillaman · 2021-02-07T20:28:25Z

@lixiaoy1 Looks like lots of failures

lixiaoy1 · 2021-02-08T03:58:13Z

@lixiaoy1 Looks like lots of failures

From the log, it failed to load the pwl_cache library.

2021-02-07T20:09:03.577 INFO:teuthology.orchestra.run.smithi078.stderr:2021-02-07T20:09:03.576+0000 7fbdc1008700 -1 load failed dlopen(): "/usr/lib64/ceph/librbd/libceph_librbd_pwl_cache.so: cannot open shared object file: No such file or directory" or "/usr/lib64/ceph/libceph_librbd_pwl_cache.so: cannot open shared object file: No such file or directory"
2021-02-07T20:09:03.578 INFO:teuthology.orchestra.run.smithi078.stderr:2021-02-07T20:09:03.576+0000 7fbdc1008700 -1 librbd::PluginRegistry: 0x5559861330c0 init: failed to load plugin: pwl_cache

lixiaoy1 · 2021-02-08T08:09:41Z

@dillaman I found that it is not correct to remove the cache folder in '7-cleanup/cleanup.yaml'. As this work is done before deleting rbd images. As a result, the command 'rbd remove' failed as it needs to load the cache folder during this command.
Do you know any method to delete the cache folder after images' deletions?

dillaman · 2021-02-08T14:14:29Z

From the log, it failed to load the pwl_cache library.

Sounds like an issue w/ the RPM spec file that the library wasn't built? Running rpm -q https://4.chacra.ceph.com/r/ceph/wip-jd-testing/6a2c2d96bec1b7dd97e8c2b7114de81ce9ab38f9/centos/8/flavors/default/x86_64/librbd1-17.0.0-511.g6a2c2d96.el8.x86_64.rpm --list I can see that indeed the plugin isn't included which means it was disabled in the spec.

dillaman · 2021-02-08T14:17:03Z

I found that it is not correct to remove the cache folder in '7-cleanup/cleanup.yaml'. As this work is done before deleting rbd images.

What image is automatically deleted at the end of the test? I would expect any image created by the QEMU task to be deleted by the end of that task and not after the full test runs. Worst case, you could always disable the cache on the image before removing it. Also, realistically speaking, if you are removing the image, why would the cache care that it failed to open the cache? It's not like you are going to care about replaying the log.

lixiaoy1 · 2021-02-09T02:04:23Z

From the log, it failed to load the pwl_cache library.

Sounds like an issue w/ the RPM spec file that the library wasn't built? Running rpm -q https://4.chacra.ceph.com/r/ceph/wip-jd-testing/6a2c2d96bec1b7dd97e8c2b7114de81ce9ab38f9/centos/8/flavors/default/x86_64/librbd1-17.0.0-511.g6a2c2d96.el8.x86_64.rpm --list I can see that indeed the plugin isn't included which means it was disabled in the spec.

https://github.com/ceph/ceph/blob/master/ceph.spec.in#L1276 The rbd_rwl_cache needs to be specified for rpm packages to enable RWL.
I used debian packages, it is enabled by default.

lixiaoy1 · 2021-02-09T02:54:36Z

Also, realistically speaking, if you are removing the image, why would the cache care that it failed to open the cache? It's not like you are going to care about replaying the log.

From the codes, it acquires exclusive lock in pre_remove_image() when removing an image. After acquiring the exclusive lock, its write-back cache is initialized. Because the cache folder is removed, the initialization fails.
https://github.com/ceph/ceph/blob/master/src/librbd/api/Image.cc#L800
https://github.com/ceph/ceph/blob/master/src/librbd/image/PreRemoveRequest.cc#L65

lixiaoy1 · 2021-02-09T08:17:01Z

What image is automatically deleted at the end of the test? I would expect any image created by the QEMU task to be deleted by the end of that task and not after the full test runs. Worst case, you could always disable the cache on the image before removing it.

The issue is fixed by using exec and exec_on_cleanup.

dillaman · 2021-02-09T13:38:08Z

What image is automatically deleted at the end of the test? I would expect any image created by the QEMU task to be deleted by the end of that task and not after the full test runs. Worst case, you could always disable the cache on the image before removing it.

The issue is fixed by using exec and exec_on_cleanup.

I'd prefer to see "remove" function correctly and as expected should the cache be corrupted / missing. In a perfect world the ExclusiveLock::acquire_request could take a bitset of flags and you could add a new exclusive_lock::ACQUIRE_FLAG_REMOVING flag that could be passed to the cache::pwl::InitRequest to ignore missing or corrupt caches. Alternatively, there could be another plugin hook to handle the removal case. Worst-case, add a bool to ImageCtx to indicate the image is being removed.

dillaman · 2021-02-09T13:41:21Z

The rbd_rwl_cache needs to be specified for rpm packages to enable RWL.

The spec will need to default it to on -- our upstream builders will not be changed to flip the switch.

Signed-off-by: Li, Xiaoyan <xiaoyan.li@intel.com>

lixiaoy1 · 2021-02-18T01:14:48Z

The spec will need to default it to on -- our upstream builders will not be changed to flip the switch.

#39539 is raised to enable write-back cache by default.

lixiaoy1 · 2021-02-18T01:38:03Z

I'd prefer to see "remove" function correctly and as expected should the cache be corrupted / missing. In a perfect world the ExclusiveLock::acquire_request could take a bitset of flags and you could add a new exclusive_lock::ACQUIRE_FLAG_REMOVING flag that could be passed to the cache::pwl::InitRequest to ignore missing or corrupt caches. Alternatively, there could be another plugin hook to handle the removal case. Worst-case, add a bool to ImageCtx to indicate the image is being removed.

@dillaman Thank you for your advice. This item will be implemented in a standalone PR.

lixiaoy1 · 2021-02-19T05:09:05Z

Found another issue: #39567

dillaman · 2021-02-21T14:18:09Z

https://pulpito.ceph.com/jdillaman-2021-02-21_09:44:19-rbd-wip-jd-testing-2-distro-basic-smithi/

dillaman · 2021-02-21T16:42:29Z

@lixiaoy1 It looks like PWL isn't passing the XFS tests:

2021-02-21T15:47:46.131+0000 7f557d5cd700  1 heartbeat_map reset_timeout 'librbd::cache::pwl::AbstractWriteLog::thread_pool thread 0x7f557d5cd700' had timed out after 60.000003815s

and

2021-02-21T15:50:35.722 INFO:tasks.qemu.client.0.smithi111.stdout:[ 2760.252211] INFO: task mount:17498 blocked for more than 120 seconds.
2021-02-21T15:50:35.774 INFO:tasks.qemu.client.0.smithi111.stdout:[ 2760.255681] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
2021-02-21T15:52:35.728 INFO:tasks.qemu.client.0.smithi111.stdout:[ 2880.256148] INFO: task mount:17498 blocked for more than 120 seconds.
2021-02-21T15:52:35.729 INFO:tasks.qemu.client.0.smithi111.stdout:[ 2880.258483] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
2021-02-21T15:54:35.730 INFO:tasks.qemu.client.0.smithi111.stdout:[ 3000.260152] INFO: task mount:17498 blocked for more than 120 seconds.
2021-02-21T15:54:35.730 INFO:tasks.qemu.client.0.smithi111.stdout:[ 3000.263734] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
2021-02-21T15:56:35.736 INFO:tasks.qemu.client.0.smithi111.stdout:[ 3120.268147] INFO: task mount:17498 blocked for more than 120 seconds.
2021-02-21T15:56:35.737 INFO:tasks.qemu.client.0.smithi111.stdout:[ 3120.270529] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
2021-02-21T15:58:35.738 INFO:tasks.qemu.client.0.smithi111.stdout:[ 3240.272144] INFO: task mount:17498 blocked for more than 120 seconds.
2021-02-21T15:58:35.739 INFO:tasks.qemu.client.0.smithi111.stdout:[ 3240.274668] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
2021-02-21T16:00:35.757 INFO:tasks.qemu.client.0.smithi111.stdout:[ 3360.276148] INFO: task mount:17498 blocked for more than 120 seconds.
2021-02-21T16:00:35.758 INFO:tasks.qemu.client.0.smithi111.stdout:[ 3360.279784] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
2021-02-21T16:02:35.747 INFO:tasks.qemu.client.0.smithi111.stdout:[ 3480.284187] INFO: task mount:17498 blocked for more than 120 seconds.
2021-02-21T16:02:35.747 INFO:tasks.qemu.client.0.smithi111.stdout:[ 3480.286522] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
2021-02-21T16:04:35.747 INFO:tasks.qemu.client.0.smithi111.stdout:[ 3600.288234] INFO: task mount:17498 blocked for more than 120 seconds.
2021-02-21T16:04:35.748 INFO:tasks.qemu.client.0.smithi111.stdout:[ 3600.291732] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
2021-02-21T16:06:35.761 INFO:tasks.qemu.client.0.smithi111.stdout:[ 3720.296221] INFO: task mount:17498 blocked for more than 120 seconds.
2021-02-21T16:06:35.762 INFO:tasks.qemu.client.0.smithi111.stdout:[ 3720.298856] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
2021-02-21T16:08:35.755 INFO:tasks.qemu.client.0.smithi111.stdout:[ 3840.300114] INFO: task mount:17498 blocked for more than 120 seconds.
2021-02-21T16:08:35.755 INFO:tasks.qemu.client.0.smithi111.stdout:[ 3840.302441] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.

dillaman · 2021-02-21T18:28:03Z

Also deadlocking here [1] but now w/ debug logs enabled.

[1] http://qa-proxy.ceph.com/teuthology/jdillaman-2021-02-21_12:04:19-rbd-wip-jd-testing-2-distro-basic-smithi/5900442/teuthology.log

dillaman · 2021-02-21T19:44:55Z

@lixiaoy1 Looks like this is where it hangs (note the 137 second timestamp delta between log entries after the PWL cache complained about the large IO):

2021-02-21T18:10:57.135+0000 7f2fff7fe700  5 librbd::cache::pwl::rwl::WriteLog: 0x7f2fe801b270 reserve_cache: can't allocate all data buffers: extending the pool by appending parts with headers is not supported!. image_extents=[[100806656,4923392]], image_extents_summary=[total_bytes=4923392, first_image_byte=100806656, last_image_byte=105730048], bl=buffer::list(len=4923392,
...snip...
), user_req=0x5650f8fd86b0, m_user_req_completed=0, m_deferred=1, detained=0, waited_lanes=0, waited_entries=1, waited_buffers=0
2021-02-21T18:13:11.700+0000 7f306e8bbf00 20 librbd::api::Io aio_write: ictx=0x5650f8682490, completion=0x5650f9dd6bf0, off=554766848, len=12288, flags=0

lixiaoy1 · 2021-02-23T06:13:17Z

2021-02-21T18:13:11.700+0000 7f306e8bbf00 20 librbd::api::Io aio_write: ictx=0x5650f8682490, completion=0x5650f9dd6bf0, off=554766848, len=12288, flags=0

Thank you for the info. The PR is ongoing: #39603

dillaman · 2021-02-24T21:30:54Z

Going to merge this just so the existing bug doesn't get lost.

CongMinYin · 2021-03-18T09:26:39Z

What image is automatically deleted at the end of the test? I would expect any image created by the QEMU task to be deleted by the end of that task and not after the full test runs. Worst case, you could always disable the cache on the image before removing it.

The issue is fixed by using exec and exec_on_cleanup.

I'd prefer to see "remove" function correctly and as expected should the cache be corrupted / missing. In a perfect world the ExclusiveLock::acquire_request could take a bitset of flags and you could add a new exclusive_lock::ACQUIRE_FLAG_REMOVING flag that could be passed to the cache::pwl::InitRequest to ignore missing or corrupt caches. Alternatively, there could be another plugin hook to handle the removal case. Worst-case, add a bool to ImageCtx to indicate the image is being removed.

Hi @dillaman , I'm going to take this work, but I don't quite understand the scene. Can you give me an example? why pass flag and what issue solve?

dillaman · 2021-03-18T12:33:08Z

@CongMinYin In general, when removing an image, we don't care about the state or availability of data. We just want to ensure everything possible has been removed. Therefore, when removing there is no need to attempt to open the cache, writeback the cache, etc. Since the remove state machine acquires the exclusive lock (via librbd/image/PreRemoveRequest), if we could pass an bitmask flag to the acquire_lock method, we could have the PWL cache skip any unnecessary initialization when removing.

github-actions bot added rbd tests labels Jan 15, 2021

lixiaoy1 force-pushed the pwl_teuthology branch from d7b1216 to af539d0 Compare January 15, 2021 06:13

lixiaoy1 force-pushed the pwl_teuthology branch from af539d0 to 7a2105c Compare February 3, 2021 08:10

lixiaoy1 changed the title ~~WIP qa: add tests for persistent writeback cache~~ qa: add tests for persistent writeback cache Feb 4, 2021

lixiaoy1 force-pushed the pwl_teuthology branch from 7a2105c to 8e3cc14 Compare February 4, 2021 04:00

dillaman reviewed Feb 5, 2021

View reviewed changes

lixiaoy1 force-pushed the pwl_teuthology branch 3 times, most recently from 1b2cbdb to a2b39bc Compare February 7, 2021 06:02

lixiaoy1 force-pushed the pwl_teuthology branch from a2b39bc to 167a11f Compare February 7, 2021 07:38

dillaman reviewed Feb 7, 2021

View reviewed changes

dillaman added the needs-qa label Feb 7, 2021

dillaman added wip-jason-testing and removed wip-jason-testing labels Feb 7, 2021

lixiaoy1 force-pushed the pwl_teuthology branch from 167a11f to 65f64c8 Compare February 8, 2021 07:15

lixiaoy1 force-pushed the pwl_teuthology branch 3 times, most recently from e508922 to 86ae486 Compare February 9, 2021 06:53

lixiaoy1 added 2 commits February 9, 2021 23:39

qa: fix issues in exec_on_cleanup

898f9d6

Signed-off-by: Li, Xiaoyan <xiaoyan.li@intel.com>

qa: add tests for persistent writeback cache

86ae486

Signed-off-by: Li, Xiaoyan <xiaoyan.li@intel.com>

dillaman approved these changes Feb 24, 2021

View reviewed changes

dillaman merged commit 5c991fe into ceph:master Feb 24, 2021

dillaman mentioned this pull request Feb 24, 2021

pacific: librbd: miscellaneous backports #39675

Merged

3 tasks

CongMinYin mentioned this pull request Apr 1, 2021

librbd/cache/pwl: Discard write back cache in removing image request #40464

Closed

3 tasks

qa: add tests for persistent writeback cache #38921

qa: add tests for persistent writeback cache #38921

Conversation

lixiaoy1 commented Jan 15, 2021

Checklist

lixiaoy1 commented Jan 15, 2021

dillaman commented Jan 21, 2021

lixiaoy1 commented Jan 25, 2021

lixiaoy1 commented Feb 4, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dillaman left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lixiaoy1 commented Feb 7, 2021

Choose a reason for hiding this comment

dillaman commented Feb 7, 2021

dillaman commented Feb 7, 2021

lixiaoy1 commented Feb 8, 2021

lixiaoy1 commented Feb 8, 2021

dillaman commented Feb 8, 2021

dillaman commented Feb 8, 2021

lixiaoy1 commented Feb 9, 2021

lixiaoy1 commented Feb 9, 2021

lixiaoy1 commented Feb 9, 2021

dillaman commented Feb 9, 2021

dillaman commented Feb 9, 2021

lixiaoy1 commented Feb 18, 2021

lixiaoy1 commented Feb 18, 2021

lixiaoy1 commented Feb 19, 2021

dillaman commented Feb 21, 2021 • edited Loading

dillaman commented Feb 21, 2021

dillaman commented Feb 21, 2021

dillaman commented Feb 21, 2021 • edited Loading

lixiaoy1 commented Feb 23, 2021

dillaman commented Feb 24, 2021

CongMinYin commented Mar 18, 2021

dillaman commented Mar 18, 2021

dillaman commented Feb 21, 2021 •

edited

Loading

dillaman commented Feb 21, 2021 •

edited

Loading