Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pacific: os/bluestore: fix spillover alert #50932

Merged
merged 3 commits into from May 16, 2023

Conversation

ifed01
Copy link
Contributor

@ifed01 ifed01 commented Apr 7, 2023

backport of #49987

backport tracker: https://tracker.ceph.com/issues/59340
parent tracker: https://tracker.ceph.com/issues/58440

Signed-off-by: Igor Fedotov igor.fedotov@croit.io

Contribution Guidelines

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox
  • jenkins test windows

Signed-off-by: Igor Fedotov <igor.fedotov@croit.io>
(cherry picked from commit 431ca85)
Fixes: https://tracker.ceph.com/issues/58440

Signed-off-by: Igor Fedotov <igor.fedotov@croit.io>
(cherry picked from commit 4f20521)
Signed-off-by: Igor Fedotov <igor.fedotov@croit.io>
(cherry picked from commit 326eabe)
@ljflores
Copy link
Contributor

ljflores commented May 10, 2023

Hey @ifed01, I'm seeing a bluestore test fail consistently in the QA runs. Can you take a look?

http://pulpito.front.sepia.ceph.com/?branch=wip-yuri11-testing-2023-04-25-1605-pacific

/a/yuriw-2023-04-26_01:16:19-rados-wip-yuri11-testing-2023-04-25-1605-pacific-distro-default-smithi/7253708
/a/yuriw-2023-05-01_16:31:04-rados-wip-yuri11-testing-2023-04-25-1605-pacific-distro-default-smithi/7259813
/a/lflores-2023-05-03_21:43:07-rados-wip-yuri11-testing-2023-04-25-1605-pacific-distro-default-smithi/7261653

2023-04-29T06:18:33.927 INFO:teuthology.orchestra.run.smithi155.stdout:==> rm -r bluestore.test_temp_dir
2023-04-29T06:18:33.931 INFO:teuthology.orchestra.run.smithi155.stdout:[       OK ] ObjectStore/StoreTest.MultipoolListTest/2 (3440 ms)
2023-04-29T06:18:33.931 INFO:teuthology.orchestra.run.smithi155.stdout:[ RUN      ] ObjectStore/StoreTest.SimpleCloneTest/0
2023-04-29T06:18:34.015 INFO:teuthology.orchestra.run.smithi155.stderr:Creating collection meta
2023-04-29T06:18:34.015 INFO:teuthology.orchestra.run.smithi155.stderr:Creating object and set attr #-1:de000000::key:Object 1:head#
2023-04-29T06:18:34.016 INFO:teuthology.orchestra.run.smithi155.stderr:Clone object and rm attr
2023-04-29T06:18:34.016 INFO:teuthology.orchestra.run.smithi155.stderr:Invalid rm coll
2023-04-29T06:18:34.055 INFO:teuthology.orchestra.run.smithi155.stderr:Invalid rm coll again
2023-04-29T06:18:34.095 INFO:teuthology.orchestra.run.smithi155.stderr:Cleaning
2023-04-29T06:18:34.095 INFO:teuthology.orchestra.run.smithi155.stdout:==> rm -r memstore.test_temp_dir
2023-04-29T06:18:34.097 INFO:teuthology.orchestra.run.smithi155.stdout:[       OK ] ObjectStore/StoreTest.SimpleCloneTest/0 (167 ms)
2023-04-29T06:18:34.097 INFO:teuthology.orchestra.run.smithi155.stdout:[ RUN      ] ObjectStore/StoreTest.SimpleCloneTest/1
2023-04-29T06:18:34.856 INFO:teuthology.orchestra.run.smithi155.stderr:Creating collection meta
2023-04-29T06:18:34.856 INFO:teuthology.orchestra.run.smithi155.stderr:Creating object and set attr #-1:de000000::key:Object 1:head#
2023-04-29T06:18:34.856 INFO:teuthology.orchestra.run.smithi155.stderr:Clone object and rm attr
2023-04-29T06:18:35.976 INFO:teuthology.orchestra.run.smithi155.stderr:Invalid rm coll
2023-04-29T06:18:36.013 INFO:teuthology.orchestra.run.smithi155.stderr:Invalid rm coll again
2023-04-29T06:18:36.558 INFO:teuthology.orchestra.run.smithi155.stderr:Cleaning
2023-04-29T06:18:36.837 INFO:teuthology.orchestra.run.smithi155.stdout:==> rm -r filestore.test_temp_dir
2023-04-29T06:18:36.840 INFO:teuthology.orchestra.run.smithi155.stdout:[       OK ] ObjectStore/StoreTest.SimpleCloneTest/1 (2743 ms)
2023-04-29T06:18:36.841 INFO:teuthology.orchestra.run.smithi155.stdout:[ RUN      ] ObjectStore/StoreTest.SimpleCloneTest/2
2023-04-29T06:18:40.072 INFO:teuthology.orchestra.run.smithi155.stderr:Creating collection meta
2023-04-29T06:18:40.073 INFO:teuthology.orchestra.run.smithi155.stderr:Creating object and set attr #-1:de000000::key:Object 1:head#
2023-04-29T06:18:40.073 INFO:teuthology.orchestra.run.smithi155.stderr:Clone object and rm attr
2023-04-29T06:18:41.668 INFO:teuthology.orchestra.run.smithi155.stderr:Invalid rm coll
2023-04-29T18:06:12.144 DEBUG:teuthology.exit:Got signal 15; running 1 handler...
2023-04-29T18:06:12.253 DEBUG:teuthology.task.console_log:Killing console logger for smithi155
2023-04-29T18:06:12.254 DEBUG:teuthology.exit:Finished running handlers

/a/yuriw-2023-04-26_01:16:19-rados-wip-yuri11-testing-2023-04-25-1605-pacific-distro-default-smithi/7253708/remote/smithi155/ceph_test_objectstore.log.gz

2023-04-29T06:18:34.015+0000 7f1bbeb84540  0 memstore(memstore.test_temp_dir) dump:{
    "collections": [
        {
            "name": "meta",
            "xattrs": [],
            "objects": [
                {
                    "name": "#-1:de000000::key:Object 1:head#",
                    "data_len": 65536,
                    "omap_header_len": 0,
                    "xattrs": [],
                    "omap": []
                },
                {
                    "name": "#-1:de000000::key:Object 2:head#",
                    "data_len": 65536,
                    "omap_header_len": 0,
                    "xattrs": [],
                    "omap": []
                }
            ]
        }
    ]
}

2023-04-29T06:18:34.023+0000 7f1bbeb84540 -1 *** Caught signal (Aborted) **
 in thread 7f1bbeb84540 thread_name:ceph_test_objec

 ceph version 16.2.12-89-g8d175760 (8d17576050e846ecd4a9899bc7d8ebbf771b4de8) pacific (stable)
 1: /lib/x86_64-linux-gnu/libpthread.so.0(+0x14420) [0x7f1bc8546420]
 2: gsignal()
 3: abort()
 4: ceph_test_objectstore(+0x80f89b) [0x5609e1eaf89b]
 5: ceph_test_objectstore(+0xc06176) [0x5609e22a6176]
 6: clone()

@ifed01
Copy link
Contributor Author

ifed01 commented May 11, 2023

Hey @ifed01, I'm seeing a bluestore test fail consistently in the QA runs. Can you take a look?

Hey @ljflores - this looks absolutely similar to the error you shared at #50506 two weeks ago.
And alike to that case it looks rather irrelevant since it's memstore which is failing. While the PRs in question don't deal with this specific object a]store. So I'm curious if there are any common PRs/commits which were included in both QA runs? If not I would say the issue is caused by a completely different stuff, e.g. prior commits or something...

@ljflores
Copy link
Contributor

ljflores commented May 15, 2023

Hey @ifed01, I'm seeing a bluestore test fail consistently in the QA runs. Can you take a look?

Hey @ljflores - this looks absolutely similar to the error you shared at #50506 two weeks ago. And alike to that case it looks rather irrelevant since it's memstore which is failing. While the PRs in question don't deal with this specific object a]store. So I'm curious if there are any common PRs/commits which were included in both QA runs? If not I would say the issue is caused by a completely different stuff, e.g. prior commits or something...

@ifed01 Here was the Trello card this was tested on for reference: https://trello.com/c/RFvB8Ugn/1741-wip-yuri11-testing-2023-04-25-1605-pacific

See #50506 (comment) for how I think we should handle this.

@ljflores
Copy link
Contributor

@yuriw yuriw merged commit 7ec3217 into ceph:pacific May 16, 2023
8 checks passed
@ifed01 ifed01 deleted the wip-ifed-fix-spillover-alert-pac branch May 16, 2023 15:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants