Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modify to use lvm block device. #1

Closed
wants to merge 3 commits into from
Closed

Conversation

kfyharukz
Copy link

No description provided.

@kfyharukz kfyharukz self-assigned this Oct 24, 2019
@@ -302,6 +303,14 @@ def _get_strategy(self):

self.strategy = strategy.with_auto_devices(self.args, unused_devices)

def _to_mapper_devs(unused_devices):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ここはクラスの中のメソッドなので、第一引数に self が必須となります。
(プログラムを書く人が self を明示する必要があります。)
また、最後は return で返しましょう。

  def _to_mapper_devs(self, unused_devices):
    ...
    return updated

@@ -290,6 +290,7 @@ def execute(self):
def _get_strategy(self):
strategy = get_strategy(self.args, self.args.devices)
unused_devices, self.filtered_devices = filter_devices(self.args)
unused_devices = _to_mapper_devs(unused_devices)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self.<MethodName>() となります。

unused_devices = self._to_mapper_devs(unused_devices)

satoru-takeuchi pushed a commit that referenced this pull request Apr 2, 2020
* no need to discard_result(). as `output_stream::close()` returns an
  empty future<> already
* free the connected socket after the background task finishes, because:

we should not free the connected socket before the promise referencing it is fulfilled.

otherwise we have error messages from ASan, like

==287182==ERROR: AddressSanitizer: heap-use-after-free on address 0x611000019aa0 at pc 0x55e2ae2de882 bp 0x7fff7e2bf080 sp 0x7fff7e2bf078
READ of size 8 at 0x611000019aa0 thread T0
    #0 0x55e2ae2de881 in seastar::reactor_backend_aio::await_events(int, __sigset_t const*) ../src/seastar/src/core/reactor_backend.cc:396
    #1 0x55e2ae2dfb59 in seastar::reactor_backend_aio::reap_kernel_completions() ../src/seastar/src/core/reactor_backend.cc:428
    #2 0x55e2adbea397 in seastar::reactor::reap_kernel_completions_pollfn::poll() (/var/ssd/ceph/build/bin/crimson-osd+0x155e9397)
    #3 0x55e2adaec6d0 in seastar::reactor::poll_once() ../src/seastar/src/core/reactor.cc:2789
    #4 0x55e2adae7cf7 in operator() ../src/seastar/src/core/reactor.cc:2687
    #5 0x55e2adb7c595 in __invoke_impl<bool, seastar::reactor::run()::<lambda()>&> /usr/include/c++/10/bits/invoke.h:60
    #6 0x55e2adb699b0 in __invoke_r<bool, seastar::reactor::run()::<lambda()>&> /usr/include/c++/10/bits/invoke.h:113
    #7 0x55e2adb50222 in _M_invoke /usr/include/c++/10/bits/std_function.h:291
    #8 0x55e2adc2ba00 in std::function<bool ()>::operator()() const /usr/include/c++/10/bits/std_function.h:622
    #9 0x55e2adaea491 in seastar::reactor::run() ../src/seastar/src/core/reactor.cc:2713
    #10 0x55e2ad98f1c7 in seastar::app_template::run_deprecated(int, char**, std::function<void ()>&&) ../src/seastar/src/core/app-template.cc:199
    #11 0x55e2a9e57538 in main ../src/crimson/osd/main.cc:148
    #12 0x7fae7f20de0a in __libc_start_main ../csu/libc-start.c:308
    #13 0x55e2a9d431e9 in _start (/var/ssd/ceph/build/bin/crimson-osd+0x117421e9)

0x611000019aa0 is located 96 bytes inside of 240-byte region [0x611000019a40,0x611000019b30)
freed by thread T0 here:
    #0 0x7fae80a4e487 in operator delete(void*, unsigned long) (/usr/lib/x86_64-linux-gnu/libasan.so.6+0xac487)
    #1 0x55e2ae302a0a in seastar::aio_pollable_fd_state::~aio_pollable_fd_state() ../src/seastar/src/core/reactor_backend.cc:458
    #2 0x55e2ae2e1059 in seastar::reactor_backend_aio::forget(seastar::pollable_fd_state&) ../src/seastar/src/core/reactor_backend.cc:524
    #3 0x55e2adab9b9a in seastar::pollable_fd_state::forget() ../src/seastar/src/core/reactor.cc:1396
    #4 0x55e2adab9d05 in seastar::intrusive_ptr_release(seastar::pollable_fd_state*) ../src/seastar/src/core/reactor.cc:1401
    #5 0x55e2ace1b72b in boost::intrusive_ptr<seastar::pollable_fd_state>::~intrusive_ptr() /opt/ceph/include/boost/smart_ptr/intrusive_ptr.hpp:98
    #6 0x55e2ace115a5 in seastar::pollable_fd::~pollable_fd() ../src/seastar/include/seastar/core/internal/pollable_fd.hh:109
    #7 0x55e2ae0ed35c in seastar::net::posix_server_socket_impl::~posix_server_socket_impl() ../src/seastar/include/seastar/net/posix-stack.hh:161
    #8 0x55e2ae0ed3cf in seastar::net::posix_server_socket_impl::~posix_server_socket_impl() ../src/seastar/include/seastar/net/posix-stack.hh:161
    #9 0x55e2ae0ed943 in std::default_delete<seastar::net::api_v2::server_socket_impl>::operator()(seastar::net::api_v2::server_socket_impl*) const /usr/include/c++/10/bits/unique_ptr.h:81
    #10 0x55e2ae0db357 in std::unique_ptr<seastar::net::api_v2::server_socket_impl, std::default_delete<seastar::net::api_v2::server_socket_impl> >::~unique_ptr()
	/usr/include/c++/10/bits/unique_ptr.h:357    #11 0x55e2ae1438b7 in seastar::api_v2::server_socket::~server_socket() ../src/seastar/src/net/stack.cc:195
    #12 0x55e2aa1c7656 in std::_Optional_payload_base<seastar::api_v2::server_socket>::_M_destroy() /usr/include/c++/10/optional:260
    #13 0x55e2aa16c84b in std::_Optional_payload_base<seastar::api_v2::server_socket>::_M_reset() /usr/include/c++/10/optional:280
    #14 0x55e2ac24b2b7 in std::_Optional_base_impl<seastar::api_v2::server_socket, std::_Optional_base<seastar::api_v2::server_socket, false, false> >::_M_reset() /usr/include/c++/10/optional:432
    #15 0x55e2ac23f37b in std::optional<seastar::api_v2::server_socket>::reset() /usr/include/c++/10/optional:975
    #16 0x55e2ac21a2e7 in crimson::admin::AdminSocket::stop() ../src/crimson/admin/admin_socket.cc:265
    #17 0x55e2aa099825 in operator() ../src/crimson/osd/osd.cc:450
    #18 0x55e2aa0d4e3e in apply ../src/seastar/include/seastar/core/apply.hh:36

Signed-off-by: Kefu Chai <kchai@redhat.com>
satoru-takeuchi pushed a commit that referenced this pull request Apr 24, 2020
This fixes the the following selinux error when using ceph-iscsi's
rbd-target-api daemon (rbd-target-gw has the same issue). They are
a result of the a python library, rtslib, which the daemons use.

Additional Information:
Source Context                system_u:system_r:ceph_t:s0
Target Context                system_u:object_r:configfs_t:s0
Target Objects
/sys/kernel/config/target/iscsi/iqn.2003-01.com.re
                              dhat:ceph-iscsi/tpgt_1/attrib/authentication
[
                              file ]
Source                        rbd-target-api
Source Path                   /usr/libexec/platform-python3.6
Port                          <Unknown>
Host                          ans8
Source RPM Packages           platform-python-3.6.8-15.1.el8.x86_64
Target RPM Packages
Policy RPM                    selinux-policy-3.14.3-20.el8.noarch
Selinux Enabled               True
Policy Type                   targeted
Enforcing Mode                Enforcing
Host Name                     ans8
Platform                      Linux ans8 4.18.0-147.el8.x86_64 #1 SMP
Thu Sep 26
                              15:52:44 UTC 2019 x86_64 x86_64
Alert Count                   1
First Seen                    2020-01-08 18:39:47 EST
Last Seen                     2020-01-08 18:39:47 EST
Local ID                      6f8c3415-7a50-4dc8-b3d2-2621e1d00ca3

Raw Audit Messages
type=AVC msg=audit(1578526787.577:68): avc:  denied  { ioctl } for
pid=995 comm="rbd-target-api"
path="/sys/kernel/config/target/iscsi/iqn.2003-01.com.redhat:ceph-iscsi/tpgt_1/attrib/authentication"
dev="configfs" ino=25703 ioctlcmd=0x5401
scontext=system_u:system_r:ceph_t:s0
tcontext=system_u:object_r:configfs_t:s0 tclass=file permissive=1

type=SYSCALL msg=audit(1578526787.577:68): arch=x86_64 syscall=ioctl
success=no exit=ENOTTY a0=34 a1=5401 a2=7ffd4f8f1f60 a3=3052cd2d95839b96
items=0 ppid=1 pid=995 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0
egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm=rbd-target-api
exe=/usr/libexec/platform-python3.6 subj=system_u:system_r:ceph_t:s0
key=(null)

Hash: rbd-target-api,ceph_t,configfs_t,file,ioctl

Signed-off-by: Mike Christie <mchristi@redhat.com>
satoru-takeuchi pushed a commit that referenced this pull request Apr 24, 2020
This fixes the selinux errors like this for /etc/target

-----------------------------------
Additional Information:
Source Context                system_u:system_r:ceph_t:s0
Target Context                system_u:object_r:targetd_etc_rw_t:s0
Target Objects                target [ dir ]
Source                        rbd-target-api
Source Path                   rbd-target-api
Port                          <Unknown>
Host                          ans8
Source RPM Packages
Target RPM Packages
Policy RPM                    selinux-policy-3.14.3-20.el8.noarch
Selinux Enabled               True
Policy Type                   targeted
Enforcing Mode                Enforcing
Host Name                     ans8
Platform                      Linux ans8 4.18.0-147.el8.x86_64 #1 SMP
Thu Sep 26
                              15:52:44 UTC 2019 x86_64 x86_64
Alert Count                   1
First Seen                    2020-01-08 18:39:48 EST
Last Seen                     2020-01-08 18:39:48 EST
Local ID                      9a13ee18-eaf2-4f2a-872f-2809ee4928f6

Raw Audit Messages
type=AVC msg=audit(1578526788.148:69): avc:  denied  { search } for
pid=995 comm="rbd-target-api" name="target" dev="sda1" ino=52198
scontext=system_u:system_r:ceph_t:s0
tcontext=system_u:object_r:targetd_etc_rw_t:s0 tclass=dir permissive=1

Hash: rbd-target-api,ceph_t,targetd_etc_rw_t,dir,search

which are a result of the rtslib library the ceph-iscsi daemons use
accessing /etc/target to read/write a file which stores meta data the
target uses.

Signed-off-by: Mike Christie <mchristi@redhat.com>
satoru-takeuchi pushed a commit that referenced this pull request Aug 28, 2020
Changes addressing comments in PR - commit to be
squashed prior to merge

Signed-off-by: Paul Cuzner <pcuzner@redhat.com>
satoru-takeuchi pushed a commit that referenced this pull request Jul 27, 2021
Following crash occured at Sepia [1]:

```
INFO  2021-05-26 20:16:32,872 [shard 0] ms - [osd.0(client) v2:172.21.15.119:6803/31733 >> unknown.? -@55220] ProtocolV2::start_accept(): targ
et_addr=172.21.15.119:55220/0
DEBUG 2021-05-26 20:16:32,872 [shard 0] ms - [osd.0(client) v2:172.21.15.119:6803/31733 >> unknown.? -@55220] TRIGGER ACCEPTING, was NONE
DEBUG 2021-05-26 20:16:32,873 [shard 0] ms - [osd.0(client) v2:172.21.15.119:6803/31733 >> unknown.? -@55220] SEND(26) banner: len_payload=16,
 supported=1, required=0, banner="ceph v2
"
DEBUG 2021-05-26 20:16:32,873 [shard 0] ms - [osd.0(client) v2:172.21.15.119:6803/31733 >> unknown.? -@55220] RECV(10) banner: "ceph v2
"
DEBUG 2021-05-26 20:16:32,873 [shard 0] ms - [osd.0(client) v2:172.21.15.119:6803/31733 >> unknown.? -@55220] GOT banner: payload_len=16
DEBUG 2021-05-26 20:16:32,873 [shard 0] ms - [osd.0(client) v2:172.21.15.119:6803/31733 >> unknown.? -@55220] RECV(16) banner features: supported=1 required=0
DEBUG 2021-05-26 20:16:32,873 [shard 0] ms - [osd.0(client) v2:172.21.15.119:6803/31733 >> unknown.? -@55220] WRITE HelloFrame: my_type=osd, peer_addr=172.21.15.119:55220/0
DEBUG 2021-05-26 20:16:32,873 [shard 0] ms - [osd.0(client) v2:172.21.15.119:6803/31733 >> unknown.? -@55220] GOT HelloFrame: my_type=client peer_addr=v2:172.21.15.119:6803/31733
INFO  2021-05-26 20:16:32,873 [shard 0] ms - [osd.0(client) v2:172.21.15.119:6803/31733 >> client.? -@55220] UPDATE: peer_type=client, policy(lossy=true server=true standby=false resetcheck=false)
DEBUG 2021-05-26 20:16:32,873 [shard 0] ms - [osd.0(client) v2:172.21.15.119:6803/31733 >> client.? -@55220] GOT AuthRequestFrame: method=2, preferred_modes={1, 2}, payload_len=174
/home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.0.0-4622-gaa1dc559/rpm/el8/BUILD/ceph-17.0.0-4622-gaa1dc559/src/crimson/mon/MonClient.cc:399:10: runtime error: member access within null pointer of type 'struct Connection'
Segmentation fault on shard 0.
Backtrace:
 0# 0x000055E84CF44C1F in ceph-osd
 1# FatalSignal::signaled(int, siginfo_t const*) in ceph-osd
 2# FatalSignal::install_oneshot_signal_handler<11>()::{lambda(int, siginfo_t*, void*)#1}::_FUN(int, siginfo_t*, void*) in ceph-osd
 3# 0x00007F2BC88C0B20 in /lib64/libpthread.so.0
 4# crimson::mon::Connection::get_conn() in ceph-osd
 5# crimson::mon::Client::handle_auth_request(seastar::shared_ptr<crimson::net::Connection>, seastar::lw_shared_ptr<AuthConnectionMeta>, bool, unsigned int, ceph::buffer::v15_2_0::list const&, ceph::buffer::v15_2_0::list*) in ceph-osd
 6# crimson::net::ProtocolV2::_handle_auth_request(ceph::buffer::v15_2_0::list&, bool) in ceph-osd
 7# 0x000055E84DF67669 in ceph-osd
 8# 0x000055E84DF68775 in ceph-osd
 9# 0x000055E846F47F60 in ceph-osd
10# 0x000055E85296770F in ceph-osd
11# 0x000055E85296CC50 in ceph-osd
12# 0x000055E852B1ECBB in ceph-osd
13# 0x000055E85267C73A in ceph-osd
14# main in ceph-osd
15# __libc_start_main in /lib64/libc.so.6
16# _start in ceph-osd
Fault at location: 0x98
```

[1]: http://pulpito.front.sepia.ceph.com/rzarzynski-2021-05-26_12:20:26-rados-master-distro-basic-smithi/6136907

When the `handle_auth_request()` happens, there is no guarantee
`active_con` is being available. This is reflected in the classical
implementation:

```cpp
int MonClient::handle_auth_request(
  Connection *con,
  // ...
  ceph::buffer::list *reply)
{
  // ...
  bool isvalid = ah->verify_authorizer(
    cct,
    *rotating_secrets,
    payload,
    auth_meta->get_connection_secret_length(),
    reply,
    &con->peer_name,
    &con->peer_global_id,
    &con->peer_caps_info,
    &auth_meta->session_key,
    &auth_meta->connection_secret,
    ac);
```

The patch transplate the same logic to crimson.

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
satoru-takeuchi pushed a commit that referenced this pull request Jul 27, 2021
The `FuturizedStore` interface imposes the `get_attr()`
takes the `name` parameter as `std::string_view`, and
thus burdens implementations with extending the life-
time of the data the instance refers to.

Unfortunately, `AlienStore` is unaware that prolonging
the life of a `std::string_view` instance doesn't prolong
the data memory it points to. This problem has manifested
in the following use-after-free detected at Sepia:

```
rzarzynski@teuthology:/home/teuthworker/archive/rzarzynski-2021-05-26_12:20:26-rados-master-distro-basic-smithi/6136929$ less ./remote/smithi194/log/ceph-osd.7.log.gz
...
DEBUG 2021-05-26 20:24:54,077 [shard 0] osd - do_osd_ops_execute: object 14:55e1a5b4:test-rados-api-smithi067-38889-2::foo:head - handling op
call
DEBUG 2021-05-26 20:24:54,077 [shard 0] osd - handling op call on object 14:55e1a5b4:test-rados-api-smithi067-38889-2::foo:head
DEBUG 2021-05-26 20:24:54,078 [shard 0] osd - calling method lock.lock, num_read=0, num_write=0
DEBUG 2021-05-26 20:24:54,078 [shard 0] osd - handling op getxattr on object 14:55e1a5b4:test-rados-api-smithi067-38889-2::foo:head
DEBUG 2021-05-26 20:24:54,078 [shard 0] osd - getxattr on obj=14:55e1a5b4:test-rados-api-smithi067-38889-2::foo:head for attr=_lock.TestLockPP1
DEBUG 2021-05-26 20:24:54,078 [shard 0] bluestore - get_attr
=================================================================
==34068==ERROR: AddressSanitizer: heap-use-after-free on address 0x6030001851d0 at pc 0x7f824d6a5b27 bp 0x7f822b4201c0 sp 0x7f822b41f968
READ of size 17 at 0x6030001851d0 thread T28 (alien-store-tp)
...
    #0 0x7f824d6a5b26  (/lib64/libasan.so.5+0x40b26)
    #1 0x55e2cbb2e00b  (/usr/bin/ceph-osd+0x2b6dc00b)
    #2 0x55e2d31f086e  (/usr/bin/ceph-osd+0x32d9e86e)
    #3 0x55e2d3467607 in crimson::os::ThreadPool::loop(std::chrono::duration<long, std::ratio<1l, 1000l> >, unsigned long) (/usr/bin/ceph-osd+0x33015607)
    #4 0x55e2d346b14a  (/usr/bin/ceph-osd+0x3301914a)
    #5 0x7f8249d32ba2  (/lib64/libstdc++.so.6+0xc2ba2)
    #6 0x7f824a00d149 in start_thread (/lib64/libpthread.so.0+0x8149)
    #7 0x7f82486edf22 in clone (/lib64/libc.so.6+0xfcf22)

0x6030001851d0 is located 0 bytes inside of 31-byte region [0x6030001851d0,0x6030001851ef)
freed by thread T0 here:
    #0 0x7f824d757688 in operator delete(void*) (/lib64/libasan.so.5+0xf2688)

previously allocated by thread T0 here:
    #0 0x7f824d7567b0 in operator new(unsigned long) (/lib64/libasan.so.5+0xf17b0)

Thread T28 (alien-store-tp) created by T0 here:
    #0 0x7f824d6b7ea3 in __interceptor_pthread_create (/lib64/libasan.so.5+0x52ea3)

SUMMARY: AddressSanitizer: heap-use-after-free (/lib64/libasan.so.5+0x40b26)
Shadow bytes around the buggy address:
  0x0c06800289e0: fd fd fd fa fa fa fd fd fd fa fa fa 00 00 00 fa
  0x0c06800289f0: fa fa fd fd fd fa fa fa fd fd fd fa fa fa fd fd
  0x0c0680028a00: fd fa fa fa fd fd fd fa fa fa fd fd fd fa fa fa
  0x0c0680028a10: fd fd fd fa fa fa fd fd fd fa fa fa fd fd fd fa
  0x0c0680028a20: fa fa fd fd fd fa fa fa fd fd fd fa fa fa fd fd
=>0x0c0680028a30: fd fd fa fa fd fd fd fd fa fa[fd]fd fd fd fa fa
  0x0c0680028a40: fd fd fd fd fa fa fd fd fd fd fa fa 00 00 00 07
  0x0c0680028a50: fa fa 00 00 00 fa fa fa 00 00 00 fa fa fa fd fd
  0x0c0680028a60: fd fd fa fa fd fd fd fd fa fa fd fd fd fd fa fa
  0x0c0680028a70: 00 00 00 00 fa fa fd fd fd fd fa fa fd fd fd fd
  0x0c0680028a80: fa fa fd fd fd fd fa fa fd fd fd fd fa fa fd fd
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==34068==ABORTING
```

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
satoru-takeuchi pushed a commit that referenced this pull request Jul 27, 2021
Commit b5efdc6 has unified
the interruption handling among `InternalClientRequest` and
`ClientRequest`. Unfortunately, a call to `maybe_reset()` of
`OpSequencer` has been overlooked and dropped leading to the
`assert(prev_op > last_unblocked)` assertion failure in
`start_op()`.

This was the root cause of the following problem at Sepia:

```
rzarzynski@teuthology:/home/teuthworker/archive/rzarzynski-2021-05-26_12:20:26-rados-master-distro-basic-smithi/6136929$ less ./remote/smithi194/log/ceph-osd.6.log.gz
...
DEBUG 2021-05-26 20:24:53,988 [shard 0] ms - [osd.6(client) v2:172.21.15.194:6804/34047 >> client.4453 172.21.15.67:0/3814935464@37042] <== #1
 === osd_op(client.4453.0:5 12.6 12.7fc1f406 (undecoded) ondisk+write+known_if_redirected e52) v8 (42)
DEBUG 2021-05-26 20:24:53,988 [shard 0] osd - client_request(id=4, detail=osd_op(client.4453.0:5 12.6 12.7fc1f406 (undecoded) ondisk+write+known_if_redirected e52) v8): start
DEBUG 2021-05-26 20:24:53,988 [shard 0] osd - client_request(id=4, detail=osd_op(client.4453.0:5 12.6 12.7fc1f406 (undecoded) ondisk+write+known_if_redirected e52) v8): in repeat
DEBUG 2021-05-26 20:24:53,988 [shard 0] osd - client_request(id=4, detail=osd_op(client.4453.0:5 12.6 12.7fc1f406 (undecoded) ondisk+write+known_if_redirected e52) v8) same_interval_since: 19
DEBUG 2021-05-26 20:24:53,988 [shard 0] osd - do_recover_missing check for recovery, 12:602f83fe:::foo:head
DEBUG 2021-05-26 20:24:53,988 [shard 0] osd - client_request(id=4, detail=osd_op(client.4453.0:5 12.6 12:602f83fe:::foo:head {write 0~128 in=128b} snapc 0={} ondisk+write+known_if_redirected e52) v8): got obc lock
...
DEBUG 2021-05-26 20:25:21,810 [shard 0] osd - client_request(id=4, detail=osd_op(client.4453.0:5 12.6 12:602f83fe:::foo:head {write 0~128 in=1
28b} snapc 0={} ondisk+write+known_if_redirected e52) v8): in repeat
...
DEBUG 2021-05-26 20:25:21,809 [shard 0] osd - should_abort_request operation restart, acting set changed
...
DEBUG 2021-05-26 20:25:21,813 [shard 0] osd - client_request(id=4, detail=osd_op(client.4453.0:5 12.6 12:602f83fe:::foo:head {write 0~128 in=128b} snapc 0={} ondisk+write+known_if_redirected e52) v8) same_interval_since: 55
...
DEBUG 2021-05-26 20:25:21,813 [shard 0] osd - client_request(id=4, detail=osd_op(client.4453.0:5 12.6 12:602f83fe:::foo:head {write 0~128 in=128b} snapc 0={} ondisk+write+known_if_redirected e52) v8) same_interval_since: 55
ceph-osd: /home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.0.0-4622-gaa1dc559/rpm/el8/BUILD/ceph-17.0.0-4622-gaa1dc559/src/crimson/osd/osd_operation_sequencer.h:52: seastar::futurize_t<Result> crimson::osd::OpSequencer::start_op(HandleT&, uint64_t, uint64_t, FuncT&&) [with HandleT = crimson::PipelineHandle; FuncT = crimson::interruptible::interruptor<InterruptCond>::wrap_function(Func&&) [with Func = crimson::osd::ClientRequest::start()::<lambda()> mutable::<lambda(Ref<crimson::osd::PG>)> mutable::<lambda()> mutable::<lambda()>; InterruptCond = crimson::osd::IOInterruptCondition]::<lambda()>; Result = crimson::interruptible::interruptible_future_detail<crimson::osd::IOInterruptCondition, seastar::future<> >; seastar::futurize_t<Result> = crimson::interruptible::interruptible_future_detail<crimson::osd::IOInterruptCondition, seastar::future<> >; uint64_t = long unsigned int]: Assertion `prev_op > last_unblocked' failed.
Aborting on shard 0.
Backtrace:
 0# 0x000055F440239C1F in ceph-osd
 1# FatalSignal::signaled(int, siginfo_t const*) in ceph-osd
 2# FatalSignal::install_oneshot_signal_handler<6>()::{lambda(int, siginfo_t*, void*)#1}::_FUN(int, siginfo_t*, void*) in ceph-osd
 3# 0x00007F4788336B20 in /lib64/libpthread.so.0
 4# gsignal in /lib64/libc.so.6
 5# abort in /lib64/libc.so.6
 6# 0x00007F4786931B09 in /lib64/libc.so.6
 7# 0x00007F478693FDE6 in /lib64/libc.so.6
 8# 0x000055F43BA3DD17 in ceph-osd
 9# 0x000055F43BA419A8 in ceph-osd
10# seastar::continuation<seastar::internal::promise_base_with_type<seastar::bool_class<seastar::stop_iteration_tag> >, seastar::noncopyable_function<seastar::future<seastar::bool_class<seastar::stop_iteration_tag> > (boost::intrusive_ptr<crimson::osd::PG>&&)>, seastar::future<boost::intrusive_ptr<crimson::osd::PG> >::then_impl_nrvo<seastar::noncopyable_function<seastar::future<seastar::bool_class<seastar::stop_iteration_tag> > (boost::intrusive_ptr<crimson::osd::PG>&&)>, seastar::future<seastar::bool_class<seastar::stop_iteration_tag> > >(seastar::noncopyable_function<seastar::future<seastar::bool_class<seastar::stop_iteration_tag> > (boost::intrusive_ptr<crimson::osd::PG>&&)>&&)::{lambda(seastar::internal::promise_base_with_type<seastar::bool_class<seastar::stop_iteration_tag> >&&, seastar::noncopyable_function<seastar::future<seastar::bool_class<seastar::stop_iteration_tag> > (boost::intrusive_ptr<crimson::osd::PG>&&)>&, seastar::future_state<boost::intrusive_ptr<crimson::osd::PG> >&&)#1}, boost::intrusive_ptr<crimson::osd::PG> >::run_and_dispose() in ceph-osd
11# 0x000055F445C5C70F in ceph-osd
```

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
satoru-takeuchi pushed a commit that referenced this pull request Jul 27, 2021
This is an AlienStore's counterpart of 9fbd601
which becomes necessary as, since 6b66c30,
we call `open_collection()` to check the existence of OSD's superblock.

Without this fix the following crash happens:

```
[rzarzynski@o06 build]$ MDS=0 MGR=0 OSD=1 ../src/vstart.sh -l -n --without-dashboard --crimson
...
crimson-osd: boost/include/boost/smart_ptr/intrusive_ptr.hpp:199: T* boost::intrusive_ptr<T>::operator->() const [with T = ObjectStore::CollectionImpl]: Assertion `px != 0' failed.
Aborting on shard 0.
Backtrace:
 0# print_backtrace(std::basic_string_view<char, std::char_traits<char> >) at /home/rzarzynski/ceph1/build/boost/include/boost/stacktrace/stacktrace.hpp:129
 1# FatalSignal::signaled(int, siginfo_t const*) at /home/rzarzynski/ceph1/build/../src/crimson/common/fatal_signal.cc:80
 2# FatalSignal::install_oneshot_signal_handler<6>()::{lambda(int, siginfo_t*, void*)#1}::operator()(int, siginfo_t*, void*) const at /home/rzarzynski/ceph1/build/../src/crimson/common/fatal_signal.cc:41
 3# FatalSignal::install_oneshot_signal_handler<6>()::{lambda(int, siginfo_t*, void*)#1}::_FUN(int, siginfo_t*, void*) at /home/rzarzynski/ceph1/build/../src/crimson/common/fatal_signal.cc:36
 4# 0x00007F8BF534DB30 in /lib64/libpthread.so.0
 5# gsignal in /lib64/libc.so.6
 6# abort in /lib64/libc.so.6
 7# 0x00007F8BF3948B19 in /lib64/libc.so.6
 8# 0x00007F8BF3956DF6 in /lib64/libc.so.6
 9# boost::intrusive_ptr<ObjectStore::CollectionImpl>::operator->() const at /home/rzarzynski/ceph1/build/boost/include/boost/smart_ptr/intrusive_ptr.hpp:200
10# crimson::os::AlienStore::open_collection(coll_t const&)::{lambda(boost::intrusive_ptr<ObjectStore::CollectionImpl>)#2}::operator()(boost::intrusive_ptr<ObjectStore::CollectionImpl>) const at /home/rzarzynski/ceph1/build/../src/crimson/os/alienstore/alien_store.cc:214
11# seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> > seastar::futurize<seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> > >::invoke<crimson::os::AlienStore::open_collection(coll_t const&)::{lambda(boost::intrusive_ptr<ObjectStore::CollectionImpl>)#2}&, boost::intrusive_ptr<ObjectStore::CollectionImpl> >(crimson::os::AlienStore::open_collection(coll_t const&)::{lambda(boost::intrusive_ptr<ObjectStore::CollectionImpl>)#2}&, boost::intrusive_ptr<ObjectStore::CollectionImpl>&&) at /home/rzarzynski/ceph1/build/../src/seastar/include/seastar/core/future.hh:2135
12# auto seastar::futurize_invoke<crimson::os::AlienStore::open_collection(coll_t const&)::{lambda(boost::intrusive_ptr<ObjectStore::CollectionImpl>)#2}&, boost::intrusive_ptr<ObjectStore::CollectionImpl> >(crimson::os::AlienStore::open_collection(coll_t const&)::{lambda(boost::intrusive_ptr<ObjectStore::CollectionImpl>)#2}&, boost::intrusive_ptr<ObjectStore::CollectionImpl>&&) at /home/rzarzynski/ceph1/build/../src/seastar/include/seastar/core/future.hh:2167
13# _ZZN7seastar6futureIN5boost13intrusive_ptrIN11ObjectStore14CollectionImplEEEE4thenIZN7crimson2os10AlienStore15open_collectionERK6coll_tEUlS5_E0_NS0_INS2_INS9_19FuturizedCollectionEEEEEEET0_OT_ENUlDpOT_E_clIJS5_EEEDaSN_ at /home/rzarzynski/ceph1/build/../src/seastar/include/seastar/core/future.hh:1526
14# _ZN7seastar20noncopyable_functionIFNS_6futureIN5boost13intrusive_ptrIN7crimson2os19FuturizedCollectionEEEEEONS3_IN11ObjectStore14CollectionImplEEEEE17direct_vtable_forIZNS1_ISB_E4thenIZNS5_10AlienStore15open_collectionERK6coll_tEUlSB_E0_S8_EET0_OT_EUlDpOT_E_E4callEPKSE_SC_ at /home/rzarzynski/ceph1/build/../src/seastar/include/seastar/util/noncopyable_function.hh:125
15# seastar::noncopyable_function<seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> > (boost::intrusive_ptr<ObjectStore::CollectionImpl>&&)>::operator()(boost::intrusive_ptr<ObjectStore::CollectionImpl>&&) const at /home/rzarzynski/ceph1/build/../src/seastar/include/seastar/util/noncopyable_function.hh:210
16# seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> > std::__invoke_impl<seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> >, seastar::noncopyable_function<seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> > (boost::intrusive_ptr<ObjectStore::CollectionImpl>&&)>&, boost::intrusive_ptr<ObjectStore::CollectionImpl> >(std::__invoke_other, seastar::noncopyable_function<seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> > (boost::intrusive_ptr<ObjectStore::CollectionImpl>&&)>&, boost::intrusive_ptr<ObjectStore::CollectionImpl>&&) at /opt/rh/gcc-toolset-9/root/usr/include/c++/9/bits/invoke.h:60
17# std::__invoke_result<seastar::noncopyable_function<seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> > (boost::intrusive_ptr<ObjectStore::CollectionImpl>&&)>&, boost::intrusive_ptr<ObjectStore::CollectionImpl> >::type std::__invoke<seastar::noncopyable_function<seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> > (boost::intrusive_ptr<ObjectStore::CollectionImpl>&&)>&, boost::intrusive_ptr<ObjectStore::CollectionImpl> >(seastar::noncopyable_function<seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> > (boost::intrusive_ptr<ObjectStore::CollectionImpl>&&)>&, boost::intrusive_ptr<ObjectStore::CollectionImpl>&&) at /opt/rh/gcc-toolset-9/root/usr/include/c++/9/bits/invoke.h:97
18# std::invoke_result<seastar::noncopyable_function<seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> > (boost::intrusive_ptr<ObjectStore::CollectionImpl>&&)>&, boost::intrusive_ptr<ObjectStore::CollectionImpl> >::type std::invoke<seastar::noncopyable_function<seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> > (boost::intrusive_ptr<ObjectStore::CollectionImpl>&&)>&, boost::intrusive_ptr<ObjectStore::CollectionImpl> >(seastar::noncopyable_function<seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> > (boost::intrusive_ptr<ObjectStore::CollectionImpl>&&)>&, boost::intrusive_ptr<ObjectStore::CollectionImpl>&&) at /opt/rh/gcc-toolset-9/root/usr/include/c++/9/functional:83
19# auto seastar::internal::future_invoke<seastar::noncopyable_function<seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> > (boost::intrusive_ptr<ObjectStore::CollectionImpl>&&)>&, boost::intrusive_ptr<ObjectStore::CollectionImpl> >(seastar::noncopyable_function<seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> > (boost::intrusive_ptr<ObjectStore::CollectionImpl>&&)>&, boost::intrusive_ptr<ObjectStore::CollectionImpl>&&) at /home/rzarzynski/ceph1/build/../src/seastar/include/seastar/core/future.hh:1213
20# seastar::future<boost::intrusive_ptr<ObjectStore::CollectionImpl> >::then_impl_nrvo<seastar::noncopyable_function<seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> > (boost::intrusive_ptr<ObjectStore::CollectionImpl>&&)>, seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> > >(seastar::noncopyable_function<seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> > (boost::intrusive_ptr<ObjectStore::CollectionImpl>&&)>&&)::{lambda(seastar::internal::promise_base_with_type<boost::intrusive_ptr<crimson::os::FuturizedCollection> >&&, seastar::noncopyable_function<seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> > (boost::intrusive_ptr<ObjectStore::CollectionImpl>&&)>&, seastar::future_state<boost::intrusive_ptr<ObjectStore::CollectionImpl> >&&)#1}::operator()(seastar::internal::promise_base_with_type<boost::intrusive_ptr<crimson::os::FuturizedCollection> >&&, seastar::noncopyable_function<seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> > (boost::intrusive_ptr<ObjectStore::CollectionImpl>&&)>&, seastar::future_state<boost::intrusive_ptr<ObjectStore::CollectionImpl> >&&) const::{lambda()#1}::operator()() const at /home/rzarzynski/ceph1/build/../src/seastar/include/seastar/core/future.hh:1575
21# void seastar::futurize<seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> > >::satisfy_with_result_of<seastar::future<boost::intrusive_ptr<ObjectStore::CollectionImpl> >::then_impl_nrvo<seastar::noncopyable_function<seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> > (boost::intrusive_ptr<ObjectStore::CollectionImpl>&&)>, seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> > >(seastar::noncopyable_function<seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> > (boost::intrusive_ptr<ObjectStore::CollectionImpl>&&)>&&)::{lambda(seastar::internal::promise_base_with_type<boost::intrusive_ptr<crimson::os::FuturizedCollection> >&&, seastar::noncopyable_function<seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> > (boost::intrusive_ptr<ObjectStore::CollectionImpl>&&)>&, seastar::future_state<boost::intrusive_ptr<ObjectStore::CollectionImpl> >&&)#1}::operator()(seastar::internal::promise_base_with_type<boost::intrusive_ptr<crimson::os::FuturizedCollection> >&&, seastar::noncopyable_function<seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> > (boost::intrusive_ptr<ObjectStore::CollectionImpl>&&)>&, seastar::future_state<boost::intrusive_ptr<ObjectStore::CollectionImpl> >&&) const::{lambda()#1}>(seastar::internal::promise_base_with_type<boost::intrusive_ptr<crimson::os::FuturizedCollection> >&&, seastar::noncopyable_function<seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> > (boost::intrusive_ptr<ObjectStore::CollectionImpl>&&)>&&) at /home/rzarzynski/ceph1/build/../src/seastar/include/seastar/core/future.hh:2120
22# seastar::future<boost::intrusive_ptr<ObjectStore::CollectionImpl> >::then_impl_nrvo<seastar::noncopyable_function<seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> > (boost::intrusive_ptr<ObjectStore::CollectionImpl>&&)>, seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> > >(seastar::noncopyable_function<seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> > (boost::intrusive_ptr<ObjectStore::CollectionImpl>&&)>&&)::{lambda(seastar::internal::promise_base_with_type<boost::intrusive_ptr<crimson::os::FuturizedCollection> >&&, seastar::noncopyable_function<seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> > (boost::intrusive_ptr<ObjectStore::CollectionImpl>&&)>&, seastar::future_state<boost::intrusive_ptr<ObjectStore::CollectionImpl> >&&)#1}::operator()(seastar::internal::promise_base_with_type<boost::intrusive_ptr<crimson::os::FuturizedCollection> >&&, seastar::noncopyable_function<seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> > (boost::intrusive_ptr<ObjectStore::CollectionImpl>&&)>&, seastar::future_state<boost::intrusive_ptr<ObjectStore::CollectionImpl> >&&) const at /home/rzarzynski/ceph1/build/../src/seastar/include/seastar/core/future.hh:1571
23# seastar::continuation<seastar::internal::promise_base_with_type<boost::intrusive_ptr<crimson::os::FuturizedCollection> >, seastar::noncopyable_function<seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> > (boost::intrusive_ptr<ObjectStore::CollectionImpl>&&)>, seastar::future<boost::intrusive_ptr<ObjectStore::CollectionImpl> >::then_impl_nrvo<seastar::noncopyable_function<seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> > (boost::intrusive_ptr<ObjectStore::CollectionImpl>&&)>, seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> > >(seastar::noncopyable_function<seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> > (boost::intrusive_ptr<ObjectStore::CollectionImpl>&&)>&&)::{lambda(seastar::internal::promise_base_with_type<boost::intrusive_ptr<crimson::os::FuturizedCollection> >&&, seastar::noncopyable_function<seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> > (boost::intrusive_ptr<ObjectStore::CollectionImpl>&&)>&, seastar::future_state<boost::intrusive_ptr<ObjectStore::CollectionImpl> >&&)#1}, boost::intrusive_ptr<ObjectStore::CollectionImpl> >::run_and_dispose() at /home/rzarzynski/ceph1/build/../src/seastar/include/seastar/core/future.hh:771
24# seastar::reactor::run_tasks(seastar::reactor::task_queue&) at /home/rzarzynski/ceph1/build/../src/seastar/src/core/reactor.cc:2237
25# seastar::reactor::run_some_tasks() at /home/rzarzynski/ceph1/build/../src/seastar/src/core/reactor.cc:2646 (discriminator 1)
26# seastar::reactor::run() at /home/rzarzynski/ceph1/build/../src/seastar/src/core/reactor.cc:2805
27# seastar::app_template::run_deprecated(int, char**, std::function<void ()>&&) at /home/rzarzynski/ceph1/build/../src/seastar/src/core/app-template.cc:207 (discriminator 7)
28# seastar::app_template::run(int, char**, std::function<seastar::future<int> ()>&&) at /home/rzarzynski/ceph1/build/../src/seastar/src/core/app-template.cc:115 (discriminator 2)
29# main at /home/rzarzynski/ceph1/build/../src/crimson/osd/main.cc:206 (discriminator 1)
30# __libc_start_main in /lib64/libc.so.6
31# _start in /home/rzarzynski/ceph1/build/bin/crimson-osd
Reactor stalled for 33157 ms on shard 0. Backtrace: 0xb14ab 0xa6cd418 0xa6a496d 0xa54fd22 0xa566e3d 0xa56383c 0xa5639fc 0xa566808 0x12b2f 0x3780e 0x21c44 0x21b18 0x2fdf5 0x7cfdf93 0x7b78622 0x7c3fcb7 0x7c1bcb2 0x7c1bdf1 0x7c4004a 0x7d913c7 0x7d875ed 0x7d777d7 0x7d5f626 0x7d47ca4 0x7d47b5a 0x7d5f770 0x7d47931 0x7d9bb74 0xa5998b5 0xa59e1ff 0xa5a3a95 0xa43546c 0xa433331 0x3f01992 0x237c2 0x3c7f9bd
../src/vstart.sh: line 28: 665993 Aborted                 (core dumped) PATH=$CEPH_BIN:$PATH "$@"
```

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
satoru-takeuchi pushed a commit that referenced this pull request Jul 27, 2021
…uence.

The `active_con` can get invalidated every single time when there is
a preemption point. This includes even the middle connection open
sequence as it's spread across multiple continuations!

Unfortunately, we don't check for `active_con` in the lambdas inside
the `on_session_opened()` method. That was the reason of the following
crash at Sepia [1]:

```
INFO  2021-06-08 09:36:23,992 [shard 0] monc - do_auth_single: connection closed
/home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.0.0-4967-g96cdf983/rpm/el8/BUILD/ceph-17.0.0-4967-g96cdf983/src/crimson/mon/MonClient.cc:399:10: runtime error: member access within null pointer of type 'struct Connection'
Segmentation fault on shard 0.
Backtrace:
 0# 0x000055C3C1CA860F in ceph-osd
 1# FatalSignal::signaled(int, siginfo_t const*) in ceph-osd
 2# FatalSignal::install_oneshot_signal_handler<11>()::{lambda(int, siginfo_t*, void*)#1}::_FUN(int, siginfo_t*, void*) in ceph-osd
 3# 0x00007FAAE1713B20 in /lib64/libpthread.so.0
 4# crimson::mon::Connection::get_conn() in ceph-osd
 5# 0x000055C3C2532DA8 in ceph-osd
 6# 0x000055C3C2535CB5 in ceph-osd
 7# 0x000055C3BBC9FC70 in ceph-osd
 8# 0x000055C3C76FAE5F in ceph-osd
 9# 0x000055C3C77003A0 in ceph-osd
10# 0x000055C3C78B240B in ceph-osd
11# 0x000055C3C740FE8A in ceph-osd
12# 0x000055C3C7419FAE in ceph-osd
13# main in ceph-osd
14# __libc_start_main in /lib64/libc.so.6
15# _start in ceph-osd
Fault at location: 0x98
```

[1]: http://pulpito.front.sepia.ceph.com/rzarzynski-2021-06-08_09:11:05-rados-master-distro-basic-smithi/6159602

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
satoru-takeuchi pushed a commit that referenced this pull request Jul 27, 2021
It's yet another racing issue which happens when auth request
handling is performed during the `active_con` reset sequence.
It caused the following `nullptr` dereference at Sepia:

```
DEBUG 2021-06-09 10:27:24,059 [shard 0] ms - [osd.6(client) v2:172.21.15.170:6809/33397 >> client.? -@39840] GOT AuthRequestFrame: method=2, p
referred_modes={2, 1}, payload_len=174
/home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.0.0-4977-g65cb255e/rpm/el8/BUILD/ceph-17.0.0-4977-g65cb255e/src/crimson/mon/MonClient.cc:595:26: runtime error: member call on null pointer of type 'struct Connection'
/home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.0.0-4977-g65cb255e/rpm/el8/BUILD/ceph-17.0.0-4977-g65cb255e/src/crimson/mon/MonClient.cc:178:11: runtime error: member access within null pointer of type 'struct Connection'
Segmentation fault on shard 0.
Backtrace:
 0# 0x0000563F9C00395F in ceph-osd
 1# FatalSignal::signaled(int, siginfo_t const*) in ceph-osd
 2# FatalSignal::install_oneshot_signal_handler<11>()::{lambda(int, siginfo_t*, void*)#1}::_FUN(int, siginfo_t*, void*) in ceph-osd
 3# 0x00007F4A064D0B20 in /lib64/libpthread.so.0
 4# crimson::mon::Connection::get_keys() in ceph-osd
 5# crimson::mon::Client::handle_auth_request(seastar::shared_ptr<crimson::net::Connection>, seastar::lw_shared_ptr<AuthConnectionMeta>, bool, unsigned int, ceph::buffer::v15_2_0::list const&, ceph::buffer::v15_2_0::list*) in ceph-osd
 6# crimson::net::ProtocolV2::_handle_auth_request(ceph::buffer::v15_2_0::list&, bool) in ceph-osd
 7# 0x0000563F9D007B39 in ceph-osd
 8# 0x0000563F9D008C45 in ceph-osd
 9# 0x0000563F95FF8D70 in ceph-osd
10# 0x0000563FA1A560BF in ceph-osd
11# 0x0000563FA1A5B600 in ceph-osd
12# 0x0000563FA1C0D66B in ceph-osd
13# 0x0000563FA176B0EA in ceph-osd
14# 0x0000563FA177520E in ceph-osd
15# main in ceph-osd
16# __libc_start_main in /lib64/libc.so.6
17# _start in ceph-osd
Fault at location: 0xb0
```

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
satoru-takeuchi pushed a commit that referenced this pull request Jul 27, 2021
Some time ago we replaced the single, `boost::lockfree`-based queue
in `ThreadPool` with the in-house, lockish `ShardedWorkQueue` vector.
Unfortunately, pushing into such queue isn't synchronized with
consuming from it -- the former happens without locking the `mutex`.
As the underlying primitive behind `ShardedWorkQueue::pending` is
plain `std::deque`, it's unsafe to operate that way in multi-thread
environment. Indeed, weirdly looking crashes have been spotted at Sepia:

```
(virtualenv) rzarzynski@teuthology:/home/teuthworker/archive/rzarzynski-2021-06-21_14:49:36-rados-master-distro-basic-smithi/6182668$ less ./remote/smithi196/log/ceph-osd.7.log.gz
...
 0# 0x000055862FD67ADF in ceph-osd
 1# FatalSignal::signaled(int, siginfo_t const*) in ceph-osd
 2# FatalSignal::install_oneshot_signal_handler<11>()::{lambda(int, siginfo_t*, void*)#1}::_FUN(int, siginfo_t*, void*) in ceph-osd
 3# 0x00007FB22CF36B20 in /lib64/libpthread.so.0
 4# 0x00005586357540E4 in ceph-osd
 5# 0x00007FB22CF36B20 in /lib64/libpthread.so.0
 6# pthread_cond_timedwait in /lib64/libpthread.so.0
 7# crimson::os::ThreadPool::loop(std::chrono::duration<long, std::ratio<1l, 1000l> >, unsigned long) in ceph-osd
 8# 0x00005586313E303B in ceph-osd
 9# 0x00007FB22CC51BA3 in /lib64/libstdc++.so.6
10# 0x00007FB22CF2C14A in /lib64/libpthread.so.0
11# clone in /lib64/libc.so.6
Fault at location: 0x18
daemon-helper: command crashed with signal 11
```

This fix introduces the synchronization to the `push_back()` method of
`ShardedWorkQueue`. The side effect is that it may stall the reactor.
Therefore, a follow-up change that switches to e.g. `boost::lockfree`
is expected.

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
satoru-takeuchi pushed a commit that referenced this pull request Jul 27, 2021
The assert in the ctor of `InternalClientRequest` actually operates on
the ctor's argument we `std::moved` from, not on the class' member.
When a debug build is used, this translates into failures like the one
below:

```
2021-06-16T22:53:03.410 INFO:journalctl@ceph.osd.6.smithi170.stdout:Jun 16 22:53:02 smithi170 conmon[43770]: ceph-osd: /home/jenkins-build/build/workspace/ceph-dev-new-
build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.0.0-4987-gec8844b6/rpm/el8/BUILD/ceph-17.0.0-4987-gec8844b6
/src/crimson/osd/osd_operations/internal_client_request.cc:19: crimson::osd::InternalClientRequest::InternalClientRequest(Ref<crimson::osd::PG>): Assertion `bool(pg)' f
ailed.
2021-06-16T22:53:05.363 INFO:journalctl@ceph.osd.6.smithi170.stdout:Jun 16 22:53:05 smithi170 conmon[43770]:  0# 0x0000558BE7BBF68F in /usr/bin/ceph-osd
2021-06-16T22:53:05.363 INFO:journalctl@ceph.osd.6.smithi170.stdout:Jun 16 22:53:05 smithi170 conmon[43770]:  1# FatalSignal::signaled(int, siginfo_t const*) in /usr/bi
n/ceph-osd
2021-06-16T22:53:05.363 INFO:journalctl@ceph.osd.6.smithi170.stdout:Jun 16 22:53:05 smithi170 conmon[43770]:  2# FatalSignal::install_oneshot_signal_handler<6>()::{lamb
da(int, siginfo_t*, void*)#1}::_FUN(int, siginfo_t*, void*) in /usr/bin/ceph-osd
2021-06-16T22:53:05.364 INFO:journalctl@ceph.osd.6.smithi170.stdout:Jun 16 22:53:05 smithi170 conmon[43770]:  3# 0x00007F8AD7535B20 in /lib64/libpthread.so.0
2021-06-16T22:53:05.364 INFO:journalctl@ceph.osd.6.smithi170.stdout:Jun 16 22:53:05 smithi170 conmon[43770]:  4# gsignal in /lib64/libc.so.6
2021-06-16T22:53:05.364 INFO:journalctl@ceph.osd.6.smithi170.stdout:Jun 16 22:53:05 smithi170 conmon[43770]:  5# abort in /lib64/libc.so.6
2021-06-16T22:53:05.364 INFO:journalctl@ceph.osd.6.smithi170.stdout:Jun 16 22:53:05 smithi170 conmon[43770]:  6# 0x00007F8AD5B2EC89 in /lib64/libc.so.6
2021-06-16T22:53:05.365 INFO:journalctl@ceph.osd.6.smithi170.stdout:Jun 16 22:53:05 smithi170 conmon[43770]:  7# 0x00007F8AD5B3CA76 in /lib64/libc.so.6
2021-06-16T22:53:05.365 INFO:journalctl@ceph.osd.6.smithi170.stdout:Jun 16 22:53:05 smithi170 conmon[43770]:  8# crimson::osd::InternalClientRequest::InternalClientRequ
est(boost::intrusive_ptr<crimson::osd::PG>) in /usr/bin/ceph-osd
2021-06-16T22:53:05.365 INFO:journalctl@ceph.osd.6.smithi170.stdout:Jun 16 22:53:05 smithi170 conmon[43770]:  9# crimson::osd::Watch::do_watch_timeout(boost::intrusive_ptr<crimson::osd::PG>) in /usr/bin/ceph-osd
2021-06-16T22:53:05.365 INFO:journalctl@ceph.osd.6.smithi170.stdout:Jun 16 22:53:05 smithi170 conmon[43770]: 10# seastar::noncopyable_function<void ()>::direct_vtable_for<crimson::osd::Watch::Watch(crimson::osd::Watch::private_ctag_t, boost::intrusive_ptr<crimson::osd::ObjectContext>, watch_info_t const&, entity_name_t const&, boost::intrusive_ptr<crimson::osd::PG>)::{lambda()#1}>::call(seastar::noncopyable_function<void ()> const*) in /usr/bin/ceph-osd
2021-06-16T22:53:05.366 INFO:journalctl@ceph.osd.6.smithi170.stdout:Jun 16 22:53:05 smithi170 conmon[43770]: 11# 0x0000558BED653759 in /usr/bin/ceph-osd
2021-06-16T22:53:05.366 INFO:journalctl@ceph.osd.6.smithi170.stdout:Jun 16 22:53:05 smithi170 conmon[43770]: 12# 0x0000558BED61B148 in /usr/bin/ceph-osd
2021-06-16T22:53:05.366 INFO:journalctl@ceph.osd.6.smithi170.stdout:Jun 16 22:53:05 smithi170 conmon[43770]: 13# 0x0000558BED61B576 in /usr/bin/ceph-osd
2021-06-16T22:53:05.366 INFO:journalctl@ceph.osd.6.smithi170.stdout:Jun 16 22:53:05 smithi170 conmon[43770]: 14# 0x0000558BED7C93C9 in /usr/bin/ceph-osd
2021-06-16T22:53:05.367 INFO:journalctl@ceph.osd.6.smithi170.stdout:Jun 16 22:53:05 smithi170 conmon[43770]: 15# 0x0000558BED326D5A in /usr/bin/ceph-osd
2021-06-16T22:53:05.367 INFO:journalctl@ceph.osd.6.smithi170.stdout:Jun 16 22:53:05 smithi170 conmon[43770]: 16# 0x0000558BED330E7E in /usr/bin/ceph-osd
2021-06-16T22:53:05.367 INFO:journalctl@ceph.osd.6.smithi170.stdout:Jun 16 22:53:05 smithi170 conmon[43770]: 17# main in /usr/bin/ceph-osd
2021-06-16T22:53:05.367 INFO:journalctl@ceph.osd.6.smithi170.stdout:Jun 16 22:53:05 smithi170 conmon[43770]: 18# __libc_start_main in /lib64/libc.so.6
2021-06-16T22:53:05.368 INFO:journalctl@ceph.osd.6.smithi170.stdout:Jun 16 22:53:05 smithi170 conmon[43770]: 19# _start in /usr/bin/ceph-osd
```

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
satoru-takeuchi pushed a commit that referenced this pull request Jul 27, 2021
`FuturizedStore` and `ObjectStore` use different memory layout for
conveying object attributes: map of `bufferlists` and map of `bptrs`
respectively. Unfortunately, `AlienStore` was trying to solve this
mismatch with just a `reinterpret_cast`.

Very likely this problem was the root cause behind the observed
crashes in `PGBackend::load_matadata` like the following one:

```
2021-06-15T09:25:07.511 INFO:journalctl@ceph.osd.3.smithi100.stdout:Jun 15 09:24:19 smithi100 conmon[54917]: DEBUG 2021-06-15 09:24:19,199 [shard 0] osd - peering_event(id=412, detail=PeeringEvent(from=7 pgid=5.14 sent=49 requested=49 evt=epoch_sent: 49 epoch_requested: 49 MInfoRec from 7 info: 5.14( v 45'2 (0'0,45'2] local-lis/les=48/49 n=0 ec=44/44 lis/c=48/44 les/c/f=49/45/0 sis=48) pg_lease_ack(ruub 19.176788330s))): complete
2021-06-15T09:25:07.511 INFO:journalctl@ceph.osd.3.smithi100.stdout:Jun 15 09:24:19 smithi100 conmon[54917]: Segmentation fault on shard 0.
2021-06-15T09:25:07.511 INFO:journalctl@ceph.osd.3.smithi100.stdout:Jun 15 09:24:19 smithi100 conmon[54917]: Backtrace:
2021-06-15T09:25:07.511 INFO:journalctl@ceph.osd.3.smithi100.stdout:Jun 15 09:24:19 smithi100 conmon[54917]:  0# 0x000055C99757FFBF in /usr/bin/ceph-osd
2021-06-15T09:25:07.511 INFO:journalctl@ceph.osd.3.smithi100.stdout:Jun 15 09:24:19 smithi100 conmon[54917]:  1# FatalSignal::signaled(int, siginfo_t const*) in /usr/bin/ceph-osd
2021-06-15T09:25:07.511 INFO:journalctl@ceph.osd.3.smithi100.stdout:Jun 15 09:24:19 smithi100 conmon[54917]:  2# FatalSignal::install_oneshot_signal_handler<11>()::{lambda(int, siginfo_t*, void*)#1}::_FUN(int, siginfo_t*, void*) in /usr/bin/ceph-osd
2021-06-15T09:25:07.512 INFO:journalctl@ceph.osd.3.smithi100.stdout:Jun 15 09:24:19 smithi100 conmon[54917]:  3# 0x00007F34BB632B20 in /lib64/libpthread.so.0
2021-06-15T09:25:07.512 INFO:journalctl@ceph.osd.3.smithi100.stdout:Jun 15 09:24:19 smithi100 conmon[54917]:  4# 0x000055C99263D4D2 in /usr/bin/ceph-osd
2021-06-15T09:25:07.512 INFO:journalctl@ceph.osd.3.smithi100.stdout:Jun 15 09:24:19 smithi100 conmon[54917]:  5# 0x000055C992740E47 in /usr/bin/ceph-osd
2021-06-15T09:25:07.512 INFO:journalctl@ceph.osd.3.smithi100.stdout:Jun 15 09:24:19 smithi100 conmon[54917]:  6# seastar::continuation<seastar::internal::promise_base_with_type<std::unique_ptr<PGBackend::loaded_object_md_t, std::default_delete<PGBackend::loaded_object_md_t> > >, seastar::noncopyable_function<crimson::errorator<crimson::unthrowable_wrapper<std::error_code const&, crimson::ec<(std::errc)84> > >::_future<crimson::errorated_future_marker<std::unique_ptr<PGBackend::loaded_object_md_t, std::default_delete<PGBackend::loaded_object_md_t> > > > (seastar::future<std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, ceph::buffer::v15_2_0::list, std::less<void>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, ceph::buffer::v15_2_0::list> > > >&&)>, seastar::future<std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, ceph::buffer::v15_2_0::list, std::less<void>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, ceph::buffer::v15_2_0::list> > > >::then_wrapped_nrvo<crimson::errorator<crimson::unthrowable_wrapper<std::error_code const&, crimson::ec<(std::errc)84> > >::_future<crimson::errorated_future_marker<std::unique_ptr<PGBackend::loaded_object_md_t, std::default_delete<PGBackend::loaded_object_md_t> > > >, seastar::noncopyable_function<crimson::errorator<crimson::unthrowable_wrapper<std::error_code const&, crimson::ec<(std::errc)84> > >::_future<crimson::errorated_future_marker<std::unique_ptr<PGBackend::loaded_object_md_t, std::default_delete<PGBackend::loaded_object_md_t> > > > (seastar::future<std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, ceph::buffer::v15_2_0::list, std::less<void>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, ceph::buffer::v15_2_0::list> > > >&&)> >(seastar::noncopyable_function<crimson::errorator<crimson::unthrowable_wrapper<std::error_code const&, crimson::ec<(std::errc)84> > >::_future<crimson::errorated_future_marker<std::unique_ptr<PGBackend::loaded_object_md_t, std::default_delete<PGBackend::loaded_object_md_t> > > > (seastar::future<std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, ceph::buffer::v15_2_0::list, std::less<void>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, ceph::buffer::v15_2_0::list> > > >&&)>&&)::{lambda(seastar::internal::promise_base_with_type<std::unique_ptr<PGBackend::loaded_object_md_t, std::default_delete<PGBackend::loaded_object_md_t> > >&&, seastar::noncopyable_function<crimson::errorator<crimson::unthrowable_wrapper<std::error_code const&, crimson::ec<(std::errc)84> > >::_future<crimson::errorated_future_marker<std::unique_ptr<PGBackend::loaded_object_md_t, std::default_delete<PGBackend::loaded_object_md_t> > > > (seastar::future<std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, ceph::buffer::v15_2_0::list, std::less<void>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, ceph::buffer::v15_2_0::list> > > >&&)>&, seastar::future_state<std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, ceph::buffer::v15_2_0::list, std::less<void>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, ceph::buffer::v15_2_0::list> > > >&&)#1}, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, ceph::buffer::v15_2_0::list, std::less<void>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, ceph::buffer::v15_2_0::list> > > >::run_and_dispose() in /usr/bin/ceph-osd
2021-06-15T09:25:07.512 INFO:journalctl@ceph.osd.3.smithi100.stdout:Jun 15 09:24:19 smithi100 conmon[54917]:  7# 0x000055C99CFD195F in /usr/bin/ceph-osd
2021-06-15T09:25:07.513 INFO:journalctl@ceph.osd.3.smithi100.stdout:Jun 15 09:24:19 smithi100 conmon[54917]:  8# 0x000055C99CFD6EA0 in /usr/bin/ceph-osd
2021-06-15T09:25:07.513 INFO:journalctl@ceph.osd.3.smithi100.stdout:Jun 15 09:24:19 smithi100 conmon[54917]:  9# 0x000055C99D188F0B in /usr/bin/ceph-osd
2021-06-15T09:25:07.513 INFO:journalctl@ceph.osd.3.smithi100.stdout:Jun 15 09:24:19 smithi100 conmon[54917]: 10# 0x000055C99CCE698A in /usr/bin/ceph-osd
2021-06-15T09:25:07.513 INFO:journalctl@ceph.osd.3.smithi100.stdout:Jun 15 09:24:19 smithi100 conmon[54917]: 11# 0x000055C99CCF0AAE in /usr/bin/ceph-osd
2021-06-15T09:25:07.513 INFO:journalctl@ceph.osd.3.smithi100.stdout:Jun 15 09:24:19 smithi100 conmon[54917]: 12# main in /usr/bin/ceph-osd
2021-06-15T09:25:07.513 INFO:journalctl@ceph.osd.3.smithi100.stdout:Jun 15 09:24:19 smithi100 conmon[54917]: 13# __libc_start_main in /lib64/libc.so.6
2021-06-15T09:25:07.514 INFO:journalctl@ceph.osd.3.smithi100.stdout:Jun 15 09:24:19 smithi100 conmon[54917]: 14# _start in /usr/bin/ceph-osd
2021-06-15T09:25:07.514 INFO:journalctl@ceph.osd.3.smithi100.stdout:Jun 15 09:24:19 smithi100 conmon[54917]: Fault at location: 0x31dfff8000
2021-06-15T09:25:07.514 INFO:journalctl@ceph.osd.3.smithi100.stdout:Jun 15 09:24:20 smithi100 podman[55356]: 2021-06-15 09:24:20.230341885 +0000 UTC m=+0.072958807 container died a3ea2a1d0a176286b93b8f5b94458982b9038e70d09128fb55f53b92976f0c42 (image=quay.ceph.io/ceph-ci/ceph@sha256:13ae953e3f83ee011d784d6eb9126fdc692f5bb688fe7d918be61ca7a7282b3c, name=ceph-43579b90-cdba-11eb-8c13-001a4aab830c-osd.3)
```

The fix deals with the issue by wrapping the `bptrs` in `bufferlists`.

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
satoru-takeuchi pushed a commit that referenced this pull request Nov 12, 2021
It's perfectly legal for a client to reconnect to particular `Watch`
using different socket / `Connection` than original one. This shall
include proper handling of the watch timer which is currently broken
as, when reconnecting, we don't cancel the timer. This leaded to the
following crash at Sepia:

```
rzarzynski@teuthology:/home/teuthworker/archive/rzarzynski-2021-09-02_07:44:51-rados-master-distro-basic-smithi/6372357$ less ./remote/smithi183/log/ceph-osd.4.log.gz
...
DEBUG 2021-09-02 08:10:45,462 [shard 0] osd - client_request(id=12, detail=m=[osd_op(client.5087.0:93 7.1e 7:7c7084bd:::repobj:head {watch reconnect cookie 94478891024832 gen 1} snapc 0={} ondisk+write+know
n_if_redirected e40) v8]): got obc lock
...
DEBUG 2021-09-02 08:10:45,462 [shard 0] osd - do_op_watch
INFO  2021-09-02 08:10:45,462 [shard 0] osd - found existing watch by client.5087
DEBUG 2021-09-02 08:10:45,462 [shard 0] osd - do_op_watch_subop_watch
INFO  2021-09-02 08:10:45,462 [shard 0] osd - found existing watch watch(cookie 94478891024832 30s 172.21.15.150:0/3544196211) by client.5087
...
INFO  2021-09-02 08:10:45,462 [shard 0] osd - op_effect: found existing watcher: 94478891024832,client.5087
ceph-osd: /home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.0.0-7406-g9d30203c/rpm/el8/BUILD/ceph-
17.0.0-7406-g9d30203c/src/seastar/include/seastar/core/timer.hh:95: void seastar::timer<Clock>::arm_state(seastar::timer<Clock>::time_point, std::optional<typename Clock::duration>) [with Clock = seastar::l
owres_clock; seastar::timer<Clock>::time_point = std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long int, std::ratio<1, 1000> > >; typename Clock::duration = std::chrono::duration<long
 int, std::ratio<1, 1000> >]: Assertion `!_armed' failed.
Aborting on shard 0.
Backtrace:
 0# 0x000055CC052CF0B6 in ceph-osd
 1# FatalSignal::signaled(int, siginfo_t const&) in ceph-osd
 2# FatalSignal::install_oneshot_signal_handler<6>()::{lambda(int, siginfo_t*, void*)#1}::_FUN(int, siginfo_t*, void*) in ceph-osd
 3# 0x00007FA58349FB20 in /lib64/libpthread.so.0
 4# gsignal in /lib64/libc.so.6
 5# abort in /lib64/libc.so.6
 6# 0x00007FA581A98C89 in /lib64/libc.so.6
 7# 0x00007FA581AA6A76 in /lib64/libc.so.6
 8# 0x000055CC0BEEE9DD in ceph-osd
 9# crimson::osd::Watch::connect(seastar::shared_ptr<crimson::net::Connection>, bool) in ceph-osd
10# 0x000055CC00B1D246 in ceph-osd
11# 0x000055CBFFEF01AE in ceph-osd
...
```

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
satoru-takeuchi pushed a commit that referenced this pull request Nov 12, 2021
`seastar::engine()` is available only for Seastar's threads;
it shouldn't be called outside of a reactor thread.
Unfortunately, this assumption is violated in `AlienStore`
where `OnCommit::finish()`, executed from a finisher thread
of `BlueStore`, calls `alien()` on `seastar::engine()`.
The net effect are crashes like the following one:

```
INFO  2021-09-22 14:26:33,214 [shard 0] osd - operator() writing superblock cluster_fsid 1d8f7908-2ebf-4a91-ae70-f445668c126b osd_fsid 4da9fe9a-1da5-4ea9-aa79-a1178165ede5         [381/1839]
Segmentation fault.
Backtrace:
 0# print_backtrace(std::basic_string_view<char, std::char_traits<char> >) at /home/rzarzynski/ceph1/build/../src/crimson/common/fatal_signal.cc:80
 1# FatalSignal::signaled(int, siginfo_t const&) at /opt/rh/gcc-toolset-9/root/usr/include/c++/9/ostream:570
 2# FatalSignal::install_oneshot_signal_handler<11>()::{lambda(int, siginfo_t*, void*)#1}::_FUN(int, siginfo_t*, void*) at /home/rzarzynski/ceph1/build/../src/crimson/common/fatal_signal.cc:
62
 3# 0x00007F16BBA13B30 in /lib64/libpthread.so.0
 4# (anonymous namespace)::OnCommit::finish(int) at /home/rzarzynski/ceph1/build/../src/crimson/os/alienstore/alien_store.cc:53
 5# Context::complete(int) at /home/rzarzynski/ceph1/build/../src/include/Context.h:100
 6# Finisher::finisher_thread_entry() at /home/rzarzynski/ceph1/build/../src/common/Finisher.cc:65
 7# 0x00007F16BBA0915A in /lib64/libpthread.so.0
 8# clone in /lib64/libc.so.6
Dump of siginfo:
  ...
  si_addr: 0x10
```

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
satoru-takeuchi pushed a commit that referenced this pull request Nov 12, 2021
It's all about these items:

```
 0# print_backtrace(std::basic_string_view<char, std::char_traits<char> >) at /home/rzarzynski/ceph1/build/../src/crimson/common/fatal_signal.cc:80
 1# FatalSignal::signaled(int, siginfo_t const&) at /opt/rh/gcc-toolset-9/root/usr/include/c++/9/ostream:570
 2# FatalSignal::install_oneshot_signal_handler<11>()::{lambda(int, siginfo_t*, void*)#1}::_FUN(int, siginfo_t*, void*) at /home/rzarzynski/ceph1/build/../src/crimson/common/fatal_signal.cc:
62
 3# 0x00007F16BBA13B30 in /lib64/libpthread.so.0
```

They are part of our backtrace handling and typically developers
are not interested in them. Let's be consistent with the classical
OSD and hide them.

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
satoru-takeuchi pushed a commit that referenced this pull request Nov 12, 2021
`PG::request_{local,remote}_recovery_reservation()` dynamically allocates
up to 2 instances of `LambdaContext<T>` and transfers their ownership to
the `AsyncReserver<T, F>`. This is expressed in raw pointers (`new` and
`delete`) notion. Further analysis shows the only place where `delete`
for these objects is called is the `AsyncReserver::cancel_reservation()`.
In contrast to the classical OSD, crimson doesn't invoke the method when
stopping a PG during the shutdown sequence. This would explain the
following ASan issue observed at Sepia:

```
Direct leak of 576 byte(s) in 24 object(s) allocated from:
    #0 0x7fa108fc57b0 in operator new(unsigned long) (/lib64/libasan.so.5+0xf17b0)
    #1 0x55723d8b0b56 in non-virtual thunk to crimson::osd::PG::request_local_background_io_reservation(unsigned int, std::unique_ptr<PGPeeringEvent, std::default_delete<PGPeeringEvent> >, std::unique_ptr<PGPeeringEvent, std::default_delete<PGPeeringEvent> >) (/usr/bin/ceph-osd+0x24d95b56)
    #2 0x55723f1f66ef in PeeringState::WaitDeleteReserved::WaitDeleteReserved(boost::statechart::state<PeeringState::WaitDeleteReserved, PeeringState::ToDelete, boost::mpl::list<mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, (boost::statechart::history_mode)0>::my_context) (/usr/bin/ceph-osd+0x266db6ef)
```

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
satoru-takeuchi pushed a commit that referenced this pull request Nov 12, 2021
exit() will call pthread_cond_destroy attempting to destroy dpdk::eal::cond
upon which other threads are currently blocked results in undefine
behavior. Link different libc version test, libc-2.17 can exit,
libc-2.27 will deadlock, the call stack is as follows:

Thread 3 (Thread 0xffff7e5749f0 (LWP 62213)):
 #0  0x0000ffff7f3c422c in futex_wait_cancelable (private=<optimized out>, expected=0,
    futex_word=0xaaaadc0e30f4 <dpdk::eal::cond+44>) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
 #1  __pthread_cond_wait_common (abstime=0x0, mutex=0xaaaadc0e30f8 <dpdk::eal::lock>, cond=0xaaaadc0e30c8 <dpdk::eal::cond>)
    at pthread_cond_wait.c:502
 #2  __pthread_cond_wait (cond=0xaaaadc0e30c8 <dpdk::eal::cond>, mutex=0xaaaadc0e30f8 <dpdk::eal::lock>)
    at pthread_cond_wait.c:655
 #3  0x0000ffff7f1f1f80 in std::condition_variable::wait(std::unique_lock<std::mutex>&) ()
   from /usr/lib/aarch64-linux-gnu/libstdc++.so.6
 #4  0x0000aaaad37f5078 in dpdk::eal::<lambda()>::operator()(void) const (__closure=<optimized out>, __closure=<optimized out>)
    at ./src/msg/async/dpdk/dpdk_rte.cc:136
 #5  0x0000ffff7f1f7ed4 in ?? () from /usr/lib/aarch64-linux-gnu/libstdc++.so.6
 #6  0x0000ffff7f3be088 in start_thread (arg=0xffffe73e197f) at pthread_create.c:463
 #7  0x0000ffff7efc74ec in thread_start () at ../sysdeps/unix/sysv/linux/aarch64/clone.S:78

Thread 1 (Thread 0xffff7ee3b010 (LWP 62200)):
 #0  0x0000ffff7f3c3c38 in futex_wait (private=<optimized out>, expected=12, futex_word=0xaaaadc0e30ec <dpdk::eal::cond+36>)
    at ../sysdeps/unix/sysv/linux/futex-internal.h:61
 #1  futex_wait_simple (private=<optimized out>, expected=12, futex_word=0xaaaadc0e30ec <dpdk::eal::cond+36>)
    at ../sysdeps/nptl/futex-internal.h:135
 #2  __pthread_cond_destroy (cond=0xaaaadc0e30c8 <dpdk::eal::cond>) at pthread_cond_destroy.c:54
 #3  0x0000ffff7ef2be34 in __run_exit_handlers (status=-6, listp=0xffff7f04a5a0 <__exit_funcs>, run_list_atexit=255,
    run_list_atexit@entry=true, run_dtors=run_dtors@entry=true) at exit.c:108
 #4  0x0000ffff7ef2bf6c in __GI_exit (status=<optimized out>) at exit.c:139
 #5  0x0000ffff7ef176e4 in __libc_start_main (main=0x0, argc=0, argv=0x0, init=<optimized out>, fini=<optimized out>,
    rtld_fini=<optimized out>, stack_end=<optimized out>) at ../csu/libc-start.c:344
 #6  0x0000aaaad2939db0 in _start () at ./src/include/buffer.h:642

Fixes: https://tracker.ceph.com/issues/42890
Signed-off-by: Chunsong Feng <fengchunsong@huawei.com>
Signed-off-by: luo rixin <luorixin@huawei.com>
satoru-takeuchi pushed a commit that referenced this pull request Dec 27, 2021
The patch is supposed to fix the following problems (extra
debugs onboard):

```
NFO  2021-11-16 01:18:38,713 [shard 0] osd - ~OSD: OSD dtor called
INFO  2021-11-16 01:18:38,713 [shard 0] osd - Heartbeat::Peer: osd.6 removed
INFO  2021-11-16 01:18:38,714 [shard 0] osd - Heartbeat::Peer: osd.5 removed
INFO  2021-11-16 01:18:38,714 [shard 0] osd - Heartbeat::Peer: osd.2 removed
INFO  2021-11-16 01:18:38,714 [shard 0] osd - ~ShardServices: ShardServices dtor called
INFO  2021-11-16 01:18:38,714 [shard 0] osd - ~ObjectContextRegistry: ShardServices dtor called; unref_size=3, size=3
INFO  2021-11-16 01:18:38,714 [shard 0] osd - ~ObjectContextRegistry: unreferenced p=0x619000115380
INFO  2021-11-16 01:18:38,714 [shard 0] osd - ~ObjectContextRegistry: unreferenced p=0x619000114980
INFO  2021-11-16 01:18:38,714 [shard 0] osd - ~ObjectContextRegistry: unreferenced p=0x619000112680
INFO  2021-11-16 01:18:38,714 [shard 0] osd - ~ObjectContextRegistry: set p=0x619000114980
INFO  2021-11-16 01:18:38,714 [shard 0] osd - ~ObjectContextRegistry: set p=0x619000115380
INFO  2021-11-16 01:18:38,714 [shard 0] osd - ~ObjectContextRegistry: set p=0x619000112680
INFO  2021-11-16 01:18:38,738 [shard 0] osd - crimson shutdown complete

=================================================================
==33351==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 2808 byte(s) in 3 object(s) allocated from:
    #0 0x7fe10c0327b0 in operator new(unsigned long) (/lib64/libasan.so.5+0xf17b0)
    #1 0x55accbe8ffc4 in ceph::common::intrusive_lru<ceph::common::intrusive_lru_config<hobject_t, crimson::osd::ObjectContext, crimson::osd::obc_to_hoid<crimson::osd::ObjectContext> > >::get_or_create(hobject_t const&) (/usr/bin/ceph-osd+0x3b000fc4)

Objects leaked above:
0x619000112680 (936 bytes)
0x619000114980 (936 bytes)
0x619000115380 (936 bytes)
```

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
satoru-takeuchi pushed a commit that referenced this pull request Dec 27, 2021
Initially, we were assuming that no pointer obtained from SharedLRU
can outlive the lru itself. However, since going with the interruption
concept for handling shutdowns, this is no longer valid.

The patch is supposed to deal with crashes like the following one:

```
ceph-osd: /home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.0.0-8898-ge57ad63c/rpm/el8/BUILD/ceph-17.0.
0-8898-ge57ad63c/src/crimson/common/shared_lru.h:46: SharedLRU<K, V>::~SharedLRU() [with K = unsigned int; V = OSDMap]: Assertion `weak_refs.empty()' failed.
Aborting on shard 0.
Backtrace:
Reactor stalled for 1162 ms on shard 0. Backtrace: 0xb14ab 0x46e57428 0x46bc450d 0x46be03bd 0x46be0782 0x46be0946 0x46be0bf6 0x12b1f 0xc8e3b 0x3fdd77e2 0x3fddccdb 0x3fdde1ee 0x3fdde8b3 0x3fdd3f2b 0x3fdd4442 0x3f
dd4c3a 0x12b1f 0x3737e 0x21db4 0x21c88 0x2fa75 0x3a5ae1b9 0x3a38c5e2 0x3a0c823d 0x3a1771f1 0x3a1796f5 0x46ff92c9 0x46ff9525 0x46ff9e93 0x46ff8eae 0x46ff8bd9 0x3a160e67 0x39f50c83 0x39f51cd0 0x46b96271 0x46bde51a
 0x46d6891b 0x46d6a8f0 0x4681a7d2 0x4681f03b 0x39fd50f2 0x23492 0x39b7a7dd
 0# gsignal in /lib64/libc.so.6
 1# abort in /lib64/libc.so.6
 2# 0x00007F9535E04C89 in /lib64/libc.so.6
 3# 0x00007F9535E12A76 in /lib64/libc.so.6
 4# crimson::osd::OSD::~OSD() in ceph-osd
 5# seastar::shared_ptr_count_for<crimson::osd::OSD>::~shared_ptr_count_for() in ceph-osd
 6# seastar::shared_ptr<crimson::osd::OSD>::~shared_ptr() in ceph-osd
 7# seastar::futurize<std::result_of<seastar::sharded<crimson::osd::OSD>::stop()::{lambda(seastar::future<void>)#2}::operator()(seastar::future<void>) const::{lambda(unsigned int)#1}::operator()(unsigned int) co
nst::{lambda()#1} ()>::type>::type seastar::smp::submit_to<seastar::sharded<crimson::osd::OSD>::stop()::{lambda(seastar::future<void>)#2}::operator()(seastar::future<void>) const::{lambda(unsigned int)#1}::opera
tor()(unsigned int) const::{lambda()#1}>(unsigned int, seastar::smp_submit_to_options, seastar::sharded<crimson::osd::OSD>::stop()::{lambda(seastar::future<void>)#2}::operator()(seastar::future<void>) const::{la
mbda(unsigned int)#1}::operator()(unsigned int) const::{lambda()#1}&&) in ceph-osd
 8# std::_Function_handler<seastar::future<void> (unsigned int), seastar::sharded<crimson::osd::OSD>::stop()::{lambda(seastar::future<void>)#2}::operator()(seastar::future<void>) const::{lambda(unsigned int)#1}>
::_M_invoke(std::_Any_data const&, unsigned int&&) in ceph-osd
 9# 0x0000562DA18162CA in ceph-osd
10# 0x0000562DA1816526 in ceph-osd
11# 0x0000562DA1816E94 in ceph-osd
12# 0x0000562DA1815EAF in ceph-osd
13# 0x0000562DA1815BDA in ceph-osd
14# seastar::noncopyable_function<seastar::future<void> (seastar::future<void>&&)>::direct_vtable_for<seastar::future<void>::then_wrapped_maybe_erase<true, seastar::future<void>, seastar::sharded<crimson::osd::OSD>::stop()::{lambda(seastar::future<void>)#2}>(seastar::sharded<crimson::osd::OSD>::stop()::{lambda(seastar::future<void>)#2}&&)::{lambda(seastar::future<void>&&)#1}>::call(seastar::noncopyable_function<seastar::future<void> (seastar::future<void>&&)> const*, seastar::future<void>&&) in ceph-osd
15# 0x0000562D9476DC84 in ceph-osd
16# 0x0000562D9476ECD1 in ceph-osd
17# 0x0000562DA13B3272 in ceph-osd
18# 0x0000562DA13FB51B in ceph-osd
19# 0x0000562DA158591C in ceph-osd
20# 0x0000562DA15878F1 in ceph-osd
21# 0x0000562DA10377D3 in ceph-osd
22# 0x0000562DA103C03C in ceph-osd
23# main in ceph-osd
24# __libc_start_main in /lib64/libc.so.6
25# _start in ceph-osd
```

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
@github-actions
Copy link

This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days.
If you are a maintainer or core committer, please follow-up on this pull request to identify what steps should be taken by the author to move this proposed change forward.
If you are the author of this pull request, thank you for your proposed contribution. If you believe this change is still appropriate, please ensure that any feedback has been addressed and ask for a code review.

Copy link

This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved

@github-actions github-actions bot removed the stale label Jan 31, 2024
peng225 pushed a commit that referenced this pull request Mar 28, 2024
before this change, we increment the refcount when constructing
`cct` instrusive_ptr, but nobody owns this smart pointer. also,
`CephContext` 's constructor set its refcount to 1. so, when the
test finishes, the refcount is 1, and this leads to a leakage of
the `CephContext` instance, this not only annoys ASan, and defeats
the purpose of 14d878c.
```
Indirect leak of 10880000 byte(s) in 1 object(s) allocated from:
    #0 0x5564d173537d in operator new(unsigned long) (/home/jenkins-build/build/workspace/ceph-pull-requests/build/bin/unittest_ipaddr+0x19b37d) (BuildId: 45c0c7f28b253c04fcb7bb1a43aed52a5526d734)
    #1 0x7fe7f2ccd189 in __gnu_cxx::new_allocator<ceph::logging::ConcreteEntry>::allocate(unsigned long, void const*) /usr/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/ext/new_allocator.h:127:27
    #2 0x7fe7f2ccc563 in std::allocator<ceph::logging::ConcreteEntry>::allocate(unsigned long) /usr/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/allocator.h:185:32
    #3 0x7fe7f2ccc563 in boost::circular_buffer<ceph::logging::ConcreteEntry, std::allocator<ceph::logging::ConcreteEntry> >::allocate(unsigned long) /opt/ceph/include/boost/circular_buffer/base.hpp:2396:39
    #4 0x7fe7f2ccc2c0 in boost::circular_buffer<ceph::logging::ConcreteEntry, std::allocator<ceph::logging::ConcreteEntry> >::initialize_buffer(unsigned long) /opt/ceph/include/boost/circular_buffer/base.hpp:2494:18
    #5 0x7fe7f2cc6192 in boost::circular_buffer<ceph::logging::ConcreteEntry, std::allocator<ceph::logging::ConcreteEntry> >::circular_buffer(unsigned long, std::allocator<ceph::logging::ConcreteEntry> const&) /opt/ceph/include/boost/circular_buffer/base.hpp:1039:9
    #6 0x7fe7f2cb91e4 in ceph::logging::Log::Log(ceph::logging::SubsystemMap const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/log/Log.cc:53:5
    #7 0x7fe7f1f8f96d in ceph::common::CephContext::CephContext(unsigned int, ceph::common::CephContext::create_options const&) /home/jenkins-build/build/workspace/ceph-pull-requests/src/common/ceph_context.cc:729:16
    #8 0x7fe7f1f8e93b in ceph::common::CephContext::CephContext(unsigned int, code_environment_t, int) /home/jenkins-build/build/workspace/ceph-pull-requests/src/common/ceph_context.cc:697:5
    #9 0x5564d1752eb9 in pick_address_find_ip_in_subnet_list_Test::TestBody() /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/test_ipaddr.cc:706:47
    #10 0x5564d18694d6 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2605:10
    #11 0x5564d1820fc2 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2641:14
    #12 0x5564d17d19dc in testing::Test::Run() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2680:5
    #13 0x5564d17d3a12 in testing::TestInfo::Run() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2858:11
    #14 0x5564d17d504b in testing::TestSuite::Run() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:3012:28
    #15 0x5564d17f24d8 in testing::internal::UnitTestImpl::RunAllTests() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:5723:44
    #16 0x5564d1871d06 in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2605:10
    #17 0x5564d1827932 in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2641:14
    #18 0x5564d17f1862 in testing::UnitTest::Run() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:5306:10
    #19 0x5564d1775d80 in RUN_ALL_TESTS() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/include/gtest/gtest.h:2486:46
    #20 0x5564d1775d11 in main /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googlemock/src/gmock_main.cc:70:10
```

so, in this change, we do not increase the refcount when
creating cct.

the same applies to `test/common/test_fault_injector.cc`.

Signed-off-by: Kefu Chai <tchaikov@gmail.com>
peng225 pushed a commit that referenced this pull request Mar 28, 2024
before this change, in test_util.cc, we increment the refcount of
when constructing it. but at that moment, nobody really owns it.
also, `CephContext` 's refcount is set to 1 in its constructor.
so, we should not do this. otherwise, the created `CephContext`
is leaked as LeakSanitizer rightly points out:
```
Indirect leak of 10880000 byte(s) in 1 object(s) allocated from:
    #0 0x5632320d27ed in operator new(unsigned long) (/home/jenkins-build/build/workspace/ceph-pull-requests/build/bin/unittest_util+0x1917ed) (BuildId: ff1df1455bd07b651ad580584a17ea204afeb36e)
    #1 0x7ff9d535b189 in __gnu_cxx::new_allocator<ceph::logging::ConcreteEntry>::allocate(unsigned long, void const*) /usr/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/ext/new_allocator.h:127:27
    #2 0x7ff9d535a563 in std::allocator<ceph::logging::ConcreteEntry>::allocate(unsigned long) /usr/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/allocator.h:185:32
    #3 0x7ff9d535a563 in boost::circular_buffer<ceph::logging::ConcreteEntry, std::allocator<ceph::logging::ConcreteEntry> >::allocate(unsigned long) /opt/ceph/include/boost/circular_buffer/base.hpp:2396:39
    #4 0x7ff9d535a2c0 in boost::circular_buffer<ceph::logging::ConcreteEntry, std::allocator<ceph::logging::ConcreteEntry> >::initialize_buffer(unsigned long) /opt/ceph/include/boost/circular_buffer/base.hpp:2494:18
    #5 0x7ff9d5354192 in boost::circular_buffer<ceph::logging::ConcreteEntry, std::allocator<ceph::logging::ConcreteEntry> >::circular_buffer(unsigned long, std::allocator<ceph::logging::ConcreteEntry> const&) /opt/ceph/include/boost/circular_buffer/base.hpp:1039:9
    #6 0x7ff9d53471e4 in ceph::logging::Log::Log(ceph::logging::SubsystemMap const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/log/Log.cc:53:5
    #7 0x7ff9d461d96d in ceph::common::CephContext::CephContext(unsigned int, ceph::common::CephContext::create_options const&) /home/jenkins-build/build/workspace/ceph-pull-requests/src/common/ceph_context.cc:729:16
    #8 0x7ff9d461c93b in ceph::common::CephContext::CephContext(unsigned int, code_environment_t, int) /home/jenkins-build/build/workspace/ceph-pull-requests/src/common/ceph_context.cc:697:5
    #9 0x5632320d52e0 in util_collect_sys_info_Test::TestBody() /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/common/test_util.cc:34:27
    #10 0x563232205c16 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2605:10
    #11 0x5632321c2742 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2641:14
    #12 0x5632321736dc in testing::Test::Run() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2680:5
```
in this change, instead of using a raw pointer, let's
use `boost::intrusive_ptr<CephContext>` to manage the lifecyle
of `CephContext`, this also address the leakage reported by
LeakSanitizer.

the same applies to common/test_context.cc

Signed-off-by: Kefu Chai <tchaikov@gmail.com>
peng225 pushed a commit that referenced this pull request Mar 28, 2024
before this change, we allocate memory chunks with specified
size using `new []`, but we never free them. when testing with
LeakSanitizer enabled, it rightly points identifies the leakage:

```
Direct leak of 8754 byte(s) in 184 object(s) allocated from:
    #0 0x55c0b2470f0d in operator new[](unsigned long) (/home/jenkins-build/build/workspace/ceph-pull-requests/build/bin/unittest_memory+0x196f0d) (BuildId: d3267dd8819427b804c4729e0467dbe7601fb321)
    #1 0x55c0b247456c in MemoryIsZeroSmallTest_MemoryIsZeroTestSmall_Test::TestBody() /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/common/test_memory.cc:33:18
    #2 0x55c0b2598ee6 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2605:10
    #3 0x55c0b2553b92 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2641:14
    #4 0x55c0b25049dc in testing::Test::Run() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2680:5
    #5 0x55c0b2506a12 in testing::TestInfo::Run() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2858:11
    #6 0x55c0b250804b in testing::TestSuite::Run() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:3012:28
    #7 0x55c0b25254d8 in testing::internal::UnitTestImpl::RunAllTests() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:5723:44
    #8 0x55c0b25a16f6 in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2605:10
    #9 0x55c0b255a502 in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2641:14
    #10 0x55c0b2524862 in testing::UnitTest::Run() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:5306:10
    #11 0x55c0b24ab4c0 in RUN_ALL_TESTS() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/include/gtest/gtest.h:2486:46
    #12 0x55c0b24ab451 in main /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googlemock/src/gmock_main.cc:70:10
    #13 0x7f45e065ad8f in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
```

in this change, we free the allocate memory.

Signed-off-by: Kefu Chai <tchaikov@gmail.com>
peng225 pushed a commit that referenced this pull request Mar 28, 2024
…sive_ptr<CephContext>

before this change, we increment the refcount when constructing
`cct` instrusive_ptr, but nobody owns this smart pointer. also,
`CephContext` 's constructor set its refcount to 1. so, when the
test finishes, the refcount is 1, and this leads to a leakage of
the `CephContext` instance. and LeakSanitizer points this out:

```
Indirect leak of 10880000 byte(s) in 1 object(s) allocated from:
    #0 0xaaaac359c7c8 in operator new(unsigned long) (/home/jenkins-build/build/workspace/ceph-pull-requests-arm64/build/bin/unittest_rgw_iam_policy+0x211c7c8) (BuildId: 060fadb10da261b52fd5757c7b1e9812d34542f1)
    #1 0xffff96f764e4 in __gnu_cxx::new_allocator<ceph::logging::ConcreteEntry>::allocate(unsigned long, void const*) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/ext/new_allocator.h:127:27
    #2 0xffff96f757cc in std::allocator<ceph::logging::ConcreteEntry>::allocate(unsigned long) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/allocator.h:185:32
    #3 0xffff96f757cc in boost::circular_buffer<ceph::logging::ConcreteEntry, std::allocator<ceph::logging::ConcreteEntry> >::allocate(unsigned long) /home/jenkins-build/build/workspace/ceph-pull-requests-arm64/build/boost/include/boost/circular_buffer/base.hpp:2396:39
    #4 0xffff96f75500 in boost::circular_buffer<ceph::logging::ConcreteEntry, std::allocator<ceph::logging::ConcreteEntry> >::initialize_buffer(unsigned long) /home/jenkins-build/build/workspace/ceph-pull-requests-arm64/build/boost/include/boost/circular_buffer/base.hpp:2494:18
    #5 0xffff96f6ec4c in boost::circular_buffer<ceph::logging::ConcreteEntry, std::allocator<ceph::logging::ConcreteEntry> >::circular_buffer(unsigned long, std::allocator<ceph::logging::ConcreteEntry> const&) /home/jenkins-build/build/workspace/ceph-pull-requests-arm64/build/boost/include/boost/circular_buffer/base.hpp:1039:9
    #6 0xffff96f63528 in ceph::logging::Log::Log(ceph::logging::SubsystemMap const*) /home/jenkins-build/build/workspace/ceph-pull-requests-arm64/src/log/Log.cc:53:5
    #7 0xffff96045300 in ceph::common::CephContext::CephContext(unsigned int, ceph::common::CephContext::create_options const&) /home/jenkins-build/build/workspace/ceph-pull-requests-arm64/src/common/ceph_context.cc:729:16
    #8 0xffff960446ec in ceph::common::CephContext::CephContext(unsigned int, code_environment_t, int) /home/jenkins-build/build/workspace/ceph-pull-requests-arm64/src/common/ceph_context.cc:697:5
    #9 0xaaaac3629238 in IPPolicyTest::IPPolicyTest() /home/jenkins-build/build/workspace/ceph-pull-requests-arm64/src/test/rgw/test_rgw_iam_policy.cc:864:15
    #10 0xaaaac3628da0 in IPPolicyTest_MaskedIPOperations_Test::IPPolicyTest_MaskedIPOperations_Test() /home/jenkins-build/build/workspace/ceph-pull-requests-arm64/src/test/rgw/test_rgw_iam_policy.cc:869:1
    #11 0xaaaac3628d3c in testing::internal::TestFactoryImpl<IPPolicyTest_MaskedIPOperations_Test>::CreateTest() /home/jenkins-build/build/workspace/ceph-pull-requests-arm64/src/googletest/googletest/include/gtest/internal/gtest-internal.h:472:44
```

so, in this change, we do not increase the refcount when creating cct.

Signed-off-by: Kefu Chai <tchaikov@gmail.com>
peng225 pushed a commit that referenced this pull request Mar 28, 2024
before this change, we create a new cct instance with `new`, but
we never free this instance after done with it. and LeakSanitizer
points this out:

```
Indirect leak of 10880000 byte(s) in 1 object(s) allocated from:
    #0 0x561afe148fed in operator new(unsigned long) (/home/jenkins-build/build/workspace/ceph-pull-requests/build/bin/unittest_config_map+0x1c2fed) (BuildId: 3ce9eeed38cee335628fa74fdd08cd215b15019e)
    #1 0x7f37dc9ac189 in __gnu_cxx::new_allocator<ceph::logging::ConcreteEntry>::allocate(unsigned long, void const*) /usr/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/ext/new_allocator.h:127:27
    #2 0x7f37dc9ab563 in std::allocator<ceph::logging::ConcreteEntry>::allocate(unsigned long) /usr/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/allocator.h:185:32
    #3 0x7f37dc9ab563 in boost::circular_buffer<ceph::logging::ConcreteEntry, std::allocator<ceph::logging::ConcreteEntry> >::allocate(unsigned long) /opt/ceph/include/boost/circular_buffer/base.hpp:2396:39
    #4 0x7f37dc9ab2c0 in boost::circular_buffer<ceph::logging::ConcreteEntry, std::allocator<ceph::logging::ConcreteEntry> >::initialize_buffer(unsigned long) /opt/ceph/include/boost/circular_buffer/base.hpp:2494:18
    #5 0x7f37dc9a5192 in boost::circular_buffer<ceph::logging::ConcreteEntry, std::allocator<ceph::logging::ConcreteEntry> >::circular_buffer(unsigned long, std::allocator<ceph::logging::ConcreteEntry> const&) /opt/ceph/include/boost/circular_buffer/base.hpp:1039:9
    #6 0x7f37dc9981e4 in ceph::logging::Log::Log(ceph::logging::SubsystemMap const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/log/Log.cc:53:5
    #7 0x7f37dbc6e96d in ceph::common::CephContext::CephContext(unsigned int, ceph::common::CephContext::create_options const&) /home/jenkins-build/build/workspace/ceph-pull-requests/src/common/ceph_context.cc:729:16
    #8 0x7f37dbc6d93b in ceph::common::CephContext::CephContext(unsigned int, code_environment_t, int) /home/jenkins-build/build/workspace/ceph-pull-requests/src/common/ceph_context.cc:697:5
    #9 0x561afe14e983 in ConfigMap_add_option_Test::TestBody() /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/mon/test_config_map.cc:58:18
    #10 0x561afe2689b6 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2605:10
    #11 0x561afe221262 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2641:14
    #12 0x561afe1d1f7c in testing::Test::Run() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2680:5
    #13 0x561afe1d3fb2 in testing::TestInfo::Run() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2858:11
    #14 0x561afe1d55eb in testing::TestSuite::Run() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:3012:28
    #15 0x561afe1f2a78 in testing::internal::UnitTestImpl::RunAllTests() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:5723:44
    #16 0x561afe2711e6 in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2605:10
    #17 0x561afe227bd2 in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2641:14
    #18 0x561afe1f1e02 in testing::UnitTest::Run() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:5306:10
    #19 0x561afe176ec0 in RUN_ALL_TESTS() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/include/gtest/gtest.h:2486:46
    #20 0x561afe176e51 in main /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googlemock/src/gmock_main.cc:70:10
    #21 0x7f37d9397d8f in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
```

so in this change, we manage the `CephContext` pointer with a smart
pointer. because the size of CephContext could be large, we don't create
it on stack.

Signed-off-by: Kefu Chai <tchaikov@gmail.com>
peng225 pushed a commit that referenced this pull request Mar 28, 2024
before this change, we create a new CrushWrapper instance with `new`, but
we never free this instance after done with it. and LeakSanitizer
points this out:

```
Direct leak of 544 byte(s) in 1 object(s) allocated from:
    #0 0x561afe148fed in operator new(unsigned long) (/home/jenkins-build/build/workspace/ceph-pull-requests/build/bin/unittest_config_map+0x1c2fed) (BuildId: 3ce9eeed38cee335628fa74fdd08cd215b15019e)
    #1 0x561afe151cbd in ConfigMap_result_sections_Test::TestBody() /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/mon/test_config_map.cc:93:16
    #2 0x561afe2689b6 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2605:10
    #3 0x561afe221262 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2641:14
    #4 0x561afe1d1f7c in testing::Test::Run() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2680:5
    #5 0x561afe1d3fb2 in testing::TestInfo::Run() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2858:11
    #6 0x561afe1d55eb in testing::TestSuite::Run() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:3012:28
    #7 0x561afe1f2a78 in testing::internal::UnitTestImpl::RunAllTests() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:5723:44
    #8 0x561afe2711e6 in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2605:10
    #9 0x561afe227bd2 in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2641:14
    #10 0x561afe1f1e02 in testing::UnitTest::Run() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:5306:10
    #11 0x561afe176ec0 in RUN_ALL_TESTS() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/include/gtest/gtest.h:2486:46
    #12 0x561afe176e51 in main /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googlemock/src/gmock_main.cc:70:10
    #13 0x7f37d9397d8f in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
```

so in this change, we manage the `CrushWrapper` pointer with a smart
pointer. because the size of `CrushWrapper` is relatively large, we
don't create it on stack.

Signed-off-by: Kefu Chai <tchaikov@gmail.com>
peng225 pushed a commit that referenced this pull request Mar 28, 2024
LeakSanitizer reports
```
==688591==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 45 byte(s) in 1 object(s) allocated from:
    #0 0x55f8dd9969dd in operator new(unsigned long) (/home/jenkins-build/build/workspace/ceph-pull-requests/build/bin/unittest_fastbmap_allocator+0x1f89dd) (BuildId: cac39eac8ef1e8774f9dd48e6e3f677fdd864776)
    #1 0x55f8dd99c730 in __gnu_cxx::new_allocator<char>::allocate(unsigned long, void const*) /usr/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/ext/new_allocator.h:127:27
    #2 0x55f8dd99c690 in std::allocator<char>::allocate(unsigned long) /usr/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/allocator.h:185:32
    #3 0x55f8dd99c690 in std::allocator_traits<std::allocator<char> >::allocate(std::allocator<char>&, unsigned long) /usr/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/alloc_traits.h:464:20
    #4 0x55f8dd99c393 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_create(unsigned long&, unsigned long) /usr/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/basic_string.tcc:153:14
    #5 0x55f8dda96a6c in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_mutate(unsigned long, unsigned long, char const*, unsigned long) /usr/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/basic_string.tcc:307:21
    #6 0x55f8dda96852 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_append(char const*, unsigned long) /usr/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/basic_string.tcc:395:8
    #7 0x7f4a751ab6f0 in
    MallocExtension::Initialize() (/lib/x86_64-linux-gnu/libtcmalloc.so.4+0x2a6f0) (BuildId:
    eeef3d1257388a806e122398dbce3157ee568ef4)
```

this is a global object allocated by the allocator, so we can suppress this report.

Signed-off-by: Kefu Chai <tchaikov@gmail.com>
Copy link

This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days.
If you are a maintainer or core committer, please follow-up on this pull request to identify what steps should be taken by the author to move this proposed change forward.
If you are the author of this pull request, thank you for your proposed contribution. If you believe this change is still appropriate, please ensure that any feedback has been addressed and ask for a code review.

@github-actions github-actions bot added the stale label Mar 31, 2024
toshipp pushed a commit that referenced this pull request Apr 18, 2024
before this change, we increment the refcount when constructing
`cct` instrusive_ptr, but nobody owns this smart pointer. also,
`CephContext` 's constructor set its refcount to 1. so, when the
test finishes, the refcount is 1, and this leads to a leakage of
the `CephContext` instance. and LeakSanitizer points this out:
```
Indirect leak of 10880000 byte(s) in 1 object(s) allocated from:
    #0 0x558d341d837d in operator new(unsigned long) (/home/jenkins-build/build/workspace/ceph-pull-requests/build/bin/unittest_ipaddr+0x19b37d) (BuildId: 1b7e7e5abfc2b58ce2334712e4c00b2441c25870)
    #1 0x7fd74c957559 in __gnu_cxx::new_allocator<ceph::logging::ConcreteEntry>::allocate(unsigned long, void const*) /usr/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/ext/new_allocator.h:127:27
    #2 0x7fd74c956933 in std::allocator<ceph::logging::ConcreteEntry>::allocate(unsigned long) /usr/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/allocator.h:185:32
    #3 0x7fd74c956933 in boost::circular_buffer<ceph::logging::ConcreteEntry, std::allocator<ceph::logging::ConcreteEntry> >::allocate(unsigned long) /opt/ceph/include/boost/circular_buffer/base.hpp:2396:39
    #4 0x7fd74c956690 in boost::circular_buffer<ceph::logging::ConcreteEntry, std::allocator<ceph::logging::ConcreteEntry> >::initialize_buffer(unsigned long) /opt/ceph/include/boost/circular_buffer/base.hpp:2494:18
    #5 0x7fd74c950562 in boost::circular_buffer<ceph::logging::ConcreteEntry, std::allocator<ceph::logging::ConcreteEntry> >::circular_buffer(unsigned long, std::allocator<ceph::logging::ConcreteEntry> const&) /opt/ceph/include/boost/circ
ular_buffer/base.hpp:1039:9
    #6 0x7fd74c9435b4 in ceph::logging::Log::Log(ceph::logging::SubsystemMap const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/log/Log.cc:53:5
    #7 0x7fd74bc1891d in ceph::common::CephContext::CephContext(unsigned int, ceph::common::CephContext::create_options const&) /home/jenkins-build/build/workspace/ceph-pull-requests/src/common/ceph_context.cc:729:16
    #8 0x7fd74bc178eb in ceph::common::CephContext::CephContext(unsigned int, code_environment_t, int) /home/jenkins-build/build/workspace/ceph-pull-requests/src/common/ceph_context.cc:697:5
    #9 0x558d341f97e9 in pick_address_filtering_Test::TestBody() /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/test_ipaddr.cc:774:47
    #10 0x558d3430c4f6 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2605:10
    #11 0x558d342c3fc2 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2641:14
    #12 0x558d342749dc in testing::Test::Run() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2680:5
    #13 0x558d34276a12 in testing::TestInfo::Run() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2858:11
    #14 0x558d3427804b in testing::TestSuite::Run() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:3012:28
    #15 0x558d342954d8 in testing::internal::UnitTestImpl::RunAllTests() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:5723:44
    #16 0x558d34314d26 in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2605:10
    #17 0x558d342ca932 in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2641:14
    #18 0x558d34294862 in testing::UnitTest::Run() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:5306:10
    #19 0x558d34218d80 in RUN_ALL_TESTS() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/include/gtest/gtest.h:2486:46
    #20 0x558d34218d11 in main /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googlemock/src/gmock_main.cc:70:10
    #21 0x7fd749331d8f in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
```

so, in this change, we do not increase the refcount when creating cct.

Signed-off-by: Kefu Chai <tchaikov@gmail.com>
toshipp pushed a commit that referenced this pull request Apr 18, 2024
in BlueFS.test_shared_alloc, we keep the return value of
`fs.get_perf_counters()`, and deference it after umounting the fs,
but the `PerfCounters*` pointer returned from `fs.get_perf_counters()`
is destroyed in `BlueFS::_shutdown_logger()` which is in turn called
by `BlueFS::umount()`. so ASan points this out:
```
==1662613==ERROR: AddressSanitizer: heap-use-after-free on address 0x6110000b2d80 at pc 0x7f0eefc30644 bp 0x7ffcdbab6430 sp 0x7ffcdbab6428
READ of size 8 at 0x6110000b2d80 thread T0
    #0 0x7f0eefc30643 in ceph::common::PerfCounters::get(int) const /home/jenkins-build/build/workspace/ceph-pull-requests/src/common/perf_counters.cc:246:8
    #1 0x557595ddfc15 in BlueFS_test_shared_alloc_Test::TestBody() /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/objectstore/test_bluefs.cc:1182:3
    #2 0x557595eeef66 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2605:10
    #3 0x557595ea8b22 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2641:14
    #4 0x557595e5974c in testing::Test::Run() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2680:5
    #5 0x557595e5b782 in testing::TestInfo::Run() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2858:11
    #6 0x557595e5cdbb in testing::TestSuite::Run() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:3012:28
    #7 0x557595e7a248 in testing::internal::UnitTestImpl::RunAllTests() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:5723:44
    #8 0x557595ef7816 in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2605:10
    #9 0x557595eaf5c2 in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2641:14
    #10 0x557595e795d2 in testing::UnitTest::Run() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:5306:10
    #11 0x557595e05370 in RUN_ALL_TESTS() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/include/gtest/gtest.h:2486:46
    #12 0x557595dfc1f5 in main /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/objectstore/test_bluefs.cc:1603:10
    #13 0x7f0eed083d8f in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
    #14 0x7f0eed083e3f in __libc_start_main csu/../csu/libc-start.c:392:3
    #15 0x557595cd46a4 in _start (/home/jenkins-build/build/workspace/ceph-pull-requests/build/bin/unittest_bluefs+0x2856a4) (BuildId: 5439261504ca3d7549fe9bcda1d17ef6d4d9b644)

0x6110000b2d80 is located 0 bytes inside of 208-byte region [0x6110000b2d80,0x6110000b2e50)
freed by thread T0 here:
    #0 0x557595d92b1d in operator delete(void*) (/home/jenkins-build/build/workspace/ceph-pull-requests/build/bin/unittest_bluefs+0x343b1d) (BuildId: 5439261504ca3d7549fe9bcda1d17ef6d4d9b644)
    #1 0x557595f31c43 in BlueFS::_shutdown_logger() /home/jenkins-build/build/workspace/ceph-pull-requests/src/os/bluestore/BlueFS.cc:462:3
    #2 0x557595f54ab5 in BlueFS::umount(bool) /home/jenkins-build/build/workspace/ceph-pull-requests/src/os/bluestore/BlueFS.cc:1076:3
    #3 0x557595ddfbd7 in BlueFS_test_shared_alloc_Test::TestBody() /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/objectstore/test_bluefs.cc:1180:6
    #4 0x557595eeef66 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2605:10
    #5 0x557595ea8b22 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2641:14
    #6 0x557595e5974c in testing::Test::Run() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2680:5
    #7 0x557595e5b782 in testing::TestInfo::Run() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2858:11
    #8 0x557595e5cdbb in testing::TestSuite::Run() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:3012:28
    #9 0x557595e7a248 in testing::internal::UnitTestImpl::RunAllTests() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:5723:44
    #10 0x557595ef7816 in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2605:10
    #11 0x557595eaf5c2 in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2641:14
    #12 0x557595e795d2 in testing::UnitTest::Run() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:5306:10
    #13 0x557595e05370 in RUN_ALL_TESTS() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/include/gtest/gtest.h:2486:46
    #14 0x557595dfc1f5 in main /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/objectstore/test_bluefs.cc:1603:10
    #15 0x7f0eed083d8f in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16

previously allocated by thread T0 here:
    #0 0x557595d922bd in operator new(unsigned long) (/home/jenkins-build/build/workspace/ceph-pull-requests/build/bin/unittest_bluefs+0x3432bd) (BuildId: 5439261504ca3d7549fe9bcda1d17ef6d4d9b644)
    #1 0x7f0eefc33180 in ceph::common::PerfCountersBuilder::PerfCountersBuilder(ceph::common::CephContext*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, int, int) /home/jenkins-build/build/workspace/ceph-pull-requests/src/common/perf_counters.cc:537:21
    #2 0x557595f30ac9 in BlueFS::_init_logger() /home/jenkins-build/build/workspace/ceph-pull-requests/src/os/bluestore/BlueFS.cc:221:23
    #3 0x557595f42bc6 in BlueFS::mount() /home/jenkins-build/build/workspace/ceph-pull-requests/src/os/bluestore/BlueFS.cc:977:3
    #4 0x557595ddd339 in BlueFS_test_shared_alloc_Test::TestBody() /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/objectstore/test_bluefs.cc:1139:3
    #5 0x557595eeef66 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2605:10
    #6 0x557595ea8b22 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2641:14
    #7 0x557595e5974c in testing::Test::Run() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2680:5
    #8 0x557595e5b782 in testing::TestInfo::Run() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2858:11
    #9 0x557595e5cdbb in testing::TestSuite::Run() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:3012:28
    #10 0x557595e7a248 in testing::internal::UnitTestImpl::RunAllTests() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:5723:44
    #11 0x557595ef7816 in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2605:10
    #12 0x557595eaf5c2 in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2641:14
    #13 0x557595e795d2 in testing::UnitTest::Run() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:5306:10
    #14 0x557595e05370 in RUN_ALL_TESTS() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/include/gtest/gtest.h:2486:46
    #15 0x557595dfc1f5 in main /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/objectstore/test_bluefs.cc:1603:10
    #16 0x7f0eed083d8f in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
```

in this change, instead of keeping `logger` across the `umount()` and
`mount()` calls, we get another instance of `logger`, query it for
the perf counter that we are interested, and compare the value
to see if it is unchanged.

this should address the ASan warning above.

Signed-off-by: Kefu Chai <tchaikov@gmail.com>
toshipp pushed a commit that referenced this pull request Apr 18, 2024
before this change, we allocate an instance of `RocksDBStore` with
`new`, but we never free it. and LeanSanitizer points this out:

```
Direct leak of 952 byte(s) in 1 object(s) allocated from:
    #0 0x55f31440bc2d in operator new(unsigned long) (/home/jenkins-build/build/workspace/ceph-pull-requests/build/bin/unittest_rocksdb_option+0xaebc2d) (BuildId: 81b849dbc41cbc6b05d5e603d9ba8a002dab2d24)
    #1 0x55f3144132fd in RocksDBOption_simple_Test::TestBody() /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/objectstore/TestRocksdbOptionParse.cc:17:22
    #2 0x55f3144ecf26 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2605:10
    #3 0x55f3144a4312 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2641:14
    #4 0x55f314453ccc in testing::Test::Run() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2680:5
    #5 0x55f314455d02 in testing::TestInfo::Run() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2858:11
    #6 0x55f31445733b in testing::TestSuite::Run() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:3012:28
    #7 0x55f3144747c8 in testing::internal::UnitTestImpl::RunAllTests() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:5723:44
    #8 0x55f3144f5576 in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2605:10
    #9 0x55f3144ab1a2 in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2641:14
    #10 0x55f314473b52 in testing::UnitTest::Run() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:5306:10
    #11 0x55f31440f690 in RUN_ALL_TESTS() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/include/gtest/gtest.h:2486:46
    #12 0x55f31440e4c3 in main /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/unit.cc:45:10
    #13 0x7f0d32551d8f in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
```

in this change, we manage the life cycle of `RocksDBStore` using
a smart pointer. this should address the leak.

Signed-off-by: Kefu Chai <tchaikov@gmail.com>
toshipp pushed a commit that referenced this pull request Apr 18, 2024
before this change, we allocate memory chunks using malloc(), but
we never free them. and LeakSanitizer points this out

```
Direct leak of 4 byte(s) in 1 object(s) allocated from:
    #0 0x5588bfe532de in malloc (/home/jenkins-build/build/workspace/ceph-pull-requests/build/bin/unittest_on_exit+0xa52de) (BuildId: 7c7a92bf5719592938c5307214bcd9b080bd847f)
    #1 0x5588bfe911d7 in func_scope() /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/on_exit.cc:33:22
    #2 0x5588bfe90804 in main /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/on_exit.cc:64:3
    #3 0x7f23081c1d8f in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16

Direct leak of 4 byte(s) in 1 object(s) allocated from:
    #0 0x5588bfe532de in malloc (/home/jenkins-build/build/workspace/ceph-pull-requests/build/bin/unittest_on_exit+0xa52de) (BuildId: 7c7a92bf5719592938c5307214bcd9b080bd847f)
    #1 0x5588bfe91160 in func_scope() /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/on_exit.cc:29:22
    #2 0x5588bfe90804 in main /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/on_exit.cc:64:3
    #3 0x7f23081c1d8f in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
```

in this change, instead of allocating the variables using `malloc()`,
we keep them in static variables, so that they can be accessed by
`OnExitManager` even if it is a static variable.
with this change, the memory leak reports for this source file go away.

Signed-off-by: Kefu Chai <tchaikov@gmail.com>
toshipp pushed a commit that referenced this pull request Apr 18, 2024
we allocate a hitset without freeing it in this test, and LeakSanitizer
points this out
```
Direct leak of 16 byte(s) in 1 object(s) allocated from:
    #0 0x557a0841a9dd in operator new(unsigned long) (/home/jenkins-build/build/workspace/ceph-pull-requests/build/bin/unittest_hitset+0x1ae9dd) (BuildId: ad9be2b52b3d6fb1a567b262c3becaab6373e88d)
    #1 0x557a0843b98e in ExplicitHashHitSetTest::ExplicitHashHitSetTest() /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/osd/hitset.cc:128:46
    #2 0x557a0843b918 in ExplicitHashHitSetTest_Construct_Test::ExplicitHashHitSetTest_Construct_Test() /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/osd/hitset.cc:133:1
    #3 0x557a0843b8cb in testing::internal::TestFactoryImpl<ExplicitHashHitSetTest_Construct_Test>::CreateTest() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/include/gtest/internal/gtest-internal.h:472:44
    #4 0x557a08532406 in testing::Test* testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::TestFactoryBase, testing::Test*>(testing::internal::TestFactoryBase*, testing::Test* (testing::internal::TestFactoryBase::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2605:10
    #5 0x557a084eb892 in testing::Test* testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::TestFactoryBase, testing::Test*>(testing::internal::TestFactoryBase*, testing::Test* (testing::internal::TestFactoryBase::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2641:14
    #6 0x557a0849dd55 in testing::TestInfo::Run() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2848:22
    #7 0x557a0849f3bb in testing::TestSuite::Run() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:3012:28
    #8 0x557a084bc848 in testing::internal::UnitTestImpl::RunAllTests() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:5723:44
    #9 0x557a0853a6d6 in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2605:10
    #10 0x557a084f1222 in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2641:14
    #11 0x557a084bbbd2 in testing::UnitTest::Run() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:5306:10
    #12 0x557a08441c10 in RUN_ALL_TESTS() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/include/gtest/gtest.h:2486:46
    #13 0x557a08441ba1 in main /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googlemock/src/gmock_main.cc:70:10
    #14 0x7faa493d6d8f in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
```

in this change, we just free it in the dtor. this should address
the warning from the sanitizer.

Signed-off-by: Kefu Chai <tchaikov@gmail.com>
toshipp pushed a commit that referenced this pull request Apr 18, 2024
When sanitizer is enabled, unittest_mds_quiesce_agent fails as following

```
[==========] Running 5 tests from 1 test suite.
[----------] Global test environment set-up.
[----------] 5 tests from QuiesceAgentTest
[ RUN      ] QuiesceAgentTest.ThreadManagement
[       OK ] QuiesceAgentTest.ThreadManagement (3 ms)
[ RUN      ] QuiesceAgentTest.DbUpdates
[       OK ] QuiesceAgentTest.DbUpdates (1 ms)
[ RUN      ] QuiesceAgentTest.QuiesceProtocol
[       OK ] QuiesceAgentTest.QuiesceProtocol (3 ms)
[ RUN      ] QuiesceAgentTest.DuplicateQuiesceRequest
[       OK ] QuiesceAgentTest.DuplicateQuiesceRequest (2 ms)
[ RUN      ] QuiesceAgentTest.TimeoutBeforeComplete
[       OK ] QuiesceAgentTest.TimeoutBeforeComplete (2 ms)
[----------] 5 tests from QuiesceAgentTest (11 ms total)

[----------] Global test environment tear-down
[==========] 5 tests from 1 test suite ran. (11 ms total)
[  PASSED  ] 5 tests.

=================================================================
==3975692==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 64 byte(s) in 1 object(s) allocated from:
    #0 0xaaaadd81c7c8 in operator new(unsigned long) (/root/ceph/build/bin/unittest_mds_quiesce_agent+0x1fc7c8) (BuildId: 7d45344ba1e43661d9de484f0a5d129377c4d4ae)
    #1 0xaaaadd8878d8 in QuiesceAgent::agent_thread_main() /root/ceph/src/mds/QuiesceAgent.cc:136:68
    #2 0xaaaadd86de38 in QuiesceAgent::AgentThread::entry() /root/ceph/src/mds/QuiesceAgent.h:244:24
    #3 0xffff83d6b554 in Thread::entry_wrapper() /root/ceph/src/common/Thread.cc:87:10
    #4 0xffff83d6b314 in Thread::_entry_func(void*) /root/ceph/src/common/Thread.cc:74:29
    #5 0xffff8154d5c4 in start_thread nptl/./nptl/pthread_create.c:442:8
    #6 0xffff815b5ed8  misc/../sysdeps/unix/sysv/linux/aarch64/clone.S:79

Indirect leak of 120 byte(s) in 1 object(s) allocated from:
    #0 0xaaaadd81c7c8 in operator new(unsigned long) (/root/ceph/build/bin/unittest_mds_quiesce_agent+0x1fc7c8) (BuildId: 7d45344ba1e43661d9de484f0a5d129377c4d4ae)
    #1 0xaaaadd8af4f4 in __gnu_cxx::new_allocator<std::_Sp_counted_ptr_inplace<QuiesceAgent::TrackedRoot, std::allocator<QuiesceAgent::TrackedRoot>, (__gnu_cxx::_Lock_policy)2> >::allocate(unsigned long, void const*) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/ext/new_allocator.h:127:27
    #2 0xaaaadd8af3d8 in std::allocator<std::_Sp_counted_ptr_inplace<QuiesceAgent::TrackedRoot, std::allocator<QuiesceAgent::TrackedRoot>, (__gnu_cxx::_Lock_policy)2> >::allocate(unsigned long) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/allocator.h:185:32
    #3 0xaaaadd8af3d8 in std::allocator_traits<std::allocator<std::_Sp_counted_ptr_inplace<QuiesceAgent::TrackedRoot, std::allocator<QuiesceAgent::TrackedRoot>, (__gnu_cxx::_Lock_policy)2> > >::allocate(std::allocator<std::_Sp_counted_ptr_inplace<QuiesceAgent::TrackedRoot, std::allocator<QuiesceAgent::TrackedRoot>, (__gnu_cxx::_Lock_policy)2> >&, unsigned long) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/alloc_traits.h:464:20
    #4 0xaaaadd8aef00 in std::__allocated_ptr<std::allocator<std::_Sp_counted_ptr_inplace<QuiesceAgent::TrackedRoot, std::allocator<QuiesceAgent::TrackedRoot>, (__gnu_cxx::_Lock_policy)2> > > std::__allocate_guarded<std::allocator<std::_Sp_counted_ptr_inplace<QuiesceAgent::TrackedRoot, std::allocator<QuiesceAgent::TrackedRoot>, (__gnu_cxx::_Lock_policy)2> > >(std::allocator<std::_Sp_counted_ptr_inplace<QuiesceAgent::TrackedRoot, std::allocator<QuiesceAgent::TrackedRoot>, (__gnu_cxx::_Lock_policy)2> >&) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/allocated_ptr.h:98:21
    #5 0xaaaadd8aec14 in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::__shared_count<QuiesceAgent::TrackedRoot, std::allocator<QuiesceAgent::TrackedRoot>, QuiesceState&, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> >&>(QuiesceAgent::TrackedRoot*&, std::_Sp_alloc_shared_tag<std::allocator<QuiesceAgent::TrackedRoot> >, QuiesceState&, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> >&) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr_base.h:648:19
    #6 0xaaaadd8ae988 in std::__shared_ptr<QuiesceAgent::TrackedRoot, (__gnu_cxx::_Lock_policy)2>::__shared_ptr<std::allocator<QuiesceAgent::TrackedRoot>, QuiesceState&, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> >&>(std::_Sp_alloc_shared_tag<std::allocator<QuiesceAgent::TrackedRoot> >, QuiesceState&, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> >&) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr_base.h:1342:14
    #7 0xaaaadd8ae70c in std::shared_ptr<QuiesceAgent::TrackedRoot>::shared_ptr<std::allocator<QuiesceAgent::TrackedRoot>, QuiesceState&, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> >&>(std::_Sp_alloc_shared_tag<std::allocator<QuiesceAgent::TrackedRoot> >, QuiesceState&, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> >&) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr.h:409:4
    #8 0xaaaadd8ae484 in std::shared_ptr<QuiesceAgent::TrackedRoot> std::allocate_shared<QuiesceAgent::TrackedRoot, std::allocator<QuiesceAgent::TrackedRoot>, QuiesceState&, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> >&>(std::allocator<QuiesceAgent::TrackedRoot> const&, QuiesceState&, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> >&) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr.h:862:14
    #9 0xaaaadd88ff0c in std::shared_ptr<QuiesceAgent::TrackedRoot> std::make_shared<QuiesceAgent::TrackedRoot, QuiesceState&, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> >&>(QuiesceState&, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> >&) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr.h:878:14
    #10 0xaaaadd884a6c in QuiesceAgent::db_update(QuiesceMap&) /root/ceph/src/mds/QuiesceAgent.cc:60:26
    #11 0xaaaadd84a840 in QuiesceAgentTest::update(QuiesceDbVersion, std::initializer_list<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, QuiesceMap::RootInfo> >) /root/ceph/src/test/mds/TestQuiesceAgent.cc:156:18
    #12 0xaaaadd84985c in QuiesceAgentTest::update(unsigned long, std::initializer_list<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, QuiesceMap::RootInfo> >) /root/ceph/src/test/mds/TestQuiesceAgent.cc:165:14
    #13 0xaaaadd8288a8 in QuiesceAgentTest_DbUpdates_Test::TestBody() /root/ceph/src/test/mds/TestQuiesceAgent.cc:213:16
    #14 0xaaaadd977230 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /root/ceph/src/googletest/googletest/src/gtest.cc:2605:10
    #15 0xaaaadd924590 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /root/ceph/src/googletest/googletest/src/gtest.cc:2641:14
    #16 0xaaaadd8d4a40 in testing::Test::Run() /root/ceph/src/googletest/googletest/src/gtest.cc:2680:5
    #17 0xaaaadd8d6984 in testing::TestInfo::Run() /root/ceph/src/googletest/googletest/src/gtest.cc:2858:11
    #18 0xaaaadd8d7f84 in testing::TestSuite::Run() /root/ceph/src/googletest/googletest/src/gtest.cc:3012:28
    #19 0xaaaadd8f3d48 in testing::internal::UnitTestImpl::RunAllTests() /root/ceph/src/googletest/googletest/src/gtest.cc:5723:44
    #20 0xaaaadd981130 in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /root/ceph/src/googletest/googletest/src/gtest.cc:2605:10
    #21 0xaaaadd92bb64 in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /root/ceph/src/googletest/googletest/src/gtest.cc:2641:14
    #22 0xaaaadd8f31c0 in testing::UnitTest::Run() /root/ceph/src/googletest/googletest/src/gtest.cc:5306:10
    #23 0xaaaadd820710 in RUN_ALL_TESTS() /root/ceph/src/googletest/googletest/include/gtest/gtest.h:2486:46
    #24 0xaaaadd81ed3c in main /root/ceph/src/test/unit.cc:45:10
    #25 0xffff814f73f8 in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
    #26 0xffff814f74c8 in __libc_start_main csu/../csu/libc-start.c:392:3
    #27 0xaaaadd76e6ac in _start (/root/ceph/build/bin/unittest_mds_quiesce_agent+0x14e6ac) (BuildId: 7d45344ba1e43661d9de484f0a5d129377c4d4ae)

SUMMARY: AddressSanitizer: 184 byte(s) leaked in 2 allocation(s).
```

quiesce_requests Context should be freed.

Signed-off-by: Rongqi Sun <sunrongqi@huawei.com>
toshipp pushed a commit that referenced this pull request Apr 18, 2024
When sanitizer is enabled, unittest__rgw_crypto shows

```
=================================================================
==136464==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 75023 byte(s) in 22 object(s) allocated from:
    #0 0xaaaabf7fb86c in operator new[](unsigned long) (/root/ceph/build/bin/unittest_rgw_crypto+0x48b86c) (BuildId: 8023dc30820215da92d6d4883620bedd8ac1190d)
    #1 0xaaaabf81db48 in TestRGWCrypto_verify_Encrypt_Decrypt_Test::TestBody() /root/ceph/src/test/rgw/test_rgw_crypto.cc:780:24
    #2 0xaaaabf9018ac in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /root/ceph/src/googletest/googletest/src/gtest.cc:2605:10
    #3 0xaaaabf8b08a4 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /root/ceph/src/googletest/googletest/src/gtest.cc:2641:14
    #4 0xaaaabf861f88 in testing::Test::Run() /root/ceph/src/googletest/googletest/src/gtest.cc:2680:5
    #5 0xaaaabf863ecc in testing::TestInfo::Run() /root/ceph/src/googletest/googletest/src/gtest.cc:2858:11
    #6 0xaaaabf8654cc in testing::TestSuite::Run() /root/ceph/src/googletest/googletest/src/gtest.cc:3012:28
    #7 0xaaaabf881290 in testing::internal::UnitTestImpl::RunAllTests() /root/ceph/src/googletest/googletest/src/gtest.cc:5723:44
    #8 0xaaaabf90b7ac in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /root/ceph/src/googletest/googletest/src/gtest.cc:2605:10
    #9 0xaaaabf8b7ac0 in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /root/ceph/src/googletest/googletest/src/gtest.cc:2641:14
    #10 0xaaaabf880708 in testing::UnitTest::Run() /root/ceph/src/googletest/googletest/src/gtest.cc:5306:10
    #11 0xaaaabf823d70 in RUN_ALL_TESTS() /root/ceph/src/googletest/googletest/include/gtest/gtest.h:2486:46
    #12 0xaaaabf81f390 in main /root/ceph/src/test/rgw/test_rgw_crypto.cc:822:10
    #13 0xffff878673f8 in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
    #14 0xffff878674c8 in __libc_start_main csu/../csu/libc-start.c:392:3
    #15 0xaaaabf74d62c in _start (/root/ceph/build/bin/unittest_rgw_crypto+0x3dd62c) (BuildId: 8023dc30820215da92d6d4883620bedd8ac1190d)

SUMMARY: AddressSanitizer: 75023 byte(s) leaked in 22 allocation(s).
```

test_in should be freed to address the warning.

Signed-off-by: Rongqi Sun <sunrongqi@huawei.com>
toshipp pushed a commit that referenced this pull request Apr 18, 2024
When sanitizer is enabled, unittest_bluestore_types fails as following
```
[ RUN      ] sb_info_space_efficient_map_t.basic
=================================================================
==143714==ERROR: AddressSanitizer: heap-buffer-overflow on address 0xffff99f8b7f4 at pc 0xaaaab50bde18 bp 0xffffebefcdb0 sp 0xffffebefcda8
READ of size 8 at 0xffff99f8b7f4 thread T0
    #0 0xaaaab50bde14 in sb_info_t::get_sbid() const /root/ceph/src/os/bluestore/bluestore_types.h:1337:30
    #1 0xaaaab50a5908 in sb_info_space_efficient_map_t::find(unsigned long) /root/ceph/src/os/bluestore/bluestore_types.h:1385:10
    #2 0xaaaab50bd638 in sb_info_space_efficient_map_t::_add(long) /root/ceph/src/os/bluestore/bluestore_types.h:1424:15
    #3 0xaaaab50a52bc in sb_info_space_efficient_map_t::add_maybe_stray(unsigned long) /root/ceph/src/os/bluestore/bluestore_types.h:1358:12
    #4 0xaaaab4fec03c in sb_info_space_efficient_map_t_basic_Test::TestBody() /root/ceph/src/test/objectstore/test_bluestore_types.cc:113:11
    #5 0xaaaab51e9a40 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /root/ceph/src/googletest/googletest/src/gtest.cc:2605:10
    #6 0xaaaab5197040 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /root/ceph/src/googletest/googletest/src/gtest.cc:2641:14
    #7 0xaaaab51488a4 in testing::Test::Run() /root/ceph/src/googletest/googletest/src/gtest.cc:2680:5
    #8 0xaaaab514a7e8 in testing::TestInfo::Run() /root/ceph/src/googletest/googletest/src/gtest.cc:2858:11
    #9 0xaaaab514bde8 in testing::TestSuite::Run() /root/ceph/src/googletest/googletest/src/gtest.cc:3012:28
    #10 0xaaaab5167bac in testing::internal::UnitTestImpl::RunAllTests() /root/ceph/src/googletest/googletest/src/gtest.cc:5723:44
    #11 0xaaaab51f3940 in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /root/ceph/src/googletest/googletest/src/gtest.cc:2605:10
    #12 0xaaaab519e5d8 in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /root/ceph/src/googletest/googletest/src/gtest.cc:2641:14
    #13 0xaaaab5167024 in testing::UnitTest::Run() /root/ceph/src/googletest/googletest/src/gtest.cc:5306:10
    #14 0xaaaab50b4d6c in RUN_ALL_TESTS() /root/ceph/src/googletest/googletest/include/gtest/gtest.h:2486:46
    #15 0xaaaab50a1080 in main /root/ceph/src/test/objectstore/test_bluestore_types.cc:2847:10
    #16 0xffff9d6c73f8 in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
    #17 0xffff9d6c74c8 in __libc_start_main csu/../csu/libc-start.c:392:3
    #18 0xaaaab4f3812c in _start (/root/ceph/build/bin/unittest_bluestore_types+0xe4812c) (BuildId: cb75399658026f83a4e89012de8fb02f08f6d239)

0xffff99f8b7f4 is located 0 bytes to the right of 20-byte region [0xffff99f8b7e0,0xffff99f8b7f4)
allocated by thread T0 here:
    #0 0xaaaab4fe636c in operator new[](unsigned long) (/root/ceph/build/bin/unittest_bluestore_types+0xef636c) (BuildId: cb75399658026f83a4e89012de8fb02f08f6d239)
    #1 0xaaaab50c0d2c in mempool::pool_allocator<(mempool::pool_index_t)11, sb_info_t>::allocate(unsigned long, void*) /root/ceph/src/include/mempool.h:375:33
    #2 0xaaaab50c0c0c in std::allocator_traits<mempool::pool_allocator<(mempool::pool_index_t)11, sb_info_t> >::allocate(mempool::pool_allocator<(mempool::pool_index_t)11, sb_info_t>&, unsigned long) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/alloc_traits.h:318:20
    #3 0xaaaab50c044c in std::_Vector_base<sb_info_t, mempool::pool_allocator<(mempool::pool_index_t)11, sb_info_t> >::_M_allocate(unsigned long) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/stl_vector.h:346:20
    #4 0xaaaab50bf954 in void std::vector<sb_info_t, mempool::pool_allocator<(mempool::pool_index_t)11, sb_info_t> >::_M_realloc_insert<long&>(__gnu_cxx::__normal_iterator<sb_info_t*, std::vector<sb_info_t, mempool::pool_allocator<(mempool::pool_index_t)11, sb_info_t> > >, long&) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/vector.tcc:440:33
    #5 0xaaaab50be0d8 in sb_info_t& std::vector<sb_info_t, mempool::pool_allocator<(mempool::pool_index_t)11, sb_info_t> >::emplace_back<long&>(long&) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/vector.tcc:121:4
    #6 0xaaaab50bd760 in sb_info_space_efficient_map_t::_add(long) /root/ceph/src/os/bluestore/bluestore_types.h:1429:24
    #7 0xaaaab50a5e78 in sb_info_space_efficient_map_t::add_or_adopt(unsigned long) /root/ceph/src/os/bluestore/bluestore_types.h:1361:15
    #8 0xaaaab4feb07c in sb_info_space_efficient_map_t_basic_Test::TestBody() /root/ceph/src/test/objectstore/test_bluestore_types.cc:103:11
    #9 0xaaaab51e9a40 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /root/ceph/src/googletest/googletest/src/gtest.cc:2605:10
    #10 0xaaaab5197040 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /root/ceph/src/googletest/googletest/src/gtest.cc:2641:14
    #11 0xaaaab51488a4 in testing::Test::Run() /root/ceph/src/googletest/googletest/src/gtest.cc:2680:5
    #12 0xaaaab514a7e8 in testing::TestInfo::Run() /root/ceph/src/googletest/googletest/src/gtest.cc:2858:11
    #13 0xaaaab514bde8 in testing::TestSuite::Run() /root/ceph/src/googletest/googletest/src/gtest.cc:3012:28
    #14 0xaaaab5167bac in testing::internal::UnitTestImpl::RunAllTests() /root/ceph/src/googletest/googletest/src/gtest.cc:5723:44
    #15 0xaaaab51f3940 in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /root/ceph/src/googletest/googletest/src/gtest.cc:2605:10
    #16 0xaaaab519e5d8 in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /root/ceph/src/googletest/googletest/src/gtest.cc:2641:14
    #17 0xaaaab5167024 in testing::UnitTest::Run() /root/ceph/src/googletest/googletest/src/gtest.cc:5306:10
    #18 0xaaaab50b4d6c in RUN_ALL_TESTS() /root/ceph/src/googletest/googletest/include/gtest/gtest.h:2486:46
    #19 0xaaaab50a1080 in main /root/ceph/src/test/objectstore/test_bluestore_types.cc:2847:10
    #20 0xffff9d6c73f8 in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
    #21 0xffff9d6c74c8 in __libc_start_main csu/../csu/libc-start.c:392:3
    #22 0xaaaab4f3812c in _start (/root/ceph/build/bin/unittest_bluestore_types+0xe4812c) (BuildId: cb75399658026f83a4e89012de8fb02f08f6d239)

SUMMARY: AddressSanitizer: heap-buffer-overflow /root/ceph/src/os/bluestore/bluestore_types.h:1337:30 in sb_info_t::get_sbid() const
Shadow bytes around the buggy address:
  0x200ff33f16a0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x200ff33f16b0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x200ff33f16c0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x200ff33f16d0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x200ff33f16e0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
=>0x200ff33f16f0: fa fa fa fa fa fa fa fa fa fa fa fa 00 00[04]fa
  0x200ff33f1700: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x200ff33f1710: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x200ff33f1720: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x200ff33f1730: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x200ff33f1740: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==143714==ABORTING
```

'it' might be invalid, so before using 'it', need to figure validity out

Signed-off-by: Rongqi Sun <sunrongqi@huawei.com>
toshipp pushed a commit that referenced this pull request Apr 18, 2024
When sanitizer is enabled, unittest_osdscrub shows

```
=================================================================
==1633952==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 28 byte(s) in 1 object(s) allocated from:
    #0 0xaaaab4e108e0 in malloc (/root/ceph/build/bin/unittest_osdscrub+0x1ed08e0) (BuildId: b3cfa2137be96d75535beecf0f2500cec10c7550)
    #1 0xffffa8cac2f8 in __res_context_send resolv/./resolv/res_send.c:334:9
    #2 0xffffa8ca9c54 in __res_context_query resolv/./resolv/res_query.c:216:6
    #3 0xffffa8caa4a8 in __res_context_querydomain resolv/./resolv/res_query.c:625:9
    #4 0xffffa8caa4a8 in __res_context_search resolv/./resolv/res_query.c:381:9
    #5 0xffffa8caaa20 in context_search_common resolv/./resolv/res_query.c:550:16
    #6 0xffffa8caaa20 in res_nsearch resolv/./resolv/res_query.c:563:10
    #7 0xffffabbf1f64 in ceph::ResolvHWrapper::res_nsearch(__res_state*, char const*, int, int, unsigned char*, int) /root/ceph/src/common/dns_resolve.cc:37:10
    #8 0xffffabbf6574 in ceph::DNSResolver::resolve_srv_hosts(ceph::common::CephContext*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, ceph::DNSResolver::SRV_Protocol, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, ceph::DNSResolver::Record, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, ceph::DNSResolver::Record> > >*) /root/ceph/src/common/dns_resolve.cc:295:19
    #9 0xffffac8edaf0 in MonMap::init_with_dns_srv(ceph::common::CephContext*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, bool, std::ostream&) /root/ceph/src/mon/MonMap.cc:935:36
    #10 0xffffac8eeec8 in MonMap::build_initial(ceph::common::CephContext*, bool, std::ostream&) /root/ceph/src/mon/MonMap.cc:1014:20
    #11 0xffffac85beb0 in MonClient::build_initial_monmap() /root/ceph/src/mon/MonClient.cc:93:18
    #12 0xaaaab4e50d98 in TestOSDScrub_scrub_time_permit_Test::TestBody() /root/ceph/src/test/osd/TestOSDScrub.cc:73:6
    #13 0xaaaab4f655b0 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /root/ceph/src/googletest/googletest/src/gtest.cc:2605:10
    #14 0xaaaab4f16264 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /root/ceph/src/googletest/googletest/src/gtest.cc:2641:14
    #15 0xaaaab4ec6ca8 in testing::Test::Run() /root/ceph/src/googletest/googletest/src/gtest.cc:2680:5
    #16 0xaaaab4ec8bec in testing::TestInfo::Run() /root/ceph/src/googletest/googletest/src/gtest.cc:2858:11
    #17 0xaaaab4eca1ec in testing::TestSuite::Run() /root/ceph/src/googletest/googletest/src/gtest.cc:3012:28
    #18 0xaaaab4ee5fb0 in testing::internal::UnitTestImpl::RunAllTests() /root/ceph/src/googletest/googletest/src/gtest.cc:5723:44
    #19 0xaaaab4f6f4c4 in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /root/ceph/src/googletest/googletest/src/gtest.cc:2605:10
    #20 0xaaaab4f1d4bc in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /root/ceph/src/googletest/googletest/src/gtest.cc:2641:14
    #21 0xaaaab4ee5428 in testing::UnitTest::Run() /root/ceph/src/googletest/googletest/src/gtest.cc:5306:10
    #22 0xaaaab4e4b790 in RUN_ALL_TESTS() /root/ceph/src/googletest/googletest/include/gtest/gtest.h:2486:46
    #23 0xaaaab4e49dbc in main /root/ceph/src/test/unit.cc:45:10
    #24 0xffffa8bc73f8 in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
    #25 0xffffa8bc74c8 in __libc_start_main csu/../csu/libc-start.c:392:3
    #26 0xaaaab4d9972c in _start (/root/ceph/build/bin/unittest_osdscrub+0x1e5972c) (BuildId: b3cfa2137be96d75535beecf0f2500cec10c7550)

-----------------------------------------------------
Suppressions used:
  count      bytes template
      1         45 ^MallocExtension::Initialize
-----------------------------------------------------

SUMMARY: AddressSanitizer: 28 byte(s) leaked in 1 allocation(s).
```

1. 'res_ninit/res_nquery' memory should be freed.

Signed-off-by: Rongqi Sun <sunrongqi@huawei.com>
Copy link

This pull request has been automatically closed because there has been no activity for 90 days. Please feel free to reopen this pull request (or open a new one) if the proposed change is still appropriate. Thank you for your contribution!

@github-actions github-actions bot closed this Apr 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants