Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Errors when killing nvmf_tgt #337

Closed
pawelkax opened this issue Jun 20, 2018 · 25 comments
Closed

Errors when killing nvmf_tgt #337

pawelkax opened this issue Jun 20, 2018 · 25 comments
Assignees
Labels
bug High High Priority Bug

Comments

@pawelkax
Copy link
Contributor

pawelkax commented Jun 20, 2018

Expected Behavior

Nvmf_tgt is killed without error

Current Behavior

I got following output:
^C=================================================================
==144090==ERROR: AddressSanitizer: heap-use-after-free on address 0x60600000e620 at pc 0x00000058f40c bp 0x7ffc42648710 sp 0x7ffc42648700
WRITE of size 8 at 0x60600000e620 thread T0 (reactor_0)
#0 0x58f40b in spdk_nvmf_rdma_qpair_destroy /home/pawelkax/newest_spdk/spdk/lib/nvmf/rdma.c:356
#1 0x5a9117 in spdk_nvmf_rdma_close_qpair /home/pawelkax/newest_spdk/spdk/lib/nvmf/rdma.c:1924
#2 0x58e5d3 in spdk_nvmf_transport_qpair_fini /home/pawelkax/newest_spdk/spdk/lib/nvmf/transport.c:172
#3 0x5aee98 in spdk_nvmf_ctrlr_destruct /home/pawelkax/newest_spdk/spdk/lib/nvmf/ctrlr.c:224
#4 0x579515 in spdk_nvmf_subsystem_destroy /home/pawelkax/newest_spdk/spdk/lib/nvmf/subsystem.c:334
#5 0x58350b in spdk_nvmf_tgt_destroy_cb /home/pawelkax/newest_spdk/spdk/lib/nvmf/nvmf.c:224
#6 0x5e45fd in _finish_unregister /home/pawelkax/newest_spdk/spdk/lib/thread/thread.c:358
#7 0x5d0edc in _spdk_reactor_msg_passed /home/pawelkax/newest_spdk/spdk/lib/event/reactor.c:215
#8 0x5d0d53 in _spdk_event_queue_run_batch /home/pawelkax/newest_spdk/spdk/lib/event/reactor.c:202
#9 0x5d39f5 in _spdk_reactor_run /home/pawelkax/newest_spdk/spdk/lib/event/reactor.c:462
#10 0x5d5dcd in spdk_reactors_start /home/pawelkax/newest_spdk/spdk/lib/event/reactor.c:633
#11 0x5cdf9f in spdk_app_start /home/pawelkax/newest_spdk/spdk/lib/event/app.c:570
#12 0x4070cd in main /home/pawelkax/newest_spdk/spdk/app/nvmf_tgt/nvmf_main.c:79
#13 0x7ff9729d782f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)
#14 0x406d48 in _start (/home/pawelkax/newest_spdk/spdk/app/nvmf_tgt/nvmf_tgt+0x406d48)

0x60600000e620 is located 32 bytes inside of 56-byte region [0x60600000e600,0x60600000e638)
freed by thread T0 (reactor_0) here:
#0 0x7ff974fcf2ca in __interceptor_free (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x982ca)
#1 0x5a7a7d in spdk_nvmf_rdma_poll_group_destroy /home/pawelkax/newest_spdk/spdk/lib/nvmf/rdma.c:1798
#2 0x58dbb1 in spdk_nvmf_transport_poll_group_destroy /home/pawelkax/newest_spdk/spdk/lib/nvmf/transport.c:131
#3 0x581ced in spdk_nvmf_tgt_destroy_poll_group /home/pawelkax/newest_spdk/spdk/lib/nvmf/nvmf.c:138
#4 0x5e632f in _spdk_put_io_channel /home/pawelkax/newest_spdk/spdk/lib/thread/thread.c:494
#5 0x5d0edc in _spdk_reactor_msg_passed /home/pawelkax/newest_spdk/spdk/lib/event/reactor.c:215
#6 0x5d0d53 in _spdk_event_queue_run_batch /home/pawelkax/newest_spdk/spdk/lib/event/reactor.c:202
#7 0x5d39f5 in _spdk_reactor_run /home/pawelkax/newest_spdk/spdk/lib/event/reactor.c:462
#8 0x5d5dcd in spdk_reactors_start /home/pawelkax/newest_spdk/spdk/lib/event/reactor.c:633
#9 0x5cdf9f in spdk_app_start /home/pawelkax/newest_spdk/spdk/lib/event/app.c:570
#10 0x4070cd in main /home/pawelkax/newest_spdk/spdk/app/nvmf_tgt/nvmf_main.c:79
#11 0x7ff9729d782f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)

previously allocated by thread T0 (reactor_0) here:
#0 0x7ff974fcf79a in __interceptor_calloc (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x9879a)
#1 0x5a6e64 in spdk_nvmf_rdma_poll_group_create /home/pawelkax/newest_spdk/spdk/lib/nvmf/rdma.c:1750
#2 0x58d9f4 in spdk_nvmf_transport_poll_group_create /home/pawelkax/newest_spdk/spdk/lib/nvmf/transport.c:122
#3 0x587aa5 in spdk_nvmf_poll_group_add_transport /home/pawelkax/newest_spdk/spdk/lib/nvmf/nvmf.c:696
#4 0x58501e in spdk_nvmf_tgt_listen_add_transport /home/pawelkax/newest_spdk/spdk/lib/nvmf/nvmf.c:416
#5 0x5e73d2 in _call_channel /home/pawelkax/newest_spdk/spdk/lib/thread/thread.c:605
#6 0x5d0edc in _spdk_reactor_msg_passed /home/pawelkax/newest_spdk/spdk/lib/event/reactor.c:215
#7 0x5d0d53 in _spdk_event_queue_run_batch /home/pawelkax/newest_spdk/spdk/lib/event/reactor.c:202
#8 0x5d39f5 in _spdk_reactor_run /home/pawelkax/newest_spdk/spdk/lib/event/reactor.c:462
#9 0x5d5dcd in spdk_reactors_start /home/pawelkax/newest_spdk/spdk/lib/event/reactor.c:633
#10 0x5cdf9f in spdk_app_start /home/pawelkax/newest_spdk/spdk/lib/event/app.c:570
#11 0x4070cd in main /home/pawelkax/newest_spdk/spdk/app/nvmf_tgt/nvmf_main.c:79
#12 0x7ff9729d782f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)

SUMMARY: AddressSanitizer: heap-use-after-free /home/pawelkax/newest_spdk/spdk/lib/nvmf/rdma.c:356 spdk_nvmf_rdma_qpair_destroy
Shadow bytes around the buggy address:
0x0c0c7fff9c70: 00 00 00 00 fa fa fa fa 00 00 00 00 00 00 00 00
0x0c0c7fff9c80: fa fa fa fa 00 00 00 00 00 00 00 00 fa fa fa fa
0x0c0c7fff9c90: 00 00 00 00 00 00 00 00 fa fa fa fa 00 00 00 00
0x0c0c7fff9ca0: 00 00 00 00 fa fa fa fa 00 00 00 00 00 00 00 00
0x0c0c7fff9cb0: fa fa fa fa 00 00 00 00 00 00 00 00 fa fa fa fa
=>0x0c0c7fff9cc0: fd fd fd fd[fd]fd fd fa fa fa fa fa fd fd fd fd
0x0c0c7fff9cd0: fd fd fd fd fa fa fa fa 00 00 00 00 00 00 00 00
0x0c0c7fff9ce0: fa fa fa fa fd fd fd fd fd fd fd fa fa fa fa fa
0x0c0c7fff9cf0: fd fd fd fd fd fd fd fa fa fa fa fa fd fd fd fd
0x0c0c7fff9d00: fd fd fd fa fa fa fa fa fd fd fd fd fd fd fd fa
0x0c0c7fff9d10: fa fa fa fa fd fd fd fd fd fd fd fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Heap right redzone: fb
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack partial redzone: f4
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
==144090==ABORTING

Possible Solution

None

Steps to Reproduce

  1. ./app/nvmf_tgt/nvmf_tgt
  2. ./scripts/rpc.py construct_malloc_bdev 2048 512 -b Malloc
  3. ./scripts/rpc.py construct_nvmf_subsystem -s N37SXV509SRW nqn.2014-08.org.spdk:cnode1 'trtype:RDMA traddr:192.168.100.10 trsvcid:4260' '' -a -n "Malloc"
  4. nvme connect -t rdma -n nqn.2014-08.org.spdk:cnode1 -a 192.168.100.10 -s 4260 -Q 32
  5. dd if=/dev/zero of=/dev/nvme0n1 bs=1M
    and during this command press Ctrl-C.
  6. Kill nvmf_tgt

Context (Environment including OS version, SPDK version, etc.)

Ubuntu 16.04

@WenqianZong
Copy link

WenqianZong commented Jun 21, 2018

@pawelkax
I can not reproduce this issue with your steps . Can you tell me where your error log was found?core dump trace log?

@pawelkax
Copy link
Contributor Author

I found my logs after killing nvmf_tgt application.
Before compiling spdk sources I use:
./configure --enable-debug --enable-werror --enable-coverage --enable-ubsan --enable-asan --with-rdma --with-pmdk --with-rbd --with-iscsi-initiator

@WenqianZong
Copy link

@pawelkax
Unfortunately, I also can not reproduce this issue with compiling spdk command "./configure --enable-debug --enable-werror --enable-coverage --enable-ubsan --enable-asan --with-rdma --with-pmdk --with-rbd --with-iscsi-initiator".

@pawelkax
Copy link
Contributor Author

@WenqianZong
Ok. I will try to reproduce this again and try to clearly define steps to get error message.

@wanqunintel
Copy link

@pawelkax do you have the commit #?

@pawelkax
Copy link
Contributor Author

@wanqunintel No, I don't have any commit.

@pawelkax
Copy link
Contributor Author

pawelkax commented Jun 26, 2018

@wanqunintel @WenqianZong
When I use command:
nvme connect -t rdma -n nqn.2014-08.org.spdk:cnode1 -a 192.168.100.10 -s 4260 -Q 32
and during this command I press Ctrl-C, then error output occurs.
This is similar but not the same error as in this issue.

@pawelkax
Copy link
Contributor Author

pawelkax commented Jun 26, 2018

@wanqunintel @WenqianZong
Similar output I get when I use command:
nvme connect -t rdma -n nqn.2014-08.org.spdk:cnode1 -a 192.168.100.10 -s 4260 -Q 32
wait until it finishes, then use command:
dd if=/dev/zero of=/dev/nvme0n1 bs=1M
and press Ctrl-C. After that I kill nvmf_tgt.
This is similar but not the same error as in this issue.

@ghost ghost added the bug label Jun 27, 2018
@wanqunintel
Copy link

@pawelkax is there any patch for it

@pawelkax
Copy link
Contributor Author

@wanqunintel No there is no patch for it, This error occurs on master.

@WenqianZong
Copy link

@pawelkax
I still can not reproduce this issue.Could you provide detail test env(kernel ,os...), thanks.

@pawelkax
Copy link
Contributor Author

pawelkax commented Jul 6, 2018

@WenqianZong I will check again if this error exists

@WenqianZong
Copy link

I use ubuntu 17.10 and test this issue , it can be reproduced.But I test 10 times , only happened once. Following is log:

==271060==ERROR: AddressSanitizer: heap-use-after-free on address 0x606000000dc0 at pc 0x558586791094 bp 0x7ffc652815c0 sp 0x7ffc652815b0
WRITE of size 8 at 0x606000000dc0 thread T0 (reactor_0)
    #0 0x558586791093 in spdk_nvmf_rdma_qpair_destroy /home/zong/spdk/lib/nvmf/rdma.c:296
    #1 0x55858679a227 in spdk_nvmf_ctrlr_destruct /home/zong/spdk/lib/nvmf/ctrlr.c:210
    #2 0x55858678d7a6 in spdk_nvmf_subsystem_destroy /home/zong/spdk/lib/nvmf/subsystem.c:347
    #3 0x55858678ef5e in spdk_nvmf_tgt_destroy /home/zong/spdk/lib/nvmf/nvmf.c:203
    #4 0x558586784dae in nvmf_tgt_advance_state /home/zong/spdk/lib/event/subsystems/nvmf/nvmf_tgt.c:306
    #5 0x5585867a5d70 in _spdk_event_queue_run_batch /home/zong/spdk/lib/event/reactor.c:202
    #6 0x5585867a5d70 in _spdk_reactor_run /home/zong/spdk/lib/event/reactor.c:446
    #7 0x5585867a68c7 in spdk_reactors_start /home/zong/spdk/lib/event/reactor.c:624
    #8 0x5585867a3ee6 in spdk_app_start /home/zong/spdk/lib/event/app.c:484
    #9 0x55858670a226 in main /home/zong/spdk/app/nvmf_tgt/nvmf_main.c:79
    #10 0x7f23fc0ea1c0 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x211c0)
    #11 0x55858670c3b9 in _start (/home/zong/spdk/app/nvmf_tgt/nvmf_tgt+0x463b9)

0x606000000dc0 is located 32 bytes inside of 56-byte region [0x606000000da0,0x606000000dd8)
freed by thread T0 (reactor_0) here:
    #0 0x7f23fd5ec7b8 in __interceptor_free (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xde7b8)
    #1 0x55858679123e in spdk_nvmf_rdma_poll_group_destroy /home/zong/spdk/lib/nvmf/rdma.c:1569

previously allocated by thread T0 (reactor_0) here:
    #0 0x7f23fd5ecd38 in __interceptor_calloc (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xded38)
    #1 0x558586791336 in spdk_nvmf_rdma_poll_group_create /home/zong/spdk/lib/nvmf/rdma.c:1521

SUMMARY: AddressSanitizer: heap-use-after-free /home/zong/spdk/lib/nvmf/rdma.c:296 in spdk_nvmf_rdma_qpair_destroy
Shadow bytes around the buggy address:
  0x0c0c7fff8160: fd fd fd fd fd fd fd fa fa fa fa fa fd fd fd fd
  0x0c0c7fff8170: fd fd fd fa fa fa fa fa fd fd fd fd fd fd fd fa
  0x0c0c7fff8180: fa fa fa fa fd fd fd fd fd fd fd fa fa fa fa fa
  0x0c0c7fff8190: fd fd fd fd fd fd fd fa fa fa fa fa fd fd fd fd
  0x0c0c7fff81a0: fd fd fd fa fa fa fa fa fd fd fd fd fd fd fd fd
=>0x0c0c7fff81b0: fa fa fa fa fd fd fd fd[fd]fd fd fa fa fa fa fa
  0x0c0c7fff81c0: fd fd fd fd fd fd fd fd fa fa fa fa fd fd fd fd
  0x0c0c7fff81d0: fd fd fd fa fa fa fa fa fd fd fd fd fd fd fd fa
  0x0c0c7fff81e0: fa fa fa fa fd fd fd fd fd fd fd fa fa fa fa fa
  0x0c0c7fff81f0: fd fd fd fd fd fd fd fa fa fa fa fa fd fd fd fd
  0x0c0c7fff8200: fd fd fd fd fa fa fa fa fd fd fd fd fd fd fd fd
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==271060==ABORTING

@pawelkax
Copy link
Contributor Author

@WenqianZong Thank you very match for testing this issue. I can reproduce this problem on ubuntu 16.04 every time I try.

@pawelkax
Copy link
Contributor Author

pawelkax commented Jul 25, 2018

@tomzawadzki @Seth5141 @WenqianZong
I confirm that this error still exists. I don't think that this issue can be merged with #347 . We will be able to check that these two issues are the same, only if any bug from #337 and #347 will be solved. Steps to reproduce these two bugs are listed in description.

@pawelkax
Copy link
Contributor Author

pawelkax commented Aug 7, 2018

@tomzawadzki @Seth5141 @WenqianZong
I tested this bug on master and I can still reproduce it

@pawelkax
Copy link
Contributor Author

pawelkax commented Aug 7, 2018

New output when killing nvmf_tgt:

=================================================================
==7241==ERROR: AddressSanitizer: heap-use-after-free on address 0x61700000c128 at pc 0x0000005aad3e bp 0x7ffccae47a90 sp 0x7ffccae47a80
READ of size 8 at 0x61700000c128 thread T0 (reactor_0)
#0 0x5aad3d in _spdk_nvmf_qpair_destroy /root/spdk_nvmf/lib/nvmf/nvmf.c:697
#1 0x5ab914 in _spdk_nvmf_qpair_deactivate /root/spdk_nvmf/lib/nvmf/nvmf.c:751
#2 0x5f9609 in _spdk_reactor_msg_passed /root/spdk_nvmf/lib/event/reactor.c:220
#3 0x5f9480 in _spdk_event_queue_run_batch /root/spdk_nvmf/lib/event/reactor.c:207
#4 0x5fc7cf in _spdk_reactor_run /root/spdk_nvmf/lib/event/reactor.c:506
#5 0x5fecd5 in spdk_reactors_start /root/spdk_nvmf/lib/event/reactor.c:692
#6 0x5f6614 in spdk_app_start /root/spdk_nvmf/lib/event/app.c:576
#7 0x4079cd in main /root/spdk_nvmf/app/nvmf_tgt/nvmf_main.c:79
#8 0x7f374fda982f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)
#9 0x407648 in _start (/root/spdk_nvmf/app/nvmf_tgt/nvmf_tgt+0x407648)

0x61700000c128 is located 40 bytes inside of 680-byte region [0x61700000c100,0x61700000c3a8)
freed by thread T0 (reactor_0) here:
#0 0x7f3752bd42ca in __interceptor_free (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x982ca)
#1 0x5b704c in spdk_nvmf_rdma_qpair_destroy /root/spdk_nvmf/lib/nvmf/rdma.c:520
#2 0x5d2635 in spdk_nvmf_rdma_close_qpair /root/spdk_nvmf/lib/nvmf/rdma.c:2392
#3 0x5b3f6d in spdk_nvmf_transport_qpair_fini /root/spdk_nvmf/lib/nvmf/transport.c:178
#4 0x5ab0db in _spdk_nvmf_qpair_destroy /root/spdk_nvmf/lib/nvmf/nvmf.c:707
#5 0x5af7f2 in spdk_nvmf_qpair_request_cleanup /root/spdk_nvmf/lib/nvmf/request.c:55
#6 0x5afd54 in spdk_nvmf_request_free /root/spdk_nvmf/lib/nvmf/request.c:72
#7 0x5ebacc in spdk_nvmf_qpair_free_aer /root/spdk_nvmf/lib/nvmf/ctrlr.c:1726
#8 0x5ab8dc in _spdk_nvmf_qpair_deactivate /root/spdk_nvmf/lib/nvmf/nvmf.c:747
#9 0x5f9609 in _spdk_reactor_msg_passed /root/spdk_nvmf/lib/event/reactor.c:220
#10 0x5f9480 in _spdk_event_queue_run_batch /root/spdk_nvmf/lib/event/reactor.c:207
#11 0x5fc7cf in _spdk_reactor_run /root/spdk_nvmf/lib/event/reactor.c:506
#12 0x5fecd5 in spdk_reactors_start /root/spdk_nvmf/lib/event/reactor.c:692
#13 0x5f6614 in spdk_app_start /root/spdk_nvmf/lib/event/app.c:576
#14 0x4079cd in main /root/spdk_nvmf/app/nvmf_tgt/nvmf_main.c:79
#15 0x7f374fda982f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)

previously allocated by thread T0 (reactor_0) here:
#0 0x7f3752bd479a in __interceptor_calloc (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x9879a)
#1 0x5bf480 in nvmf_rdma_connect /root/spdk_nvmf/lib/nvmf/rdma.c:888
#2 0x5cc829 in spdk_nvmf_process_cm_event /root/spdk_nvmf/lib/nvmf/rdma.c:1792
#3 0x5cf18d in spdk_nvmf_rdma_accept /root/spdk_nvmf/lib/nvmf/rdma.c:2103
#4 0x5b2fd1 in spdk_nvmf_transport_accept /root/spdk_nvmf/lib/nvmf/transport.c:106
#5 0x5a975a in spdk_nvmf_tgt_accept /root/spdk_nvmf/lib/nvmf/nvmf.c:584
#6 0x58b2e8 in acceptor_poll /root/spdk_nvmf/lib/event/subsystems/nvmf/nvmf_tgt.c:144
#7 0x5fd88e in _spdk_reactor_run /root/spdk_nvmf/lib/event/reactor.c:541
#8 0x5fecd5 in spdk_reactors_start /root/spdk_nvmf/lib/event/reactor.c:692
#9 0x5f6614 in spdk_app_start /root/spdk_nvmf/lib/event/app.c:576
#10 0x4079cd in main /root/spdk_nvmf/app/nvmf_tgt/nvmf_main.c:79
#11 0x7f374fda982f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)

SUMMARY: AddressSanitizer: heap-use-after-free /root/spdk_nvmf/lib/nvmf/nvmf.c:697 _spdk_nvmf_qpair_destroy
Shadow bytes around the buggy address:
0x0c2e7fff97d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0c2e7fff97e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0c2e7fff97f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0c2e7fff9800: 00 00 00 00 00 fa fa fa fa fa fa fa fa fa fa fa
0x0c2e7fff9810: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
=>0x0c2e7fff9820: fd fd fd fd fd[fd]fd fd fd fd fd fd fd fd fd fd
0x0c2e7fff9830: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
0x0c2e7fff9840: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
0x0c2e7fff9850: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
0x0c2e7fff9860: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
0x0c2e7fff9870: fd fd fd fd fd fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Heap right redzone: fb
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack partial redzone: f4
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
==7241==ABORTING

@benlwalker
Copy link
Member

This patch series fixes the problem, it just isn't fully merged yet:

https://review.gerrithub.io/#/c/spdk/spdk/+/421167/

@WenqianZong
Copy link

@benlwalker I pull patch https://review.gerrithub.io/#/c/spdk/spdk/+/421167/ and verified issue, this issue also can be reproduced.Following is log:

^C=================================================================
==34567==ERROR: AddressSanitizer: heap-use-after-free on address 0x6170000023a8 at pc 0x0000005f856b bp 0x7ffe7632a490 sp 0x7ffe7632a480
READ of size 8 at 0x6170000023a8 thread T0 (reactor_0)
    #0 0x5f856a in _spdk_nvmf_qpair_destroy /home/zong/spdk/lib/nvmf/nvmf.c:697
    #1 0x5f9319 in _spdk_nvmf_qpair_deactivate /home/zong/spdk/lib/nvmf/nvmf.c:751
    #2 0x658d17 in _spdk_reactor_msg_passed /home/zong/spdk/lib/event/reactor.c:220
    #3 0x658b80 in _spdk_event_queue_run_batch /home/zong/spdk/lib/event/reactor.c:207
    #4 0x65c936 in _spdk_reactor_run /home/zong/spdk/lib/event/reactor.c:506
    #5 0x65f48a in spdk_reactors_start /home/zong/spdk/lib/event/reactor.c:692
    #6 0x6558f2 in spdk_app_start /home/zong/spdk/lib/event/app.c:576
    #7 0x407c48 in main /home/zong/spdk/app/nvmf_tgt/nvmf_main.c:79
    #8 0x7fa3646ae18a in __libc_start_main (/lib64/libc.so.6+0x2318a)
    #9 0x407859 in _start (/home/zong/spdk/app/nvmf_tgt/nvmf_tgt+0x407859)

0x6170000023a8 is located 40 bytes inside of 688-byte region [0x617000002380,0x617000002630)
freed by thread T0 (reactor_0) here:
    #0 0x7fa367a5c880 in __interceptor_free (/lib64/libasan.so.5+0xee880)
    #1 0x6071cb in spdk_nvmf_rdma_qpair_destroy /home/zong/spdk/lib/nvmf/rdma.c:522
    #2 0x629ecb in spdk_nvmf_rdma_close_qpair /home/zong/spdk/lib/nvmf/rdma.c:2422
    #3 0x603257 in spdk_nvmf_transport_qpair_fini /home/zong/spdk/lib/nvmf/transport.c:178
    #4 0x5f898e in _spdk_nvmf_qpair_destroy /home/zong/spdk/lib/nvmf/nvmf.c:707
    #5 0x5fdd88 in spdk_nvmf_qpair_request_cleanup /home/zong/spdk/lib/nvmf/request.c:55
    #6 0x5fe3c0 in spdk_nvmf_request_free /home/zong/spdk/lib/nvmf/request.c:72
    #7 0x64887c in spdk_nvmf_qpair_free_aer /home/zong/spdk/lib/nvmf/ctrlr.c:1726
    #8 0x5f92f6 in _spdk_nvmf_qpair_deactivate /home/zong/spdk/lib/nvmf/nvmf.c:747
    #9 0x658d17 in _spdk_reactor_msg_passed /home/zong/spdk/lib/event/reactor.c:220
    #10 0x658b80 in _spdk_event_queue_run_batch /home/zong/spdk/lib/event/reactor.c:207
    #11 0x65c936 in _spdk_reactor_run /home/zong/spdk/lib/event/reactor.c:506
    #12 0x65f48a in spdk_reactors_start /home/zong/spdk/lib/event/reactor.c:692
    #13 0x6558f2 in spdk_app_start /home/zong/spdk/lib/event/app.c:576
    #14 0x407c48 in main /home/zong/spdk/app/nvmf_tgt/nvmf_main.c:79
    #15 0x7fa3646ae18a in __libc_start_main (/lib64/libc.so.6+0x2318a)

previously allocated by thread T0 (reactor_0) here:
    #0 0x7fa367a5ce50 in calloc (/lib64/libasan.so.5+0xeee50)
    #1 0x611ee6 in nvmf_rdma_connect /home/zong/spdk/lib/nvmf/rdma.c:890
    #2 0x62330b in spdk_nvmf_process_cm_event /home/zong/spdk/lib/nvmf/rdma.c:1803
    #3 0x625f04 in spdk_nvmf_rdma_accept /home/zong/spdk/lib/nvmf/rdma.c:2136
    #4 0x6020ed in spdk_nvmf_transport_accept /home/zong/spdk/lib/nvmf/transport.c:106
    #5 0x5f6aad in spdk_nvmf_tgt_accept /home/zong/spdk/lib/nvmf/nvmf.c:584
    #6 0x5d3cc9 in acceptor_poll /home/zong/spdk/lib/event/subsystems/nvmf/nvmf_tgt.c:144
    #7 0x65ddbb in _spdk_reactor_run /home/zong/spdk/lib/event/reactor.c:541
    #8 0x65f48a in spdk_reactors_start /home/zong/spdk/lib/event/reactor.c:692
    #9 0x6558f2 in spdk_app_start /home/zong/spdk/lib/event/app.c:576
    #10 0x407c48 in main /home/zong/spdk/app/nvmf_tgt/nvmf_main.c:79
    #11 0x7fa3646ae18a in __libc_start_main (/lib64/libc.so.6+0x2318a)

SUMMARY: AddressSanitizer: heap-use-after-free /home/zong/spdk/lib/nvmf/nvmf.c:697 in _spdk_nvmf_qpair_destroy
Shadow bytes around the buggy address:
  0x0c2e7fff8420: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c2e7fff8430: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c2e7fff8440: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c2e7fff8450: 00 00 00 00 00 00 00 00 00 00 00 fa fa fa fa fa
  0x0c2e7fff8460: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
=>0x0c2e7fff8470: fd fd fd fd fd[fd]fd fd fd fd fd fd fd fd fd fd
  0x0c2e7fff8480: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c2e7fff8490: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c2e7fff84a0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c2e7fff84b0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c2e7fff84c0: fd fd fd fd fd fd fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==34567==ABORTING

@benlwalker
Copy link
Member

I just took a look at what happened. The above bug is actually different than the one originally reported and the patch I linked originally did fix the first bug. An additional bug was introduced as part of another fix and when the patches I linked were rebased this new failure appeared.

I've added an additional patch to my series that fixes this new bug. You can get to the code here:

https://review.gerrithub.io/#/c/spdk/spdk/+/422134/

We really need to get an automated test into the pool that kills the nvmf target while I/O is outstanding so that we stop introducing new problems.

@benlwalker
Copy link
Member

benlwalker commented Aug 13, 2018

@liang2yan @wanqunintel @WenqianZong can you all please write a patch that adds the following test to ensure these issues do not get reintroduced:

  1. ./app/nvmf_tgt/nvmf_tgt -s 2048 -m 0x1 &
  2. ./scripts/rpc.py construct_malloc_bdev 1024 512 -b Malloc
  3. ./scripts/rpc.py construct_nvmf_subsystem -s SPDK0 nqn.2014-08.org.spdk:cnode1 'trtype:RDMA traddr:ip trsvcid:port' '' -a -n "Malloc"
  4. ./examples/nvme/perf/perf -r 'trtype:RDMA adrfam:IPv4 traddr:ip trsvcid:port' -d 1024 -q 128 -s 4096 -w randread -t 20 -c 0x2 &
  5. sleep 5
  6. kill nvmf_tgt
  7. kill perf

@wanqunintel
Copy link

@WenqianZong will add these case in the autotest. thanks.

@WenqianZong
Copy link

@benlwalker I use latest spdk to test this issue and find new asan log:

app.c: 574:spdk_app_start: *NOTICE*: Total cores available: 1
reactor.c: 718:spdk_reactors_init: *NOTICE*: Occupied cpu socket mask is 0x1
reactor.c: 492:_spdk_reactor_run: *NOTICE*: Reactor started on core 0 on socket 0
rdma.c:2203:spdk_nvmf_rdma_poll_group_create: *NOTICE*: Skipping unused RDMA device when creating poll group.
rdma.c:2203:spdk_nvmf_rdma_poll_group_create: *NOTICE*: Skipping unused RDMA device when creating poll group.
rdma.c:2203:spdk_nvmf_rdma_poll_group_create: *NOTICE*: Skipping unused RDMA device when creating poll group.
rdma.c:2203:spdk_nvmf_rdma_poll_group_create: *NOTICE*: Skipping unused RDMA device when creating poll group.
rdma.c:2203:spdk_nvmf_rdma_poll_group_create: *NOTICE*: Skipping unused RDMA device when creating poll group.
rdma.c:2203:spdk_nvmf_rdma_poll_group_create: *NOTICE*: Skipping unused RDMA device when creating poll group.
profiling:/home/zong/spdk/lib/virtio/virtio_user/vhost_user.gcda:Version mismatch - expected A81R got 603R

=================================================================
==17654==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 480 byte(s) in 1 object(s) allocated from:
    #0 0x7f3fdbe30e50 in calloc (/lib64/libasan.so.5+0xeee50)
    #1 0x7f3fbc92df74  (/usr/lib64/libibverbs/libmlx5-rdmav16.so+0x12f74)

Indirect leak of 524288 byte(s) in 1 object(s) allocated from:
    #0 0x7f3fdbe31a60 in posix_memalign (/lib64/libasan.so.5+0xefa60)
    #1 0x7f3fbc91e4eb  (/usr/lib64/libibverbs/libmlx5-rdmav16.so+0x34eb)

SUMMARY: AddressSanitizer: 524768 byte(s) leaked in 2 allocation(s).

test step :
./app/nvmf_tgt/nvmf_tgt -s 2048 -m 0x1 &
./scripts/rpc.py construct_malloc_bdev 1024 512 -b Malloc
./scripts/rpc.py construct_nvmf_subsystem -s SPDK0 nqn.2014-08.org.spdk:cnode1 'trtype:RDMA traddr:ip trsvcid:port' '' -a -n "Malloc"
./examples/nvme/perf/perf -r 'trtype:RDMA adrfam:IPv4 traddr:ip trsvcid:port' -d 1024 -q 128 -s 4096 -w randread -t 20 -c 0x2 &
sleep 5
kill nvmf_tgt
kill perf

@jimharris
Copy link
Member

Hi @WenqianZong - should we file a different issue for the ASAN error? It seems unrelated to the original failure reported by @pawelkax.

@wanqunintel
Copy link

close this one and will raise a new issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug High High Priority Bug
Projects
None yet
Development

No branches or pull requests

6 participants