Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

QDMA_Linux: bug in qdma_cdev_destroy??? #13

Closed
hmaarrfk opened this issue May 14, 2019 · 2 comments
Closed

QDMA_Linux: bug in qdma_cdev_destroy??? #13

hmaarrfk opened this issue May 14, 2019 · 2 comments

Comments

@hmaarrfk
Copy link

Not too sure how I got this to happen. But this is the basic history that I think can help recreate it.

All of this is being run as my own user, (with the fixed up udev rules).

I think there are a few timing bugs in the dmautils app, which is why while qdma01000-MM-0 was created, it claims it can't find it, and just shuts down.

$ ./dmautils -c config/dmautils_config/mm-bi/mm_1_1/bi_mm_1_1_64                                                                                      
dmactl qdma01000 q add idx 0 mode mm dir h2c

qdma01000-MM-0 H2C added.
Added 1 Queues.

dmactl qdma01000 q start idx 0 dir h2c idx_ringsz 5

1 Queues started, idx 0 ~ 0.

Error: Cannot find /dev/qdma01000-MM-0
mark@mark-MS-7758 ~/g/d/Q/l/tools human_readable_error_message|+2⚑ 1                                                                                  
$ dmactl qdma01000 q del idx 0 mode mm dir h2c                                                                                                        

queue qdma01000-MM-0, id 0 cannot be deleted. Invalid q state

dmactl: Warn: Ignoring attr: mode
mark@mark-MS-7758 ~/g/d/Q/l/tools human_readable_error_message|+2⚑ 1                                                                                  
$ dmactl qdma01000 q stop idx 0 mode mm dir h2c                                                                                                       

Stopped Queues 0 -> 0.

dmactl: Warn: Ignoring attr: mode
mark@mark-MS-7758 ~/g/d/Q/l/tools human_readable_error_message|+2⚑ 1                                                                                  
$ dmactl qdma01000 q del idx 0 mode mm dir h2c                                                                                                        
dmactl: Warn: Ignoring attr: mode
Segmentation fault (core dumped)
mark@mark-MS-7758 ~/g/d/Q/l/tools human_readable_error_message|+2⚑ 1                                                                                  
$ dmactl qdma01000 q del idx 0 mode mm dir h2c                                                                                                        
dmactl: Warn: Ignoring attr: mode
^C^C^C^C^C^C^C^C^C^C^C
[   56.515057] qdma:xnl_q_start: qdma_queue_reconfig() failed: -8
[   58.630003] qdma:qdma_queue_reconfig: qdma01000-MM-0 invalid state, q_state 2.
[   58.630006] qdma:xnl_q_start: qdma_queue_reconfig() failed: -8
[   59.211169] qdma:qdma_queue_reconfig: qdma01000-MM-0 invalid state, q_state 2.
[   59.211172] qdma:xnl_q_start: qdma_queue_reconfig() failed: -8
[   59.638692] qdma:qdma_queue_reconfig: qdma01000-MM-0 invalid state, q_state 2.
[   59.638695] qdma:xnl_q_start: qdma_queue_reconfig() failed: -8
[ 1133.315622] qdma:xnl_q_del: xpdev_queue_delete() failed: -8
[ 1139.733050] ------------[ cut here ]------------
[ 1139.733054] kernel BUG at /build/linux-fkZVDM/linux-4.15.0/mm/slub.c:296!
[ 1139.733057] invalid opcode: 0000 [#1] SMP PTI
[ 1139.733059] Modules linked in: nls_iso8859_1 snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm snd_hda_intel snd_hda_codec snd_hda_core irqbypass snd_hwdep crct10dif_pclmul snd_pcm crc32_pclmul ghash_clmulni_intel pcbc snd_seq_midi snd_seq_midi_event aesni_intel joydev input_leds aes_x86_64 crypto_simd glue_helper snd_rawmidi cryptd intel_cstate i915 intel_rapl_perf snd_seq snd_seq_device qdma(OE) snd_timer drm_kms_helper qdma_vf(OE) wmi snd drm i2c_algo_bit fb_sys_fops syscopyarea sysfillrect soundcore sysimgblt lpc_ich mei_me shpchp mac_hid mei video ie31200_edac intel_smartconnect sch_fq_codel parport_pc ppdev lp parport ip_tables x_tables autofs4 hid_generic usbhid hid r8169 ahci libahci mii
[ 1139.733113] CPU: 3 PID: 3413 Comm: dmactl Tainted: G           OE    4.15.0-48-generic #51-Ubuntu
[ 1139.733115] Hardware name: MSI MS-7758/Z77A-G43 (MS-7758), BIOS V2.13 03/07/2014
[ 1139.733123] RIP: 0010:__slab_free+0x17a/0x2c0
[ 1139.733126] RSP: 0018:ffffbbbc8a35b890 EFLAGS: 00010246
[ 1139.733129] RAX: ffff990f29366100 RBX: ffff990f29366100 RCX: 0000000180100002
[ 1139.733131] RDX: ffff990f29366100 RSI: fffff11e1ca4d980 RDI: ffff990ffec03200
[ 1139.733133] RBP: ffffbbbc8a35b930 R08: 0000000000000001 R09: ffffffffc0464fbe
[ 1139.733136] R10: ffffbbbc8a35b950 R11: 0000000000000100 R12: ffff990ffec03200
[ 1139.733138] R13: fffff11e1ca4d980 R14: ffff990f29366100 R15: 0000000000000100
[ 1139.733141] FS:  00007ff92a6aa500(0000) GS:ffff99101f380000(0000) knlGS:0000000000000000
[ 1139.733144] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1139.733146] CR2: 0000564938c8a000 CR3: 0000000794dbe002 CR4: 00000000001606e0
[ 1139.733149] Call Trace:
[ 1139.733156]  ? invalid_op+0x1b/0x40
[ 1139.733169]  ? qdma_cdev_destroy+0x3e/0x80 [qdma]
[ 1139.733173]  kfree+0x165/0x180
[ 1139.733177]  ? kfree+0x165/0x180
[ 1139.733184]  qdma_cdev_destroy+0x3e/0x80 [qdma]
[ 1139.733192]  xpdev_queue_delete+0xef/0x130 [qdma]
[ 1139.733199]  xnl_q_del.part.12+0x158/0x250 [qdma]
[ 1139.733205]  ? skb_queue_tail+0x43/0x50
[ 1139.733210]  ? __netlink_sendskb+0x44/0x70
[ 1139.733214]  ? netlink_unicast+0x20c/0x240
[ 1139.733221]  ? xnl_dev_info+0x22e/0x300 [qdma]
[ 1139.733229]  xnl_q_del+0x16/0x20 [qdma]
[ 1139.733233]  genl_family_rcv_msg+0x1fe/0x3f0
[ 1139.733238]  ? get_page_from_freelist+0xf16/0x1400
[ 1139.733242]  ? enqueue_task_fair+0xa1/0x7f0
[ 1139.733247]  genl_rcv_msg+0x4c/0x90
[ 1139.733250]  ? genl_family_rcv_msg+0x3f0/0x3f0
[ 1139.733254]  netlink_rcv_skb+0x54/0x130
[ 1139.733258]  genl_rcv+0x28/0x40
[ 1139.733261]  netlink_unicast+0x19e/0x240
[ 1139.733265]  netlink_sendmsg+0x2d1/0x3d0
[ 1139.733271]  sock_sendmsg+0x3e/0x50
[ 1139.733275]  SYSC_sendto+0x13f/0x180
[ 1139.733282]  ? __do_page_fault+0x270/0x4d0
[ 1139.733286]  SyS_sendto+0xe/0x10
[ 1139.733292]  do_syscall_64+0x73/0x130
[ 1139.733295]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[ 1139.733299] RIP: 0033:0x7ff92a1cada7
[ 1139.733301] RSP: 002b:00007ffc3878b638 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
[ 1139.733304] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007ff92a1cada7
[ 1139.733307] RDX: 0000000000000034 RSI: 0000564938c88380 RDI: 0000000000000003
[ 1139.733309] RBP: 00007ffc3878b670 R08: 00007ffc3878b65c R09: 000000000000000c
[ 1139.733311] R10: 0000000000000000 R11: 0000000000000246 R12: 00005649384ab320
[ 1139.733313] R13: 00007ffc3878b950 R14: 0000000000000000 R15: 0000000000000000
[ 1139.733316] Code: 0f 84 ee fe ff ff 44 0f b6 7d 8b 80 7d ab 00 79 05 45 84 ff 74 61 48 83 c4 70 5b 41 5a 41 5c 41 5d 41 5e 41 5f 5d 49 8d 62 f8 c3 <0f> 0b 4c 89 d0 4c 89 d7 45 89 fa 48 85 c0 44 0f b6 7d 8b 74 cb 
[ 1139.733363] RIP: __slab_free+0x17a/0x2c0 RSP: ffffbbbc8a35b890
[ 1139.733366] ---[ end trace e8a14e9bc0d62286 ]---

@hmaarrfk
Copy link
Author

So I'm not too exactly how to recreate it, but basically you can try to mess with the driver a little bit, by trying to repeatidly add a queue that has already been added

Running

dmactl qdma01000 q add idx 0 dir h2c idx_ringsz 5
dmactl qdma01000 q start idx 0 dir h2c idx_ringsz 5
dmactl qdma01000 q add idx 0 dir h2c idx_ringsz 5
dmactl qdma01000 q start idx 0 dir h2c idx_ringsz 5
dmactl qdma01000 q stop idx 0 dir h2c idx_ringsz 5
dmactl qdma01000 q add idx 0 dir h2c idx_ringsz 5
dmactl qdma01000 q start idx 0 dir h2c idx_ringsz 5
dmactl qdma01000 q stop idx 0 dir h2c idx_ringsz 5
dmactl qdma01000 q del idx 0 dir h2c idx_ringsz 5

A few times you get

[  114.054856] qdma:qdma_queue_add: descq idx 0 already added.
[  114.054859] qdma:xnl_q_add: xpdev_queue_add() failed: -12
[  114.056097] qdma:qdma_queue_reconfig: qdma01000-MM-0 invalid state, q_state 2.
[  114.056099] qdma:xnl_q_start: qdma_queue_reconfig() failed: -8
[  114.058385] qdma:qdma_queue_add: descq idx 0 already added.
[  114.058389] qdma:xnl_q_add: xpdev_queue_add() failed: -12
[  123.732637] qdma:qdma_queue_add: descq idx 0 already added.
[  123.732641] qdma:xnl_q_add: xpdev_queue_add() failed: -12
[  123.734470] qdma:qdma_queue_reconfig: qdma01000-MM-0 invalid state, q_state 2.
[  123.734473] qdma:xnl_q_start: qdma_queue_reconfig() failed: -8
[  123.737047] qdma:qdma_queue_add: descq idx 0 already added.
[  123.737049] qdma:xnl_q_add: xpdev_queue_add() failed: -12
[  131.492421] qdma:qdma_queue_add: descq idx 0 already added.
[  131.492424] qdma:xnl_q_add: xpdev_queue_add() failed: -12
[  131.494234] qdma:qdma_queue_reconfig: qdma01000-MM-0 invalid state, q_state 2.
[  131.494237] qdma:xnl_q_start: qdma_queue_reconfig() failed: -8
[  131.496782] qdma:qdma_queue_add: descq idx 0 already added.
[  131.496785] qdma:xnl_q_add: xpdev_queue_add() failed: -12
[  150.450714] qdma:qdma_queue_add: descq idx 0 already added.
[  150.450716] qdma:xnl_q_add: xpdev_queue_add() failed: -12
[  150.456352] qdma:qdma_queue_reconfig: qdma01000-MM-0 invalid state, q_state 2.
[  150.456353] qdma:xnl_q_start: qdma_queue_reconfig() failed: -8
[  150.463805] qdma:qdma_queue_add: descq idx 0 already added.
[  150.463808] qdma:xnl_q_add: xpdev_queue_add() failed: -12
[  150.469872] qdma:qdma_queue_reconfig: qdma01000-MM-0 invalid state, q_state 2.
[  150.469874] qdma:xnl_q_start: qdma_queue_reconfig() failed: -8
[  150.483322] qdma:qdma_queue_add: descq idx 0 already added.
[  150.483326] qdma:xnl_q_add: xpdev_queue_add() failed: -12

Then dma util will hang

$ sudo ./dmautils -c config/dmautils_config/mm-bi/mm_1_1/bi_mm_1_1_64                                
[sudo] password for mark: 
dmactl qdma01000 q add idx 0 mode mm dir h2c

qdma01000-MM-0 H2C added.
Added 1 Queues.

dmactl qdma01000 q start idx 0 dir h2c idx_ringsz 5

1 Queues started, idx 0 ~ 0.

dmactl qdma01000 q add idx 0 mode mm dir c2h

q idx 0 already added.

dmactl qdma01000 q start idx 0 dir c2h idx_ringsz 5

qdma01000-MM-0 invalid state, q_state 2.

dmautils(16) threads
dmactl qdma01000 q stop idx 0 dir h2c
dmactl qdma01000 q stop idx 0 dir h2c

Stopped Queues 0 -> 0.


queue qdma01000-MM-0, idx 0 stop failed.

qdma01000, 01:00.00, bar#2, reg 0x8 -> 0x240000, read back 0x104.
qdma01000, 01:00.00, bar#2, reg 0x8 -> 0x240000, read back 0x104.

and the following dmesgs have been added to the kernel log

[  156.876911] qdma:qdma_queue_add: descq idx 0 already added.
[  156.876914] qdma:xnl_q_add: xpdev_queue_add() failed: -12
[  156.879239] qdma:qdma_queue_reconfig: qdma01000-MM-0 invalid state, q_state 2.
[  156.879241] qdma:xnl_q_start: qdma_queue_reconfig() failed: -8
[  186.977445] qdma:qdma_queue_stop: qdma01000-MM-0 invalid state, q_state 1.

It seems logical that the kernel logic should tell the user program that it asked for something unreasonable, and keep going with the status quo, instead of putting itself in an invalid state (and hanging the QDMA driver).

@sujathabanoth-xlnx sujathabanoth-xlnx changed the title bug in qdma_cdev_destroy??? QDMA_Linux: bug in qdma_cdev_destroy??? Jan 29, 2021
@sujathabanoth-xlnx
Copy link
Collaborator

Issue is addressed in latest driver driver, Hence closing it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants