-
Notifications
You must be signed in to change notification settings - Fork 537
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Netmap app crashing FreeBSD #229
Comments
What FreeBSD version are you running on? |
I'm using 10.3-STABLE-r305972M Thanks. |
I think the issue is that na->tx_rings is NULL (and so tx_ring points to page 0). Can you print it in the debugger to confirm that? |
Thanks for your help @vmaffione. Here is my output, i'm not sure if it's what are you looking. Let me know i can run another command. (kgdb) up 8 |
Here is more output info: (kgdb) print *ifp |
I think here is what you are looking for: (kgdb) print ((struct netmap_adapter)ifp)->tx_rings |
Nope, you have to print na->tx_rings 2016-09-28 22:47 GMT+02:00 alexhebra notifications@github.com:
Vincenzo Maffione |
When i print "((struct netmap_adapter)ifp)->tx_rings" is not as same as "na->tx_rings"? |
no, the NA() macro is not a cast |
Take a look here please: (kgdb) print *((struct netmap_adapter *)ifp->if_pspare) So looks NULL, right? |
I'm not sure about what are you printing. Can you please
? |
Here it is: (kgdb) print na->tx_rings |
Ok, you were right by looking at ifp->if_pspare. So yes, na->tx_rings as I was suspecting. Thanks. |
Ok, let me know if you need something else. Thanks. |
@vmaffione i've been trying fix this issue with no success. I'd like to help to fix it. Please can you point me out a direction? Thanks. |
We know that na->tx_rings is NULL where it should not be. Then I would run your test to see what happens in the log. Was tx_rings created when netmap_transmit is called? Was tx_rings destroyed after netmap_transmit is called? |
I think i found the issue, it's on line 3025 (netmap.c): "if (tx_kring->nr_mode == NKR_NETMAP_OFF) {" If i comment it out, it stopped crashing. I added some log printing as you suggested: In function netmap_create(): In function netmap_transmit(): Before "if" i added: I didn't have any logs on function netmap_destroy() because the system crashed before it has been destroyed. So i tried to make a simple test printing tx_kring->nr_mode This simple log print let the system to be crashed when tx_kring->nr_mode is accessed. So looks an issue with tx_kring->nr_mode. Do you have any idea? Thanks. |
In the previous traces it was clear that the problem was a tx_kring NULL pointer. Here I don't see any NULL pointer. However, there is something wrong with the address of tx_kring, which is not related to na->tx_rings. In the netmap_krings_create()
In the netmap_krings_delete()
Before the "if" in netmap_transmit(), and so right before the crash
|
I've add log print as you suggested and i got the following: netmap_krings_create: na: 0xfffff800054bac00: na->tx_rings: 0xfffffe0006989000 netmap_transmit: na: 0xfffff800054bac00: na->tx_rings: 0xfffffe0006989000 txr: -1009587095 tx_kring: 0xfffffd87ac716200 netmap_krings_delete: na: 0xfffff800054bac00: na->tx_rings: 0xfffffe0006989000 As you can see "txr" is receiving a negative value, then i added an "if" to check when its negative change to zero: txr = txr < 0 ? 0 : txr; It fixed the issue, but i don't know if it's the best approach to follow. What do you think? Thanks. |
Ah, good catch. So the problem is that we cannot assume FreeBSD flowid is set correctly. No, your proposal is not enough, the upper bound is not checked. |
I've just tested and it worked. Thanks for your help to fix this issue. :) |
ok! |
Hi,
I've a netmap application routing (it takes my bgp routes, like netmap-fwd), but when i start it netmap just crash the system. I turned on kernel debug mode on FreeBSD and i can see the following message:
Sep 28 15:58:00 rt1 kernel: Fatal trap 12: page fault while in kernel mode
Sep 28 15:58:00 rt1 kernel: cpuid = 2; apic id = 04
Sep 28 15:58:00 rt1 kernel: fault virtual address = 0xfffffd78f2a32e20
Sep 28 15:58:00 rt1 kernel: fault code = supervisor read data, page not present
Sep 28 15:58:00 rt1 kernel: instruction pointer = 0x20:0xffffffff80693f99
Sep 28 15:58:00 rt1 kernel: stack pointer = 0x28:0xfffffe0237e773b0
Sep 28 15:58:00 rt1 kernel: frame pointer = 0x28:0xfffffe0237e77480
Sep 28 15:58:00 rt1 kernel: code segment = base 0x0, limit 0xfffff, type 0x1b
(kgdb) list *0xffffffff80693f99
0xffffffff80693f99 is in netmap_transmit (/usr/src/sys/dev/netmap/netmap.c:3025).
3020 }
3021
3022 txr = MBUF_TXQ(m);
3023 tx_kring = &NMR(na, NR_TX)[txr];
3024
3025 if (tx_kring->nr_mode == NKR_NETMAP_OFF) {
3026 return MBUF_TRANSMIT(na, ifp, m);
3027 }
3028
3029 q = &kring->rx_queue;
Current language: auto; currently minimal
(kgdb) bt
#0 doadump (textdump=) at pcpu.h:219
#1 0xffffffff8097f642 in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:486
#2 0xffffffff8097fa25 in vpanic (fmt=, ap=) at /usr/src/sys/kern/kern_shutdown.c:889
#3 0xffffffff8097f8b3 in panic (fmt=0x0) at /usr/src/sys/kern/kern_shutdown.c:818
#4 0xffffffff80daa69b in trap_fatal (frame=, eva=) at /usr/src/sys/amd64/amd64/trap.c:858
#5 0xffffffff80daa99d in trap_pfault (frame=0xfffffe0237e77300, usermode=) at /usr/src/sys/amd64/amd64/trap.c:681
#6 0xffffffff80da9fea in trap (frame=0xfffffe0237e77300) at /usr/src/sys/amd64/amd64/trap.c:447
#7 0xffffffff80d9007c in calltrap () at /usr/src/sys/amd64/amd64/exception.S:238
#8 0xffffffff80693f99 in netmap_transmit (ifp=0xfffff800054c1800, m=0xfffff801ac443400) at /usr/src/sys/dev/netmap/netmap.c:3025
#9 0xffffffff80a4decd in ether_output (ifp=0xfffff800054c1800, m=0xfffff801ac443400, dst=, ro=) at /usr/src/sys/net/if_ethersubr.c:438
#10 0xffffffff80abb6fb in ip_output (m=0xfffff801ac443400, opt=, flags=0, imo=, inp=)
#11 0xffffffff80b2f3d2 in tcp_output (tp=0xfffff801ac7b7810) at /usr/src/sys/netinet/tcp_output.c:1407
#12 0xffffffff80b3b549 in tcp_usr_send (so=, flags=, m=, nam=, control=,
#13 0xffffffff809fe796 in sosend_generic (so=0xfffff8001eaea2b8, addr=0x0, uio=0xfffffe0237e779c0, top=, control=,
#14 0xffffffff80a04b15 in kern_sendit (td=0xfffff8001e053960, s=7, mp=0xfffffe0237e77a88, flags=0, control=0x0, segflg=)
#15 0xffffffff80a04e39 in sendit (td=0xfffff8001e053960, s=, mp=0xfffffe0237e77a88, flags=1223614464) at /usr/src/sys/kern/uipc_syscalls.c:871
#16 0xffffffff80a04ec1 in sys_sendmsg (td=0xfffff8001e053960, uap=0xfffffe0237e77b80) at /usr/src/sys/kern/uipc_syscalls.c:1073
#17 0xffffffff80dab07e in amd64_syscall (td=0xfffff8001e053960, traced=0) at subr_syscall.c:141
#18 0xffffffff80d9036b in Xfast_syscall () at /usr/src/sys/amd64/amd64/exception.S:398
#19 0x000000080098adca in ?? ()
Previous frame inner to this frame (corrupt stack?)
(kgdb)
I tried another test running examples/bridge on the same NIC that my bgp get the routes and same crash happened. Any ideas what's going on?
Any help is welcome. Thanks.
The text was updated successfully, but these errors were encountered: