Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Netmap app crashing FreeBSD #229

Closed
alexhebra opened this issue Sep 28, 2016 · 22 comments
Closed

Netmap app crashing FreeBSD #229

alexhebra opened this issue Sep 28, 2016 · 22 comments

Comments

@alexhebra
Copy link

Hi,

I've a netmap application routing (it takes my bgp routes, like netmap-fwd), but when i start it netmap just crash the system. I turned on kernel debug mode on FreeBSD and i can see the following message:

Sep 28 15:58:00 rt1 kernel: Fatal trap 12: page fault while in kernel mode
Sep 28 15:58:00 rt1 kernel: cpuid = 2; apic id = 04
Sep 28 15:58:00 rt1 kernel: fault virtual address = 0xfffffd78f2a32e20
Sep 28 15:58:00 rt1 kernel: fault code = supervisor read data, page not present
Sep 28 15:58:00 rt1 kernel: instruction pointer = 0x20:0xffffffff80693f99
Sep 28 15:58:00 rt1 kernel: stack pointer = 0x28:0xfffffe0237e773b0
Sep 28 15:58:00 rt1 kernel: frame pointer = 0x28:0xfffffe0237e77480
Sep 28 15:58:00 rt1 kernel: code segment = base 0x0, limit 0xfffff, type 0x1b

(kgdb) list *0xffffffff80693f99
0xffffffff80693f99 is in netmap_transmit (/usr/src/sys/dev/netmap/netmap.c:3025).
3020 }
3021
3022 txr = MBUF_TXQ(m);
3023 tx_kring = &NMR(na, NR_TX)[txr];
3024
3025 if (tx_kring->nr_mode == NKR_NETMAP_OFF) {
3026 return MBUF_TRANSMIT(na, ifp, m);
3027 }
3028
3029 q = &kring->rx_queue;
Current language: auto; currently minimal
(kgdb) bt
#0 doadump (textdump=) at pcpu.h:219
#1 0xffffffff8097f642 in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:486
#2 0xffffffff8097fa25 in vpanic (fmt=, ap=) at /usr/src/sys/kern/kern_shutdown.c:889
#3 0xffffffff8097f8b3 in panic (fmt=0x0) at /usr/src/sys/kern/kern_shutdown.c:818
#4 0xffffffff80daa69b in trap_fatal (frame=, eva=) at /usr/src/sys/amd64/amd64/trap.c:858
#5 0xffffffff80daa99d in trap_pfault (frame=0xfffffe0237e77300, usermode=) at /usr/src/sys/amd64/amd64/trap.c:681
#6 0xffffffff80da9fea in trap (frame=0xfffffe0237e77300) at /usr/src/sys/amd64/amd64/trap.c:447
#7 0xffffffff80d9007c in calltrap () at /usr/src/sys/amd64/amd64/exception.S:238
#8 0xffffffff80693f99 in netmap_transmit (ifp=0xfffff800054c1800, m=0xfffff801ac443400) at /usr/src/sys/dev/netmap/netmap.c:3025
#9 0xffffffff80a4decd in ether_output (ifp=0xfffff800054c1800, m=0xfffff801ac443400, dst=, ro=) at /usr/src/sys/net/if_ethersubr.c:438
#10 0xffffffff80abb6fb in ip_output (m=0xfffff801ac443400, opt=, flags=0, imo=, inp=)

at /usr/src/sys/netinet/ip_output.c:638

#11 0xffffffff80b2f3d2 in tcp_output (tp=0xfffff801ac7b7810) at /usr/src/sys/netinet/tcp_output.c:1407
#12 0xffffffff80b3b549 in tcp_usr_send (so=, flags=, m=, nam=, control=,

td=<value optimized out>) at /usr/src/sys/netinet/tcp_usrreq.c:911

#13 0xffffffff809fe796 in sosend_generic (so=0xfffff8001eaea2b8, addr=0x0, uio=0xfffffe0237e779c0, top=, control=,

flags=<value optimized out>, td=0x3000000010) at /usr/src/sys/kern/uipc_socket.c:1287

#14 0xffffffff80a04b15 in kern_sendit (td=0xfffff8001e053960, s=7, mp=0xfffffe0237e77a88, flags=0, control=0x0, segflg=)

at /usr/src/sys/kern/uipc_syscalls.c:944

#15 0xffffffff80a04e39 in sendit (td=0xfffff8001e053960, s=, mp=0xfffffe0237e77a88, flags=1223614464) at /usr/src/sys/kern/uipc_syscalls.c:871
#16 0xffffffff80a04ec1 in sys_sendmsg (td=0xfffff8001e053960, uap=0xfffffe0237e77b80) at /usr/src/sys/kern/uipc_syscalls.c:1073
#17 0xffffffff80dab07e in amd64_syscall (td=0xfffff8001e053960, traced=0) at subr_syscall.c:141
#18 0xffffffff80d9036b in Xfast_syscall () at /usr/src/sys/amd64/amd64/exception.S:398
#19 0x000000080098adca in ?? ()

Previous frame inner to this frame (corrupt stack?)
(kgdb)

I tried another test running examples/bridge on the same NIC that my bgp get the routes and same crash happened. Any ideas what's going on?

Any help is welcome. Thanks.

@vmaffione
Copy link
Collaborator

What FreeBSD version are you running on?
What netmap ports are opened?
How are they opened (native or emulated?)

@alexhebra
Copy link
Author

I'm using 10.3-STABLE-r305972M
I'm opening ix0, ix1and igb2.
All of them are opened using native.

Thanks.

@vmaffione
Copy link
Collaborator

I think the issue is that na->tx_rings is NULL (and so tx_ring points to page 0). Can you print it in the debugger to confirm that?

@alexhebra
Copy link
Author

Thanks for your help @vmaffione.

Here is my output, i'm not sure if it's what are you looking. Let me know i can run another command.

(kgdb) up 8
#8 0xffffffff80693f99 in netmap_transmit (ifp=0xfffff800054c1800, m=0xfffff8006bf83100) at /usr/src/sys/dev/netmap/netmap.c:3025
3025 if (tx_kring->nr_mode == NKR_NETMAP_OFF) {
(kgdb) print ifp
$1 = (struct ifnet ) 0xfffff800054c1800
(kgdb) list
3020 }
3021
3022 txr = MBUF_TXQ(m);
3023 tx_kring = &NMR(na, NR_TX)[txr];
3024
3025 if (tx_kring->nr_mode == NKR_NETMAP_OFF) {
3026 return MBUF_TRANSMIT(na, ifp, m);
3027 }
3028
3029 q = &kring->rx_queue;
(kgdb) print m
$2 = (struct mbuf *) 0xfffff8006bf83100
(kgdb) list
3030
3031 // XXX reconsider long packets if we handle fragments
3032 if (len > NETMAP_BUF_SIZE(na)) { /
too long for us */
3033 D("%s from_host, drop packet size %d > %d", na->name,
3034 len, NETMAP_BUF_SIZE(na));
3035 goto done;
3036 }
3037
3038 if (nm_os_mbuf_has_offld(m)) {
3039 D("%s drop mbuf requiring offloadings", na->name);
(kgdb) print na
No symbol "na" in current context.
(kgdb)

@alexhebra
Copy link
Author

Here is more output info:

(kgdb) print *ifp
$1 = {if_softc = 0xfffffe0000b71000, if_l2com = 0xfffff80005484290, if_vnet = 0x0, if_link = {tqe_next = 0xfffff800054c0800, tqe_prev = 0xfffff8000546c818},
if_xname = "igb2", '\0' <repeats 11 times>, if_dname = 0xfffff800052a5958 "igb", if_dunit = 2, if_refcount = 2, if_addrhead = {tqh_first = 0xfffff800054b1000,
tqh_last = 0xfffff8001e72b2c0}, if_pcount = 0, if_carp = 0x0, if_bpf = 0xfffff800054b8700, if_index = 5, if_index_reserved = 0, if_vlantrunk = 0x0, if_flags = 34819,
if_capabilities = 6621115, if_capenable = 7603131, if_linkmib = 0x0, if_linkmiblen = 0, if_data = {ifi_type = 6 '\006', ifi_physical = 0 '\0', ifi_addrlen = 6 '\006',
ifi_hdrlen = 18 '\022', ifi_link_state = 2 '\002', ifi_vhid = 0 '\0', ifi_baudrate_pf = 0 '\0', ifi_datalen = 152 '\230', ifi_mtu = 1500, ifi_metric = 0,
ifi_baudrate = 1000000000, ifi_ipackets = 11165, ifi_ierrors = 0, ifi_opackets = 5332, ifi_oerrors = 0, ifi_collisions = 0, ifi_ibytes = 10722673, ifi_obytes = 352351,
ifi_imcasts = 2527, ifi_omcasts = 0, ifi_iqdrops = 0, ifi_noproto = 0, ifi_hwassist = 7710, ifi_epoch = 1, ifi_lastchange = {tv_sec = 1475089187, tv_usec = 415168}},
if_multiaddrs = {tqh_first = 0xfffff8001bf435c0, tqh_last = 0xfffff8001e68d580}, if_amcount = 0, if_output = 0xffffffff80a4d940 <ether_output>,
if_input = 0xffffffff80a4e2a0 <ether_input>, if_start = 0, if_ioctl = 0xffffffff80507540 <igb_ioctl>, if_init = 0xffffffff805074b0 <igb_init>,
if_resolvemulti = 0xffffffff80a4e2b0 <ether_resolvemulti>, if_qflush = 0xffffffff80507d40 <igb_qflush>, if_transmit = 0xffffffff80693f50 <netmap_transmit>, if_reassign = 0,
if_home_vnet = 0x0, if_addr = 0xfffff800054b1000, if_llsoftc = 0x0, if_drv_flags = 64, if_snd = {ifq_head = 0x0, ifq_tail = 0x0, ifq_len = 0, ifq_maxlen = 50, ifq_drops = 0,
ifq_mtx = {lock_object = {lo_name = 0xfffff800054c1828 "igb2", lo_flags = 16973824, lo_data = 0, lo_witness = 0x0}, mtx_lock = 4}, ifq_drv_head = 0x0, ifq_drv_tail = 0x0,
ifq_drv_len = 0, ifq_drv_maxlen = 0, altq_type = 0, altq_flags = 0, altq_disc = 0x0, altq_ifp = 0xfffff800054c1800, altq_enqueue = 0, altq_dequeue = 0, altq_request = 0,
altq_clfier = 0x0, altq_classify = 0, altq_tbr = 0x0, altq_cdnr = 0x0}, if_broadcastaddr = 0xffffffff8104b2d0 "??????", if_bridge = 0x0, if_label = 0x0, if_unused = {0x0,
0x0}, if_afdata = {0x0, 0x0, 0xfffff8001bec2120, 0x0 <repeats 25 times>, 0xfffff8001be8dc80, 0x0 <repeats 13 times>}, if_afdata_initialized = 2, if_afdata_lock = {
lock_object = {lo_name = 0xffffffff8104ada0 "if_afdata", lo_flags = 86179840, lo_data = 0, lo_witness = 0x0}, rw_lock = 1}, if_linktask = {ta_link = {stqe_next = 0x0},
ta_pending = 0, ta_priority = 0, ta_func = 0xffffffff80a460a0 <do_link_state_change>, ta_context = 0xfffff800054c1800}, if_addr_lock = {lock_object = {
lo_name = 0xffffffff8104ad93 "if_addr_lock", lo_flags = 86179840, lo_data = 0, lo_witness = 0x0}, rw_lock = 1}, if_clones = {le_next = 0x0, le_prev = 0x0}, if_groups = {
tqh_first = 0xfffff800054a9600, tqh_last = 0xfffff800054a9608}, if_pf_kif = 0x0, if_lagg = 0x0, if_description = 0x0, if_fib = 0, if_alloctype = 6 '\006',
if_hw_tsomax = 65535, if_cspare = "\000\000", if_ispare = {0, 0}, if_hw_tsomaxsegcount = 40, if_hw_tsomaxsegsize = 4096, if_pspare = {0xfffff800054bac00, 0x0, 0x0, 0x0, 0x0,
0x0, 0x0, 0x0}}
(kgdb) print *m
$2 = {m_hdr = {mh_next = 0x0, mh_nextpkt = 0x0, mh_data = 0xfffff8006bf8315a "", mh_len = 85, mh_type = 1, mh_flags = 258}, M_dat = {MH = {MH_pkthdr = {rcvif = 0x0, tags = {
slh_first = 0x0}, len = 85, flowid = 860493604, csum_flags = 4, fibnum = 0, cosqos = 0 '\0', rsstype = 255 '?', l2hlen = 0 '\0', l3hlen = 0 '\0', l4hlen = 0 '\0',
l5hlen = 0 '\0', PH_per = {eigth = "\000\000\000\000\020\000\000", sixteen = {0, 0, 16, 0}, thirtytwo = {0, 16}, sixtyfour = {68719476736}, unintptr = {68719476736},
ptr = 0x1000000000}, PH_loc = {eigth = "\000\000\000\000\000\000\000", sixteen = {0, 0, 0, 0}, thirtytwo = {0, 0}, sixtyfour = {0}, unintptr = {0}, ptr = 0x0}},
MH_dat = {MH_ext = {ref_cnt = 0x6356370b90000000, ext_buf = 0x89e95490b9000 <Address 0x89e95490b9000 out of bounds>, ext_size = 1191231557, ext_type = 24, ext_flags = 16638,
ext_free = 0xf29c0ab1eeb40610, ext_arg1 = 0xb3002d91fd9c0ab1, ext_arg2 = 0xccf13ce4e3884797},
MH_databuf = "\000\000\000\220\v7Vc\000\220\vI\225\236\b\000E?\000G\030?@\000\020\006??\n\234?\n\234?\221-\000?\227G\210??<??\200\030\004\020\234>\000\000\001\001\b\n\000\031I?k\022?A", '?' <repeats 16 times>, "\000\023\004B\017?\025??@\025?a ", '\0' <repeats 69 times>}},
M_databuf = '\0' <repeats 16 times>, "U\000\000\000$\027J3\004\000\000\000\000\000\000\000\000\000\000?\000\000\000\000\000\000\000\000\020", '\0' <repeats 14 times>, "\220\v7Vc\000\220\vI\225\236\b\000E?\000G\030?@\000\020\006??\n\234?\n\234?\221-\000?\227G\210??<??\200\030\004\020\234>\000\000\001\001\b\n\000\031I?k\022?A", '?' <repeats 16 times>, "\000\023\004B\017?\025??@\025?a ", '\0' <repeats 69 times>}}

@alexhebra
Copy link
Author

I think here is what you are looking for:

(kgdb) print ((struct netmap_adapter)ifp)->tx_rings
$4 = (struct netmap_kring *) 0xfffffe0237e774aa
(kgdb) print *((struct netmap_adapter)ifp)->tx_rings
$5 = {ring = 0x74f06356370b9000, nr_hwcur = 4261558247, nr_hwtail = 822083592, rhead = 4160777208, rcur = 822149119, rtail = 4160777208, nr_kflags = 402718719,
nr_mode = 4160750924, nr_pending_mode = 828964863, nkr_num_slots = 4160777208, nkr_hwofs = -1308557313, nkr_slot_flags = 7794, last_reclaim = 12826533213694474215, si = {si = {
si_tdlist = {tqh_first = 0x75f0fffff8001e72, tqh_last = 0xb6fbfffffe0237e7}, si_note = {kl_list = {slh_first = 0x7540ffffffff80ab}, kl_lock = 0x607bfffffe0237e7,
kl_unlock = 0x3168ffffffff8099, kl_assert_locked = 0x64b0fffff8006bf8, kl_assert_unlocked = 0x69f2000000401e99, kl_lockarg = 0x5dcfffff801d876}, si_mtx = 0x0}, m = {
lock_object = {lo_name = 0xb80000000000000 <Address 0xb80000000000000 out of bounds>, lo_flags = 4686194, lo_data = 2922381312, lo_witness = 0x1fffff8001b09},
mtx_lock = 1729382256911581184}}, q_lock = {lock_object = {lo_name = 0xfffff800054c <Address 0xfffff800054c out of bounds>, lo_flags = 0, lo_data = 1993867264,
lo_witness = 0x76f4fffffe0237e7}, mtx_lock = 7579839647807846375}, nr_busy = -134096778, na = 0x75f0ffffffff8096, nkr_ft = 0xebc3fffffe0237e7,
nkr_leases = 0x7788ffffffff80c1, nkr_hwlease = 4261558247, nkr_lease_idx = 65535, nkr_stopped = 4096, tx_pool = 0x7788fffff8027ffa, tx_event = 0xfffffe0237e7, tx_event_lock = {
lock_object = {lo_name = 0x3100000000000000 <Address 0x3100000000000000 out of bounds>, lo_flags = 4160777208, lo_data = 2088173567, lo_witness = 0x3100651894f01311},
mtx_lock = 10957539368234937336}, rx_queue = {head = 0x13fffff801d876, tail = 0x47000000000000, count = 0, lock = {lock_object = {
lo_name = 0x7810fffff8006bf8 <Address 0x7810fffff8006bf8 out of bounds>, lo_flags = 4261558247, lo_data = 4090691583, lo_witness = 0xffbbffffffff80b2},
mtx_lock = 281474943352831}}, users = 0, ring_id = 131072, tx = NR_RX,
name = "\000\000?m\231\036\000???`\231v?\001????\000\000\000\001\000\000\000\210TOQ\000???@toq\000???\000\000\000\000\000\000\000\000h1?k\000???\000\000\000\000\000",
nm_sync = 0, nm_notify = 0x13000000280000, pipe = 0x9810000000000000, save_notify = 0xfffff801d876, monitors = 0x64b0000000000000, max_monitors = 4160757401,
n_monitors = 1767964671, mon_sync = 0xc1e99, mon_notify = 0x52b8000000000000, mon_tail = 4160770383, mon_pos = 1419837439}

@vmaffione
Copy link
Collaborator

Nope, you have to print na->tx_rings

2016-09-28 22:47 GMT+02:00 alexhebra notifications@github.com:

I think here is what you are looking for:

(kgdb) print ((struct netmap_adapter)ifp)->tx_rings
$4 = (struct netmap_kring *) 0xfffffe0237e774aa
(kgdb) print *((struct netmap_adapter)ifp)->tx_rings
$5 = {ring = 0x74f06356370b9000, nr_hwcur = 4261558247, nr_hwtail =
822083592, rhead = 4160777208, rcur = 822149119, rtail = 4160777208,
nr_kflags = 402718719,
nr_mode = 4160750924, nr_pending_mode = 828964863, nkr_num_slots =
4160777208, nkr_hwofs = -1308557313, nkr_slot_flags = 7794, last_reclaim =
12826533213694474215, si = {si = {
si_tdlist = {tqh_first = 0x75f0fffff8001e72, tqh_last =
0xb6fbfffffe0237e7}, si_note = {kl_list = {slh_first = 0x7540ffffffff80ab},
kl_lock = 0x607bfffffe0237e7,
kl_unlock = 0x3168ffffffff8099, kl_assert_locked = 0x64b0fffff8006bf8,
kl_assert_unlocked = 0x69f2000000401e99, kl_lockarg = 0x5dcfffff801d876},
si_mtx = 0x0}, m = {
lock_object = {lo_name = 0xb80000000000000
, lo_flags = 4686194, lo_data = 2922381312, lo_witness = 0x1fffff8001b09},
mtx_lock = 1729382256911581184}}, q_lock = {lock_object = {lo_name =
0xfffff800054c , lo_flags = 0, lo_data = 1993867264,
lo_witness = 0x76f4fffffe0237e7}, mtx_lock = 7579839647807846375}, nr_busy
= -134096778, na = 0x75f0ffffffff8096, nkr_ft = 0xebc3fffffe0237e7,
nkr_leases = 0x7788ffffffff80c1, nkr_hwlease = 4261558247, nkr_lease_idx =
65535, nkr_stopped = 4096, tx_pool = 0x7788fffff8027ffa, tx_event =
0xfffffe0237e7, tx_event_lock = {
lock_object = {lo_name = 0x3100000000000000 , lo_flags = 4160777208,
lo_data = 2088173567, lo_witness = 0x3100651894f01311},
mtx_lock = 10957539368234937336}, rx_queue = {head = 0x13fffff801d876,
tail = 0x47000000000000, count = 0, lock = {lock_object = {
lo_name = 0x7810fffff8006bf8 , lo_flags = 4261558247, lo_data =
4090691583, lo_witness = 0xffbbffffffff80b2},
mtx_lock = 281474943352831}}, users = 0, ring_id = 131072, tx = NR_RX,
name = "\000\000?m\231\036\000???`\231v?\001????\000\000\000\001
000\000\000\210TOQ\000???@toq https://github.com/TOQ\
000???\000\000\000\000\000\000\000\000h1?k\000???\000\000\000\000\000",
nm_sync = 0, nm_notify = 0x13000000280000, pipe = 0x9810000000000000,
save_notify = 0xfffff801d876, monitors = 0x64b0000000000000, max_monitors =
4160757401,
n_monitors = 1767964671, mon_sync = 0xc1e99, mon_notify =
0x52b8000000000000, mon_tail = 4160770383, mon_pos = 1419837439}


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#229 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AEsSwe85kOoxSd6xRdwz72OtjikS-9BPks5qutJrgaJpZM4KJMY4
.

Vincenzo Maffione

@alexhebra
Copy link
Author

When i print "((struct netmap_adapter)ifp)->tx_rings" is not as same as "na->tx_rings"?
Looks 'na' is a cast from 'ifp'.

@vmaffione
Copy link
Collaborator

no, the NA() macro is not a cast

@alexhebra
Copy link
Author

alexhebra commented Sep 28, 2016

Take a look here please:

(kgdb) print *((struct netmap_adapter *)ifp->if_pspare)
$10 = {magic = 88845312, na_flags = 4294965248, active_fds = 0, num_rx_rings = 0, num_tx_rings = 0, num_tx_desc = 0, num_rx_desc = 0, tx_rings = 0x0, rx_rings = 0x0,
tailroom = 0x0, si = {{si = {si_tdlist = {tqh_first = 0x0, tqh_last = 0x0}, si_note = {kl_list = {slh_first = 0x0}, kl_lock = 0, kl_unlock = 0, kl_assert_locked = 0,
kl_assert_unlocked = 0, kl_lockarg = 0x0}, si_mtx = 0x0}, m = {lock_object = {lo_name = 0x0, lo_flags = 0, lo_data = 0, lo_witness = 0x0}, mtx_lock = 0}}, {si = {
si_tdlist = {tqh_first = 0x0, tqh_last = 0x0}, si_note = {kl_list = {slh_first = 0x0}, kl_lock = 0, kl_unlock = 0, kl_assert_locked = 0, kl_assert_unlocked = 0,
kl_lockarg = 0x0}, si_mtx = 0x0}, m = {lock_object = {lo_name = 0x0, lo_flags = 0, lo_data = 0, lo_witness = 0x0}, mtx_lock = 0}}}, si_users = {0, 0}, pdev = 0x0,
if_transmit = 0, if_input = 0, ifp = 0x0, nm_dtor = 0, nm_register = 0, nm_intr = 0, nm_txsync = 0, nm_rxsync = 0, nm_notify = 0, nm_config = 0, nm_krings_create = 0,
nm_krings_delete = 0, nm_bdg_attach = 0, nm_bdg_ctl = 0, na_vp = 0x0, na_hostvp = 0x0, na_refcount = 0, nm_mem = 0x0, na_lut = {lut = 0x0, objtotal = 0, objsize = 0},
na_private = 0x0, na_pipes = 0x0, na_next_pipe = 0, na_max_pipes = 0, virt_hdr_len = 0, name = '\0' <repeats 63 times>}
(kgdb) print ((struct netmap_adapter *)ifp->if_pspare)
$11 = (struct netmap_adapter *) 0xfffff800054c1c98
(kgdb) print ((struct netmap_adapter *)ifp->if_pspare)->tx_rings
$12 = (struct netmap_kring *) 0x0
(kgdb) print *((struct netmap_adapter *)ifp->if_pspare)->tx_rings
Cannot access memory at address 0x0

So looks NULL, right?

@vmaffione
Copy link
Collaborator

I'm not sure about what are you printing. Can you please

(kgdb) print na->tx_rings

?

@alexhebra
Copy link
Author

Here it is:

(kgdb) print na->tx_rings
No symbol "na" in current context.

@vmaffione
Copy link
Collaborator

Ok, you were right by looking at ifp->if_pspare. So yes, na->tx_rings as I was suspecting. Thanks.

@alexhebra
Copy link
Author

Ok, let me know if you need something else. Thanks.

@alexhebra
Copy link
Author

@vmaffione i've been trying fix this issue with no success. I'd like to help to fix it. Please can you point me out a direction?

Thanks.

@vmaffione
Copy link
Collaborator

We know that na->tx_rings is NULL where it should not be.
na->tx_rings is set in netmap_krings_create(), and is zeroed in netmap_krings_delete(). I would start by putting a log print (using D() macro) at line 809 and line 863 of netmap.c, to track creation and deletion.

Then I would run your test to see what happens in the log. Was tx_rings created when netmap_transmit is called? Was tx_rings destroyed after netmap_transmit is called?

@alexhebra
Copy link
Author

@vmaffione

I think i found the issue, it's on line 3025 (netmap.c):

"if (tx_kring->nr_mode == NKR_NETMAP_OFF) {"

If i comment it out, it stopped crashing. I added some log printing as you suggested:

In function netmap_create():
D("%s: na->tx_rings: %p\n", func, (void*) na->tx_rings);
I got the log bellow:
netmap_krings_create netmap_krings_create: na->tx_rings: 0xfffffe00046f6000

In function netmap_transmit():
D("%s: na->tx_rings: %p\n", func, (void*) na->tx_rings);
I got the following on log:
netmap_transmit netmap_transmit: na->tx_rings: 0xfffffe00046f6000

Before "if" i added:
D("%s: tx_kring: %p\n", func, (void*) tx_kring);
And i got:
netmap_transmit netmap_transmit: tx_kring: 0xfffffe617ccd5a00

I didn't have any logs on function netmap_destroy() because the system crashed before it has been destroyed.

So i tried to make a simple test printing tx_kring->nr_mode
D("%s: tx_kring->nr_mode: %d\n", func, tx_kring->nr_mode);

This simple log print let the system to be crashed when tx_kring->nr_mode is accessed.

So looks an issue with tx_kring->nr_mode. Do you have any idea?

Thanks.

@vmaffione
Copy link
Collaborator

vmaffione commented Oct 6, 2016

In the previous traces it was clear that the problem was a tx_kring NULL pointer. Here I don't see any NULL pointer. However, there is something wrong with the address of tx_kring, which is not related to na->tx_rings.
To avoid confusion, please repeat the tests with the following log statements (removing the ones you have).

In the netmap_krings_create()

D("%p: na->tx_rings: %p", na, na->tx_rings);

In the netmap_krings_delete()

D("%p: na->tx_rings: %p", na, na->tx_rings);

Before the "if" in netmap_transmit(), and so right before the crash

D("%p: na->tx_rings: %p  txr: %d tx_kring", na, na->tx_rings, txr, tx_kring);

@alexhebra
Copy link
Author

@vmaffione

I've add log print as you suggested and i got the following:

netmap_krings_create: na: 0xfffff800054bac00: na->tx_rings: 0xfffffe0006989000

netmap_transmit: na: 0xfffff800054bac00: na->tx_rings: 0xfffffe0006989000 txr: -1009587095 tx_kring: 0xfffffd87ac716200

netmap_krings_delete: na: 0xfffff800054bac00: na->tx_rings: 0xfffffe0006989000

As you can see "txr" is receiving a negative value, then i added an "if" to check when its negative change to zero:

txr = txr < 0 ? 0 : txr;

It fixed the issue, but i don't know if it's the best approach to follow. What do you think?

Thanks.

@vmaffione
Copy link
Collaborator

Ah, good catch. So the problem is that we cannot assume FreeBSD flowid is set correctly.

No, your proposal is not enough, the upper bound is not checked.
I provided a fix for the bug and updated the repository. Can you give a try again with the current code?

@alexhebra
Copy link
Author

I've just tested and it worked. Thanks for your help to fix this issue. :)

@vmaffione
Copy link
Collaborator

ok!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants