Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kernel Panic w/ 6.0.3 on CentOS 7.1 #47

Closed
dcode opened this issue Oct 21, 2015 · 4 comments
Closed

Kernel Panic w/ 6.0.3 on CentOS 7.1 #47

dcode opened this issue Oct 21, 2015 · 4 comments

Comments

@dcode
Copy link

dcode commented Oct 21, 2015

I thought this might be a duplicate of #28 but this is on 6.0.3 and happens even if I remove the pf_ring kernel module. Wether I use bro (which I compiled against the RPM-provided libpcap) or tcpdump (from the CentOS repos), the dynamic linker pushes them through /usr/local/lib/libpcap.so.1.6.2. When I rename that shared lib, tcpdump uses the system libpcap and it works just fine, whether the module is loaded or not.

I've tried running 6.1.1 on this kernel, which doesn't crash, but I get terrible packet loss when going through libpcap, but that is a different issue (that I'll file when I get a chance to run it to ground).

So the culprit seems to be libpcap 1.6.2. I'd like to stick to PF_RING 6.0.3 if I can for stability and performance. I'm happy to provide more info or dig into crash dumps more if needed.

System Info (from crash tool)

This GDB was configured as "x86_64-unknown-linux-gnu"...

      KERNEL: /usr/lib/debug/lib/modules/3.10.0-229.el7.x86_64/vmlinux
    DUMPFILE: /var/crash/127.0.0.1-2015.10.20-22:22:58/vmcore  [PARTIAL DUMP]
        CPUS: 24
        DATE: Tue Oct 20 22:22:51 2015
      UPTIME: 00:16:49
LOAD AVERAGE: 0.00, 0.01, 0.04
       TASKS: 355
    NODENAME: reciever2
     RELEASE: 3.10.0-229.el7.x86_64
     VERSION: #1 SMP Fri Mar 6 11:36:42 UTC 2015
     MACHINE: x86_64  (2300 Mhz)
      MEMORY: 128 GB
       PANIC: ""
         PID: 35
     COMMAND: "rcuos/0"
        TASK: ffff8810297a8b60  [THREAD_INFO: ffff8810297b4000]
         CPU: 1
       STATE: TASK_RUNNING (PANIC)

dmesg

[  691.218347] [PF_RING] Module unloaded
[  843.251621] [PF_RING] Welcome to PF_RING 6.0.3 ($Revision: 6.0.3-stable:8994076d9761315040ed29a0d5825cb74c20c078$)
               (C) 2004-14 ntop.org
[  843.251654] [PF_RING] registered /proc/net/pf_ring/
[  843.251656] NET: Registered protocol family 27
[  843.251666] [PF_RING] Min # ring slots 4096
[  843.251667] [PF_RING] Slot version     16
[  843.251668] [PF_RING] Capture TX       Yes [RX+TX]
[  843.251669] [PF_RING] IP Defragment    No
[  843.251670] [PF_RING] Initialized correctly
[ 1008.709269] device eno3 entered promiscuous mode
[ 1008.727969] general protection fault: 0000 [#1] SMP
[ 1008.727998] Modules linked in: pf_ring(OF) xt_conntrack ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 xt_addrtype iptable_filter ip_tables nf_nat nf_conntrack bridge stp llc dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio loop intel_powerclamp coretemp iTCO_wdt iTCO_vendor_support intel_rapl kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd sb_edac ses edac_core serio_raw pcspkr mei_me enclosure mei lpc_ich i2c_i801 mfd_core ipmi_si ipmi_msghandler ioatdma ntb shpchp wmi xfs libcrc32c sd_mod crc_t10dif crct10dif_common mgag200 syscopyarea sysfillrect sysimgblt drm_kms_helper ttm isci igb drm ixgbe(OF) vxlan libsas ahci ip_tunnel ptp libahci scsi_transport_sas
[ 1008.728336]  pps_core dca libata i2c_algo_bit aacraid i2c_core dm_mirror dm_region_hash dm_log dm_mod [last unloaded: pf_ring]
[ 1008.728386] CPU: 1 PID: 35 Comm: rcuos/0 Tainted: GF          O--------------   3.10.0-229.el7.x86_64 #1
[ 1008.728421] Hardware name: Supermicro X9DRW-3LN4F+/X9DRW-3TF+/X9DRW-3LN4F+/X9DRW-3TF+, BIOS 3.0a 02/06/2014
[ 1008.728458] task: ffff8810297a8b60 ti: ffff8810297b4000 task.ti: ffff8810297b4000
[ 1008.728485] RIP: 0010:[<ffffffff8106a802>]  [<ffffffff8106a802>] bpf_jit_free+0x22/0x50
[ 1008.728520] RSP: 0018:ffff8810297b7df0  EFLAGS: 00010203
[ 1008.728541] RAX: 0000000fffffffe0 RBX: ffff882014f90280 RCX: dead000000200200
[ 1008.728567] RDX: 636e753a755f6465 RSI: 0000000000000283 RDI: 0000000000001400
[ 1008.728594] RBP: ffff8810297b7e08 R08: ffff8810297b7e80 R09: 0000000000000008
[ 1008.728620] R10: 0000000000000004 R11: 0000000000000005 R12: 0000000000000000
[ 1008.728646] R13: 0000000000000000 R14: 0000000000000000 R15: ffff882014f90290
[ 1008.728673] FS:  0000000000000000(0000) GS:ffff88103fc20000(0000) knlGS:0000000000000000
[ 1008.728703] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1008.728724] CR2: 00007fa4edc8a000 CR3: 000000000190a000 CR4: 00000000000407e0
[ 1008.728751] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1008.728777] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 1008.728803] Stack:
[ 1008.728813]  ffff8810297b7e08 ffffffff81510406 ffff882014f90290 ffff8810297b7ec0
[ 1008.728847]  ffffffff81113239 ffff8810297a8b60 ffff8810297a8b60 ffff88103fc0df50
[ 1008.728880]  ffff8810297b7e80 ffff8810297a8b60 ffff88103fc0df28 ffff88103fc0df38
[ 1008.728913] Call Trace:
[ 1008.728928]  [<ffffffff81510406>] ? sk_filter_release_rcu+0x16/0x30
[ 1008.728955]  [<ffffffff81113239>] rcu_nocb_kthread+0x229/0x370
[ 1008.728979]  [<ffffffff81098350>] ? wake_up_bit+0x30/0x30
[ 1008.729002]  [<ffffffff81113010>] ? rcu_start_gp+0x40/0x40
[ 1008.729023]  [<ffffffff8109739f>] kthread+0xcf/0xe0
[ 1008.729044]  [<ffffffff810972d0>] ? kthread_create_on_node+0x140/0x140
[ 1008.729071]  [<ffffffff8161497c>] ret_from_fork+0x7c/0xb0
[ 1008.729092]  [<ffffffff810972d0>] ? kthread_create_on_node+0x140/0x140
[ 1008.729117] Code: 84 e9 94 fa ff ff 0f 1f 00 66 66 66 66 90 48 8b 57 08 48 81 fa b0 07 51 81 74 37 48 b8 e0 ff ff ff 0f 00 00 00 55 bf 00 14 00 00 <48> 89 02 48 8d 42 08 48 8b 35 58 74 9b 00 48 89 e5 48 c7 42 18
[ 1008.729246] RIP  [<ffffffff8106a802>] bpf_jit_free+0x22/0x50
[ 1008.729270]  RSP <ffff8810297b7df0>

Backtrace

PID: 35     TASK: ffff8810297a8b60  CPU: 1   COMMAND: "rcuos/0"
 #0 [ffff8810297b7b90] machine_kexec at ffffffff8104c681
 #1 [ffff8810297b7be8] crash_kexec at ffffffff810e2222
 #2 [ffff8810297b7cb8] oops_end at ffffffff8160d188
 #3 [ffff8810297b7ce0] die at ffffffff810173eb
 #4 [ffff8810297b7d10] do_general_protection at ffffffff8160ca8e
 #5 [ffff8810297b7d40] general_protection at ffffffff8160c3a8
    [exception RIP: bpf_jit_free+34]
    RIP: ffffffff8106a802  RSP: ffff8810297b7df0  RFLAGS: 00010203
    RAX: 0000000fffffffe0  RBX: ffff882014f90280  RCX: dead000000200200
    RDX: 636e753a755f6465  RSI: 0000000000000283  RDI: 0000000000001400
    RBP: ffff8810297b7e08   R8: ffff8810297b7e80   R9: 0000000000000008
    R10: 0000000000000004  R11: 0000000000000005  R12: 0000000000000000
    R13: 0000000000000000  R14: 0000000000000000  R15: ffff882014f90290
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #6 [ffff8810297b7df8] sk_filter_release_rcu at ffffffff81510406
 #7 [ffff8810297b7e10] rcu_nocb_kthread at ffffffff81113239
 #8 [ffff8810297b7ec8] kthread at ffffffff8109739f
 #9 [ffff8810297b7f50] ret_from_fork at ffffffff8161497c
@cardigliano
Copy link
Member

Applied patch from #28 to 6.0.3, please check it fixes the issue. Rebuilding new 6.0.3 packages.

@dcode
Copy link
Author

dcode commented Oct 21, 2015

@cardigliano are you building packages to push to packages.ntop.org?

@cardigliano
Copy link
Member

Yes

@dcode
Copy link
Author

dcode commented Oct 21, 2015

I rebuilt the packages from the current commit. It's working beautifully. No more crashes and so far no significant packet loss.

190948f on the 6.0.3-stable branch fixed this.

#win!

@dcode dcode closed this as completed Oct 21, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants