-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
trc_pkt_lister -decode hangs on Snowball sample snapshot #4
Comments
Thanks for the feedback. Reproduced this and made fixes which will appear in the v0.4.1 library release. |
mathieupoirier
pushed a commit
that referenced
this issue
Sep 22, 2016
Panic occurs when issuing "cat /proc/net/route" whilst populating FIB with > 1M routes. Use of cached node pointer in fib_route_get_idx is unsafe. BUG: unable to handle kernel paging request at ffffc90001630024 IP: [<ffffffff814cf6a0>] leaf_walk_rcu+0x10/0xe0 PGD 11b08d067 PUD 11b08e067 PMD dac4b067 PTE 0 Oops: 0000 [#1] SMP Modules linked in: nfsd auth_rpcgss oid_registry nfs_acl nfs lockd grace fscac snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep virti acpi_cpufreq button parport_pc ppdev lp parport autofs4 ext4 crc16 mbcache jbd tio_ring virtio floppy uhci_hcd ehci_hcd usbcore usb_common libata scsi_mod CPU: 1 PID: 785 Comm: cat Not tainted 4.2.0-rc8+ #4 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007 task: ffff8800da1c0bc0 ti: ffff88011a05c000 task.ti: ffff88011a05c000 RIP: 0010:[<ffffffff814cf6a0>] [<ffffffff814cf6a0>] leaf_walk_rcu+0x10/0xe0 RSP: 0018:ffff88011a05fda0 EFLAGS: 00010202 RAX: ffff8800d8a40c00 RBX: ffff8800da4af940 RCX: ffff88011a05ff20 RDX: ffffc90001630020 RSI: 0000000001013531 RDI: ffff8800da4af950 RBP: 0000000000000000 R08: ffff8800da1f9a00 R09: 0000000000000000 R10: ffff8800db45b7e4 R11: 0000000000000246 R12: ffff8800da4af950 R13: ffff8800d97a74c0 R14: 0000000000000000 R15: ffff8800d97a7480 FS: 00007fd3970e0700(0000) GS:ffff88011fd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: ffffc90001630024 CR3: 000000011a7e4000 CR4: 00000000000006e0 Stack: ffffffff814d00d3 0000000000000000 ffff88011a05ff20 ffff8800da1f9a00 ffffffff811dd8b9 0000000000000800 0000000000020000 00007fd396f35000 ffffffff811f8714 0000000000003431 ffffffff8138dce0 0000000000000f80 Call Trace: [<ffffffff814d00d3>] ? fib_route_seq_start+0x93/0xc0 [<ffffffff811dd8b9>] ? seq_read+0x149/0x380 [<ffffffff811f8714>] ? fsnotify+0x3b4/0x500 [<ffffffff8138dce0>] ? process_echoes+0x70/0x70 [<ffffffff8121cfa7>] ? proc_reg_read+0x47/0x70 [<ffffffff811bb823>] ? __vfs_read+0x23/0xd0 [<ffffffff811bbd42>] ? rw_verify_area+0x52/0xf0 [<ffffffff811bbe61>] ? vfs_read+0x81/0x120 [<ffffffff811bcbc2>] ? SyS_read+0x42/0xa0 [<ffffffff81549ab2>] ? entry_SYSCALL_64_fastpath+0x16/0x75 Code: 48 85 c0 75 d8 f3 c3 31 c0 c3 f3 c3 66 66 66 66 66 66 2e 0f 1f 84 00 00 a 04 89 f0 33 02 44 89 c9 48 d3 e8 0f b6 4a 05 49 89 RIP [<ffffffff814cf6a0>] leaf_walk_rcu+0x10/0xe0 RSP <ffff88011a05fda0> CR2: ffffc90001630024 Signed-off-by: Dave Forster <dforster@brocade.com> Acked-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
mathieupoirier
pushed a commit
that referenced
this issue
Sep 22, 2016
drm_connector_register_all requires a few too many locks because our connector_list locking is busted. Add another FIXME+hack to work around this. This should address the below lockdep splat: ====================================================== [ INFO: possible circular locking dependency detected ] 4.7.0-rc5+ #524 Tainted: G O ------------------------------------------------------- kworker/u8:0/6 is trying to acquire lock: (&dev->mode_config.mutex){+.+.+.}, at: [<ffffffff815afde0>] drm_modeset_lock_all+0x40/0x120 but task is already holding lock: ((fb_notifier_list).rwsem){++++.+}, at: [<ffffffff810ac195>] __blocking_notifier_call_chain+0x35/0x70 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 ((fb_notifier_list).rwsem){++++.+}: [<ffffffff810df611>] lock_acquire+0xb1/0x200 [<ffffffff819a55b4>] down_write+0x44/0x80 [<ffffffff810abf91>] blocking_notifier_chain_register+0x21/0xb0 [<ffffffff814c7448>] fb_register_client+0x18/0x20 [<ffffffff814c6c86>] backlight_device_register+0x136/0x260 [<ffffffffa0127eb2>] intel_backlight_device_register+0xa2/0x160 [i915] [<ffffffffa00f46be>] intel_connector_register+0xe/0x10 [i915] [<ffffffffa0112bfb>] intel_dp_connector_register+0x1b/0x80 [i915] [<ffffffff8159dfea>] drm_connector_register+0x4a/0x80 [<ffffffff8159fe44>] drm_connector_register_all+0x64/0xf0 [<ffffffff815a2a64>] drm_modeset_register_all+0x174/0x1c0 [<ffffffff81599b72>] drm_dev_register+0xc2/0xd0 [<ffffffffa00621d7>] i915_driver_load+0x1547/0x2200 [i915] [<ffffffffa006d80f>] i915_pci_probe+0x4f/0x70 [i915] [<ffffffff814a2135>] local_pci_probe+0x45/0xa0 [<ffffffff814a349b>] pci_device_probe+0xdb/0x130 [<ffffffff815c07e3>] driver_probe_device+0x223/0x440 [<ffffffff815c0ad5>] __driver_attach+0xd5/0x100 [<ffffffff815be386>] bus_for_each_dev+0x66/0xa0 [<ffffffff815c002e>] driver_attach+0x1e/0x20 [<ffffffff815bf9be>] bus_add_driver+0x1ee/0x280 [<ffffffff815c1810>] driver_register+0x60/0xe0 [<ffffffff814a1a10>] __pci_register_driver+0x60/0x70 [<ffffffffa01a905b>] i915_init+0x5b/0x62 [i915] [<ffffffff8100042d>] do_one_initcall+0x3d/0x150 [<ffffffff811a935b>] do_init_module+0x5f/0x1d9 [<ffffffff81124416>] load_module+0x20e6/0x27e0 [<ffffffff81124d63>] SYSC_finit_module+0xc3/0xf0 [<ffffffff81124dae>] SyS_finit_module+0xe/0x10 [<ffffffff819a83a9>] entry_SYSCALL_64_fastpath+0x1c/0xac -> #0 (&dev->mode_config.mutex){+.+.+.}: [<ffffffff810df0ac>] __lock_acquire+0x10fc/0x1260 [<ffffffff810df611>] lock_acquire+0xb1/0x200 [<ffffffff819a3097>] mutex_lock_nested+0x67/0x3c0 [<ffffffff815afde0>] drm_modeset_lock_all+0x40/0x120 [<ffffffff8158f79b>] drm_fb_helper_restore_fbdev_mode_unlocked+0x2b/0x80 [<ffffffff8158f81d>] drm_fb_helper_set_par+0x2d/0x50 [<ffffffffa0105f7a>] intel_fbdev_set_par+0x1a/0x60 [i915] [<ffffffff814c13c6>] fbcon_init+0x586/0x610 [<ffffffff8154d16a>] visual_init+0xca/0x130 [<ffffffff8154e611>] do_bind_con_driver+0x1c1/0x3a0 [<ffffffff8154eaf6>] do_take_over_console+0x116/0x180 [<ffffffff814bd3a7>] do_fbcon_takeover+0x57/0xb0 [<ffffffff814c1e48>] fbcon_event_notify+0x658/0x750 [<ffffffff810abcae>] notifier_call_chain+0x3e/0xb0 [<ffffffff810ac1ad>] __blocking_notifier_call_chain+0x4d/0x70 [<ffffffff810ac1e6>] blocking_notifier_call_chain+0x16/0x20 [<ffffffff814c748b>] fb_notifier_call_chain+0x1b/0x20 [<ffffffff814c86b1>] register_framebuffer+0x251/0x330 [<ffffffff8158fa9f>] drm_fb_helper_initial_config+0x25f/0x3f0 [<ffffffffa0106b48>] intel_fbdev_initial_config+0x18/0x30 [i915] [<ffffffff810adfd8>] async_run_entry_fn+0x48/0x150 [<ffffffff810a3947>] process_one_work+0x1e7/0x750 [<ffffffff810a3efb>] worker_thread+0x4b/0x4f0 [<ffffffff810aad4f>] kthread+0xef/0x110 [<ffffffff819a85ef>] ret_from_fork+0x1f/0x40 other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock((fb_notifier_list).rwsem); lock(&dev->mode_config.mutex); lock((fb_notifier_list).rwsem); lock(&dev->mode_config.mutex); *** DEADLOCK *** 6 locks held by kworker/u8:0/6: #0: ("events_unbound"){.+.+.+}, at: [<ffffffff810a38c9>] process_one_work+0x169/0x750 #1: ((&entry->work)){+.+.+.}, at: [<ffffffff810a38c9>] process_one_work+0x169/0x750 #2: (registration_lock){+.+.+.}, at: [<ffffffff814c8487>] register_framebuffer+0x27/0x330 #3: (console_lock){+.+.+.}, at: [<ffffffff814c86ce>] register_framebuffer+0x26e/0x330 #4: (&fb_info->lock){+.+.+.}, at: [<ffffffff814c78dd>] lock_fb_info+0x1d/0x40 #5: ((fb_notifier_list).rwsem){++++.+}, at: [<ffffffff810ac195>] __blocking_notifier_call_chain+0x35/0x70 stack backtrace: CPU: 2 PID: 6 Comm: kworker/u8:0 Tainted: G O 4.7.0-rc5+ #524 Hardware name: Intel Corp. Broxton P/NOTEBOOK, BIOS APLKRVPA.X64.0138.B33.1606250842 06/25/2016 Workqueue: events_unbound async_run_entry_fn 0000000000000000 ffff8800758577f0 ffffffff814507a5 ffffffff828b9900 ffffffff828b9900 ffff880075857830 ffffffff810dc6fa ffff880075857880 ffff88007584d688 0000000000000005 0000000000000006 ffff88007584d6b0 Call Trace: [<ffffffff814507a5>] dump_stack+0x67/0x92 [<ffffffff810dc6fa>] print_circular_bug+0x1aa/0x200 [<ffffffff810df0ac>] __lock_acquire+0x10fc/0x1260 [<ffffffff810df611>] lock_acquire+0xb1/0x200 [<ffffffff815afde0>] ? drm_modeset_lock_all+0x40/0x120 [<ffffffff815afde0>] ? drm_modeset_lock_all+0x40/0x120 [<ffffffff819a3097>] mutex_lock_nested+0x67/0x3c0 [<ffffffff815afde0>] ? drm_modeset_lock_all+0x40/0x120 [<ffffffff810fa85f>] ? rcu_read_lock_sched_held+0x7f/0x90 [<ffffffff81208218>] ? kmem_cache_alloc_trace+0x248/0x2b0 [<ffffffff815afdc5>] ? drm_modeset_lock_all+0x25/0x120 [<ffffffff815afde0>] drm_modeset_lock_all+0x40/0x120 [<ffffffff8158f79b>] drm_fb_helper_restore_fbdev_mode_unlocked+0x2b/0x80 [<ffffffff8158f81d>] drm_fb_helper_set_par+0x2d/0x50 [<ffffffffa0105f7a>] intel_fbdev_set_par+0x1a/0x60 [i915] [<ffffffff814c13c6>] fbcon_init+0x586/0x610 [<ffffffff8154d16a>] visual_init+0xca/0x130 [<ffffffff8154e611>] do_bind_con_driver+0x1c1/0x3a0 [<ffffffff8154eaf6>] do_take_over_console+0x116/0x180 [<ffffffff814bd3a7>] do_fbcon_takeover+0x57/0xb0 [<ffffffff814c1e48>] fbcon_event_notify+0x658/0x750 [<ffffffff810abcae>] notifier_call_chain+0x3e/0xb0 [<ffffffff810ac1ad>] __blocking_notifier_call_chain+0x4d/0x70 [<ffffffff810ac1e6>] blocking_notifier_call_chain+0x16/0x20 [<ffffffff814c748b>] fb_notifier_call_chain+0x1b/0x20 [<ffffffff814c86b1>] register_framebuffer+0x251/0x330 [<ffffffff815b7e8d>] ? vga_switcheroo_client_fb_set+0x5d/0x70 [<ffffffff8158fa9f>] drm_fb_helper_initial_config+0x25f/0x3f0 [<ffffffffa0106b48>] intel_fbdev_initial_config+0x18/0x30 [i915] [<ffffffff810adfd8>] async_run_entry_fn+0x48/0x150 [<ffffffff810a3947>] process_one_work+0x1e7/0x750 [<ffffffff810a38c9>] ? process_one_work+0x169/0x750 [<ffffffff810a3efb>] worker_thread+0x4b/0x4f0 [<ffffffff810a3eb0>] ? process_one_work+0x750/0x750 [<ffffffff810aad4f>] kthread+0xef/0x110 [<ffffffff819a85ef>] ret_from_fork+0x1f/0x40 [<ffffffff810aac60>] ? kthread_stop+0x2e0/0x2e0 v2: Rebase onto the right branch (hand-editing patches ftw) and add more reporters. Reported-by: Imre Deak <imre.deak@intel.com> Cc: Imre Deak <imre.deak@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Reported-by: Jiri Kosina <jikos@kernel.org> Cc: Jiri Kosina <jikos@kernel.org> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
mathieupoirier
pushed a commit
that referenced
this issue
Sep 22, 2016
Memory leak and unbalanced reference count: If the hub gets disconnected while the core is still activating it, this can result in leaking memory of few USB structures. This will happen if we have done a kref_get() from hub_activate() and scheduled a delayed work item for HUB_INIT2/3. Now if hub_disconnect() gets called before the delayed work expires, then we will cancel the work from hub_quiesce(), but wouldn't do a kref_put(). And so the unbalance. kmemleak reports this as (with the commit e50293e backported to 3.10 kernel with other changes, though the same is true for mainline as well): unreferenced object 0xffffffc08af5b800 (size 1024): comm "khubd", pid 73, jiffies 4295051211 (age 6482.350s) hex dump (first 32 bytes): 30 68 f3 8c c0 ff ff ff 00 a0 b2 2e c0 ff ff ff 0h.............. 01 00 00 00 00 00 00 00 00 94 7d 40 c0 ff ff ff ..........}@.... backtrace: [<ffffffc0003079ec>] create_object+0x148/0x2a0 [<ffffffc000cc150c>] kmemleak_alloc+0x80/0xbc [<ffffffc000303a7c>] kmem_cache_alloc_trace+0x120/0x1ac [<ffffffc0006fa610>] hub_probe+0x120/0xb84 [<ffffffc000702b20>] usb_probe_interface+0x1ec/0x298 [<ffffffc0005d50cc>] driver_probe_device+0x160/0x374 [<ffffffc0005d5308>] __device_attach+0x28/0x4c [<ffffffc0005d3164>] bus_for_each_drv+0x78/0xac [<ffffffc0005d4ee0>] device_attach+0x6c/0x9c [<ffffffc0005d42b8>] bus_probe_device+0x28/0xa0 [<ffffffc0005d23a4>] device_add+0x324/0x604 [<ffffffc000700fcc>] usb_set_configuration+0x660/0x6cc [<ffffffc00070a350>] generic_probe+0x44/0x84 [<ffffffc000702914>] usb_probe_device+0x54/0x74 [<ffffffc0005d50cc>] driver_probe_device+0x160/0x374 [<ffffffc0005d5308>] __device_attach+0x28/0x4c Deadlocks: If the hub gets disconnected early enough (i.e. before INIT2/INIT3 are finished and the init_work is still queued), the core may call hub_quiesce() after acquiring interface device locks and it will wait for the work to be cancelled synchronously. But if the work handler is already running in parallel, it may try to acquire the same interface device lock and this may result in deadlock. Fix both the issues by removing the call to cancel_delayed_work_sync(). CC: <stable@vger.kernel.org> #4.4+ Fixes: e50293e ("USB: fix invalid memory access in hub_activate()") Reported-by: Manu Gautam <mgautam@codeaurora.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
mathieupoirier
pushed a commit
that referenced
this issue
Sep 22, 2016
The early-exit pathway in hub_activate, added by commit e50293e ("USB: fix invalid memory access in hub_activate()") needs improvement. It duplicates code that is already present at the end of the subroutine, and it neglects to undo the effect of a usb_autopm_get_interface_no_resume() call. This patch fixes both problems by making the early-exit pathway jump directly to the end of the subroutine. It simplifies the code at the end by merging two conditionals that actually test the same condition although they appear different: If type < HUB_INIT3 then type must be either HUB_INIT2 or HUB_INIT, and it can't be HUB_INIT because in that case the subroutine would have exited earlier. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> CC: <stable@vger.kernel.org> #4.4+ Reviewed-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
mathieupoirier
pushed a commit
that referenced
this issue
Sep 22, 2016
The locking in hub_activate() is not adequate to provide full mutual exclusion with hub_quiesce(). The subroutine locks the hub's usb_interface, but the callers of hub_quiesce() (such as hub_pre_reset() and hub_event()) hold the lock to the hub's usb_device. This patch changes hub_activate() to make it acquire the same lock as those other routines. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> CC: <stable@vger.kernel.org> #4.4+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
mathieupoirier
pushed a commit
that referenced
this issue
Sep 22, 2016
Both set_memory_ro() and set_memory_rw() will modify the page attributes of at least one page, even if the numpages parameter is zero. The author expected that calling these functions with numpages == zero would never happen. However with the new 444d13f ("modules: add ro_after_init support") feature this happens frequently. Therefore do the right thing and make these two functions return gracefully if nothing should be done. Fixes crashes on module load like this one: Unable to handle kernel pointer dereference in virtual kernel address space Failing address: 000003ff80008000 TEID: 000003ff80008407 Fault in home space mode while using kernel ASCE. AS:0000000000d18007 R3:00000001e6aa4007 S:00000001e6a10800 P:00000001e34ee21d Oops: 0004 ilc:3 [#1] SMP Modules linked in: x_tables CPU: 10 PID: 1 Comm: systemd Not tainted 4.7.0-11895-g3fa9045 #4 Hardware name: IBM 2964 N96 703 (LPAR) task: 00000001e9118000 task.stack: 00000001e9120000 Krnl PSW : 0704e00180000000 00000000005677f8 (rb_erase+0xf0/0x4d0) R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3 Krnl GPRS: 000003ff80008b20 000003ff80008b20 000003ff80008b70 0000000000b9d608 000003ff80008b20 0000000000000000 00000001e9123e88 000003ff80008950 00000001e485ab40 000003ff00000000 000003ff80008b00 00000001e4858480 0000000100000000 000003ff80008b68 00000000001d5998 00000001e9123c28 Krnl Code: 00000000005677e8: ec1801c3007c cgij %r1,0,8,567b6e 00000000005677ee: e32010100020 cg %r2,16(%r1) #00000000005677f4: a78401c2 brc 8,567b78 >00000000005677f8: e35010080024 stg %r5,8(%r1) 00000000005677fe: ec5801af007c cgij %r5,0,8,567b5c 0000000000567804: e30050000024 stg %r0,0(%r5) 000000000056780a: ebacf0680004 lmg %r10,%r12,104(%r15) 0000000000567810: 07fe bcr 15,%r14 Call Trace: ([<000003ff80008900>] __this_module+0x0/0xffffffffffffd700 [x_tables]) ([<0000000000264fd4>] do_init_module+0x12c/0x220) ([<00000000001da14a>] load_module+0x24e2/0x2b10) ([<00000000001da976>] SyS_finit_module+0xbe/0xd8) ([<0000000000803b26>] system_call+0xd6/0x264) Last Breaking-Event-Address: [<000000000056771a>] rb_erase+0x12/0x4d0 Kernel panic - not syncing: Fatal exception: panic_on_oops Reported-by: Christian Borntraeger <borntraeger@de.ibm.com> Reported-and-tested-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Fixes: e8a97e4 ("s390/pageattr: allow kernel page table splitting") Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
mathieupoirier
pushed a commit
that referenced
this issue
Sep 22, 2016
With debugobjects enabled and using SLAB_DESTROY_BY_RCU, when a kmem_cache_node is destroyed the call_rcu() may trigger a slab allocation to fill the debug object pool (__debug_object_init:fill_pool). Everywhere but during kmem_cache_destroy(), discard_slab() is performed outside of the kmem_cache_node->list_lock and avoids a lockdep warning about potential recursion: ============================================= [ INFO: possible recursive locking detected ] 4.8.0-rc1-gfxbench+ #1 Tainted: G U --------------------------------------------- rmmod/8895 is trying to acquire lock: (&(&n->list_lock)->rlock){-.-...}, at: [<ffffffff811c80d7>] get_partial_node.isra.63+0x47/0x430 but task is already holding lock: (&(&n->list_lock)->rlock){-.-...}, at: [<ffffffff811cbda4>] __kmem_cache_shutdown+0x54/0x320 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&(&n->list_lock)->rlock); lock(&(&n->list_lock)->rlock); *** DEADLOCK *** May be due to missing lock nesting notation 5 locks held by rmmod/8895: #0: (&dev->mutex){......}, at: driver_detach+0x42/0xc0 #1: (&dev->mutex){......}, at: driver_detach+0x50/0xc0 #2: (cpu_hotplug.dep_map){++++++}, at: get_online_cpus+0x2d/0x80 #3: (slab_mutex){+.+.+.}, at: kmem_cache_destroy+0x3c/0x220 #4: (&(&n->list_lock)->rlock){-.-...}, at: __kmem_cache_shutdown+0x54/0x320 stack backtrace: CPU: 6 PID: 8895 Comm: rmmod Tainted: G U 4.8.0-rc1-gfxbench+ #1 Hardware name: Gigabyte Technology Co., Ltd. H87M-D3H/H87M-D3H, BIOS F11 08/18/2015 Call Trace: __lock_acquire+0x1646/0x1ad0 lock_acquire+0xb2/0x200 _raw_spin_lock+0x36/0x50 get_partial_node.isra.63+0x47/0x430 ___slab_alloc.constprop.67+0x1a7/0x3b0 __slab_alloc.isra.64.constprop.66+0x43/0x80 kmem_cache_alloc+0x236/0x2d0 __debug_object_init+0x2de/0x400 debug_object_activate+0x109/0x1e0 __call_rcu.constprop.63+0x32/0x2f0 call_rcu+0x12/0x20 discard_slab+0x3d/0x40 __kmem_cache_shutdown+0xdb/0x320 shutdown_cache+0x19/0x60 kmem_cache_destroy+0x1ae/0x220 i915_gem_load_cleanup+0x14/0x40 [i915] i915_driver_unload+0x151/0x180 [i915] i915_pci_remove+0x14/0x20 [i915] pci_device_remove+0x34/0xb0 __device_release_driver+0x95/0x140 driver_detach+0xb6/0xc0 bus_remove_driver+0x53/0xd0 driver_unregister+0x27/0x50 pci_unregister_driver+0x25/0x70 i915_exit+0x1a/0x1e2 [i915] SyS_delete_module+0x193/0x1f0 entry_SYSCALL_64_fastpath+0x1c/0xac Fixes: 52b4b95 ("mm: slab: free kmem_cache_node after destroy sysfs file") Link: http://lkml.kernel.org/r/1470759070-18743-1-git-send-email-chris@chris-wilson.co.uk Reported-by: Dave Gordon <david.s.gordon@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Vladimir Davydov <vdavydov@virtuozzo.com> Acked-by: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Dmitry Safonov <dsafonov@virtuozzo.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Dave Gordon <david.s.gordon@intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
mathieupoirier
pushed a commit
that referenced
this issue
Sep 22, 2016
| CC mm/memory.o | In file included from ../mm/memory.c:53:0: | ../include/linux/pfn_t.h: In function ‘pfn_t_pte’: | ../include/linux/pfn_t.h:78:2: error: conversion to non-scalar type requested | return pfn_pte(pfn_t_to_pfn(pfn), pgprot); With STRICT_MM_TYPECHECKS pte_t is a struct and the offending code forces a cast which ends up shifting a struct and hence the gcc warning. Note that in recent past some of the arches (aarch64, s390) made STRICT_MM_TYPECHECKS default, but we don't for ARC as this leads to slightly worse generated code, given ARC ABI definition of returning structs (which pte_t would become) Quoting from ARC ABI... "Results of type struct are returned in a caller-supplied temporary variable whose address is passed in r0. For such functions, the arguments are shifted so that they are passed in r1 and up." So - struct to be returned would be allocated on stack requiring extra code at call sites - callee updates stack memory to facilitate the return (vs. simple MOV into return reg r0) Hence STRICT_MM_TYPECHECKS is not enabled by default for ARC Cc: <stable@vger.kernel.org> #4.4+ Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
mathieupoirier
pushed a commit
that referenced
this issue
Sep 22, 2016
Sunrise Point PCH with SPS Firmware doesn't expose working MEI interface, we need to quirk it out. The SPS Firmware is identifiable only on the first PCI function of the device. Cc: <stable@vger.kernel.org> #4.6+ Tested-by: Sujith Pandel <sujith_pandel@dell.com> Signed-off-by: Tomas Winkler <tomas.winkler@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
mathieupoirier
pushed a commit
that referenced
this issue
Sep 22, 2016
Problems with the signal integrity of the high speed USB data lines or noise on reference ground lines can cause the i.MX6 USB controller to violate USB specs and exhibit unexpected behavior. It was observed that USBi_UI interrupts were triggered first and when isr_setup_status_phase was called, ci->status was NULL, which lead to a NULL pointer dereference kernel panic. This patch fixes the kernel panic, emits a warning once and returns -EPIPE to halt the device and let the host get stalled. It also adds a comment to point people, who are experiencing this issue, to their USB hardware design. Cc: <stable@vger.kernel.org> #4.1+ Signed-off-by: Clemens Gruber <clemens.gruber@pqgruber.com> Signed-off-by: Peter Chen <peter.chen@nxp.com>
mathieupoirier
pushed a commit
that referenced
this issue
Oct 6, 2016
Since commit 4d4c474 ("perf/x86/intel/bts: Fix BTS PMI detection") my box goes boom on boot: | .... node #0, CPUs: #1 #2 #3 #4 #5 #6 #7 | BUG: unable to handle kernel NULL pointer dereference at 0000000000000018 | IP: [<ffffffff8100c463>] intel_bts_interrupt+0x43/0x130 | Call Trace: | <NMI> d [<ffffffff8100b341>] intel_pmu_handle_irq+0x51/0x4b0 | [<ffffffff81004d47>] perf_event_nmi_handler+0x27/0x40 This happens because the code introduced in this commit dereferences the debug store pointer unconditionally. The debug store is not guaranteed to be available, so a NULL pointer check as on other places is required. Fixes: 4d4c474 ("perf/x86/intel/bts: Fix BTS PMI detection") Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Reviewed-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: vince@deater.net Cc: eranian@google.com Link: http://lkml.kernel.org/r/20160920131220.xg5pbdjtznszuyzb@breakpoint.cc Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
mathieupoirier
pushed a commit
that referenced
this issue
Feb 20, 2017
commit d65283f added mod->arch.secstr under CONFIG_ARC_DW2_UNWIND, but used it unconditionally which broke builds when the option was disabled. Fix that by adjusting the #ifdef guard. And while at it add a missing guard (for unwinder) in module.c as well Reported-by: Waldemar Brodkorb <wbx@openadk.org> Cc: stable@vger.kernel.org #4.9 Fixes: d65283f ("ARC: module: elide loop to save reference to .eh_frame") Tested-by: Anton Kolesov <akolesov@synopsys.com> Reviewed-by: Alexey Brodkin <abrodkin@synopsys.com> [abrodkin: provided fixlet to Kconfig per failure in allnoconfig build] Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
mathieupoirier
pushed a commit
that referenced
this issue
Feb 20, 2017
We cannot do printk() from tk_debug_account_sleep_time(), because tk_debug_account_sleep_time() is called under tk_core seq lock. The reason why printk() is unsafe there is that console_sem may invoke scheduler (up()->wake_up_process()->activate_task()), which, in turn, can return back to timekeeping code, for instance, via get_time()->ktime_get(), deadlocking the system on tk_core seq lock. [ 48.950592] ====================================================== [ 48.950622] [ INFO: possible circular locking dependency detected ] [ 48.950622] 4.10.0-rc7-next-20170213+ #101 Not tainted [ 48.950622] ------------------------------------------------------- [ 48.950622] kworker/0:0/3 is trying to acquire lock: [ 48.950653] (tk_core){----..}, at: [<c01cc624>] retrigger_next_event+0x4c/0x90 [ 48.950683] but task is already holding lock: [ 48.950683] (hrtimer_bases.lock){-.-...}, at: [<c01cc610>] retrigger_next_event+0x38/0x90 [ 48.950714] which lock already depends on the new lock. [ 48.950714] the existing dependency chain (in reverse order) is: [ 48.950714] -> #5 (hrtimer_bases.lock){-.-...}: [ 48.950744] _raw_spin_lock_irqsave+0x50/0x64 [ 48.950775] lock_hrtimer_base+0x28/0x58 [ 48.950775] hrtimer_start_range_ns+0x20/0x5c8 [ 48.950775] __enqueue_rt_entity+0x320/0x360 [ 48.950805] enqueue_rt_entity+0x2c/0x44 [ 48.950805] enqueue_task_rt+0x24/0x94 [ 48.950836] ttwu_do_activate+0x54/0xc0 [ 48.950836] try_to_wake_up+0x248/0x5c8 [ 48.950836] __setup_irq+0x420/0x5f0 [ 48.950836] request_threaded_irq+0xdc/0x184 [ 48.950866] devm_request_threaded_irq+0x58/0xa4 [ 48.950866] omap_i2c_probe+0x530/0x6a0 [ 48.950897] platform_drv_probe+0x50/0xb0 [ 48.950897] driver_probe_device+0x1f8/0x2cc [ 48.950897] __driver_attach+0xc0/0xc4 [ 48.950927] bus_for_each_dev+0x6c/0xa0 [ 48.950927] bus_add_driver+0x100/0x210 [ 48.950927] driver_register+0x78/0xf4 [ 48.950958] do_one_initcall+0x3c/0x16c [ 48.950958] kernel_init_freeable+0x20c/0x2d8 [ 48.950958] kernel_init+0x8/0x110 [ 48.950988] ret_from_fork+0x14/0x24 [ 48.950988] -> #4 (&rt_b->rt_runtime_lock){-.-...}: [ 48.951019] _raw_spin_lock+0x40/0x50 [ 48.951019] rq_offline_rt+0x9c/0x2bc [ 48.951019] set_rq_offline.part.2+0x2c/0x58 [ 48.951049] rq_attach_root+0x134/0x144 [ 48.951049] cpu_attach_domain+0x18c/0x6f4 [ 48.951049] build_sched_domains+0xba4/0xd80 [ 48.951080] sched_init_smp+0x68/0x10c [ 48.951080] kernel_init_freeable+0x160/0x2d8 [ 48.951080] kernel_init+0x8/0x110 [ 48.951080] ret_from_fork+0x14/0x24 [ 48.951110] -> #3 (&rq->lock){-.-.-.}: [ 48.951110] _raw_spin_lock+0x40/0x50 [ 48.951141] task_fork_fair+0x30/0x124 [ 48.951141] sched_fork+0x194/0x2e0 [ 48.951141] copy_process.part.5+0x448/0x1a20 [ 48.951171] _do_fork+0x98/0x7e8 [ 48.951171] kernel_thread+0x2c/0x34 [ 48.951171] rest_init+0x1c/0x18c [ 48.951202] start_kernel+0x35c/0x3d4 [ 48.951202] 0x8000807c [ 48.951202] -> #2 (&p->pi_lock){-.-.-.}: [ 48.951232] _raw_spin_lock_irqsave+0x50/0x64 [ 48.951232] try_to_wake_up+0x30/0x5c8 [ 48.951232] up+0x4c/0x60 [ 48.951263] __up_console_sem+0x2c/0x58 [ 48.951263] console_unlock+0x3b4/0x650 [ 48.951263] vprintk_emit+0x270/0x474 [ 48.951293] vprintk_default+0x20/0x28 [ 48.951293] printk+0x20/0x30 [ 48.951324] kauditd_hold_skb+0x94/0xb8 [ 48.951324] kauditd_thread+0x1a4/0x56c [ 48.951324] kthread+0x104/0x148 [ 48.951354] ret_from_fork+0x14/0x24 [ 48.951354] -> #1 ((console_sem).lock){-.....}: [ 48.951385] _raw_spin_lock_irqsave+0x50/0x64 [ 48.951385] down_trylock+0xc/0x2c [ 48.951385] __down_trylock_console_sem+0x24/0x80 [ 48.951385] console_trylock+0x10/0x8c [ 48.951416] vprintk_emit+0x264/0x474 [ 48.951416] vprintk_default+0x20/0x28 [ 48.951416] printk+0x20/0x30 [ 48.951446] tk_debug_account_sleep_time+0x5c/0x70 [ 48.951446] __timekeeping_inject_sleeptime.constprop.3+0x170/0x1a0 [ 48.951446] timekeeping_resume+0x218/0x23c [ 48.951477] syscore_resume+0x94/0x42c [ 48.951477] suspend_enter+0x554/0x9b4 [ 48.951477] suspend_devices_and_enter+0xd8/0x4b4 [ 48.951507] enter_state+0x934/0xbd4 [ 48.951507] pm_suspend+0x14/0x70 [ 48.951507] state_store+0x68/0xc8 [ 48.951538] kernfs_fop_write+0xf4/0x1f8 [ 48.951538] __vfs_write+0x1c/0x114 [ 48.951538] vfs_write+0xa0/0x168 [ 48.951568] SyS_write+0x3c/0x90 [ 48.951568] __sys_trace_return+0x0/0x10 [ 48.951568] -> #0 (tk_core){----..}: [ 48.951599] lock_acquire+0xe0/0x294 [ 48.951599] ktime_get_update_offsets_now+0x5c/0x1d4 [ 48.951629] retrigger_next_event+0x4c/0x90 [ 48.951629] on_each_cpu+0x40/0x7c [ 48.951629] clock_was_set_work+0x14/0x20 [ 48.951660] process_one_work+0x2b4/0x808 [ 48.951660] worker_thread+0x3c/0x550 [ 48.951660] kthread+0x104/0x148 [ 48.951690] ret_from_fork+0x14/0x24 [ 48.951690] other info that might help us debug this: [ 48.951690] Chain exists of: tk_core --> &rt_b->rt_runtime_lock --> hrtimer_bases.lock [ 48.951721] Possible unsafe locking scenario: [ 48.951721] CPU0 CPU1 [ 48.951721] ---- ---- [ 48.951721] lock(hrtimer_bases.lock); [ 48.951751] lock(&rt_b->rt_runtime_lock); [ 48.951751] lock(hrtimer_bases.lock); [ 48.951751] lock(tk_core); [ 48.951782] *** DEADLOCK *** [ 48.951782] 3 locks held by kworker/0:0/3: [ 48.951782] #0: ("events"){.+.+.+}, at: [<c0156590>] process_one_work+0x1f8/0x808 [ 48.951812] #1: (hrtimer_work){+.+...}, at: [<c0156590>] process_one_work+0x1f8/0x808 [ 48.951843] #2: (hrtimer_bases.lock){-.-...}, at: [<c01cc610>] retrigger_next_event+0x38/0x90 [ 48.951843] stack backtrace: [ 48.951873] CPU: 0 PID: 3 Comm: kworker/0:0 Not tainted 4.10.0-rc7-next-20170213+ [ 48.951904] Workqueue: events clock_was_set_work [ 48.951904] [<c0110208>] (unwind_backtrace) from [<c010c224>] (show_stack+0x10/0x14) [ 48.951934] [<c010c224>] (show_stack) from [<c04ca6c0>] (dump_stack+0xac/0xe0) [ 48.951934] [<c04ca6c0>] (dump_stack) from [<c019b5cc>] (print_circular_bug+0x1d0/0x308) [ 48.951965] [<c019b5cc>] (print_circular_bug) from [<c019d2a8>] (validate_chain+0xf50/0x1324) [ 48.951965] [<c019d2a8>] (validate_chain) from [<c019ec18>] (__lock_acquire+0x468/0x7e8) [ 48.951995] [<c019ec18>] (__lock_acquire) from [<c019f634>] (lock_acquire+0xe0/0x294) [ 48.951995] [<c019f634>] (lock_acquire) from [<c01d0ea0>] (ktime_get_update_offsets_now+0x5c/0x1d4) [ 48.952026] [<c01d0ea0>] (ktime_get_update_offsets_now) from [<c01cc624>] (retrigger_next_event+0x4c/0x90) [ 48.952026] [<c01cc624>] (retrigger_next_event) from [<c01e4e24>] (on_each_cpu+0x40/0x7c) [ 48.952056] [<c01e4e24>] (on_each_cpu) from [<c01cafc4>] (clock_was_set_work+0x14/0x20) [ 48.952056] [<c01cafc4>] (clock_was_set_work) from [<c015664c>] (process_one_work+0x2b4/0x808) [ 48.952087] [<c015664c>] (process_one_work) from [<c0157774>] (worker_thread+0x3c/0x550) [ 48.952087] [<c0157774>] (worker_thread) from [<c015d644>] (kthread+0x104/0x148) [ 48.952087] [<c015d644>] (kthread) from [<c0107830>] (ret_from_fork+0x14/0x24) Replace printk() with printk_deferred(), which does not call into the scheduler. Fixes: 0bf43f1 ("timekeeping: Prints the amounts of time spent during suspend") Reported-and-tested-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Petr Mladek <pmladek@suse.com> Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: "Rafael J . Wysocki" <rjw@rjwysocki.net> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: John Stultz <john.stultz@linaro.org> Cc: "[4.9+]" <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/20170215044332.30449-1-sergey.senozhatsky@gmail.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
trc_pkt_lister appears to hang when run on the Snowball snapshot with the -decode option.
The code was built with
LINUX64=1 DEBUG=1
.At this point it hangs for several minutes with 100% CPU usage.
The text was updated successfully, but these errors were encountered: