error compiling #1

Open
peterjohnhartman opened this Issue Mar 14, 2012 · 1 comment

Comments

Projects
None yet
2 participants

Hi folks,

I tried to compile the kernel inside a chroot environment on my transformer prime. I did a make clean to be sure. It compiles fine but bails towards the very end with this as the last 5 lines:

LD init/built-in.o
LD .tmp_vmlinux1
kernel/built-in.o: In function kernel restart': elfcore.c(.text+0x14fa8): undefined reference todisable_auto_hotplug'
kernel/built-in.o: In function kernel_power_off': elfcore.c(.text+0x1504c): undefined reference todisable_auto_hotplug'
make: *** [.tmp_vmlinux1] Error 1

Thanks in advance for any tips. It could well be my chroot (a standard gentoo chroot done via:

mount -t proc none /data/gentoo/proc
mount -o bind /dev /data/gentoo/dev
mount -o bind /dev/pts /data/gentoo/dev/pts
mount -o bind /dev/shm /data/gentoo/dev/shm
chroot /data/gentoo /bin/bash

Just thought I would note that this symbol appears to be defined in arch/arm/mach-tegra/cpu-tegra3.c:

~/TF201$ find . -name *.c | xargs grep disable_auto_hotplug
./kernel/sys.c: extern void disable_auto_hotplug(void);
./kernel/sys.c: disable_auto_hotplug();
./kernel/sys.c: disable_auto_hotplug();
./arch/arm/mach-tegra/cpu-tegra3.c:void disable_auto_hotplug(void)

Did you use "tegra3_android_defconfig" ? I am trying this, but at least one file needs hand editing from #include <devices.h> to #include "devices.h"

EnJens pushed a commit that referenced this issue Oct 2, 2012

Char: nozomi, remove useless tty_sem
tty_sem used to protect tty open count. This was removed in 33dd474
but the lock remained in place.

So remove it completely as it protects nothing now.

Also this solves Mac's problem with inatomic operation called from
atomic context (ppp):
BUG: scheduling while atomic: firefox-bin/1992/0x10000800
Modules linked in: ...
Pid: 1992, comm: firefox-bin Not tainted 2.6.38 #1
Call Trace:
...
 [] ? mutex_lock+0xe/0x21
 [] ? ntty_write+0x5d/0x192 [nozomi]
 [] ? __mod_timer.clone.30+0xbe/0xcc
 [] ? check_preempt_curr+0x60/0x6d
 [] ? __nf_ct_refresh_acct+0x75/0xbe
 [] ? ppp_async_push+0xa9/0x3bd [ppp_async]
 [] ? ppp_async_send+0x34/0x40 [ppp_async]
 [] ? ppp_push+0x6c/0x4f9 [ppp_generic]
...

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Reported-by: Mac <kmac@poczta.fm>
Tested-by: Gerald Pfeifer <gerald@pfeifer.com>
Reviewed-by: Jack Stone <jwjstone@fastmail.fm>
Cc: Alan Cox <alan@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

EnJens pushed a commit that referenced this issue Oct 2, 2012

efivars: prevent oops on unload when efi is not enabled
efivars_exit() should check for efi_enabled and not undo
allocations when efi is not enabled.  Otherwise there is an Oops
during module unload:

calling  efivars_init+0x0/0x1000 [efivars] @ 2810
EFI Variables Facility v0.08 2004-May-17
initcall efivars_init+0x0/0x1000 [efivars] returned 0 after 5120 usecs
Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
last sysfs file: /sys/module/firmware_class/initstate
CPU 1
Modules linked in: efivars(-) af_packet tun nfsd lockd nfs_acl auth_rpcgss sunrpc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT xt_tcpudp nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables x_tables ipv6 cpufreq_ondemand acpi_cpufreq freq_table mperf binfmt_misc dm_mirror dm_region_hash dm_log dm_multipath scsi_dh dm_mod snd_hda_codec_analog snd_hda_intel snd_hda_codec snd_hwdep mousedev snd_seq joydev snd_seq_device mac_hid evdev snd_pcm usbkbd usbmouse usbhid snd_timer hid tg3 snd sr_mod pcspkr rtc_cmos soundcore cdrom iTCO_wdt processor sg dcdbas i2c_i801 rtc_core iTCO_vendor_support intel_agp snd_page_alloc thermal_sys rtc_lib intel_gtt 8250_pnp button hwmon unix ide_pci_generic ide_core ata_generic pata_acpi ata_piix sd_mod crc_t10dif ext3 jbd mbcache uhci_hcd ohci_hcd ssb mmc_core pcmcia pcmcia_core firmware_class ehci_hcd usbcore [last unloaded: dell_rbu]

Pid: 2812, comm: rmmod Not tainted 2.6.39-rc6 #1 Dell Inc.                 OptiPlex 745                 /0TY565
RIP: 0010:[<ffffffffa06a17f6>]  [<ffffffffa06a17f6>] unregister_efivars+0x28/0x12c [efivars]
RSP: 0018:ffff88005eedde98  EFLAGS: 00010283
RAX: ffffffffa06a23fc RBX: ffffffffa06a44c0 RCX: ffff88007c227a50
RDX: 0000000000000000 RSI: 00000055ac13db78 RDI: ffffffffa06a44c0
RBP: ffff88005eeddec8 R08: 0000000000000000 R09: ffff88005eeddd78
R10: ffffffffa06a4220 R11: ffff88005eeddd78 R12: fffffffffffff7d0
R13: 00007fff5a3aaec0 R14: 0000000000000000 R15: ffffffffa06a4508
FS:  00007fa8dcc4a6f0(0000) GS:ffff88007c200000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000000 CR3: 000000005d148000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process rmmod (pid: 2812, threadinfo ffff88005eedc000, task ffff88006754b000)
Stack:
 ffff88005eeddec8 ffffffffa06a4220 0000000000000000 00007fff5a3aaec0
 0000000000000000 0000000000000001 ffff88005eedded8 ffffffffa06a2418
 ffff88005eeddf78 ffffffff810d3598 ffffffffa06a4220 0000000000000880
Call Trace:
 [<ffffffffa06a2418>] efivars_exit+0x1c/0xc04 [efivars]
 [<ffffffff810d3598>] sys_delete_module+0x2d6/0x368
 [<ffffffff812d1db9>] ? lockdep_sys_exit_thunk+0x35/0x67
 [<ffffffff810fcba1>] ? audit_syscall_entry+0x172/0x1a5
 [<ffffffff81575082>] system_call_fastpath+0x16/0x1b
Code: 5c c9 c3 55 48 89 e5 41 57 41 56 41 55 41 54 53 48 83 ec 08 0f 1f 44 00 00 4c 8b 67 48 48 89 fb 4c 8d 7f 48 49 81 ec 30 08 00 00 <4d> 8b ac 24 30 08 00 00 49 81 ed 30 08 00 00 eb 59 48 89 df 48
RIP  [<ffffffffa06a17f6>] unregister_efivars+0x28/0x12c [efivars]
 RSP <ffff88005eedde98>
CR2: 0000000000000000
 ---[ end trace aa99b99090f70baa ]---

Matt apparently removed such a check in 2004 (with no reason given):
 *  17 May 2004 - Matt Domsch <Matt_Domsch@dell.com>
 *   remove check for efi_enabled in exit
but there have been several changes since then.

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Mike Waychison <mikew@google.com>
Tested-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Matt Domsch <Matt_Domsch@dell.com>
Cc: <matthew.e.tolentino@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

EnJens pushed a commit that referenced this issue Oct 2, 2012

ptrace: fix signal->wait_chldexit usage in task_clear_group_stop_trap…
…ping()

GROUP_STOP_TRAPPING waiting mechanism piggybacks on
signal->wait_chldexit which is primarily used to implement waiting for
wait(2) and friends.  When do_wait() waits on signal->wait_chldexit,
it uses a custom wake up callback, child_wait_callback(), which
expects the child task which is waking up the parent to be passed in
as @key to filter out spurious wakeups.

task_clear_group_stop_trapping() used __wake_up_sync() which uses NULL
@key causing the following oops if the parent was doing do_wait().

  BUG: unable to handle kernel NULL pointer dereference at 00000000000002d8
  IP: [<ffffffff810499f9>] child_wait_callback+0x29/0x80
  PGD 1d899067 PUD 1e418067 PMD 0
  Oops: 0000 [#1] PREEMPT SMP
  last sysfs file: /sys/devices/pci0000:00/0000:00:03.0/local_cpus
  CPU 2
  Modules linked in:

  Pid: 4498, comm: test-continued Not tainted 2.6.39-rc6-work+ #32 Bochs Bochs
  RIP: 0010:[<ffffffff810499f9>]  [<ffffffff810499f9>] child_wait_callback+0x29/0x80
  RSP: 0000:ffff88001b889bf8  EFLAGS: 00010046
  RAX: 0000000000000000 RBX: ffff88001fab3af8 RCX: 0000000000000000
  RDX: 0000000000000001 RSI: 0000000000000002 RDI: ffff88001d91df20
  RBP: ffff88001b889c08 R08: 0000000000000000 R09: 0000000000000000
  R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000
  R13: ffff88001fb70550 R14: 0000000000000000 R15: 0000000000000001
  FS:  00007f26ccae4700(0000) GS:ffff88001fd00000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
  CR2: 00000000000002d8 CR3: 000000001b8ac000 CR4: 00000000000006e0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
  Process test-continued (pid: 4498, threadinfo ffff88001b888000, task ffff88001fb88000)
  Stack:
   ffff88001b889c18 ffff88001fb70538 ffff88001b889c58 ffffffff810312f9
   0000000000000001 0000000200000001 ffff88001b889c58 ffff88001fb70518
   0000000000000002 0000000000000082 0000000000000001 0000000000000000
  Call Trace:
   [<ffffffff810312f9>] __wake_up_common+0x59/0x90
   [<ffffffff81035263>] __wake_up_sync_key+0x53/0x80
   [<ffffffff810352a0>] __wake_up_sync+0x10/0x20
   [<ffffffff8105a984>] task_clear_jobctl_trapping+0x44/0x50
   [<ffffffff8105bcbc>] ptrace_stop+0x7c/0x290
   [<ffffffff8105c20a>] do_signal_stop+0x28a/0x2d0
   [<ffffffff8105d27f>] get_signal_to_deliver+0x14f/0x5a0
   [<ffffffff81002175>] do_signal+0x75/0x7b0
   [<ffffffff8100292d>] do_notify_resume+0x5d/0x70
   [<ffffffff8182e36a>] retint_signal+0x46/0x8c
  Code: 00 00 55 48 89 e5 53 48 83 ec 08 0f 1f 44 00 00 8b 47 d8 83 f8 03 74 3a 85 c0 49 89 c8 75 23 89 c0 48 8b 5f e0 4c 8d 0c 40 31 c0 <4b> 39 9c c8 d8 02 00 00 74 1d 48 83 c4 08 5b c9 c3 66 0f 1f 44

Fix it by using __wake_up_sync_key() and passing in the child as @key.

I still think it's a mistake to piggyback on wait_chldexit for this.
Given the relative low frequency of ptrace use, we would be much
better off leaving already complex wait_chldexit alone and using bit
waitqueue.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>

EnJens pushed a commit that referenced this issue Oct 2, 2012

xen mmu: fix a race window causing leave_mm BUG()
There's a race window in xen_drop_mm_ref, where remote cpu may exit
dirty bitmap between the check on this cpu and the point where remote
cpu handles drop request. So in drop_other_mm_ref we need check
whether TLB state is still lazy before calling into leave_mm. This
bug is rarely observed in earlier kernel, but exaggerated by the
commit 831d52b
("x86, mm: avoid possible bogus tlb entries by clearing prev mm_cpumask after switching mm")
which clears bitmap after changing the TLB state. the call trace is as below:

---------------------------------
kernel BUG at arch/x86/mm/tlb.c:61!
invalid opcode: 0000 [#1] SMP
last sysfs file: /sys/devices/system/xen_memory/xen_memory0/info/current_kb
CPU 1
Modules linked in: 8021q garp xen_netback xen_blkback blktap blkback_pagemap nbd bridge stp llc autofs4 ipmi_devintf ipmi_si ipmi_msghandler lockd sunrpc bonding ipv6 xenfs dm_multipath video output sbs sbshc parport_pc lp parport ses enclosure snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device serio_raw bnx2 snd_pcm_oss snd_mixer_oss snd_pcm snd_timer iTCO_wdt snd soundcore snd_page_alloc i2c_i801 iTCO_vendor_support i2c_core pcs pkr pata_acpi ata_generic ata_piix shpchp mptsas mptscsih mptbase [last unloaded: freq_table]
Pid: 25581, comm: khelper Not tainted 2.6.32.36fixxen #1 Tecal RH2285
RIP: e030:[<ffffffff8103a3cb>]  [<ffffffff8103a3cb>] leave_mm+0x15/0x46
RSP: e02b:ffff88002805be48  EFLAGS: 00010046
RAX: 0000000000000000 RBX: 0000000000000001 RCX: ffff88015f8e2da0
RDX: ffff88002805be78 RSI: 0000000000000000 RDI: 0000000000000001
RBP: ffff88002805be48 R08: ffff88009d662000 R09: dead000000200200
R10: dead000000100100 R11: ffffffff814472b2 R12: ffff88009bfc1880
R13: ffff880028063020 R14: 00000000000004f6 R15: 0000000000000000
FS:  00007f62362d66e0(0000) GS:ffff880028058000(0000) knlGS:0000000000000000
CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000003aabc11909 CR3: 000000009b8ca000 CR4: 0000000000002660
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 00000000000000 00
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process khelper (pid: 25581, threadinfo ffff88007691e000, task ffff88009b92db40)
Stack:
 ffff88002805be68 ffffffff8100e4ae 0000000000000001 ffff88009d733b88
<0> ffff88002805be98 ffffffff81087224 ffff88002805be78 ffff88002805be78
<0> ffff88015f808360 00000000000004f6 ffff88002805bea8 ffffffff81010108
Call Trace:
 <IRQ>
 [<ffffffff8100e4ae>] drop_other_mm_ref+0x2a/0x53
 [<ffffffff81087224>] generic_smp_call_function_single_interrupt+0xd8/0xfc
 [<ffffffff81010108>] xen_call_function_single_interrupt+0x13/0x28
 [<ffffffff810a936a>] handle_IRQ_event+0x66/0x120
 [<ffffffff810aac5b>] handle_percpu_irq+0x41/0x6e
 [<ffffffff8128c1c0>] __xen_evtchn_do_upcall+0x1ab/0x27d
 [<ffffffff8128dd11>] xen_evtchn_do_upcall+0x33/0x46
 [<ffffffff81013efe>] xen_do_hyper visor_callback+0x1e/0x30
 <EOI>
 [<ffffffff814472b2>] ? _spin_unlock_irqrestore+0x15/0x17
 [<ffffffff8100f8cf>] ? xen_restore_fl_direct_end+0x0/0x1
 [<ffffffff81113f71>] ? flush_old_exec+0x3ac/0x500
 [<ffffffff81150dc5>] ? load_elf_binary+0x0/0x17ef
 [<ffffffff81150dc5>] ? load_elf_binary+0x0/0x17ef
 [<ffffffff8115115d>] ? load_elf_binary+0x398/0x17ef
 [<ffffffff81042fcf>] ? need_resched+0x23/0x2d
 [<ffffffff811f4648>] ? process_measurement+0xc0/0xd7
 [<ffffffff81150dc5>] ? load_elf_binary+0x0/0x17ef
 [<ffffffff81113094>] ? search_binary_handler+0xc8/0x255
 [<ffffffff81114362>] ? do_execve+0x1c3/0x29e
 [<ffffffff8101155d>] ? sys_execve+0x43/0x5d
 [<ffffffff8106fc45>] ? __call_usermodehelper+0x0/0x6f
 [<ffffffff81013e28>] ? kernel_execve+0x68/0xd0
 [<ffffffff 8106fc45>] ? __call_usermodehelper+0x0/0x6f
 [<ffffffff8100f8cf>] ? xen_restore_fl_direct_end+0x0/0x1
 [<ffffffff8106fb64>] ? ____call_usermodehelper+0x113/0x11e
 [<ffffffff81013daa>] ? child_rip+0xa/0x20
 [<ffffffff8106fc45>] ? __call_usermodehelper+0x0/0x6f
 [<ffffffff81012f91>] ? int_ret_from_sys_call+0x7/0x1b
 [<ffffffff8101371d>] ? retint_restore_args+0x5/0x6
 [<ffffffff81013da0>] ? child_rip+0x0/0x20
Code: 41 5e 41 5f c9 c3 55 48 89 e5 0f 1f 44 00 00 e8 17 ff ff ff c9 c3 55 48 89 e5 0f 1f 44 00 00 65 8b 04 25 c8 55 01 00 ff c8 75 04 <0f> 0b eb fe 65 48 8b 34 25 c0 55 01 00 48 81 c6 b8 02 00 00 e8
RIP  [<ffffffff8103a3cb>] leave_mm+0x15/0x46
 RSP <ffff88002805be48>
---[ end trace ce9cee6832a9c503 ]---

Tested-by: Maoxiaoyun<tinnycloud@hotmail.com>
Signed-off-by: Kevin Tian <kevin.tian@intel.com>
[v1: Fleshed out the git description a bit]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

EnJens pushed a commit that referenced this issue Oct 2, 2012

SCTP: fix race between sctp_bind_addr_free() and sctp_bind_addr_confl…
…ict()

During the sctp_close() call, we do not use rcu primitives to
destroy the address list attached to the endpoint.  At the same
time, we do the removal of addresses from this list before
attempting to remove the socket from the port hash

As a result, it is possible for another process to find the socket
in the port hash that is in the process of being closed.  It then
proceeds to traverse the address list to find the conflict, only
to have that address list suddenly disappear without rcu() critical
section.

Fix issue by closing address list removal inside RCU critical
section.

Race can result in a kernel crash with general protection fault or
kernel NULL pointer dereference:

kernel: general protection fault: 0000 [#1] SMP
kernel: RIP: 0010:[<ffffffffa02f3dde>]  [<ffffffffa02f3dde>] sctp_bind_addr_conflict+0x64/0x82 [sctp]
kernel: Call Trace:
kernel:  [<ffffffffa02f415f>] ? sctp_get_port_local+0x17b/0x2a3 [sctp]
kernel:  [<ffffffffa02f3d45>] ? sctp_bind_addr_match+0x33/0x68 [sctp]
kernel:  [<ffffffffa02f4416>] ? sctp_do_bind+0xd3/0x141 [sctp]
kernel:  [<ffffffffa02f5030>] ? sctp_bindx_add+0x4d/0x8e [sctp]
kernel:  [<ffffffffa02f5183>] ? sctp_setsockopt_bindx+0x112/0x4a4 [sctp]
kernel:  [<ffffffff81089e82>] ? generic_file_aio_write+0x7f/0x9b
kernel:  [<ffffffffa02f763e>] ? sctp_setsockopt+0x14f/0xfee [sctp]
kernel:  [<ffffffff810c11fb>] ? do_sync_write+0xab/0xeb
kernel:  [<ffffffff810e82ab>] ? fsnotify+0x239/0x282
kernel:  [<ffffffff810c2462>] ? alloc_file+0x18/0xb1
kernel:  [<ffffffff8134a0b1>] ? compat_sys_setsockopt+0x1a5/0x1d9
kernel:  [<ffffffff8134aaf1>] ? compat_sys_socketcall+0x143/0x1a4
kernel:  [<ffffffff810467dc>] ? sysenter_dispatch+0x7/0x32

Signed-off-by: Jacek Luczak <luczak.jacek@gmail.com>
Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
CC: Eric Dumazet <eric.dumazet@gmail.com>
Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

EnJens pushed a commit that referenced this issue Oct 2, 2012

[media] rc: add locking to fix register/show race
When device_add is called in rc_register_device, the rc sysfs nodes show
up, and there's a window in which ir-keytable can be launched via udev
and trigger a show_protocols call, which runs without various rc_dev
fields filled in yet. Add some locking around registration and
store/show_protocols to prevent that from happening.

The problem manifests thusly:

[64692.957872] BUG: unable to handle kernel NULL pointer dereference at 0000000000000090
[64692.957878] IP: [<ffffffffa036a4c1>] show_protocols+0x47/0xf1 [rc_core]
[64692.957890] PGD 19cfc7067 PUD 19cfc6067 PMD 0
[64692.957894] Oops: 0000 [#1] SMP
[64692.957897] last sysfs file: /sys/devices/pci0000:00/0000:00:03.1/usb3/3-1/3-1:1.0/rc/rc2/protocols
[64692.957902] CPU 3
[64692.957903] Modules linked in: redrat3(+) ir_lirc_codec lirc_dev ir_sony_decoder ir_jvc_decoder ir_rc6_decoder ir_rc5_decoder rc_hauppauge ir_nec
_decoder rc_core ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables snd_emu10k1_synth snd_emux_synth snd_seq_virmidi snd_seq_mi
di_event snd_seq_midi_emul snd_emu10k1 snd_rawmidi snd_ac97_codec ac97_bus snd_seq snd_pcm snd_seq_device snd_timer snd_page_alloc snd_util_mem pcsp
kr tg3 snd_hwdep emu10k1_gp snd amd64_edac_mod gameport edac_core soundcore edac_mce_amd k8temp shpchp i2c_piix4 lm63 e100 mii uinput ipv6 raid0 rai
d1 ata_generic firewire_ohci pata_acpi firewire_core crc_itu_t sata_svw pata_serverworks floppy radeon ttm drm_kms_helper drm i2c_algo_bit i2c_core
[last unloaded: redrat3]
[64692.957949] [64692.957952] Pid: 12265, comm: ir-keytable Tainted: G   M    W   2.6.39-rc6+ #2 empty empty/TYAN Thunder K8HM S3892
[64692.957957] RIP: 0010:[<ffffffffa036a4c1>]  [<ffffffffa036a4c1>] show_protocols+0x47/0xf1 [rc_core]
[64692.957962] RSP: 0018:ffff880194509e38  EFLAGS: 00010202
[64692.957964] RAX: 0000000000000000 RBX: ffffffffa036d1e0 RCX: ffffffffa036a47a
[64692.957966] RDX: ffff88019a84d000 RSI: ffffffffa036d1e0 RDI: ffff88019cf2f3f0
[64692.957969] RBP: ffff880194509e68 R08: 0000000000000002 R09: 0000000000000000
[64692.957971] R10: 0000000000000002 R11: 0000000000001617 R12: ffff88019a84d000
[64692.957973] R13: 0000000000001000 R14: ffff8801944d2e38 R15: ffff88019ce5f190
[64692.957976] FS:  00007f0a30c9a720(0000) GS:ffff88019fc00000(0000) knlGS:0000000000000000
[64692.957979] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[64692.957981] CR2: 0000000000000090 CR3: 000000019a8e0000 CR4: 00000000000006e0
[64692.957983] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[64692.957986] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[64692.957989] Process ir-keytable (pid: 12265, threadinfo ffff880194508000, task ffff88019a9fc720)
[64692.957991] Stack:
[64692.957992]  0000000000000002 ffffffffa036d1e0 ffff880194509f58 0000000000001000
[64692.957997]  ffff8801944d2e38 ffff88019ce5f190 ffff880194509e98 ffffffff8131484b
[64692.958001]  ffffffff8118e923 ffffffff810e9b2f ffff880194509e98 ffff8801944d2e18
[64692.958005] Call Trace:
[64692.958014]  [<ffffffff8131484b>] dev_attr_show+0x27/0x4e
[64692.958014]  [<ffffffff8118e923>] ? sysfs_read_file+0x94/0x172
[64692.958014]  [<ffffffff810e9b2f>] ? __get_free_pages+0x16/0x52
[64692.958014]  [<ffffffff8118e94c>] sysfs_read_file+0xbd/0x172
[64692.958014]  [<ffffffff8113205e>] vfs_read+0xac/0xf3
[64692.958014]  [<ffffffff8113347b>] ? fget_light+0x3a/0xa1
[64692.958014]  [<ffffffff811320f2>] sys_read+0x4d/0x74
[64692.958014]  [<ffffffff814c19c2>] system_call_fastpath+0x16/0x1b

Its a bit difficult to reproduce, but I'm fairly confident this has
fixed the problem.

Signed-off-by: Jarod Wilson <jarod@redhat.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

EnJens pushed a commit that referenced this issue Oct 2, 2012

KVM: Fix kvm mmu_notifier initialization order
Like the following, mmu_notifier can be called after registering
immediately. So, kvm have to initialize kvm->mmu_lock before it.

BUG: spinlock bad magic on CPU#0, kswapd0/342
 lock: ffff8800af8c4000, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0
Pid: 342, comm: kswapd0 Not tainted 2.6.39-rc5+ #1
Call Trace:
 [<ffffffff8118ce61>] spin_bug+0x9c/0xa3
 [<ffffffff8118ce91>] do_raw_spin_lock+0x29/0x13c
 [<ffffffff81024923>] ? flush_tlb_others_ipi+0xaf/0xfd
 [<ffffffff812e22f3>] _raw_spin_lock+0x9/0xb
 [<ffffffffa0582325>] kvm_mmu_notifier_clear_flush_young+0x2c/0x66 [kvm]
 [<ffffffff810d3ff3>] __mmu_notifier_clear_flush_young+0x2b/0x57
 [<ffffffff810c8761>] page_referenced_one+0x88/0xea
 [<ffffffff810c89bf>] page_referenced+0x1fc/0x256
 [<ffffffff810b2771>] shrink_page_list+0x187/0x53a
 [<ffffffff810b2ed7>] shrink_inactive_list+0x1e0/0x33d
 [<ffffffff810acf95>] ? determine_dirtyable_memory+0x15/0x27
 [<ffffffff812e90ee>] ? call_function_single_interrupt+0xe/0x20
 [<ffffffff810b3356>] shrink_zone+0x322/0x3de
 [<ffffffff810a9587>] ? zone_watermark_ok_safe+0xe2/0xf1
 [<ffffffff810b3928>] kswapd+0x516/0x818
 [<ffffffff810b3412>] ? shrink_zone+0x3de/0x3de
 [<ffffffff81053d17>] kthread+0x7d/0x85
 [<ffffffff812e9394>] kernel_thread_helper+0x4/0x10
 [<ffffffff81053c9a>] ? __init_kthread_worker+0x37/0x37
 [<ffffffff812e9390>] ? gs_change+0xb/0xb

Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Avi Kivity <avi@redhat.com>

EnJens pushed a commit that referenced this issue Oct 2, 2012

export_report: do collectcfiles work in perl itself
Avoid spawning a shell pipeline doing cat, grep, sed, and do it all
inside perl.  The <*.c> globbing construct works at least as far back
as 5.8.9

Note that this is not just an optimization; the sed command
in the pipeline was unterminated, due to lack of escape on the
end-of-line (\$) in the regex, resulting in this:

    $ perl ../linux-2.6/scripts/export_report.pl  > /dev/null
    sed: -e expression #1, char 5: unterminated `s' command
    sh: .mod.c/: not found

Comments on an earlier patch sought an all-perl implementation.

Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
cc: Michal Marek <mmarek@suse.cz>,
cc: linux-kbuild@vger.kernel.org
cc: Arnaud Lacombe lacombar@gmail.com
cc: Stephen Hemminger shemminger@vyatta.com
Signed-off-by: Michal Marek <mmarek@suse.cz>

EnJens pushed a commit that referenced this issue Oct 2, 2012

loop: limit 'max_part' module param to DISK_MAX_PARTS
The 'max_part' parameter controls the number of maximum partition
a loop block device can have. However if a user specifies very
large value it would exceed the limitation of device minor number
and can cause a kernel panic (or, at least, produce invalid
device nodes in some cases).

On my desktop system, following command kills the kernel. On qemu,
it triggers similar oops but the kernel was alive:

$ sudo modprobe loop max_part0000
 ------------[ cut here ]------------
 kernel BUG at /media/Linux_Data/project/linux/fs/sysfs/group.c:65!
 invalid opcode: 0000 [#1] SMP
 last sysfs file:
 CPU 0
 Modules linked in: loop(+)

 Pid: 43, comm: insmod Tainted: G        W   2.6.39-qemu+ #155 Bochs Bochs
 RIP: 0010:[<ffffffff8113ce61>]  [<ffffffff8113ce61>] internal_create_group=
+0x2a/0x170
 RSP: 0018:ffff880007b3fde8  EFLAGS: 00000246
 RAX: 00000000ffffffef RBX: ffff880007b3d878 RCX: 00000000000007b4
 RDX: ffffffff8152da50 RSI: 0000000000000000 RDI: ffff880007b3d878
 RBP: ffff880007b3fe38 R08: ffff880007b3fde8 R09: 0000000000000000
 R10: ffff88000783b4a8 R11: ffff880007b3d878 R12: ffffffff8152da50
 R13: ffff880007b3d868 R14: 0000000000000000 R15: ffff880007b3d800
 FS:  0000000002137880(0063) GS:ffff880007c00000(0000) knlGS:00000000000000=
00
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000000000422680 CR3: 0000000007b50000 CR4: 00000000000006b0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 0000000000000000 DR7: 0000000000000000
 Process insmod (pid: 43, threadinfo ffff880007b3e000, task ffff880007afb9c=
0)
 Stack:
  ffff880007b3fe58 ffffffff811e66dd ffff880007b3fe58 ffffffff811e570b
  0000000000000010 ffff880007b3d800 ffff880007a7b390 ffff880007b3d868
  0000000000400920 ffff880007b3d800 ffff880007b3fe48 ffffffff8113cfc8
 Call Trace:
  [<ffffffff811e66dd>] ? device_add+0x4bc/0x5af
  [<ffffffff811e570b>] ? dev_set_name+0x3c/0x3e
  [<ffffffff8113cfc8>] sysfs_create_group+0xe/0x12
  [<ffffffff810b420e>] blk_trace_init_sysfs+0x14/0x16
  [<ffffffff8116a090>] blk_register_queue+0x47/0xf7
  [<ffffffff8116f527>] add_disk+0xdf/0x290
  [<ffffffffa00060eb>] loop_init+0xeb/0x1b8 [loop]
  [<ffffffffa0006000>] ? 0xffffffffa0005fff
  [<ffffffff8100020a>] do_one_initcall+0x7a/0x12e
  [<ffffffff81096804>] sys_init_module+0x9c/0x1e0
  [<ffffffff813329bb>] system_call_fastpath+0x16/0x1b
 Code: c3 55 48 89 e5 41 57 41 56 41 89 f6 41 55 41 54 49 89 d4 53 48 89 fb=
 48 83 ec 28 48 85 ff 74 0b 85 f6 75 0b 48 83 7f 30 00 75 14 <0f> 0b eb fe =
48 83 7f 30 00 b9 ea ff ff ff 0f 84 18 01 00 00 49
 RIP  [<ffffffff8113ce61>] internal_create_group+0x2a/0x170
  RSP <ffff880007b3fde8>
 ---[ end trace a123eb592043acad ]---

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Cc: Laurent Vivier <Laurent.Vivier@bull.net>
Cc: stable@kernel.org
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

EnJens pushed a commit that referenced this issue Oct 2, 2012

xen: netfront: hold RTNL when updating features.
Konrad reports:
[    0.930811] RTNL: assertion failed at /home/konrad/ssd/linux/net/core/dev.c (5258)
[    0.930821] Pid: 22, comm: xenwatch Not tainted 2.6.39-05193-gd762f43 #1
[    0.930825] Call Trace:
[    0.930834]  [<ffffffff8143bd0e>] __netdev_update_features+0xae/0xe0
[    0.930840]  [<ffffffff8143dd41>] netdev_update_features+0x11/0x30
[    0.930847]  [<ffffffffa0037105>] netback_changed+0x4e5/0x800 [xen_netfront]
[    0.930854]  [<ffffffff8132a838>] xenbus_otherend_changed+0xa8/0xb0
[    0.930860]  [<ffffffff8157ca99>] ? _raw_spin_unlock_irqrestore+0x19/0x20
[    0.930866]  [<ffffffff8132adfe>] backend_changed+0xe/0x10
[    0.930871]  [<ffffffff8132875a>] xenwatch_thread+0xba/0x180
[    0.930876]  [<ffffffff810a8ba0>] ? wake_up_bit+0x40/0x40
[    0.930881]  [<ffffffff813286a0>] ? split+0xf0/0xf0
[    0.930886]  [<ffffffff810a8646>] kthread+0x96/0xa0
[    0.930891]  [<ffffffff815855a4>] kernel_thread_helper+0x4/0x10
[    0.930896]  [<ffffffff815846b3>] ? int_ret_from_sys_call+0x7/0x1b
[    0.930901]  [<ffffffff8157cf61>] ? retint_restore_args+0x5/0x6
[    0.930906]  [<ffffffff815855a0>] ? gs_change+0x13/0x13

This update happens in xenbus watch callback context and hence does not already
hold the rtnl. Take the lock as necessary.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

EnJens pushed a commit that referenced this issue Oct 2, 2012

bonding: prevent deadlock on slave store with alb mode (v3)
This soft lockup was recently reported:

[root@dell-per715-01 ~]# echo +bond5 > /sys/class/net/bonding_masters
[root@dell-per715-01 ~]# echo +eth1 > /sys/class/net/bond5/bonding/slaves
bonding: bond5: doing slave updates when interface is down.
bonding bond5: master_dev is not up in bond_enslave
[root@dell-per715-01 ~]# echo -eth1 > /sys/class/net/bond5/bonding/slaves
bonding: bond5: doing slave updates when interface is down.

BUG: soft lockup - CPU#12 stuck for 60s! [bash:6444]
CPU 12:
Modules linked in: bonding autofs4 hidp rfcomm l2cap bluetooth lockd sunrpc
be2d
Pid: 6444, comm: bash Not tainted 2.6.18-262.el5 #1
RIP: 0010:[<ffffffff80064bf0>]  [<ffffffff80064bf0>]
.text.lock.spinlock+0x26/00
RSP: 0018:ffff810113167da8  EFLAGS: 00000286
RAX: ffff810113167fd8 RBX: ffff810123a47800 RCX: 0000000000ff1025
RDX: 0000000000000000 RSI: ffff810123a47800 RDI: ffff81021b57f6f8
RBP: ffff81021b57f500 R08: 0000000000000000 R09: 000000000000000c
R10: 00000000ffffffff R11: ffff81011d41c000 R12: ffff81021b57f000
R13: 0000000000000000 R14: 0000000000000282 R15: 0000000000000282
FS:  00002b3b41ef3f50(0000) GS:ffff810123b27940(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00002b3b456dd000 CR3: 000000031fc60000 CR4: 00000000000006e0

Call Trace:
 [<ffffffff80064af9>] _spin_lock_bh+0x9/0x14
 [<ffffffff886937d7>] :bonding:tlb_clear_slave+0x22/0xa1
 [<ffffffff8869423c>] :bonding:bond_alb_deinit_slave+0xba/0xf0
 [<ffffffff8868dda6>] :bonding:bond_release+0x1b4/0x450
 [<ffffffff8006457b>] __down_write_nested+0x12/0x92
 [<ffffffff88696ae4>] :bonding:bonding_store_slaves+0x25c/0x2f7
 [<ffffffff801106f7>] sysfs_write_file+0xb9/0xe8
 [<ffffffff80016b87>] vfs_write+0xce/0x174
 [<ffffffff80017450>] sys_write+0x45/0x6e
 [<ffffffff8005d28d>] tracesys+0xd5/0xe0

It occurs because we are able to change the slave configuarion of a bond while
the bond interface is down.  The bonding driver initializes some data structures
only after its ndo_open routine is called.  Among them is the initalization of
the alb tx and rx hash locks.  So if we add or remove a slave without first
opening the bond master device, we run the risk of trying to lock/unlock a
spinlock that has garbage for data in it, which results in our above softlock.

Note that sometimes this works, because in many cases an unlocked spinlock has
the raw_lock parameter initialized to zero (meaning that the kzalloc of the
net_device private data is equivalent to calling spin_lock_init), but thats not
true in all cases, and we aren't guaranteed that condition, so we need to pass
the relevant spinlocks through the spin_lock_init function.

Fix it by moving the spin_lock_init calls for the tx and rx hashtable locks to
the ndo_init path, so they are ready for use by the bond_store_slaves path.

Change notes:
v2) Based on conversation with Jay and Nicolas it seems that the ability to
enslave devices while the bond master is down should be safe to do.  As such
this is an outlier bug, and so instead we'll just initalize the errant spinlocks
in the init path rather than the open path, solving the problem.  We'll also
remove the warnings about the bond being down during enslave operations, since
it should be safe

v3) Fix spelling error

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Reported-by: jtluka@redhat.com
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: nicolas.2p.debian@gmail.com
CC: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

EnJens pushed a commit that referenced this issue Oct 2, 2012

brd: limit 'max_part' module param to DISK_MAX_PARTS
The 'max_part' parameter controls the number of maximum partition
a brd device can have. However if a user specifies very large
value it would exceed the limitation of device minor number and
can cause a kernel panic (or, at least, produce invalid device
nodes in some cases).

On my desktop system, following command kills the kernel. On qemu,
it triggers similar oops but the kernel was alive:

$ sudo modprobe brd max_part=100000
 BUG: unable to handle kernel NULL pointer dereference at 0000000000000058
 IP: [<ffffffff81110a9a>] sysfs_create_dir+0x2d/0xae
 PGD 7af1067 PUD 7b19067 PMD 0
 Oops: 0000 [#1] SMP
 last sysfs file:
 CPU 0
 Modules linked in: brd(+)

 Pid: 44, comm: insmod Tainted: G        W   2.6.39-qemu+ #158 Bochs Bochs
 RIP: 0010:[<ffffffff81110a9a>]  [<ffffffff81110a9a>] sysfs_create_dir+0x2d/0xae
 RSP: 0018:ffff880007b15d78  EFLAGS: 00000286
 RAX: ffff880007b05478 RBX: ffff880007a52760 RCX: ffff880007b15dc8
 RDX: ffff880007a4f900 RSI: ffff880007b15e48 RDI: ffff880007a52760
 RBP: ffff880007b15da8 R08: 0000000000000002 R09: 0000000000000000
 R10: ffff880007b15e48 R11: ffff880007b05478 R12: 0000000000000000
 R13: ffff880007b05478 R14: 0000000000400920 R15: 0000000000000063
 FS:  0000000002160880(0063) GS:ffff880007c00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000000000000058 CR3: 0000000007b1c000 CR4: 00000000000006b0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 0000000000000000 DR7: 0000000000000000
 Process insmod (pid: 44, threadinfo ffff880007b14000, task ffff880007acb980)
 Stack:
  ffff880007b15dc8 ffff880007b05478 ffff880007b15da8 00000000fffffffe
  ffff880007a52760 ffff880007b05478 ffff880007b15de8 ffffffff81143c0a
  0000000000400920 ffff880007a52760 ffff880007b05478 0000000000000000
 Call Trace:
  [<ffffffff81143c0a>] kobject_add_internal+0xdf/0x1a0
  [<ffffffff81143da1>] kobject_add_varg+0x41/0x50
  [<ffffffff81143e6b>] kobject_add+0x64/0x66
  [<ffffffff8113bbe7>] blk_register_queue+0x5f/0xb8
  [<ffffffff81140f72>] add_disk+0xdf/0x289
  [<ffffffffa00040df>] brd_init+0xdf/0x1aa [brd]
  [<ffffffffa0004000>] ? 0xffffffffa0003fff
  [<ffffffffa0004000>] ? 0xffffffffa0003fff
  [<ffffffff8100020a>] do_one_initcall+0x7a/0x12e
  [<ffffffff8108516c>] sys_init_module+0x9c/0x1dc
  [<ffffffff812ff4bb>] system_call_fastpath+0x16/0x1b
 Code: 89 e5 41 55 41 54 53 48 89 fb 48 83 ec 18 48 85 ff 75 04 0f 0b eb fe 48 8b 47 18 49 c7 c4 70 1e 4d 81 48 85 c0 74 04 4c 8b 60 30
  8b 44 24 58 45 31 ed 0f b6 c4 85 c0 74 0d 48 8b 43 28 48 89
 RIP  [<ffffffff81110a9a>] sysfs_create_dir+0x2d/0xae
  RSP <ffff880007b15d78>
 CR2: 0000000000000058
 ---[ end trace aebb1175ce1f6739 ]---

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Cc: Laurent Vivier <Laurent.Vivier@bull.net>
Cc: stable@kernel.org
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

EnJens pushed a commit that referenced this issue Oct 2, 2012

nbd: limit module parameters to a sane value
The 'max_part' parameter controls the number of maximum partition
a nbd device can have. However if a user specifies very large
value it would exceed the limitation of device minor number and
can cause a kernel oops (or, at least, produce invalid device
nodes in some cases).

In addition, specifying large 'nbds_max' value causes same
problem for the same reason.

On my desktop, following command results to the kernel bug:

$ sudo modprobe nbd max_part=100000
 kernel BUG at /media/Linux_Data/project/linux/fs/sysfs/group.c:65!
 invalid opcode: 0000 [#1] SMP
 last sysfs file: /sys/devices/virtual/block/nbd4/range
 CPU 1
 Modules linked in: nbd(+) bridge stp llc kvm_intel kvm asus_atk0110 sg sr_mod cdrom

 Pid: 2522, comm: modprobe Tainted: G        W   2.6.39-leonard+ #159 System manufacturer System Product Name/P5G41TD-M PRO
 RIP: 0010:[<ffffffff8115aa08>]  [<ffffffff8115aa08>] internal_create_group+0x2f/0x166
 RSP: 0018:ffff8801009f1de8  EFLAGS: 00010246
 RAX: 00000000ffffffef RBX: ffff880103920478 RCX: 00000000000a7bd3
 RDX: ffffffff81a2dbe0 RSI: 0000000000000000 RDI: ffff880103920478
 RBP: ffff8801009f1e38 R08: ffff880103920468 R09: ffff880103920478
 R10: ffff8801009f1de8 R11: ffff88011eccbb68 R12: ffffffff81a2dbe0
 R13: ffff880103920468 R14: 0000000000000000 R15: ffff880103920400
 FS:  00007f3c49de9700(0000) GS:ffff88011f800000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
 CR2: 00007f3b7fe7c000 CR3: 00000000cd58d000 CR4: 00000000000406e0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
 Process modprobe (pid: 2522, threadinfo ffff8801009f0000, task ffff8801009a93a0)
 Stack:
  ffff8801009f1e58 ffffffff812e8f6e ffff8801009f1e58 ffffffff812e7a80
  ffff880000000010 ffff880103920400 ffff8801002fd0c0 ffff880103920468
  0000000000000011 ffff880103920400 ffff8801009f1e48 ffffffff8115ab6a
 Call Trace:
  [<ffffffff812e8f6e>] ? device_add+0x4f1/0x5e4
  [<ffffffff812e7a80>] ? dev_set_name+0x41/0x43
  [<ffffffff8115ab6a>] sysfs_create_group+0x13/0x15
  [<ffffffff810b857e>] blk_trace_init_sysfs+0x14/0x16
  [<ffffffff811ee58b>] blk_register_queue+0x4c/0xfd
  [<ffffffff811f3bdf>] add_disk+0xe4/0x29c
  [<ffffffffa007e2ab>] nbd_init+0x2ab/0x30d [nbd]
  [<ffffffffa007e000>] ? 0xffffffffa007dfff
  [<ffffffff8100020f>] do_one_initcall+0x7f/0x13e
  [<ffffffff8107ab0a>] sys_init_module+0xa1/0x1e3
  [<ffffffff814f3542>] system_call_fastpath+0x16/0x1b
 Code: 41 57 41 56 41 55 41 54 53 48 83 ec 28 0f 1f 44 00 00 48 89 fb 41 89 f6 49 89 d4 48 85 ff 74 0b 85 f6 75 0b 48 83
  7f 30 00 75 14 <0f> 0b eb fe b9 ea ff ff ff 48 83 7f 30 00 0f 84 09 01 00 00 49
 RIP  [<ffffffff8115aa08>] internal_create_group+0x2f/0x166
  RSP <ffff8801009f1de8>
 ---[ end trace 753285ffbf72c57c ]---

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Cc: Laurent Vivier <Laurent.Vivier@bull.net>
Cc: Paul Clements <Paul.Clements@steeleye.com>
Cc: stable@kernel.org
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

EnJens pushed a commit that referenced this issue Oct 2, 2012

rcu: Avoid acquiring rcu_node locks in timer functions
This commit switches manipulations of the rcu_node ->wakemask field
to atomic operations, which allows rcu_cpu_kthread_timer() to avoid
acquiring the rcu_node lock.  This should avoid the following lockdep
splat reported by Valdis Kletnieks:

[   12.872150] usb 1-4: new high speed USB device number 3 using ehci_hcd
[   12.986667] usb 1-4: New USB device found, idVendor=413c, idProduct=2513
[   12.986679] usb 1-4: New USB device strings: Mfr=0, Product=0, SerialNumber=0
[   12.987691] hub 1-4:1.0: USB hub found
[   12.987877] hub 1-4:1.0: 3 ports detected
[   12.996372] input: PS/2 Generic Mouse as /devices/platform/i8042/serio1/input/input10
[   13.071471] udevadm used greatest stack depth: 3984 bytes left
[   13.172129]
[   13.172130] =======================================================
[   13.172425] [ INFO: possible circular locking dependency detected ]
[   13.172650] 2.6.39-rc6-mmotm0506 #1
[   13.172773] -------------------------------------------------------
[   13.172997] blkid/267 is trying to acquire lock:
[   13.173009]  (&p->pi_lock){-.-.-.}, at: [<ffffffff81032d8f>] try_to_wake_up+0x29/0x1aa
[   13.173009]
[   13.173009] but task is already holding lock:
[   13.173009]  (rcu_node_level_0){..-...}, at: [<ffffffff810901cc>] rcu_cpu_kthread_timer+0x27/0x58
[   13.173009]
[   13.173009] which lock already depends on the new lock.
[   13.173009]
[   13.173009]
[   13.173009] the existing dependency chain (in reverse order) is:
[   13.173009]
[   13.173009] -> #2 (rcu_node_level_0){..-...}:
[   13.173009]        [<ffffffff810679b9>] check_prevs_add+0x8b/0x104
[   13.173009]        [<ffffffff81067da1>] validate_chain+0x36f/0x3ab
[   13.173009]        [<ffffffff8106846b>] __lock_acquire+0x369/0x3e2
[   13.173009]        [<ffffffff81068a0f>] lock_acquire+0xfc/0x14c
[   13.173009]        [<ffffffff815697f1>] _raw_spin_lock+0x36/0x45
[   13.173009]        [<ffffffff81090794>] rcu_read_unlock_special+0x8c/0x1d5
[   13.173009]        [<ffffffff8109092c>] __rcu_read_unlock+0x4f/0xd7
[   13.173009]        [<ffffffff81027bd3>] rcu_read_unlock+0x21/0x23
[   13.173009]        [<ffffffff8102cc34>] cpuacct_charge+0x6c/0x75
[   13.173009]        [<ffffffff81030cc6>] update_curr+0x101/0x12e
[   13.173009]        [<ffffffff810311d0>] check_preempt_wakeup+0xf7/0x23b
[   13.173009]        [<ffffffff8102acb3>] check_preempt_curr+0x2b/0x68
[   13.173009]        [<ffffffff81031d40>] ttwu_do_wakeup+0x76/0x128
[   13.173009]        [<ffffffff81031e49>] ttwu_do_activate.constprop.63+0x57/0x5c
[   13.173009]        [<ffffffff81031e96>] scheduler_ipi+0x48/0x5d
[   13.173009]        [<ffffffff810177d5>] smp_reschedule_interrupt+0x16/0x18
[   13.173009]        [<ffffffff815710f3>] reschedule_interrupt+0x13/0x20
[   13.173009]        [<ffffffff810b66d1>] rcu_read_unlock+0x21/0x23
[   13.173009]        [<ffffffff810b739c>] find_get_page+0xa9/0xb9
[   13.173009]        [<ffffffff810b8b48>] filemap_fault+0x6a/0x34d
[   13.173009]        [<ffffffff810d1a25>] __do_fault+0x54/0x3e6
[   13.173009]        [<ffffffff810d447a>] handle_pte_fault+0x12c/0x1ed
[   13.173009]        [<ffffffff810d48f7>] handle_mm_fault+0x1cd/0x1e0
[   13.173009]        [<ffffffff8156cfee>] do_page_fault+0x42d/0x5de
[   13.173009]        [<ffffffff8156a75f>] page_fault+0x1f/0x30
[   13.173009]
[   13.173009] -> #1 (&rq->lock){-.-.-.}:
[   13.173009]        [<ffffffff810679b9>] check_prevs_add+0x8b/0x104
[   13.173009]        [<ffffffff81067da1>] validate_chain+0x36f/0x3ab
[   13.173009]        [<ffffffff8106846b>] __lock_acquire+0x369/0x3e2
[   13.173009]        [<ffffffff81068a0f>] lock_acquire+0xfc/0x14c
[   13.173009]        [<ffffffff815697f1>] _raw_spin_lock+0x36/0x45
[   13.173009]        [<ffffffff81027e19>] __task_rq_lock+0x8b/0xd3
[   13.173009]        [<ffffffff81032f7f>] wake_up_new_task+0x41/0x108
[   13.173009]        [<ffffffff810376c3>] do_fork+0x265/0x33f
[   13.173009]        [<ffffffff81007d02>] kernel_thread+0x6b/0x6d
[   13.173009]        [<ffffffff8153a9dd>] rest_init+0x21/0xd2
[   13.173009]        [<ffffffff81b1db4f>] start_kernel+0x3bb/0x3c6
[   13.173009]        [<ffffffff81b1d29f>] x86_64_start_reservations+0xaf/0xb3
[   13.173009]        [<ffffffff81b1d393>] x86_64_start_kernel+0xf0/0xf7
[   13.173009]
[   13.173009] -> #0 (&p->pi_lock){-.-.-.}:
[   13.173009]        [<ffffffff81067788>] check_prev_add+0x68/0x20e
[   13.173009]        [<ffffffff810679b9>] check_prevs_add+0x8b/0x104
[   13.173009]        [<ffffffff81067da1>] validate_chain+0x36f/0x3ab
[   13.173009]        [<ffffffff8106846b>] __lock_acquire+0x369/0x3e2
[   13.173009]        [<ffffffff81068a0f>] lock_acquire+0xfc/0x14c
[   13.173009]        [<ffffffff815698ea>] _raw_spin_lock_irqsave+0x44/0x57
[   13.173009]        [<ffffffff81032d8f>] try_to_wake_up+0x29/0x1aa
[   13.173009]        [<ffffffff81032f3c>] wake_up_process+0x10/0x12
[   13.173009]        [<ffffffff810901e9>] rcu_cpu_kthread_timer+0x44/0x58
[   13.173009]        [<ffffffff81045286>] call_timer_fn+0xac/0x1e9
[   13.173009]        [<ffffffff8104556d>] run_timer_softirq+0x1aa/0x1f2
[   13.173009]        [<ffffffff8103e487>] __do_softirq+0x109/0x26a
[   13.173009]        [<ffffffff8157144c>] call_softirq+0x1c/0x30
[   13.173009]        [<ffffffff81003207>] do_softirq+0x44/0xf1
[   13.173009]        [<ffffffff8103e8b9>] irq_exit+0x58/0xc8
[   13.173009]        [<ffffffff81017f5a>] smp_apic_timer_interrupt+0x79/0x87
[   13.173009]        [<ffffffff81570fd3>] apic_timer_interrupt+0x13/0x20
[   13.173009]        [<ffffffff810bd51a>] get_page_from_freelist+0x2aa/0x310
[   13.173009]        [<ffffffff810bdf03>] __alloc_pages_nodemask+0x178/0x243
[   13.173009]        [<ffffffff8101fe2f>] pte_alloc_one+0x1e/0x3a
[   13.173009]        [<ffffffff810d27fe>] __pte_alloc+0x22/0x14b
[   13.173009]        [<ffffffff810d48a8>] handle_mm_fault+0x17e/0x1e0
[   13.173009]        [<ffffffff8156cfee>] do_page_fault+0x42d/0x5de
[   13.173009]        [<ffffffff8156a75f>] page_fault+0x1f/0x30
[   13.173009]
[   13.173009] other info that might help us debug this:
[   13.173009]
[   13.173009] Chain exists of:
[   13.173009]   &p->pi_lock --> &rq->lock --> rcu_node_level_0
[   13.173009]
[   13.173009]  Possible unsafe locking scenario:
[   13.173009]
[   13.173009]        CPU0                    CPU1
[   13.173009]        ----                    ----
[   13.173009]   lock(rcu_node_level_0);
[   13.173009]                                lock(&rq->lock);
[   13.173009]                                lock(rcu_node_level_0);
[   13.173009]   lock(&p->pi_lock);
[   13.173009]
[   13.173009]  *** DEADLOCK ***
[   13.173009]
[   13.173009] 3 locks held by blkid/267:
[   13.173009]  #0:  (&mm->mmap_sem){++++++}, at: [<ffffffff8156cdb4>] do_page_fault+0x1f3/0x5de
[   13.173009]  #1:  (&yield_timer){+.-...}, at: [<ffffffff810451da>] call_timer_fn+0x0/0x1e9
[   13.173009]  #2:  (rcu_node_level_0){..-...}, at: [<ffffffff810901cc>] rcu_cpu_kthread_timer+0x27/0x58
[   13.173009]
[   13.173009] stack backtrace:
[   13.173009] Pid: 267, comm: blkid Not tainted 2.6.39-rc6-mmotm0506 #1
[   13.173009] Call Trace:
[   13.173009]  <IRQ>  [<ffffffff8154a529>] print_circular_bug+0xc8/0xd9
[   13.173009]  [<ffffffff81067788>] check_prev_add+0x68/0x20e
[   13.173009]  [<ffffffff8100c861>] ? save_stack_trace+0x28/0x46
[   13.173009]  [<ffffffff810679b9>] check_prevs_add+0x8b/0x104
[   13.173009]  [<ffffffff81067da1>] validate_chain+0x36f/0x3ab
[   13.173009]  [<ffffffff8106846b>] __lock_acquire+0x369/0x3e2
[   13.173009]  [<ffffffff81032d8f>] ? try_to_wake_up+0x29/0x1aa
[   13.173009]  [<ffffffff81068a0f>] lock_acquire+0xfc/0x14c
[   13.173009]  [<ffffffff81032d8f>] ? try_to_wake_up+0x29/0x1aa
[   13.173009]  [<ffffffff810901a5>] ? rcu_check_quiescent_state+0x82/0x82
[   13.173009]  [<ffffffff815698ea>] _raw_spin_lock_irqsave+0x44/0x57
[   13.173009]  [<ffffffff81032d8f>] ? try_to_wake_up+0x29/0x1aa
[   13.173009]  [<ffffffff81032d8f>] try_to_wake_up+0x29/0x1aa
[   13.173009]  [<ffffffff810901a5>] ? rcu_check_quiescent_state+0x82/0x82
[   13.173009]  [<ffffffff81032f3c>] wake_up_process+0x10/0x12
[   13.173009]  [<ffffffff810901e9>] rcu_cpu_kthread_timer+0x44/0x58
[   13.173009]  [<ffffffff810901a5>] ? rcu_check_quiescent_state+0x82/0x82
[   13.173009]  [<ffffffff81045286>] call_timer_fn+0xac/0x1e9
[   13.173009]  [<ffffffff810451da>] ? del_timer+0x75/0x75
[   13.173009]  [<ffffffff810901a5>] ? rcu_check_quiescent_state+0x82/0x82
[   13.173009]  [<ffffffff8104556d>] run_timer_softirq+0x1aa/0x1f2
[   13.173009]  [<ffffffff8103e487>] __do_softirq+0x109/0x26a
[   13.173009]  [<ffffffff8106365f>] ? tick_dev_program_event+0x37/0xf6
[   13.173009]  [<ffffffff810a0e4a>] ? time_hardirqs_off+0x1b/0x2f
[   13.173009]  [<ffffffff8157144c>] call_softirq+0x1c/0x30
[   13.173009]  [<ffffffff81003207>] do_softirq+0x44/0xf1
[   13.173009]  [<ffffffff8103e8b9>] irq_exit+0x58/0xc8
[   13.173009]  [<ffffffff81017f5a>] smp_apic_timer_interrupt+0x79/0x87
[   13.173009]  [<ffffffff81570fd3>] apic_timer_interrupt+0x13/0x20
[   13.173009]  <EOI>  [<ffffffff810bd384>] ? get_page_from_freelist+0x114/0x310
[   13.173009]  [<ffffffff810bd51a>] ? get_page_from_freelist+0x2aa/0x310
[   13.173009]  [<ffffffff812220e7>] ? clear_page_c+0x7/0x10
[   13.173009]  [<ffffffff810bd1ef>] ? prep_new_page+0x14c/0x1cd
[   13.173009]  [<ffffffff810bd51a>] get_page_from_freelist+0x2aa/0x310
[   13.173009]  [<ffffffff810bdf03>] __alloc_pages_nodemask+0x178/0x243
[   13.173009]  [<ffffffff810d46b9>] ? __pmd_alloc+0x87/0x99
[   13.173009]  [<ffffffff8101fe2f>] pte_alloc_one+0x1e/0x3a
[   13.173009]  [<ffffffff810d46b9>] ? __pmd_alloc+0x87/0x99
[   13.173009]  [<ffffffff810d27fe>] __pte_alloc+0x22/0x14b
[   13.173009]  [<ffffffff810d48a8>] handle_mm_fault+0x17e/0x1e0
[   13.173009]  [<ffffffff8156cfee>] do_page_fault+0x42d/0x5de
[   13.173009]  [<ffffffff810d915f>] ? sys_brk+0x32/0x10c
[   13.173009]  [<ffffffff810a0e4a>] ? time_hardirqs_off+0x1b/0x2f
[   13.173009]  [<ffffffff81065c4f>] ? trace_hardirqs_off_caller+0x3f/0x9c
[   13.173009]  [<ffffffff812235dd>] ? trace_hardirqs_off_thunk+0x3a/0x3c
[   13.173009]  [<ffffffff8156a75f>] page_fault+0x1f/0x30
[   14.010075] usb 5-1: new full speed USB device number 2 using uhci_hcd

Reported-by: Valdis Kletnieks <Valdis.Kletnieks@vt.edu>
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

EnJens pushed a commit that referenced this issue Oct 2, 2012

[S390] mm: fix storage key handling
page_get_storage_key() and page_set_storage_key() expect a page address
and not its page frame number. This got inconsistent with 2d42552
"[S390] merge page_test_dirty and page_clear_dirty".

Result is that we read/write storage keys from random pages and do not
have a working dirty bit tracking at all.
E.g. SetPageUpdate() doesn't clear the dirty bit of requested pages, which
for example ext4 doesn't like very much and panics after a while.

Unable to handle kernel paging request at virtual user address (null)
Oops: 0004 [#1] PREEMPT SMP DEBUG_PAGEALLOC
Modules linked in:
CPU: 1 Not tainted 2.6.39-07551-g139f37f-dirty #152
Process flush-94:0 (pid: 1576, task: 000000003eb34538, ksp: 000000003c287b70)
Krnl PSW : 0704c00180000000 0000000000316b12 (jbd2_journal_file_inode+0x10e/0x138)
           R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 EA:3
Krnl GPRS: 0000000000000000 0000000000000000 0000000000000000 0700000000000000
           0000000000316a62 000000003eb34cd0 0000000000000025 000000003c287b88
           0000000000000001 000000003c287a70 000000003f1ec678 000000003f1ec000
           0000000000000000 000000003e66ec00 0000000000316a62 000000003c287988
Krnl Code: 0000000000316b04: f0a0000407f4       srp     4(11,%r0),2036,0
           0000000000316b0a: b9020022           ltgr    %r2,%r2
           0000000000316b0e: a7740015           brc     7,316b38
          >0000000000316b12: e3d0c0000024       stg     %r13,0(%r12)
           0000000000316b18: 4120c010           la      %r2,16(%r12)
           0000000000316b1c: 4130d060           la      %r3,96(%r13)
           0000000000316b20: e340d0600004       lg      %r4,96(%r13)
           0000000000316b26: c0e50002b567       brasl   %r14,36d5f4
Call Trace:
([<0000000000316a62>] jbd2_journal_file_inode+0x5e/0x138)
 [<00000000002da13c>] mpage_da_map_and_submit+0x2e8/0x42c
 [<00000000002daac2>] ext4_da_writepages+0x2da/0x504
 [<00000000002597e8>] writeback_single_inode+0xf8/0x268
 [<0000000000259f06>] writeback_sb_inodes+0xd2/0x18c
 [<000000000025a700>] writeback_inodes_wb+0x80/0x168
 [<000000000025aa92>] wb_writeback+0x2aa/0x324
 [<000000000025abde>] wb_do_writeback+0xd2/0x274
 [<000000000025ae3a>] bdi_writeback_thread+0xba/0x1c4
 [<00000000001737be>] kthread+0xa6/0xb0
 [<000000000056c1da>] kernel_thread_starter+0x6/0xc
 [<000000000056c1d4>] kernel_thread_starter+0x0/0xc
INFO: lockdep is turned off.
Last Breaking-Event-Address:
 [<0000000000316a8a>] jbd2_journal_file_inode+0x86/0x138

Reported-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>

EnJens pushed a commit that referenced this issue Oct 2, 2012

dmaengine: shdma: Fix up fallout from runtime PM changes.
The runtime PM changes introduce sh_dmae_rst() wrapping via the
runtime_resume helper, depending on dev_get_drvdata() to fetch the
platform data needed for the DMAOR initialization default at a time
where drvdata hasn't yet been established by the probe path, resulting
in general probe misery:

        Unable to handle kernel NULL pointer dereference at virtual address 000000c4
        pc = 8025adee
        *pde = 00000000
        Oops: 0000 [#1]
        Modules linked in:

        Pid : 1, Comm:           swapper
        CPU : 0                  Not tainted  (3.0.0-rc1-00012-g9436b4a-dirty #1456)

        PC is at sh_dmae_rst+0x28/0x86
        PR is at sh_dmae_rst+0x22/0x86
        PC  : 8025adee SP  : 9e803d10 SR  : 400080f1 TEA : 000000c4
        R0  : 000000c4 R1  : 0000fff8 R2  : 00000000 R3  : 00000040
        R4  : 000000f0 R5  : 00000000 R6  : 00000000 R7  : 804f184c
        R8  : 00000000 R9  : 804dd0e8 R10 : 80283204 R11 : ffffffda
        R12 : 000000a0 R13 : 804dd18c R14 : 9e803d10
        MACH: 00000000 MACL: 00008f20 GBR : 00000000 PR  : 8025ade8

        Call trace:
        [<8025ae70>] sh_dmae_runtime_resume+0x24/0x34
        [<80283238>] pm_generic_runtime_resume+0x34/0x3c
        [<80283370>] rpm_callback+0x4a/0x7e
        [<80283efc>] rpm_resume+0x240/0x384
        [<80283f54>] rpm_resume+0x298/0x384
        [<8028428c>] __pm_runtime_resume+0x44/0x7c
        [<8038a358>] __ioremap_caller+0x0/0xec
        [<80284296>] __pm_runtime_resume+0x4e/0x7c
        [<8038a358>] __ioremap_caller+0x0/0xec
        [<80666254>] sh_dmae_probe+0x180/0x6a0
        [<802803ae>] platform_drv_probe+0x26/0x2e

Fix up the ordering accordingly.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>

EnJens pushed a commit that referenced this issue Oct 2, 2012

oprofile: Fix locking dependency in sync_start()
This fixes the A->B/B->A locking dependency, see the warning below.

The function task_exit_notify() is called with (task_exit_notifier)
.rwsem set and then calls sync_buffer() which locks buffer_mutex. In
sync_start() the buffer_mutex was set to prevent notifier functions to
be started before sync_start() is finished. But when registering the
notifier, (task_exit_notifier).rwsem is locked too, but now in
different order than in sync_buffer(). In theory this causes a locking
dependency, what does not occur in practice since task_exit_notify()
is always called after the notifier is registered which means the lock
is already released.

However, after checking the notifier functions it turned out the
buffer_mutex in sync_start() is unnecessary. This is because
sync_buffer() may be called from the notifiers even if sync_start()
did not finish yet, the buffers are already allocated but empty. No
need to protect this with the mutex.

So we fix this theoretical locking dependency by removing buffer_mutex
in sync_start(). This is similar to the implementation before commit:

 750d857 oprofile: fix crash when accessing freed task structs

which introduced the locking dependency.

Lockdep warning:

oprofiled/4447 is trying to acquire lock:
 (buffer_mutex){+.+...}, at: [<ffffffffa0000e55>] sync_buffer+0x31/0x3ec [oprofile]

but task is already holding lock:
 ((task_exit_notifier).rwsem){++++..}, at: [<ffffffff81058026>] __blocking_notifier_call_chain+0x39/0x67

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 ((task_exit_notifier).rwsem){++++..}:
       [<ffffffff8106557f>] lock_acquire+0xf8/0x11e
       [<ffffffff81463a2b>] down_write+0x44/0x67
       [<ffffffff810581c0>] blocking_notifier_chain_register+0x52/0x8b
       [<ffffffff8105a6ac>] profile_event_register+0x2d/0x2f
       [<ffffffffa00013c1>] sync_start+0x47/0xc6 [oprofile]
       [<ffffffffa00001bb>] oprofile_setup+0x60/0xa5 [oprofile]
       [<ffffffffa00014e3>] event_buffer_open+0x59/0x8c [oprofile]
       [<ffffffff810cd3b9>] __dentry_open+0x1eb/0x308
       [<ffffffff810cd59d>] nameidata_to_filp+0x60/0x67
       [<ffffffff810daad6>] do_last+0x5be/0x6b2
       [<ffffffff810dbc33>] path_openat+0xc7/0x360
       [<ffffffff810dbfc5>] do_filp_open+0x3d/0x8c
       [<ffffffff810ccfd2>] do_sys_open+0x110/0x1a9
       [<ffffffff810cd09e>] sys_open+0x20/0x22
       [<ffffffff8146ad4b>] system_call_fastpath+0x16/0x1b

-> #0 (buffer_mutex){+.+...}:
       [<ffffffff81064dfb>] __lock_acquire+0x1085/0x1711
       [<ffffffff8106557f>] lock_acquire+0xf8/0x11e
       [<ffffffff814634f0>] mutex_lock_nested+0x63/0x309
       [<ffffffffa0000e55>] sync_buffer+0x31/0x3ec [oprofile]
       [<ffffffffa0001226>] task_exit_notify+0x16/0x1a [oprofile]
       [<ffffffff81467b96>] notifier_call_chain+0x37/0x63
       [<ffffffff8105803d>] __blocking_notifier_call_chain+0x50/0x67
       [<ffffffff81058068>] blocking_notifier_call_chain+0x14/0x16
       [<ffffffff8105a718>] profile_task_exit+0x1a/0x1c
       [<ffffffff81039e8f>] do_exit+0x2a/0x6fc
       [<ffffffff8103a5e4>] do_group_exit+0x83/0xae
       [<ffffffff8103a626>] sys_exit_group+0x17/0x1b
       [<ffffffff8146ad4b>] system_call_fastpath+0x16/0x1b

other info that might help us debug this:

1 lock held by oprofiled/4447:
 #0:  ((task_exit_notifier).rwsem){++++..}, at: [<ffffffff81058026>] __blocking_notifier_call_chain+0x39/0x67

stack backtrace:
Pid: 4447, comm: oprofiled Not tainted 2.6.39-00007-gcf4d8d4 #10
Call Trace:
 [<ffffffff81063193>] print_circular_bug+0xae/0xbc
 [<ffffffff81064dfb>] __lock_acquire+0x1085/0x1711
 [<ffffffffa0000e55>] ? sync_buffer+0x31/0x3ec [oprofile]
 [<ffffffff8106557f>] lock_acquire+0xf8/0x11e
 [<ffffffffa0000e55>] ? sync_buffer+0x31/0x3ec [oprofile]
 [<ffffffff81062627>] ? mark_lock+0x42f/0x552
 [<ffffffffa0000e55>] ? sync_buffer+0x31/0x3ec [oprofile]
 [<ffffffff814634f0>] mutex_lock_nested+0x63/0x309
 [<ffffffffa0000e55>] ? sync_buffer+0x31/0x3ec [oprofile]
 [<ffffffffa0000e55>] sync_buffer+0x31/0x3ec [oprofile]
 [<ffffffff81058026>] ? __blocking_notifier_call_chain+0x39/0x67
 [<ffffffff81058026>] ? __blocking_notifier_call_chain+0x39/0x67
 [<ffffffffa0001226>] task_exit_notify+0x16/0x1a [oprofile]
 [<ffffffff81467b96>] notifier_call_chain+0x37/0x63
 [<ffffffff8105803d>] __blocking_notifier_call_chain+0x50/0x67
 [<ffffffff81058068>] blocking_notifier_call_chain+0x14/0x16
 [<ffffffff8105a718>] profile_task_exit+0x1a/0x1c
 [<ffffffff81039e8f>] do_exit+0x2a/0x6fc
 [<ffffffff81465031>] ? retint_swapgs+0xe/0x13
 [<ffffffff8103a5e4>] do_group_exit+0x83/0xae
 [<ffffffff8103a626>] sys_exit_group+0x17/0x1b
 [<ffffffff8146ad4b>] system_call_fastpath+0x16/0x1b

Reported-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Carl Love <carll@us.ibm.com>
Cc: <stable@kernel.org> # .36+
Signed-off-by: Robert Richter <robert.richter@amd.com>

EnJens pushed a commit that referenced this issue Oct 2, 2012

oprofile, dcookies: Fix possible circular locking dependency
The lockdep warning below detects a possible A->B/B->A locking
dependency of mm->mmap_sem and dcookie_mutex. The order in
sync_buffer() is mm->mmap_sem/dcookie_mutex, while in
sys_lookup_dcookie() it is vice versa.

Fixing it in sys_lookup_dcookie() by unlocking dcookie_mutex before
copy_to_user().

oprofiled/4432 is trying to acquire lock:
 (&mm->mmap_sem){++++++}, at: [<ffffffff810b444b>] might_fault+0x53/0xa3

but task is already holding lock:
 (dcookie_mutex){+.+.+.}, at: [<ffffffff81124d28>] sys_lookup_dcookie+0x45/0x149

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 (dcookie_mutex){+.+.+.}:
       [<ffffffff8106557f>] lock_acquire+0xf8/0x11e
       [<ffffffff814634f0>] mutex_lock_nested+0x63/0x309
       [<ffffffff81124e5c>] get_dcookie+0x30/0x144
       [<ffffffffa0000fba>] sync_buffer+0x196/0x3ec [oprofile]
       [<ffffffffa0001226>] task_exit_notify+0x16/0x1a [oprofile]
       [<ffffffff81467b96>] notifier_call_chain+0x37/0x63
       [<ffffffff8105803d>] __blocking_notifier_call_chain+0x50/0x67
       [<ffffffff81058068>] blocking_notifier_call_chain+0x14/0x16
       [<ffffffff8105a718>] profile_task_exit+0x1a/0x1c
       [<ffffffff81039e8f>] do_exit+0x2a/0x6fc
       [<ffffffff8103a5e4>] do_group_exit+0x83/0xae
       [<ffffffff8103a626>] sys_exit_group+0x17/0x1b
       [<ffffffff8146ad4b>] system_call_fastpath+0x16/0x1b

-> #0 (&mm->mmap_sem){++++++}:
       [<ffffffff81064dfb>] __lock_acquire+0x1085/0x1711
       [<ffffffff8106557f>] lock_acquire+0xf8/0x11e
       [<ffffffff810b4478>] might_fault+0x80/0xa3
       [<ffffffff81124de7>] sys_lookup_dcookie+0x104/0x149
       [<ffffffff8146ad4b>] system_call_fastpath+0x16/0x1b

other info that might help us debug this:

1 lock held by oprofiled/4432:
 #0:  (dcookie_mutex){+.+.+.}, at: [<ffffffff81124d28>] sys_lookup_dcookie+0x45/0x149

stack backtrace:
Pid: 4432, comm: oprofiled Not tainted 2.6.39-00008-ge5a450d #9
Call Trace:
 [<ffffffff81063193>] print_circular_bug+0xae/0xbc
 [<ffffffff81064dfb>] __lock_acquire+0x1085/0x1711
 [<ffffffff8102ef13>] ? get_parent_ip+0x11/0x42
 [<ffffffff810b444b>] ? might_fault+0x53/0xa3
 [<ffffffff8106557f>] lock_acquire+0xf8/0x11e
 [<ffffffff810b444b>] ? might_fault+0x53/0xa3
 [<ffffffff810d7d54>] ? path_put+0x22/0x27
 [<ffffffff810b4478>] might_fault+0x80/0xa3
 [<ffffffff810b444b>] ? might_fault+0x53/0xa3
 [<ffffffff81124de7>] sys_lookup_dcookie+0x104/0x149
 [<ffffffff8146ad4b>] system_call_fastpath+0x16/0x1b

References: https://bugzilla.kernel.org/show_bug.cgi?id=13809
Cc: <stable@kernel.org> # .27+
Signed-off-by: Robert Richter <robert.richter@amd.com>

EnJens pushed a commit that referenced this issue Oct 2, 2012

ssb: fix PCI(e) driver regression causing oops on PCI cards
We were incorrectly executing PCIe specific workarounds on PCI cards.
This resulted in:
Machine check in kernel mode.
Caused by (from SRR1=149030): Transfer error ack signal
Oops: Machine check, sig: 7 [#1]

Reported-by: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Rafał Miłecki <zajec5@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

EnJens pushed a commit that referenced this issue Oct 2, 2012

st_kim: Handle case of no device found for ID 0
Running ktest.pl, I hit this bug:

[   19.780654] BUG: unable to handle kernel NULL pointer dereference at 0000000c
[   19.780660] IP: [<c112efcd>] dev_get_drvdata+0xc/0x46
[   19.780669] *pdpt = 0000000031daf001 *pde = 0000000000000000
[   19.780673] Oops: 0000 [#1] SMP
[   19.780680] Dumping ftrace buffer:^M
[   19.780685]    (ftrace buffer empty)
[   19.780687] Modules linked in: ide_pci_generic firewire_ohci firewire_core evbug crc_itu_t e1000 ide_core i2c_i801 iTCO_wdt
[   19.780697]
[   19.780700] Pid: 346, comm: v4l_id Not tainted 2.6.39-test-02740-gcaebc16-dirty #4                  /DG965MQ
[   19.780706] EIP: 0060:[<c112efcd>] EFLAGS: 00010202 CPU: 0
[   19.780709] EIP is at dev_get_drvdata+0xc/0x46
[   19.780712] EAX: 00000008 EBX: f1e37da4 ECX: 00000000 EDX: 00000000
[   19.780715] ESI: f1c3f200 EDI: c33ec95c EBP: f1e37d80 ESP: f1e37d80
[   19.780718]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
[   19.780721] Process v4l_id (pid: 346, ti=f1e36000 task=f2bc2a60 task.ti=f1e36000)
[   19.780723] Stack:
[   19.780725]  f1e37d8c c117d395 c33ec93c f1e37db4 c117a0f9 00000002 00000000 c1725e54
[   19.780732]  00000001 00000007 f2918c90 f1c3f200 c33ec95c f1e37dd4 c1789d3d 22222222
[   19.780740]  22222222 22222222 f2918c90 f1c3f200 f29194f4 f1e37de8 c178d5c4 c1725e54
[   19.780747] Call Trace:
[   19.780752]  [<c117d395>] st_kim_ref+0x28/0x41
[   19.780756]  [<c117a0f9>] st_register+0x29/0x562
[   19.780761]  [<c1725e54>] ? v4l2_open+0x111/0x1e3
[   19.780766]  [<c1789d3d>] fmc_prepare+0x97/0x424
[   19.780770]  [<c178d5c4>] fm_v4l2_fops_open+0x70/0x106
[   19.780773]  [<c1725e54>] ? v4l2_open+0x111/0x1e3
[   19.780777]  [<c1725e9b>] v4l2_open+0x158/0x1e3
[   19.780782]  [<c065173b>] chrdev_open+0x22c/0x276
[   19.780787]  [<c0647c4e>] __dentry_open+0x35c/0x581
[   19.780792]  [<c06498f9>] nameidata_to_filp+0x7c/0x96
[   19.780795]  [<c065150f>] ? cdev_put+0x57/0x57
[   19.780800]  [<c0660cad>] do_last+0x743/0x9d4
[   19.780804]  [<c065d5fc>] ? path_init+0x1ee/0x596
[   19.780808]  [<c0661481>] path_openat+0x10c/0x597
[   19.780813]  [<c05204a1>] ? trace_hardirqs_off+0x27/0x37
[   19.780817]  [<c0509651>] ? local_clock+0x78/0xc7
[   19.780821]  [<c0661945>] do_filp_open+0x39/0xc2
[   19.780827]  [<c1cabc76>] ? _raw_spin_unlock+0x4c/0x5d^M
[   19.780831]  [<c0674ccd>] ? alloc_fd+0x19e/0x1b7
[   19.780836]  [<c06499ca>] do_sys_open+0xb7/0x1bd
[   19.780840]  [<c0608eea>] ? sys_munmap+0x78/0x8d
[   19.780844]  [<c0649b06>] sys_open+0x36/0x58
[   19.780849]  [<c1cb809f>] sysenter_do_call+0x12/0x38
[   19.780852] Code: d8 2f 20 c3 01 83 15 dc 2f 20 c3 00 f0 ff 00 83 05 e0 2f 20 c3 01 83 15 e4 2f 20 c3 00 5d c3 55 89 e5 3e 8d 74 26 00 85 c0 74 28 <8b> 40 04 83 05 e8 2f 20 c3 01 83 15 ec 2f 20 c3 00 85 c0 74 13 ^M
[   19.780889] EIP: [<c112efcd>] dev_get_drvdata+0xc/0x46 SS:ESP 0068:f1e37d80
[   19.780894] CR2: 000000000000000c
[   19.780898] ---[ end trace e7d1d0f6a2d1d390 ]---

The id of 0 passed to st_kim_ref() found no device, keeping pdev null,
and causing pdev->dev cause a NULL pointer dereference. After having
st_kim_ref() check for NULL, the st_unregister() function needed to be
updated to handle the case that st_gdata was not set by the
st_kim_ref().

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

EnJens pushed a commit that referenced this issue Oct 2, 2012

slob/lockdep: Fix gfp flags passed to lockdep
Doing a ktest.pl randconfig, I stumbled across the following bug
on boot up:

------------[ cut here ]------------
WARNING: at /home/rostedt/work/autotest/nobackup/linux-test.git/kernel/lockdep.c:2649 lockdep_trace_alloc+0xed/0x100()
Hardware name:
Modules linked in:
Pid: 0, comm: swapper Not tainted 3.0.0-rc1-test-00054-g1d68b67 #1
Call Trace:
 [<ffffffff810626ad>] warn_slowpath_common+0xad/0xf0
 [<ffffffff8106270a>] warn_slowpath_null+0x1a/0x20
 [<ffffffff810b537d>] lockdep_trace_alloc+0xed/0x100
 [<ffffffff81182fb0>] __kmalloc_node+0x30/0x2f0
 [<ffffffff81153eda>] pcpu_mem_alloc+0x13a/0x180
 [<ffffffff82be022c>] percpu_init_late+0x48/0xc2
 [<ffffffff82bd630c>] ? mem_init+0xd8/0xe3
 [<ffffffff82bbcc73>] start_kernel+0x1c2/0x449
 [<ffffffff82bbc35c>] x86_64_start_reservations+0x163/0x167
 [<ffffffff82bbc493>] x86_64_start_kernel+0x133/0x142^M
---[ end trace a7919e7f17c0a725 ]---

Then I ran a ktest.pl config_bisect and it came up with this config
as the problem:

  CONFIG_SLOB

Looking at what is different between SLOB and SLAB and SLUB, I found
that the gfp flags are masked against gfp_allowed_mask in
SLAB and SLUB, but not SLOB.

On boot up, interrupts are disabled and lockdep will warn if some flags
are set in gfp and interrupts are disabled. But these flags are masked
off with the gfp_allowed_mask during boot. Because SLOB does not
mask the flags against gfp_allowed_mask it triggers the warn on.

Adding this mask fixes the bug. I also found that kmem_cache_alloc_node()
was missing both the mask and the lockdep check, and that was added too.

Acked-by: Matt Mackall <mpm@selenic.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Nick Piggin <npiggin@kernel.dk>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Pekka Enberg <penberg@kernel.org>

EnJens pushed a commit that referenced this issue Oct 2, 2012

md: check ->hot_remove_disk when removing disk
Check pers->hot_remove_disk instead of pers->hot_add_disk in slot_store()
during disk removal. The linear personality only has ->hot_add_disk and
no ->hot_remove_disk, so that removing disk in the array resulted to
following kernel bug:

$ sudo mdadm --create /dev/md0 --level=linear --raid-devices=4 /dev/loop[0-3]
$ echo none | sudo tee /sys/block/md0/md/dev-loop2/slot
 BUG: unable to handle kernel NULL pointer dereference at           (null)
 IP: [<          (null)>]           (null)
 PGD c9f5d067 PUD 8575a067 PMD 0
 Oops: 0010 [#1] SMP
 CPU 2
 Modules linked in: linear loop bridge stp llc kvm_intel kvm asus_atk0110 sr_mod cdrom sg

 Pid: 10450, comm: tee Not tainted 3.0.0-rc1-leonard+ #173 System manufacturer System Product Name/P5G41TD-M PRO
 RIP: 0010:[<0000000000000000>]  [<          (null)>]           (null)
 RSP: 0018:ffff880085757df0  EFLAGS: 00010282
 RAX: ffffffffa00168e0 RBX: ffff8800d1431800 RCX: 000000000000006e
 RDX: 0000000000000001 RSI: 0000000000000002 RDI: ffff88008543c000
 RBP: ffff880085757e48 R08: 0000000000000002 R09: 000000000000000a
 R10: 0000000000000000 R11: ffff88008543c2e0 R12: 00000000ffffffff
 R13: ffff8800b4641000 R14: 0000000000000005 R15: 0000000000000000
 FS:  00007fe8c9e05700(0000) GS:ffff88011fa00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
 CR2: 0000000000000000 CR3: 00000000b4502000 CR4: 00000000000406e0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
 Process tee (pid: 10450, threadinfo ffff880085756000, task ffff8800c9f08000)
 Stack:
  ffffffff8138496a ffff8800b4641000 ffff88008543c268 0000000000000000
  ffff8800b4641000 ffff88008543c000 ffff8800d1431868 ffffffff81a78a90
  ffff8800b4641000 ffff88008543c000 ffff8800d1431800 ffff880085757e98
 Call Trace:
  [<ffffffff8138496a>] ? slot_store+0xaa/0x265
  [<ffffffff81384bae>] rdev_attr_store+0x89/0xa8
  [<ffffffff8115a96a>] sysfs_write_file+0x108/0x144
  [<ffffffff81106b87>] vfs_write+0xb1/0x10d
  [<ffffffff8106e6c0>] ? trace_hardirqs_on_caller+0x111/0x135
  [<ffffffff81106cac>] sys_write+0x4d/0x77
  [<ffffffff814fe702>] system_call_fastpath+0x16/0x1b
 Code:  Bad RIP value.
 RIP  [<          (null)>]           (null)
  RSP <ffff880085757df0>
 CR2: 0000000000000000
 ---[ end trace ba5fc64319a826fb ]---

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Cc: stable@kernel.org
Signed-off-by: NeilBrown <neilb@suse.de>

EnJens pushed a commit that referenced this issue Oct 2, 2012

AppArmor: Fix sleep in invalid context from task_setrlimit
Affected kernels 2.6.36 - 3.0

AppArmor may do a GFP_KERNEL memory allocation with task_lock(tsk->group_leader);
held when called from security_task_setrlimit.  This will only occur when the
task's current policy has been replaced, and the task's creds have not been
updated before entering the LSM security_task_setrlimit() hook.

BUG: sleeping function called from invalid context at mm/slub.c:847
 in_atomic(): 1, irqs_disabled(): 0, pid: 1583, name: cupsd
 2 locks held by cupsd/1583:
  #0:  (tasklist_lock){.+.+.+}, at: [<ffffffff8104dafa>] do_prlimit+0x61/0x189
  #1:  (&(&p->alloc_lock)->rlock){+.+.+.}, at: [<ffffffff8104db2d>]
do_prlimit+0x94/0x189
 Pid: 1583, comm: cupsd Not tainted 3.0.0-rc2-git1 #7
 Call Trace:
  [<ffffffff8102ebf2>] __might_sleep+0x10d/0x112
  [<ffffffff810e6f46>] slab_pre_alloc_hook.isra.49+0x2d/0x33
  [<ffffffff810e7bc4>] kmem_cache_alloc+0x22/0x132
  [<ffffffff8105b6e6>] prepare_creds+0x35/0xe4
  [<ffffffff811c0675>] aa_replace_current_profile+0x35/0xb2
  [<ffffffff811c4d2d>] aa_current_profile+0x45/0x4c
  [<ffffffff811c4d4d>] apparmor_task_setrlimit+0x19/0x3a
  [<ffffffff811beaa5>] security_task_setrlimit+0x11/0x13
  [<ffffffff8104db6b>] do_prlimit+0xd2/0x189
  [<ffffffff8104dea9>] sys_setrlimit+0x3b/0x48
  [<ffffffff814062bb>] system_call_fastpath+0x16/0x1b

Signed-off-by: John Johansen <john.johansen@canonical.com>
Reported-by: Miles Lane <miles.lane@gmail.com>
Cc: stable@kernel.org
Signed-off-by: James Morris <jmorris@namei.org>

EnJens pushed a commit that referenced this issue Oct 2, 2012

mwifiex: Fixing NULL pointer dereference
Following OOPS was seen when booting with card inserted

 BUG: unable to handle kernel NULL pointer dereference at 0000004c
 IP: [<f8b7718c>] cfg80211_get_drvinfo+0x21/0x115 [cfg80211]
 *pde = 00000000
 Oops: 0000 [#1] SMP
 Modules linked in: iwl3945 iwl_legacy mwifiex_sdio mac80211 11 sdhci_pci sdhci pl2303

'ethtool' on the mwifiex device returned this OOPS as
wiphy_dev() returned NULL.

Adding missing set_wiphy_dev() call to fix the problem.

Signed-off-by: Yogesh Ashok Powar <yogeshp@marvell.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

EnJens pushed a commit that referenced this issue Oct 2, 2012

oprofile, x86: Fix nmi-unsafe callgraph support
Current oprofile's x86 callgraph support may trigger page faults
throwing the BUG_ON(in_nmi()) message below. This patch fixes this by
using the same nmi-safe copy-from-user code as in perf.

------------[ cut here ]------------
kernel BUG at .../arch/x86/kernel/traps.c:436!
invalid opcode: 0000 [#1] SMP
last sysfs file: /sys/devices/pci0000:00/0000:00:0a.0/0000:07:00.0/0000:08:04.0/net/eth0/broadcast
CPU 5
Modules linked in:

Pid: 8611, comm: opcontrol Not tainted 2.6.39-00007-gfe47ae7 #1 Advanced Micro Device Anaheim/Anaheim
RIP: 0010:[<ffffffff813e8e35>]  [<ffffffff813e8e35>] do_nmi+0x22/0x1ee
RSP: 0000:ffff88042fd47f28  EFLAGS: 00010002
RAX: ffff88042c0a7fd8 RBX: 0000000000000001 RCX: 00000000c0000101
RDX: 00000000ffff8804 RSI: ffffffffffffffff RDI: ffff88042fd47f58
RBP: ffff88042fd47f48 R08: 0000000000000004 R09: 0000000000001484
R10: 0000000000000001 R11: 0000000000000000 R12: ffff88042fd47f58
R13: 0000000000000000 R14: ffff88042fd47d98 R15: 0000000000000020
FS:  00007fca25e56700(0000) GS:ffff88042fd40000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000074 CR3: 000000042d28b000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process opcontrol (pid: 8611, threadinfo ffff88042c0a6000, task ffff88042c532310)
Stack:
 0000000000000000 0000000000000001 ffff88042c0a7fd8 0000000000000000
 ffff88042fd47de8 ffffffff813e897a 0000000000000020 ffff88042fd47d98
 0000000000000000 ffff88042c0a7fd8 ffff88042fd47de8 0000000000000074
Call Trace:
 <NMI>
 [<ffffffff813e897a>] nmi+0x1a/0x20
 [<ffffffff813f08ab>] ? bad_to_user+0x25/0x771
 <<EOE>>
Code: ff 59 5b 41 5c 41 5d c9 c3 55 65 48 8b 04 25 88 b5 00 00 48 89 e5 41 55 41 54 49 89 fc 53 48 83 ec 08 f6 80 47 e0 ff ff 04 74 04 <0f> 0b eb fe 81 80 44 e0 ff ff 00 00 01 04 65 ff 04 25 c4 0f 01
RIP  [<ffffffff813e8e35>] do_nmi+0x22/0x1ee
 RSP <ffff88042fd47f28>
---[ end trace ed6752185092104b ]---
Kernel panic - not syncing: Fatal exception in interrupt
Pid: 8611, comm: opcontrol Tainted: G      D     2.6.39-00007-gfe47ae7 #1
Call Trace:
 <NMI>  [<ffffffff813e5e0a>] panic+0x8c/0x188
 [<ffffffff813e915c>] oops_end+0x81/0x8e
 [<ffffffff8100403d>] die+0x55/0x5e
 [<ffffffff813e8c45>] do_trap+0x11c/0x12b
 [<ffffffff810023c8>] do_invalid_op+0x91/0x9a
 [<ffffffff813e8e35>] ? do_nmi+0x22/0x1ee
 [<ffffffff8131e6fa>] ? oprofile_add_sample+0x83/0x95
 [<ffffffff81321670>] ? op_amd_check_ctrs+0x4f/0x2cf
 [<ffffffff813ee4d5>] invalid_op+0x15/0x20
 [<ffffffff813e8e35>] ? do_nmi+0x22/0x1ee
 [<ffffffff813e8e7a>] ? do_nmi+0x67/0x1ee
 [<ffffffff813e897a>] nmi+0x1a/0x20
 [<ffffffff813f08ab>] ? bad_to_user+0x25/0x771
 <<EOE>>

Cc: John Lumby <johnlumby@hotmail.com>
Cc: Maynard Johnson <maynardj@us.ibm.com>
Cc: <stable@kernel.org> # .37+
Signed-off-by: Robert Richter <robert.richter@amd.com>

EnJens pushed a commit that referenced this issue Oct 2, 2012

mm: fix wrong kunmap_atomic() pointer
Running a ktest.pl test, I hit the following bug on x86_32:

  ------------[ cut here ]------------
  WARNING: at arch/x86/mm/highmem_32.c:81 __kunmap_atomic+0x64/0xc1()
   Hardware name:
  Modules linked in:
  Pid: 93, comm: sh Not tainted 2.6.39-test+ #1
  Call Trace:
   [<c04450da>] warn_slowpath_common+0x7c/0x91
   [<c042f5df>] ? __kunmap_atomic+0x64/0xc1
   [<c042f5df>] ? __kunmap_atomic+0x64/0xc1^M
   [<c0445111>] warn_slowpath_null+0x22/0x24
   [<c042f5df>] __kunmap_atomic+0x64/0xc1
   [<c04d4a22>] unmap_vmas+0x43a/0x4e0
   [<c04d9065>] exit_mmap+0x91/0xd2
   [<c0443057>] mmput+0x43/0xad
   [<c0448358>] exit_mm+0x111/0x119
   [<c044855f>] do_exit+0x1ff/0x5fa
   [<c0454ea2>] ? set_current_blocked+0x3c/0x40
   [<c0454f24>] ? sigprocmask+0x7e/0x8e
   [<c0448b55>] do_group_exit+0x65/0x88
   [<c0448b90>] sys_exit_group+0x18/0x1c
   [<c0c3915f>] sysenter_do_call+0x12/0x38
  ---[ end trace 8055f74ea3c0eb62 ]---

Running a ktest.pl git bisect, found the culprit: commit e303297
("mm: extended batches for generic mmu_gather")

But although this was the commit triggering the bug, it was not the one
originally responsible for the bug.  That was commit d16dfc5 ("mm:
mmu_gather rework").

The code in zap_pte_range() has something that looks like the following:

	pte =  pte_offset_map_lock(mm, pmd, addr, &ptl);
	do {
		[...]
	} while (pte++, addr += PAGE_SIZE, addr != end);
	pte_unmap_unlock(pte - 1, ptl);

The pte starts off pointing at the first element in the page table
directory that was returned by the pte_offset_map_lock().  When it's done
with the page, pte will be pointing to anything between the next entry and
the first entry of the next page inclusive.  By doing a pte - 1, this puts
the pte back onto the original page, which is all that pte_unmap_unlock()
needs.

In most archs (64 bit), this is not an issue as the pte is ignored in the
pte_unmap_unlock().  But on 32 bit archs, where things may be kmapped, it
is essential that the pte passed to pte_unmap_unlock() resides on the same
page that was given by pte_offest_map_lock().

The problem came in d16dfc5 ("mm: mmu_gather rework") where it introduced
a "break;" from the while loop.  This alone did not seem to easily trigger
the bug.  But the modifications made by e303297 caused that "break;" to
be hit on the first iteration, before the pte++.

The pte not being incremented will now cause pte_unmap_unlock(pte - 1) to
be pointing to the previous page.  This will cause the wrong page to be
unmapped, and also trigger the warning above.

The simple solution is to just save the pointer given by
pte_offset_map_lock() and use it in the unlock.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

EnJens pushed a commit that referenced this issue Oct 2, 2012

btrfs: fix wrong reservation when doing delayed inode operations
We have migrated the space for the delayed inode items from
trans_block_rsv to global_block_rsv, but we forgot to set trans->block_rsv to
global_block_rsv when we doing delayed inode operations, and the following Oops
happened:

[ 9792.654889] ------------[ cut here ]------------
[ 9792.654898] WARNING: at fs/btrfs/extent-tree.c:5681
btrfs_alloc_free_block+0xca/0x27c [btrfs]()
[ 9792.654899] Hardware name: To Be Filled By O.E.M.
[ 9792.654900] Modules linked in: btrfs zlib_deflate libcrc32c
ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables
arc4 rt61pci rt2x00pci rt2x00lib snd_hda_codec_hdmi mac80211
snd_hda_codec_realtek cfg80211 snd_hda_intel edac_core snd_seq rfkill
pcspkr serio_raw snd_hda_codec eeprom_93cx6 edac_mce_amd sp5100_tco
i2c_piix4 k10temp snd_hwdep snd_seq_device snd_pcm floppy r8169 xhci_hcd
mii snd_timer snd soundcore snd_page_alloc ipv6 firewire_ohci pata_acpi
ata_generic firewire_core pata_via crc_itu_t radeon ttm drm_kms_helper
drm i2c_algo_bit i2c_core [last unloaded: scsi_wait_scan]
[ 9792.654919] Pid: 2762, comm: rm Tainted: G        W   2.6.39+ #1
[ 9792.654920] Call Trace:
[ 9792.654922]  [<ffffffff81053c4a>] warn_slowpath_common+0x83/0x9b
[ 9792.654925]  [<ffffffff81053c7c>] warn_slowpath_null+0x1a/0x1c
[ 9792.654933]  [<ffffffffa038e747>] btrfs_alloc_free_block+0xca/0x27c [btrfs]
[ 9792.654945]  [<ffffffffa03b8562>] ? map_extent_buffer+0x6e/0xa8 [btrfs]
[ 9792.654953]  [<ffffffffa038189b>] __btrfs_cow_block+0xfc/0x30c [btrfs]
[ 9792.654963]  [<ffffffffa0396aa6>] ? btrfs_buffer_uptodate+0x47/0x58 [btrfs]
[ 9792.654970]  [<ffffffffa0382e48>] ? read_block_for_search+0x94/0x368 [btrfs]
[ 9792.654978]  [<ffffffffa0381ba9>] btrfs_cow_block+0xfe/0x146 [btrfs]
[ 9792.654986]  [<ffffffffa03848b0>] btrfs_search_slot+0x14d/0x4b6 [btrfs]
[ 9792.654997]  [<ffffffffa03b8562>] ? map_extent_buffer+0x6e/0xa8 [btrfs]
[ 9792.655022]  [<ffffffffa03938e8>] btrfs_lookup_inode+0x2f/0x8f [btrfs]
[ 9792.655025]  [<ffffffff8147afac>] ? _cond_resched+0xe/0x22
[ 9792.655027]  [<ffffffff8147b892>] ? mutex_lock+0x29/0x50
[ 9792.655039]  [<ffffffffa03d41b1>] btrfs_update_delayed_inode+0x72/0x137 [btrfs]
[ 9792.655051]  [<ffffffffa03d4ea2>] btrfs_run_delayed_items+0x90/0xdb [btrfs]
[ 9792.655062]  [<ffffffffa039a69b>] btrfs_commit_transaction+0x228/0x654 [btrfs]
[ 9792.655064]  [<ffffffff8106e8da>] ? remove_wait_queue+0x3a/0x3a
[ 9792.655075]  [<ffffffffa03a2fa5>] btrfs_evict_inode+0x14d/0x202 [btrfs]
[ 9792.655077]  [<ffffffff81132bd6>] evict+0x71/0x111
[ 9792.655079]  [<ffffffff81132de0>] iput+0x12a/0x132
[ 9792.655081]  [<ffffffff8112aa3a>] do_unlinkat+0x106/0x155
[ 9792.655083]  [<ffffffff81127b83>] ? path_put+0x1f/0x23
[ 9792.655085]  [<ffffffff8109c53c>] ? audit_syscall_entry+0x145/0x171
[ 9792.655087]  [<ffffffff81128410>] ? putname+0x34/0x36
[ 9792.655090]  [<ffffffff8112b441>] sys_unlinkat+0x29/0x2b
[ 9792.655092]  [<ffffffff81482c42>] system_call_fastpath+0x16/0x1b
[ 9792.655093] ---[ end trace 02b696eb02b3f768 ]---

This patch fix it by setting the reservation of the transaction handle to the
correct one.

Reported-by: Josef Bacik <josef@redhat.com>
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>

EnJens pushed a commit that referenced this issue Oct 2, 2012

target: Fix transport_get_lun_for_tmr failure cases
This patch fixes two possible NULL pointer dereferences in target v4.0
code where se_tmr release path in core_tmr_release_req() can OOPs upon
transport_get_lun_for_tmr() failure by attempting to access se_device or
se_tmr->tmr_list without a valid member of se_device->tmr_list during
transport_free_se_cmd() release.  This patch moves the se_tmr->tmr_dev
pointer assignment in transport_get_lun_for_tmr() until after possible
-ENODEV failures during unpacked_lun lookup.

This addresses an OOPs originally reported with LIO v4.1 upstream on
.39 code here:

    TARGET_CORE[qla2xxx]: Detected NON_EXISTENT_LUN Access for 0x00000000
    BUG: unable to handle kernel NULL pointer dereference at 0000000000000550
    IP: [<ffffffff81035ec4>] __ticket_spin_trylock+0x4/0x20
    PGD 0
    Oops: 0000 [#1] SMP
    last sysfs file: /sys/devices/system/cpu/cpu23/cache/index2/shared_cpu_map
    CPU 1
    Modules linked in: netconsole target_core_pscsi target_core_file
tcm_qla2xxx target_core_iblock tcm_loop target_core_mod configfs
ipmi_devintf ipmi_si ipmi_msghandler serio_raw i7core_edac ioatdma dca
edac_core ps_bdrv ses enclosure usbhid usb_storage ahci qla2xxx hid
uas e1000e mpt2sas libahci mlx4_core scsi_transport_fc
scsi_transport_sas raid_class scsi_tgt [last unloaded: netconsole]

    Pid: 0, comm: kworker/0:0 Tainted: G        W   2.6.39+ #1 Xyratex Storage Server
    RIP: 0010:[<ffffffff81035ec4>] [<ffffffff81035ec4>]__ticket_spin_trylock+0x4/0x20
    RSP: 0018:ffff88063e803c08  EFLAGS: 00010286
    RAX: ffff880619ab45e0 RBX: 0000000000000550 RCX: 0000000000000000
    RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000550
    RBP: ffff88063e803c08 R08: 0000000000000002 R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000568
    R13: 0000000000000001 R14: 0000000000000000 R15: ffff88060cd96a20
    FS:  0000000000000000(0000) GS:ffff88063e800000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
    CR2: 0000000000000550 CR3: 0000000001a03000 CR4: 00000000000006e0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
    Process kworker/0:0 (pid: 0, threadinfo ffff880619ab8000, task ffff880619ab45e0)
    Stack:
     ffff88063e803c28 ffffffff812cf039 0000000000000550 0000000000000568
     ffff88063e803c58 ffffffff8157071e ffffffffa028a1dc ffff88060f7e4600
     0000000000000550 ffff880616961480 ffff88063e803c78 ffffffffa028a1dc
    Call Trace:
<IRQ>
     [<ffffffff812cf039>] do_raw_spin_trylock+0x19/0x50
     [<ffffffff8157071e>] _raw_spin_lock+0x3e/0x70
     [<ffffffffa028a1dc>] ? core_tmr_release_req+0x2c/0x60 [target_core_mod]
     [<ffffffffa028a1dc>] core_tmr_release_req+0x2c/0x60 [target_core_mod]
     [<ffffffffa028d0d2>] transport_free_se_cmd+0x22/0x50 [target_core_mod]
     [<ffffffffa028d120>] transport_release_cmd_to_pool+0x20/0x40 [target_core_mod]
     [<ffffffffa028e525>] transport_generic_free_cmd+0xa5/0xb0 [target_core_mod]
     [<ffffffffa0147cc4>] tcm_qla2xxx_handle_tmr+0xc4/0xd0 [tcm_qla2xxx]
     [<ffffffffa0191ba3>] __qla24xx_handle_abts+0xd3/0x150 [qla2xxx]
     [<ffffffffa0197651>] qla_tgt_response_pkt+0x171/0x520 [qla2xxx]
     [<ffffffffa0197a2d>] qla_tgt_response_pkt_all_vps+0x2d/0x220 [qla2xxx]
     [<ffffffffa0171dd3>] qla24xx_process_response_queue+0x1a3/0x670 [qla2xxx]
     [<ffffffffa0196281>] ? qla24xx_atio_pkt+0x81/0x120 [qla2xxx]
     [<ffffffffa0174025>] ? qla24xx_msix_default+0x45/0x2a0 [qla2xxx]
     [<ffffffffa0174198>] qla24xx_msix_default+0x1b8/0x2a0 [qla2xxx]
     [<ffffffff810dadb4>] handle_irq_event_percpu+0x54/0x210
     [<ffffffff810dafb8>] handle_irq_event+0x48/0x70
     [<ffffffff810dd5ee>] ? handle_edge_irq+0x1e/0x110
     [<ffffffff810dd647>] handle_edge_irq+0x77/0x110
     [<ffffffff8100d362>] handle_irq+0x22/0x40
     [<ffffffff8157b28d>] do_IRQ+0x5d/0xe0
     [<ffffffff81571413>] common_interrupt+0x13/0x13
<EOI>
     [<ffffffff813003f7>] ? intel_idle+0xd7/0x130
     [<ffffffff813003f0>] ? intel_idle+0xd0/0x130
     [<ffffffff8144832b>] cpuidle_idle_call+0xab/0x1c0
     [<ffffffff8100a26b>] cpu_idle+0xab/0xf0
     [<ffffffff81566c59>] start_secondary+0x1cb/0x1d2

Reported-by: Roland Dreier <roland@purestorage.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>

EnJens pushed a commit that referenced this issue Oct 2, 2012

USB: ehci-ath79: fix a NULL pointer dereference
Loading the ehci-hcd module on the ath79 platform causes
a NULL pointer dereference:

CPU 0 Unable to handle kernel paging request at virtual address 00000000, epc == c0252928, ra == c00de968
Oops[#1]:
Cpu 0
$ 0   : 00000000 00000070 00000001 00000000
$ 4   : 802cf870 0000117e ffffffff 8019c7bc
$ 8   : 0000000a 00000002 00000001 fffffffb
$12   : 8026ef20 0000000f ffffff80 802dad3c
$16   : 8077a2d4 8077a200 c00f3484 8019ed84
$20   : c00f0000 00000003 000000a0 80262c2c
$24   : 00000002 80079da0
$28   : 80788000 80789c80 80262b14 c00de968
Hi    : 00000000
Lo    : b61f0000
epc   : c0252928 __mod_vermagic5+0xc260/0xc7e8 [ehci_hcd]
    Not tainted
ra    : c00de968 usb_add_hcd+0x2a4/0x858 [usbcore]
Status: 1000c003    KERNEL EXL IE
Cause : 00800008
BadVA : 00000000
PrId  : 00019374 (MIPS 24Kc)
Modules linked in: ehci_hcd(+) pppoe pppox ipt_REJECT xt_TCPMSS ipt_LOG
xt_comment xt_multiport xt_mac xt_limit iptable_mangle iptable_filte
r ip_tables xt_tcpudp x_tables ppp_async ppp_generic slhc ath mac80211
usbcore nls_base input_polldev crc_ccitt cfg80211 compat input_core a
rc4 aes_generic crypto_algapi
Process insmod (pid: 379, threadinfo=80788000, task=80ca2180,
tls=77fe52d0)
Stack : c0253184 80c57d80 80789cac 8077a200 00000001 8019edc0 807fa800 8077a200
        8077a290 c00f3484 8019ed84 c00f0000 00000003 000000a0 80262c2c c00de968
        802d0000 800878cc c0253228 c02528e4 c0253184 80c57d80 80bf6800 80ca2180
        8007b75c 00000000 8077a200 802cf830 802d0000 00000003 fffffff4 00000015
        00000348 00000124 800b189c c024bb4c c0255000 801a27e8 c0253228 c02528e4
        ...
Call Trace:
[<c0252928>] __mod_vermagic5+0xc260/0xc7e8 [ehci_hcd]

It is caused by:

  commit c430131
  Author: Jan Andersson <jan@gaisler.com>
  Date:   Tue May 3 20:11:57 2011 +0200

      USB: EHCI: Support controllers with big endian capability regs

      The two first HC capability registers (CAPLENGTH and HCIVERSION)
      are defined as one 8-bit and one 16-bit register. Most HC
      implementations have selected to treat these registers as part
      of a 32-bit register, giving the same layout for both big and
      small endian systems.

      This patch adds a new quirk, big_endian_capbase, to support
      controllers with big endian register interfaces that treat
      HCIVERSION and CAPLENGTH as individual registers.

      Signed-off-by: Jan Andersson <jan@gaisler.com>
      Acked-by: Alan Stern <stern@rowland.harvard.edu>
      Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

The reading of the HC capability register has been moved by that
commit to a place where the ehci->caps field is not initialized
yet. This patch moves the reading of the register back to the
original place.

Acked-by: Jan Andersson <jan@gaisler.com>
Cc: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Gabor Juhos <juhosg@openwrt.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

EnJens pushed a commit that referenced this issue Oct 2, 2012

x86-32, NUMA: Fix boot regression caused by NUMA init unification on …
…highmem machines

During 32/64 NUMA init unification, commit 797390d ("x86-32,
NUMA: use sparse_memory_present_with_active_regions()") made
32bit mm init call memory_present() automatically from
active_regions instead of leaving it to each NUMA init path.

This commit description is inaccurate - memory_present() calls
aren't the same for flat and numaq.  After the commit,
memory_present() is only called for the intersection of e820 and
NUMA layout.  Before, on flatmem, memory_present() would be
called from 0 to max_pfn.  After, it would be called only on the
areas that e820 indicates to be populated.

This is how x86_64 works and should be okay as memmap is allowed
to contain holes; however, x86_32 DISCONTIGMEM is missing
early_pfn_valid(), which makes memmap_init_zone() assume that
memmap doesn't contain any hole.  This leads to the following
oops if e820 map contains holes as it often does on machine with
near or more 4GiB of memory by calling pfn_to_page() on a pfn
which isn't mapped to a NUMA node, a reported by Conny Seidel:

  BUG: unable to handle kernel paging request at 000012b0
  IP: [<c1aa13ce>] memmap_init_zone+0x6c/0xf2
  *pdpt =3D 0000000000000000 *pde =3D f000eef3f000ee00
  Oops: 0000 [#1] SMP
  last sysfs file:
  Modules linked in:

  Pid: 0, comm: swapper Not tainted 2.6.39-rc5-00164-g797390d #1 To Be Filled By O.E.M. To Be Filled By O.E.M./E350M1
  EIP: 0060:[<c1aa13ce>] EFLAGS: 00010012 CPU: 0
  EIP is at memmap_init_zone+0x6c/0xf2
  EAX: 00000000 EBX: 000a8000 ECX: 000a7fff EDX: f2c00b80
  ESI: 000a8000 EDI: f2c00800 EBP: c19ffe54 ESP: c19ffe34
   DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
  Process swapper (pid: 0, ti=3Dc19fe000 task=3Dc1a07f60 task.ti=3Dc19fe000)
  Stack:
   00000002 00000000 0023f000 00000000 10000000 00000a00 f2c00000 f2c00b58
   c19ffeb0 c1a80f24 000375fe 00000000 f2c00800 00000800 00000100 00000030
   c1abb768 0000003c 00000000 00000000 00000004 00207a02 f2c00800 000375fe
  Call Trace:
   [<c1a80f24>] free_area_init_node+0x358/0x385
   [<c1a81384>] free_area_init_nodes+0x420/0x487
   [<c1a79326>] paging_init+0x114/0x11b
   [<c1a6cb13>] setup_arch+0xb37/0xc0a
   [<c1a69554>] start_kernel+0x76/0x316
   [<c1a690a8>] i386_start_kernel+0xa8/0xb0

This patch fixes the bug by defining early_pfn_valid() to be the
same as pfn_valid() when DISCONTIGMEM.

Reported-bisected-and-tested-by: Conny Seidel <conny.seidel@amd.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: hans.rosenfeld@amd.com
Cc: Christoph Lameter <cl@linux.com>
Cc: Conny Seidel <conny.seidel@amd.com>
Link: http://lkml.kernel.org/r/20110628094107.GB3386@htj.dyndns.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>

EnJens pushed a commit that referenced this issue Oct 2, 2012

natsemi: silence dma-debug warnings
This silences dma-debug warnings:

https://lkml.org/lkml/2011/6/30/341

------------[ cut here ]------------
WARNING: at /home/jimc/projects/lx/linux-2.6/lib/dma-debug.c:820
check_unmap+0x1fe/0x56a()
natsemi 0000:00:06.0: DMA-API: device driver frees DMA memory with
different size [device address=0x0000000006ef0040] [map size=1538
bytes] [unmap size=1522 bytes]
Modules linked in: pc8736x_gpio pc87360 hwmon_vid scx200_gpio nsc_gpio
scx200_hrt scx200_acb i2c_core arc4 rtl8180 mac80211 eeprom_93cx6
cfg80211 pcspkr rfkill scx200 ide_gd_mod ide_pci_generic ohci_hcd
usbcore sc1200 ide_core
Pid: 870, comm: collector Not tainted 3.0.0-rc5-sk-00080-gca56a95 #1
Call Trace:
 [<c011a556>] warn_slowpath_common+0x4a/0x5f
 [<c02565cb>] ? check_unmap+0x1fe/0x56a
 [<c011a5cf>] warn_slowpath_fmt+0x26/0x2a
 [<c02565cb>] check_unmap+0x1fe/0x56a
 [<c0256aaa>] debug_dma_unmap_page+0x53/0x5b
 [<c029d6cd>] pci_unmap_single+0x4d/0x57
 [<c029ea0a>] natsemi_poll+0x343/0x5ca
 [<c0116f41>] ? try_to_wake_up+0xea/0xfc
 [<c0122416>] ? spin_unlock_irq.clone.28+0x18/0x23
 [<c02d4667>] net_rx_action+0x3f/0xe5
 [<c011e35e>] __do_softirq+0x5b/0xd1
 [<c011e303>] ? local_bh_enable+0xa/0xa
 <IRQ>  [<c011e54b>] ? irq_exit+0x34/0x75
 [<c01034b9>] ? do_IRQ+0x66/0x79
 [<c034e869>] ? common_interrupt+0x29/0x30
 [<c0115ed0>] ? finish_task_switch.clone.118+0x31/0x72
 [<c034cb92>] ? schedule+0x3b2/0x3f1
 [<c012f4b0>] ? hrtimer_start_range_ns+0x10/0x12
 [<c012f4ce>] ? hrtimer_start_expires+0x1c/0x24
 [<c034d5aa>] ? schedule_hrtimeout_range_clock+0x8e/0xb4
 [<c012ed27>] ? update_rmtp+0x68/0x68
 [<c034d5da>] ? schedule_hrtimeout_range+0xa/0xc
 [<c017a913>] ? poll_schedule_timeout+0x27/0x3e
 [<c017b051>] ? do_select+0x488/0x4cd
 [<c0115ee2>] ? finish_task_switch.clone.118+0x43/0x72
 [<c01157ad>] ? need_resched+0x14/0x1e
 [<c017a99e>] ? poll_freewait+0x74/0x74
 [<c01157ad>] ? need_resched+0x14/0x1e
 [<c034cbc1>] ? schedule+0x3e1/0x3f1
 [<c011e55e>] ? irq_exit+0x47/0x75
 [<c01157ad>] ? need_resched+0x14/0x1e
 [<c034cf8a>] ? preempt_schedule_irq+0x44/0x4a
 [<c034dd1e>] ? need_resched+0x17/0x19
 [<c024bc12>] ? put_dec_full+0x7b/0xaa
 [<c0240060>] ? blkdev_ioctl+0x434/0x618
 [<c024bc70>] ? put_dec+0x2f/0x6d
 [<c024c6a5>] ? number.clone.1+0x10b/0x1d0
 [<c034cf8a>] ? preempt_schedule_irq+0x44/0x4a
 [<c034dd1e>] ? need_resched+0x17/0x19
 [<c024d046>] ? vsnprintf+0x225/0x264
 [<c024cea0>] ? vsnprintf+0x7f/0x264
 [<c018346f>] ? seq_printf+0x22/0x40
 [<c01a2fcc>] ? do_task_stat+0x582/0x5a3
 [<c017a913>] ? poll_schedule_timeout+0x27/0x3e
 [<c017b1b5>] ? core_sys_select+0x11f/0x1a3
 [<c017a913>] ? poll_schedule_timeout+0x27/0x3e
 [<c01a34a1>] ? proc_tgid_stat+0xd/0xf
 [<c012357c>] ? recalc_sigpending+0x32/0x35
 [<c0123b9c>] ? __set_task_blocked+0x64/0x6a
 [<c011dfb0>] ? timespec_add_safe+0x24/0x48
 [<c0123449>] ? spin_unlock_irq.clone.16+0x18/0x23
 [<c017b3a1>] ? sys_pselect6+0xe5/0x13e
 [<c034dd65>] ? syscall_call+0x7/0xb
 [<c0340000>] ? rpc_clntdir_depopulate+0x26/0x30
---[ end trace 180dcac41a50938b ]---

Reported-by: Jim Cromie <jim.cromie@gmail.com>
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Tested-by: Jim Cromie <jim.cromie@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

EnJens pushed a commit that referenced this issue Oct 2, 2012

ftrace: Fix regression of :mod:module function enabling
The new code that allows different utilities to pick and choose
what functions they trace broke the :mod: hook that allows users
to trace only functions of a particular module.

The reason is that the :mod: hook bypasses the hash that is setup
to allow individual users to trace their own functions and uses
the global hash directly. But if the global hash has not been
set up, it will cause a bug:

echo '*:mod:radeon' > /sys/kernel/debug/set_ftrace_filter

produces:

 [drm:drm_mode_getfb] *ERROR* invalid framebuffer id
 [drm:radeon_crtc_page_flip] *ERROR* failed to reserve new rbo buffer before flip
 BUG: unable to handle kernel paging request at ffffffff8160ec90
 IP: [<ffffffff810d9136>] add_hash_entry+0x66/0xd0
 PGD 1a05067 PUD 1a09063 PMD 80000000016001e1
 Oops: 0003 [#1] SMP Jul  7 04:02:28 phyllis kernel: [55303.858604] CPU 1
 Modules linked in: cryptd aes_x86_64 aes_generic binfmt_misc rfcomm bnep ip6table_filter hid radeon r8169 ahci libahci mii ttm drm_kms_helper drm video i2c_algo_bit intel_agp intel_gtt

 Pid: 10344, comm: bash Tainted: G        WC  3.0.0-rc5 #1 Dell Inc. Inspiron N5010/0YXXJJ
 RIP: 0010:[<ffffffff810d9136>]  [<ffffffff810d9136>] add_hash_entry+0x66/0xd0
 RSP: 0018:ffff88003a96bda8  EFLAGS: 00010246
 RAX: ffff8801301735c0 RBX: ffffffff8160ec80 RCX: 0000000000306ee0
 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880137c92940
 RBP: ffff88003a96bdb8 R08: ffff880137c95680 R09: 0000000000000000
 R10: 0000000000000001 R11: 0000000000000000 R12: ffffffff81c9df78
 R13: ffff8801153d1000 R14: 0000000000000000 R15: 0000000000000000
 FS: 00007f329c18a700(0000) GS:ffff880137c80000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: ffffffff8160ec90 CR3: 000000003002b000 CR4: 00000000000006e0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
 Process bash (pid: 10344, threadinfo ffff88003a96a000, task ffff88012fcfc470)
 Stack:
  0000000000000fd0 00000000000000fc ffff88003a96be38 ffffffff810d92f5
  ffff88011c4c4e00 ffff880000000000 000000000b69f4d0 ffffffff8160ec80
  ffff8800300e6f06 0000000081130295 0000000000000282 ffff8800300e6f00
 Call Trace:
  [<ffffffff810d92f5>] match_records+0x155/0x1b0
  [<ffffffff810d940c>] ftrace_mod_callback+0xbc/0x100
  [<ffffffff810dafdf>] ftrace_regex_write+0x16f/0x210
  [<ffffffff810db09f>] ftrace_filter_write+0xf/0x20
  [<ffffffff81166e48>] vfs_write+0xc8/0x190
  [<ffffffff81167001>] sys_write+0x51/0x90
  [<ffffffff815c7e02>] system_call_fastpath+0x16/0x1b
 Code: 48 8b 33 31 d2 48 85 f6 75 33 49 89 d4 4c 03 63 08 49 8b 14 24 48 85 d2 48 89 10 74 04 48 89 42 08 49 89 04 24 4c 89 60 08 31 d2
 RIP [<ffffffff810d9136>] add_hash_entry+0x66/0xd0
  RSP <ffff88003a96bda8>
 CR2: ffffffff8160ec90
 ---[ end trace a5d031828efdd88e ]---

Reported-by: Brian Marete <marete@toshnix.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

EnJens pushed a commit that referenced this issue Oct 2, 2012

ssb: fix init regression of hostmode PCI core
Our workarounds seem to be clientmode PCI specific. Using SPROM
workaround on SoC resulted in Oops:

Data bus error, epc == 8017ed58, ra == 80225838
 Oops[#1]:
 Cpu 0
 $ 0   : 00000000 10008000 b8000000 00000001
 $ 4   : 80293b5c 00000caa ffffffff 00000000
 $ 8   : 0000000a 00000003 00000001 696d6d20
 $12   : ffffffff 00000000 00000000 ffffffff
 $16   : 802d0140 b8004800 802c0000 00000000
 $20   : 00000000 802c0000 00000000 802d04d4
 $24   : 00000018 80151a00
 $28   : 81816000 81817df8 8029bda0 80225838
 Hi    : 00000000
 Lo    : 00000000
 epc   : 8017ed58 ssb_ssb_read16+0x48/0x60
   Not tainted
 ra    : 80225838 ssb_pcicore_init+0x54/0x3b4

Reported-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: Rafał Miłecki <zajec5@gmail.com>
Tested-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

EnJens pushed a commit that referenced this issue Oct 2, 2012

Bluetooth: Fix potential deadlock in hci_core
Since hdev->lock may be acquired by threads runnning in interrupt
context, all threads running in process context should disable
local bottom halve before locking hdev->lock. This can be done by
using hci_dev_lock_bh macro.

This way, we avoid potencial deadlocks like this one reported by
CONFIG_PROVE_LOCKING=y.

[  304.788780] =================================
[  304.789686] [ INFO: inconsistent lock state ]
[  304.789686] 2.6.39+ #1
[  304.789686] ---------------------------------
[  304.789686] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
[  304.789686] ksoftirqd/0/3 [HC0[0]:SC1[1]:HE1:SE0] takes:
[  304.789686]  (&(&hdev->lock)->rlock){+.?...}, at: [<ffffffffa000bbfe>] hci_conn_check_pending+0x38/0x76 [bluetooth]
[  304.789686] {SOFTIRQ-ON-W} state was registered at:
[  304.789686]   [<ffffffff8105188b>] __lock_acquire+0x347/0xd52
[  304.789686]   [<ffffffff810526ac>] lock_acquire+0x8a/0xa7
[  304.789686]   [<ffffffff812b3758>] _raw_spin_lock+0x2c/0x3b
[  304.789686]   [<ffffffffa0009cf0>] hci_blacklist_del+0x1f/0x8a [bluetooth]
[  304.789686]   [<ffffffffa00139fd>] hci_sock_ioctl+0x2d9/0x314 [bluetooth]
[  304.789686]   [<ffffffff812197d8>] sock_ioctl+0x1f2/0x214
[  304.789686]   [<ffffffff810b0fd6>] do_vfs_ioctl+0x46c/0x4ad
[  304.789686]   [<ffffffff810b1059>] sys_ioctl+0x42/0x65
[  304.789686]   [<ffffffff812b4892>] system_call_fastpath+0x16/0x1b
[  304.789686] irq event stamp: 9768
[  304.789686] hardirqs last  enabled at (9768): [<ffffffff812b40d4>] restore_args+0x0/0x30
[  304.789686] hardirqs last disabled at (9767): [<ffffffff812b3f6a>] save_args+0x6a/0x70
[  304.789686] softirqs last  enabled at (9726): [<ffffffff8102fa9b>] __do_softirq+0x129/0x13f
[  304.789686] softirqs last disabled at (9739): [<ffffffff8102fb33>] run_ksoftirqd+0x82/0x133
[  304.789686]
[  304.789686] other info that might help us debug this:
[  304.789686]  Possible unsafe locking scenario:
[  304.789686]
[  304.789686]        CPU0
[  304.789686]        ----
[  304.789686]   lock(&(&hdev->lock)->rlock);
[  304.789686]   <Interrupt>
[  304.789686]     lock(&(&hdev->lock)->rlock);
[  304.789686]
[  304.789686]  *** DEADLOCK ***
[  304.789686]
[  304.789686] 1 lock held by ksoftirqd/0/3:
[  304.789686]  #0:  (hci_task_lock){++.-..}, at: [<ffffffffa0008353>] hci_rx_task+0x49/0x2f3 [bluetooth]
[  304.789686]
[  304.789686] stack backtrace:
[  304.789686] Pid: 3, comm: ksoftirqd/0 Not tainted 2.6.39+ #1
[  304.789686] Call Trace:
[  304.789686]  [<ffffffff812ae901>] print_usage_bug+0x1e7/0x1f8
[  304.789686]  [<ffffffff8100a796>] ? save_stack_trace+0x27/0x44
[  304.789686]  [<ffffffff8104fc3f>] ? print_irq_inversion_bug.part.26+0x19a/0x19a
[  304.789686]  [<ffffffff810504bb>] mark_lock+0x106/0x258
[  304.789686]  [<ffffffff812b40d4>] ? retint_restore_args+0x13/0x13
[  304.789686]  [<ffffffff81051817>] __lock_acquire+0x2d3/0xd52
[  304.789686]  [<ffffffff8102be73>] ? vprintk+0x3ab/0x3d7
[  304.789686]  [<ffffffff812ae126>] ? printk+0x3c/0x3e
[  304.789686]  [<ffffffff810526ac>] lock_acquire+0x8a/0xa7
[  304.789686]  [<ffffffffa000bbfe>] ? hci_conn_check_pending+0x38/0x76 [bluetooth]
[  304.789686]  [<ffffffff811601c6>] ? __dynamic_pr_debug+0x10c/0x11a
[  304.789686]  [<ffffffff812b3758>] _raw_spin_lock+0x2c/0x3b
[  304.789686]  [<ffffffffa000bbfe>] ? hci_conn_check_pending+0x38/0x76 [bluetooth]
[  304.789686]  [<ffffffffa000bbfe>] hci_conn_check_pending+0x38/0x76 [bluetooth]
[  304.789686]  [<ffffffffa000c561>] hci_event_packet+0x38e/0x3e12 [bluetooth]
[  304.789686]  [<ffffffff81052615>] ? lock_release+0x16c/0x179
[  304.789686]  [<ffffffff812b3b41>] ? _raw_read_unlock+0x23/0x27
[  304.789686]  [<ffffffffa0013e7f>] ? hci_send_to_sock+0x179/0x188 [bluetooth]
[  304.789686]  [<ffffffffa00083d2>] hci_rx_task+0xc8/0x2f3 [bluetooth]
[  304.789686]  [<ffffffff8102f5a9>] tasklet_action+0x87/0xe6
[  304.789686]  [<ffffffff8102fa11>] __do_softirq+0x9f/0x13f
[  304.789686]  [<ffffffff8102fb33>] run_ksoftirqd+0x82/0x133
[  304.789686]  [<ffffffff8102fab1>] ? __do_softirq+0x13f/0x13f
[  304.789686]  [<ffffffff81040f0a>] kthread+0x7f/0x87
[  304.789686]  [<ffffffff812b55c4>] kernel_thread_helper+0x4/0x10
[  304.789686]  [<ffffffff812b40d4>] ? retint_restore_args+0x13/0x13
[  304.789686]  [<ffffffff81040e8b>] ? __init_kthread_worker+0x53/0x53
[  304.789686]  [<ffffffff812b55c0>] ? gs_change+0x13/0x13

Signed-off-by: Andre Guedes <andre.guedes@openbossa.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>

EnJens pushed a commit that referenced this issue Oct 2, 2012

Bluetooth: Fix potential deadlock in mgmt
All threads running in process context should disable local bottom
halve before locking hdev->lock.

This patch fix the following message generated when Bluetooh module
is loaded with enable_mgmt=y (CONFIG_PROVE_LOCKING enabled).

[  107.880781] =================================
[  107.881631] [ INFO: inconsistent lock state ]
[  107.881631] 2.6.39+ #1
[  107.881631] ---------------------------------
[  107.881631] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
[  107.881631] rcuc0/7 [HC0[0]:SC1[3]:HE1:SE0] takes:
[  107.881631]  (&(&hdev->lock)->rlock){+.?...}, at: [<ffffffffa0012c8d>] mgmt_set_local_name_complete+0x84/0x10b [bluetooth]
[  107.881631] {SOFTIRQ-ON-W} state was registered at:
[  107.881631]   [<ffffffff8105188b>] __lock_acquire+0x347/0xd52
[  107.881631]   [<ffffffff810526ac>] lock_acquire+0x8a/0xa7
[  107.881631]   [<ffffffff812b3758>] _raw_spin_lock+0x2c/0x3b
[  107.881631]   [<ffffffffa0011cc2>] mgmt_control+0xd4d/0x175b [bluetooth]
[  107.881631]   [<ffffffffa0013275>] hci_sock_sendmsg+0x97/0x293 [bluetooth]
[  107.881631]   [<ffffffff8121940c>] sock_aio_write+0x126/0x13a
[  107.881631]   [<ffffffff810a35fa>] do_sync_write+0xba/0xfa
[  107.881631]   [<ffffffff810a3beb>] vfs_write+0xaa/0xca
[  107.881631]   [<ffffffff810a3d80>] sys_write+0x45/0x69
[  107.881631]   [<ffffffff812b4892>] system_call_fastpath+0x16/0x1b
[  107.881631] irq event stamp: 2100876
[  107.881631] hardirqs last  enabled at (2100876): [<ffffffff812b40d4>] restore_args+0x0/0x30
[  107.881631] hardirqs last disabled at (2100875): [<ffffffff812b3f6a>] save_args+0x6a/0x70
[  107.881631] softirqs last  enabled at (2100862): [<ffffffff8106a805>] rcu_cpu_kthread+0x2b5/0x2e2
[  107.881631] softirqs last disabled at (2100863): [<ffffffff812b56bc>] call_softirq+0x1c/0x26
[  107.881631]
[  107.881631] other info that might help us debug this:
[  107.881631]  Possible unsafe locking scenario:
[  107.881631]
[  107.881631]        CPU0
[  107.881631]        ----
[  107.881631]   lock(&(&hdev->lock)->rlock);
[  107.881631]   <Interrupt>
[  107.881631]     lock(&(&hdev->lock)->rlock);
[  107.881631]
[  107.881631]  *** DEADLOCK ***
[  107.881631]
[  107.881631] 1 lock held by rcuc0/7:
[  107.881631]  #0:  (hci_task_lock){++.-..}, at: [<ffffffffa0008353>] hci_rx_task+0x49/0x2f3 [bluetooth]
[  107.881631]
[  107.881631] stack backtrace:
[  107.881631] Pid: 7, comm: rcuc0 Not tainted 2.6.39+ #1
[  107.881631] Call Trace:
[  107.881631]  <IRQ>  [<ffffffff812ae901>] print_usage_bug+0x1e7/0x1f8
[  107.881631]  [<ffffffff8100a796>] ? save_stack_trace+0x27/0x44
[  107.881631]  [<ffffffff8104fc3f>] ? print_irq_inversion_bug.part.26+0x19a/0x19a
[  107.881631]  [<ffffffff810504bb>] mark_lock+0x106/0x258
[  107.881631]  [<ffffffff81051817>] __lock_acquire+0x2d3/0xd52
[  107.881631]  [<ffffffff8102be73>] ? vprintk+0x3ab/0x3d7
[  107.881631]  [<ffffffff810526ac>] lock_acquire+0x8a/0xa7
[  107.881631]  [<ffffffffa0012c8d>] ? mgmt_set_local_name_complete+0x84/0x10b [bluetooth]
[  107.881631]  [<ffffffff81052615>] ? lock_release+0x16c/0x179
[  107.881631]  [<ffffffff812b3952>] _raw_spin_lock_bh+0x31/0x40
[  107.881631]  [<ffffffffa0012c8d>] ? mgmt_set_local_name_complete+0x84/0x10b [bluetooth]
[  107.881631]  [<ffffffffa0012c8d>] mgmt_set_local_name_complete+0x84/0x10b [bluetooth]
[  107.881631]  [<ffffffffa000d3fe>] hci_event_packet+0x122b/0x3e12 [bluetooth]
[  107.881631]  [<ffffffff81050658>] ? mark_held_locks+0x4b/0x6d
[  107.881631]  [<ffffffff812b3cff>] ? _raw_spin_unlock_irqrestore+0x40/0x4d
[  107.881631]  [<ffffffff810507b9>] ? trace_hardirqs_on_caller+0x13f/0x172
[  107.881631]  [<ffffffff812b3d07>] ? _raw_spin_unlock_irqrestore+0x48/0x4d
[  107.881631]  [<ffffffffa00083d2>] hci_rx_task+0xc8/0x2f3 [bluetooth]
[  107.881631]  [<ffffffff8102f836>] ? __local_bh_enable+0x90/0xa4
[  107.881631]  [<ffffffff8102f5a9>] tasklet_action+0x87/0xe6
[  107.881631]  [<ffffffff8102fa11>] __do_softirq+0x9f/0x13f
[  107.881631]  [<ffffffff812b56bc>] call_softirq+0x1c/0x26
[  107.881631]  <EOI>  [<ffffffff810033b8>] ? do_softirq+0x46/0x9a
[  107.881631]  [<ffffffff8106a805>] ? rcu_cpu_kthread+0x2b5/0x2e2
[  107.881631]  [<ffffffff8102f906>] _local_bh_enable_ip+0xac/0xc9
[  107.881631]  [<ffffffff8102f93b>] local_bh_enable+0xd/0xf
[  107.881631]  [<ffffffff8106a805>] rcu_cpu_kthread+0x2b5/0x2e2
[  107.881631]  [<ffffffff81041586>] ? __init_waitqueue_head+0x46/0x46
[  107.881631]  [<ffffffff8106a550>] ? rcu_yield.constprop.42+0x98/0x98
[  107.881631]  [<ffffffff81040f0a>] kthread+0x7f/0x87
[  107.881631]  [<ffffffff812b55c4>] kernel_thread_helper+0x4/0x10
[  107.881631]  [<ffffffff812b40d4>] ? retint_restore_args+0x13/0x13
[  107.881631]  [<ffffffff81040e8b>] ? __init_kthread_worker+0x53/0x53
[  107.881631]  [<ffffffff812b55c0>] ? gs_change+0x13/0x13

Signed-off-by: Andre Guedes <andre.guedes@openbossa.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>

EnJens pushed a commit that referenced this issue Oct 2, 2012

FS-Cache: Add a helper to bulk uncache pages on an inode
Add an FS-Cache helper to bulk uncache pages on an inode.  This will
only work for the circumstance where the pages in the cache correspond
1:1 with the pages attached to an inode's page cache.

This is required for CIFS and NFS: When disabling inode cookie, we were
returning the cookie and setting cifsi->fscache to NULL but failed to
invalidate any previously mapped pages.  This resulted in "Bad page
state" errors and manifested in other kind of errors when running
fsstress.  Fix it by uncaching mapped pages when we disable the inode
cookie.

This patch should fix the following oops and "Bad page state" errors
seen during fsstress testing.

  ------------[ cut here ]------------
  kernel BUG at fs/cachefiles/namei.c:201!
  invalid opcode: 0000 [#1] SMP
  Pid: 5, comm: kworker/u:0 Not tainted 2.6.38.7-30.fc15.x86_64 #1 Bochs Bochs
  RIP: 0010: cachefiles_walk_to_object+0x436/0x745 [cachefiles]
  RSP: 0018:ffff88002ce6dd00  EFLAGS: 00010282
  RAX: ffff88002ef165f0 RBX: ffff88001811f500 RCX: 0000000000000000
  RDX: 0000000000000000 RSI: 0000000000000100 RDI: 0000000000000282
  RBP: ffff88002ce6dda0 R08: 0000000000000100 R09: ffffffff81b3a300
  R10: 0000ffff00066c0a R11: 0000000000000003 R12: ffff88002ae54840
  R13: ffff88002ae54840 R14: ffff880029c29c00 R15: ffff88001811f4b0
  FS:  00007f394dd32720(0000) GS:ffff88002ef00000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
  CR2: 00007fffcb62ddf8 CR3: 000000001825f000 CR4: 00000000000006e0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
  Process kworker/u:0 (pid: 5, threadinfo ffff88002ce6c000, task ffff88002ce55cc0)
  Stack:
   0000000000000246 ffff88002ce55cc0 ffff88002ce6dd58 ffff88001815dc00
   ffff8800185246c0 ffff88001811f618 ffff880029c29d18 ffff88001811f380
   ffff88002ce6dd50 ffffffff814757e4 ffff88002ce6dda0 ffffffff8106ac56
  Call Trace:
   cachefiles_lookup_object+0x78/0xd4 [cachefiles]
   fscache_lookup_object+0x131/0x16d [fscache]
   fscache_object_work_func+0x1bc/0x669 [fscache]
   process_one_work+0x186/0x298
   worker_thread+0xda/0x15d
   kthread+0x84/0x8c
   kernel_thread_helper+0x4/0x10
  RIP  cachefiles_walk_to_object+0x436/0x745 [cachefiles]
  ---[ end trace 1d481c9af1804caa ]---

I tested the uncaching by the following means:

 (1) Create a big file on my NFS server (104857600 bytes).

 (2) Read the file into the cache with md5sum on the NFS client.  Look in
     /proc/fs/fscache/stats:

	Pages  : mrk=25601 unc=0

 (3) Open the file for read/write ("bash 5<>/warthog/bigfile").  Look in proc
     again:

	Pages  : mrk=25601 unc=25601

Reported-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-and-Tested-by: Suresh Jayaraman <sjayaraman@suse.de>
cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

EnJens pushed a commit that referenced this issue Oct 2, 2012

USB: EHCI: Allow users to override 80% max periodic bandwidth
There are cases, when 80% max isochronous bandwidth is too limiting.

For example I have two USB video capture cards which stream uncompressed
video, and to stream full NTSC + PAL videos we'd need

    NTSC 640x480 YUV422 @30fps      ~17.6 MB/s
    PAL  720x576 YUV422 @25fps      ~19.7 MB/s

isoc bandwidth.

Now, due to limited alt settings in capture devices NTSC one ends up
streaming with max_pkt_size=2688  and  PAL with max_pkt_size=2892, both
with interval=1. In terms of microframe time allocation this gives

    NTSC    ~53us
    PAL     ~57us

and together

    ~110us  >  100us == 80% of 125us uframe time.

So those two devices can't work together simultaneously because the'd
over allocate isochronous bandwidth.

80% seemed a bit arbitrary to me, and I've tried to raise it to 90% and
both devices started to work together, so I though sometimes it would be
a good idea for users to override hardcoded default of max 80% isoc
bandwidth.

After all, isn't it a user who should decide how to load the bus? If I
can live with 10% or even 5% bulk bandwidth that should be ok. I'm a USB
newcomer, but that 80% set in stone by USB 2.0 specification seems to be
chosen pretty arbitrary to me, just to serve as a reasonable default.

NOTE 1
~~~~~~

for two streams with max_pkt_size=3072 (worst case) both time
allocation would be 60us+60us=120us which is 96% periodic bandwidth
leaving 4% for bulk and control.  Alan Stern suggested that bulk then
would be problematic (less than 300*8 bittimes left per microframe), but
I think that is still enough for control traffic.

NOTE 2
~~~~~~

Sarah Sharp expressed concern that maxing out periodic bandwidth
could lead to vendor-specific hardware bugs on host controllers, because

> It's entirely possible that you'll run into
> vendor-specific bugs if you try to pack the schedule with isochronous
> transfers.  I don't think any hardware designer would seriously test or
> validate their hardware with a schedule that is basically a violation of
> the USB bus spec (more than 80% for periodic transfers).

So far I've only tested this patch on my HP Mini 5103 with N10 chipset

    kirr@mini:~$ lspci
    00:00.0 Host bridge: Intel Corporation N10 Family DMI Bridge
    00:02.0 VGA compatible controller: Intel Corporation N10 Family Integrated Graphics Controller
    00:02.1 Display controller: Intel Corporation N10 Family Integrated Graphics Controller
    00:1b.0 Audio device: Intel Corporation N10/ICH 7 Family High Definition Audio Controller (rev 02)
    00:1c.0 PCI bridge: Intel Corporation N10/ICH 7 Family PCI Express Port 1 (rev 02)
    00:1c.3 PCI bridge: Intel Corporation N10/ICH 7 Family PCI Express Port 4 (rev 02)
    00:1d.0 USB Controller: Intel Corporation N10/ICH 7 Family USB UHCI Controller #1 (rev 02)
    00:1d.1 USB Controller: Intel Corporation N10/ICH 7 Family USB UHCI Controller #2 (rev 02)
    00:1d.2 USB Controller: Intel Corporation N10/ICH 7 Family USB UHCI Controller #3 (rev 02)
    00:1d.3 USB Controller: Intel Corporation N10/ICH 7 Family USB UHCI Controller #4 (rev 02)
    00:1d.7 USB Controller: Intel Corporation N10/ICH 7 Family USB2 EHCI Controller (rev 02)
    00:1e.0 PCI bridge: Intel Corporation 82801 Mobile PCI Bridge (rev e2)
    00:1f.0 ISA bridge: Intel Corporation NM10 Family LPC Controller (rev 02)
    00:1f.2 SATA controller: Intel Corporation N10/ICH7 Family SATA AHCI Controller (rev 02)
    01:00.0 Network controller: Broadcom Corporation BCM4313 802.11b/g/n Wireless LAN Controller (rev 01)
    02:00.0 Ethernet controller: Marvell Technology Group Ltd. 88E8059 PCI-E Gigabit Ethernet Controller (rev 11)

and the system works stable with 110us/uframe (~88%) isoc bandwith allocated for
above-mentioned isochronous transfers.

NOTE 3
~~~~~~

This feature is off by default. I mean max periodic bandwidth is set to
100us/uframe by default exactly as it was before the patch. So only those of us
who need the extreme settings are taking the risk - normal users who do not
alter uframe_periodic_max sysfs attribute should not see any change at all.

NOTE 4
~~~~~~

I've tried to update documentation in Documentation/ABI/ thoroughly, but
only "TBD" was put into Documentation/usb/ehci.txt -- the text there seems
to be outdated and much needing refreshing, before it could be amended.

Cc: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

EnJens pushed a commit that referenced this issue Oct 2, 2012

x86, ioapic: Format clean up for IOAPIC output
When IOAPIC data is displayed in "dmesg" with the help of the
boot parameter "apic=debug" certain values are not formatted
correctly wrt their size.

In the "dmesg" snippet below, note that the output for "max
redirection entries", and "IO APIC version" which are each
defined to be just 8-bits long are displayed as 2 bytes in
length. Similarly, "Dst" under the "IRQ redirection table"
should only be 8-bits long.

IO APIC #0......
...
...
.... register #1: 00170020
.......     : max redirection entries: 0017
.......     : PRQ implemented: 0
.......     : IO APIC version: 0020
...
...
.... IRQ redirection table:
 NR Dst Mask Trig IRR Pol Stat Dmod Deli Vect:
 00 000 1    0    0   0   0    0    0    00
 01 000 0    0    0   0   0    0    0    31
 02 000 0    0    0   0   0    0    0    30
 03 000 1    0    0   0   0    0    0    33
...
...

Do some formatting clean up, so you will see output like below:

IO APIC #0......
...
...
.... register #1: 00170020
.......     : max redirection entries: 17
.......     : PRQ implemented: 0
.......     : IO APIC version: 20
...
...
.... IRQ redirection table:
 NR Dst Mask Trig IRR Pol Stat Dmod Deli Vect:
 00 00  1    0    0   0   0    0    0    00
 01 00  0    0    0   0   0    0    0    31
 02 00  0    0    0   0   0    0    0    30
 03 00  1    0    0   0   0    0    0    33
...
...

Signed-off-by: Naga Chumbalkar <nagananda.chumbalkar@hp.com>
Link: http://lkml.kernel.org/r/20110708184557.2734.61830.sendpatchset@nchumbalkar.americas.cpqcorp.net
Signed-off-by: Ingo Molnar <mingo@elte.hu>

EnJens pushed a commit that referenced this issue Oct 2, 2012

pch_dma: Fix channel locking
Fix for the following INFO message

=================================
[ INFO: inconsistent lock state ]
2.6.39+ #89
---------------------------------
inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
rs232/822 [HC1[1]:SC0[0]:HE0:SE1] takes:
 (&(&pd_chan->lock)->rlock){?.....}, at: [<c123b9a1>] pdc_desc_get+0x16/0xab
{HARDIRQ-ON-W} state was registered at:
  [<c104fe28>] mark_irqflags+0xbd/0x11a
  [<c1050386>] __lock_acquire+0x501/0x6bb
  [<c1050945>] lock_acquire+0x63/0x7b
  [<c131c51d>] _raw_spin_lock_bh+0x43/0x51
  [<c123bee4>] pd_alloc_chan_resources+0x92/0x11e
  [<c123ad62>] dma_chan_get+0x9b/0x107
  [<c123b2d1>] __dma_request_channel+0x61/0xdc
  [<c11ba24b>] pch_request_dma+0x61/0x19e
  [<c11bb3b8>] pch_uart_startup+0x16a/0x1a2
  [<c11b8446>] uart_startup+0x87/0x147
  [<c11b9183>] uart_open+0x117/0x13e
  [<c11a5c7d>] tty_open+0x23c/0x34c
  [<c1097705>] chrdev_open+0x140/0x15f
  [<c10930a6>] __dentry_open.clone.14+0x14a/0x22b
  [<c1093dfb>] nameidata_to_filp+0x36/0x40
  [<c109f28b>] do_last+0x513/0x635
  [<c109f4af>] path_openat+0x9c/0x2aa
  [<c109f6e4>] do_filp_open+0x27/0x69
  [<c1093f02>] do_sys_open+0xfd/0x184
  [<c1093fad>] sys_open+0x24/0x2a
  [<c131d58c>] sysenter_do_call+0x12/0x32
irq event stamp: 2522
hardirqs last  enabled at (2521): [<c131ca3b>] _raw_spin_unlock_irqrestore+0x36/0x52
hardirqs last disabled at (2522): [<c131db27>] common_interrupt+0x27/0x34
softirqs last  enabled at (2354): [<c102fa11>] __do_softirq+0x10a/0x11a
softirqs last disabled at (2299): [<c10041a4>] do_softirq+0x57/0xa4

other info that might help us debug this:
2 locks held by rs232/822:
 #0:  (&tty->atomic_write_lock){+.+.+.}, at: [<c11a4b7a>] tty_write_lock+0x14/0x3c
 #1:  (&port_lock_key){-.....}, at: [<c11bad72>] pch_uart_interrupt+0x17/0x1e9

stack backtrace:
Pid: 822, comm: rs232 Not tainted 2.6.39+ #89
Call Trace:
 [<c1319f90>] ? printk+0x19/0x1b
 [<c104f893>] print_usage_bug+0x184/0x18f
 [<c104e5b1>] ? print_irq_inversion_bug+0x10e/0x10e
 [<c104f943>] mark_lock_irq+0xa5/0x1f6
 [<c104fc9c>] mark_lock+0x208/0x2d7
 [<c104fdc0>] mark_irqflags+0x55/0x11a
 [<c1050386>] __lock_acquire+0x501/0x6bb
 [<c10042ee>] ? dump_trace+0x92/0xb6
 [<c1050945>] lock_acquire+0x63/0x7b
 [<c123b9a1>] ? pdc_desc_get+0x16/0xab
 [<c131c2d0>] _raw_spin_lock+0x3e/0x4c
 [<c123b9a1>] ? pdc_desc_get+0x16/0xab
 [<c123b9a1>] pdc_desc_get+0x16/0xab
 [<c10504d8>] ? __lock_acquire+0x653/0x6bb
 [<c123bb2c>] pd_prep_slave_sg+0x7c/0x1cb
 [<c1006c3f>] ? nommu_map_sg+0x6e/0x81
 [<c11bace6>] dma_handle_tx+0x2cf/0x344
 [<c11bad72>] ? pch_uart_interrupt+0x17/0x1e9
 [<c11baebb>] pch_uart_interrupt+0x160/0x1e9
 [<c10642fb>] handle_irq_event_percpu+0x25/0x127
 [<c1064429>] handle_irq_event+0x2c/0x43
 [<c1065e0d>] ? handle_fasteoi_irq+0x84/0x84
 [<c1065eb9>] handle_edge_irq+0xac/0xce
 <IRQ>  [<c1003ecb>] ? do_IRQ+0x38/0x9d
 [<c131db2e>] ? common_interrupt+0x2e/0x34
 [<c105007b>] ? __lock_acquire+0x1f6/0x6bb
 [<c131ca3d>] ? _raw_spin_unlock_irqrestore+0x38/0x52
 [<c11b798b>] ? uart_start+0x2d/0x32
 [<c11b7998>] ? uart_flush_chars+0x8/0xa
 [<c11a7962>] ? n_tty_write+0x12c/0x1c6
 [<c1027a73>] ? try_to_wake_up+0x251/0x251
 [<c11a4d0b>] ? tty_write+0x169/0x1dc
 [<c11a7836>] ? n_tty_ioctl+0xb7/0xb7
 [<c1094841>] ? vfs_write+0x91/0x10d
 [<c11a4ba2>] ? tty_write_lock+0x3c/0x3c
 [<c1094a69>] ? sys_write+0x3e/0x63
 [<c131d58c>] ? sysenter_do_call+0x12/0x32

Signed-off-by: Alexander Stein <alexander.stein@systec-electronic.com>
Tested-by: Tomoya MORINAGA <tomoya-linux@dsn.okisemi.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>

EnJens pushed a commit that referenced this issue Oct 2, 2012

gma500: skip getting modes via DDC on Moorestown
Moorestown does not have a DDC bus, skip getting modes via DDC. This
fixes the following bug:

BUG: unable to handle kernel NULL pointer dereference at 00000010
IP: [<c1172ff7>] i2c_transfer+0x17/0xb0
*pde = 00000000
Oops: 0000 [#1]

Call Trace:
 [<c1153ae9>] drm_do_probe_ddc_edid+0x59/0x90
 [<c1153cb4>] drm_get_edid+0x24/0x250
 [<c11805d2>] psb_intel_ddc_get_modes+0x22/0x60
 [<c117fe11>] psb_intel_lvds_get_modes+0x21/0x80

Signed-off-by: Octavian Purdila <octavian.purdila@intel.com>
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

EnJens pushed a commit that referenced this issue Oct 2, 2012

mwl8k: Fixing sta dereference when ieee80211_tx_info->control.sta is …
…NULL

Following oops was seen on SMP machine

>BUG: unable to handle kernel NULL pointer dereference at 00000012
>IP: [<f8c56691>] mwl8k_tx+0x20e/0x561 [mwl8k]
>*pde = 00000000
>Oops: 0000 [#1] SMP
>Modules linked in: mwl8k mac80211 cfg80211 [last unloaded: cfg80211]

As ieee80211_tx_info->control.sta may be NULL during ->tx call, avoiding sta
dereference in such scenario with the following patch.

Signed-off-by: Yogesh Ashok Powar <yogeshp@marvell.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

EnJens pushed a commit that referenced this issue Oct 2, 2012

[SCSI] bnx2i: Fixed kernel panic due to illegal usage of sc->request-…
…>cpu

A kernel panic was observed when passing the sc->request->cpu = -1 to
retrieve the per_cpu variable pointer:
 #0 [ffff880011203960] machine_kexec at ffffffff81022bc3
 #1 [ffff8800112039b0] crash_kexec at ffffffff81088630
 #2 [ffff880011203a80] __die at ffffffff8139ea20
 #3 [ffff880011203aa0] no_context at ffffffff8102f3a7
 #4 [ffff880011203ae0] __bad_area_nosemaphore at ffffffff8102f665
 #5 [ffff880011203ba0] retint_signal at ffffffff8139dd1f
 #6 [ffff880011203cc8] bnx2i_indicate_kcqe at ffffffffa03dc4f2
 #7 [ffff880011203da8] service_kcqes at ffffffffa03cb04f
 #8 [ffff880011203e68] cnic_service_bnx2x_kcq at ffffffffa03cb14a
 #9 [ffff880011203e88] cnic_service_bnx2x_bh at ffffffffa03cb1b3

The problem lies in the slow path sg_io (and perhaps sg_scsi_ioctl) call to
blk_get_request->get_request/wait->blk_alloc_request->blk_rq_init which
re-initializes the request->cpu to -1.  There is no assignment for cpu from
that to the request_fn call to low level drivers.

When this happens, the sc->request->cpu will be using the init value of
-1.  This will create a kernel panic when it hits bnx2i because the code
refers it to get the per_cpu variables ptr.

This change is to put in a guard against that and also for cases when
bio affinity/queue completion to the same cpu is not enabled.  In those
cases, the request->cpu will remain a -1 also.

This bug was created from commit:  b5cf6b6

For the case when the blk layer did not setup the request->cpu, bnx2i
will complete the sc with the current CPU of the thread.

Signed-off-by: Eddie Wai <eddie.wai@broadcom.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

EnJens pushed a commit that referenced this issue Oct 2, 2012

lguest: Do not exit on non-fatal errors
Do not exit on some non-fatal errors:

- writev() fails in net_output(). The result is a lost packet or packets.
- writev() fails in console_output(). The result is partially lost console
output.
- readv() fails in net_input(). The result is a lost packet or packets.

Rather than bringing the guest down, this patch ignores e.g. an allocation
failure on the host side. Example:

lguest: page allocation failure. order:4, mode:0x4d0
Pid: 4045, comm: lguest Tainted: G        W   2.6.36 #1
Call Trace:
 [<c138d614>] ? printk+0x18/0x1c
 [<c106a4e2>] __alloc_pages_nodemask+0x4d2/0x570
 [<c1087954>] cache_alloc_refill+0x2a4/0x4d0
 [<c1305149>] ? __netif_receive_skb+0x189/0x270
 [<c1087c5a>] __kmalloc+0xda/0xf0
 [<c12fffa5>] __alloc_skb+0x55/0x100
 [<c1305519>] ? net_rx_action+0x79/0x100
 [<c12fafed>] sock_alloc_send_pskb+0x18d/0x280
 [<c11fda25>] ? _copy_from_user+0x35/0x130
 [<c13010b6>] ? memcpy_fromiovecend+0x56/0x80
 [<c12a74dc>] tun_chr_aio_write+0x1cc/0x500
 [<c108a125>] do_sync_readv_writev+0x95/0xd0
 [<c11fda25>] ? _copy_from_user+0x35/0x130
 [<c1089fa8>] ? rw_copy_check_uvector+0x58/0x100
 [<c108a7bc>] do_readv_writev+0x9c/0x1d0
 [<c12a7310>] ? tun_chr_aio_write+0x0/0x500
 [<c108a93a>] vfs_writev+0x4a/0x60
 [<c108aa21>] sys_writev+0x41/0x80
 [<c138f061>] syscall_call+0x7/0xb
Mem-Info:
DMA per-cpu:
CPU    0: hi:    0, btch:   1 usd:   0
Normal per-cpu:
CPU    0: hi:  186, btch:  31 usd:   0
HighMem per-cpu:
CPU    0: hi:  186, btch:  31 usd:   0
active_anon:134651 inactive_anon:50543 isolated_anon:0
 active_file:96881 inactive_file:132007 isolated_file:0
 unevictable:0 dirty:3 writeback:0 unstable:0
 free:91374 slab_reclaimable:6300 slab_unreclaimable:2802
 mapped:2281 shmem:9 pagetables:330 bounce:0
DMA free:3524kB min:64kB low:80kB high:96kB active_anon:0kB inactive_anon:8kB active_file:8760kB inactive_file:2760kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15868kB mlocked:0kB dirty:0kB writeback:0kB mapped:16kB shmem:0kB slab_reclaimable:88kB slab_unreclaimable:148kB kernel_stack:40kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 865 2016 2016
Normal free:150100kB min:3728kB low:4660kB high:5592kB active_anon:6224kB inactive_anon:15772kB active_file:324084kB inactive_file:325944kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:885944kB mlocked:0kB dirty:12kB writeback:0kB mapped:1520kB shmem:0kB slab_reclaimable:25112kB slab_unreclaimable:11060kB kernel_stack:1888kB pagetables:1320kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 9207 9207
HighMem free:211872kB min:512kB low:1752kB high:2992kB active_anon:532380kB inactive_anon:186392kB active_file:54680kB inactive_file:199324kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:1178504kB mlocked:0kB dirty:0kB writeback:0kB mapped:7588kB shmem:36kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0 0
DMA: 3*4kB 65*8kB 35*16kB 18*32kB 11*64kB 9*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 3524kB
Normal: 35981*4kB 344*8kB 158*16kB 28*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 150100kB
HighMem: 5732*4kB 5462*8kB 2826*16kB 1598*32kB 84*64kB 10*128kB 7*256kB 1*512kB 1*1024kB 1*2048kB 9*4096kB = 211872kB
231237 total pagecache pages
2340 pages in swap cache
Swap cache stats: add 160060, delete 157720, find 189017/194106
Free swap  = 4179840kB
Total swap = 4194300kB
524271 pages RAM
296946 pages HighMem
5668 pages reserved
867664 pages shared
82155 pages non-shared

Signed-off-by: Sakari Ailus <sakari.ailus@iki.fi>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

EnJens pushed a commit that referenced this issue Oct 2, 2012

MIPS: pfn_valid() is broken on low memory HIGHMEM systems
pfn_valid() compares the PFN to max_mapnr:

        __pfn >= min_low_pfn && __pfn < max_mapnr;

On HIGHMEM kernels, highend_pfn is used to set the value of max_mapnr.
Unfortunately, highend_pfn is left at zero if the system does not
actually have enough RAM to reach into the HIGHMEM range.  This causes
pfn_valid() to always return false, and when debug checks are enabled
the kernel will fail catastrophically:

Memory: 22432k/32768k available (2249k kernel code, 10336k reserved, 653k data, 1352k init, 0k highmem)
NR_IRQS:128
kfree_debugcheck: out of range ptr 81c02900h.
Kernel bug detected[#1]:
Cpu 0
$ 0   : 00000000 10008400 00000034 00000000
$ 4   : 8003e160 802a0000 8003e160 00000000
$ 8   : 00000000 0000003e 00000747 00000747
...

On such a configuration, max_low_pfn should be used to set max_mapnr.

This was seen on 2.6.34.

Signed-off-by: Kevin Cernekee <cernekee@gmail.com>
To: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/1992/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

EnJens pushed a commit that referenced this issue Oct 2, 2012

libceph: fix msgpool
There were several problems here:

 1- we weren't tagging allocations with the pool, so they were never
    returned to the pool.
 2- msgpool_put didn't add back to the mempool, even it were called.
 3- msgpool_release didn't clear the pool pointer, so it would have looped
    had #1 not been broken.

These may or may not have been responsible for #1136 or #1381 (BUG due to
non-empty mempool on umount).  I can't seem to trigger the crash now using
the method I was using before.

Signed-off-by: Sage Weil <sage@newdream.net>

EnJens pushed a commit that referenced this issue Oct 2, 2012

Revert "memcg: get rid of percpu_charge_mutex lock"
This reverts commit 8521fc5.

The patch incorrectly assumes that using atomic FLUSHING_CACHED_CHARGE
bit operations is sufficient but that is not true.  Johannes Weiner has
reported a crash during parallel memory cgroup removal:

  BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
  IP: [<ffffffff81083b70>] css_is_ancestor+0x20/0x70
  Oops: 0000 [#1] PREEMPT SMP
  Pid: 19677, comm: rmdir Tainted: G        W   3.0.0-mm1-00188-gf38d32b #35 ECS MCP61M-M3/MCP61M-M3
  RIP: 0010:[<ffffffff81083b70>]  css_is_ancestor+0x20/0x70
  RSP: 0018:ffff880077b09c88  EFLAGS: 00010202
  Process rmdir (pid: 19677, threadinfo ffff880077b08000, task ffff8800781bb310)
  Call Trace:
   [<ffffffff810feba3>] mem_cgroup_same_or_subtree+0x33/0x40
   [<ffffffff810feccf>] drain_all_stock+0x11f/0x170
   [<ffffffff81103211>] mem_cgroup_force_empty+0x231/0x6d0
   [<ffffffff811036c4>] mem_cgroup_pre_destroy+0x14/0x20
   [<ffffffff81080559>] cgroup_rmdir+0xb9/0x500
   [<ffffffff81114d26>] vfs_rmdir+0x86/0xe0
   [<ffffffff81114e7b>] do_rmdir+0xfb/0x110
   [<ffffffff81114ea6>] sys_rmdir+0x16/0x20
   [<ffffffff8154d76b>] system_call_fastpath+0x16/0x1b

We are crashing because we try to dereference cached memcg when we are
checking whether we should wait for draining on the cache.  The cache is
already cleaned up, though.

There is also a theoretical chance that the cached memcg gets freed
between we test for the FLUSHING_CACHED_CHARGE and dereference it in
mem_cgroup_same_or_subtree:

        CPU0                    CPU1                         CPU2
  mem=stock->cached
  stock->cached=NULL
                              clear_bit
                                                        test_and_set_bit
  test_bit()                    ...
  <preempted>             mem_cgroup_destroy
  use after free

The percpu_charge_mutex protected from this race because sync draining
is exclusive.

It is safer to revert now and come up with a more parallel
implementation later.

Signed-off-by: Michal Hocko <mhocko@suse.cz>
Reported-by: Johannes Weiner <jweiner@redhat.com>
Acked-by: Johannes Weiner <jweiner@redhat.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

EnJens pushed a commit that referenced this issue Oct 2, 2012

rt2x00: fix crash in rt2800usb_write_tx_desc
Patch should fix this oops:

BUG: unable to handle kernel NULL pointer dereference at 000000a0
IP: [<f8e06078>] rt2800usb_write_tx_desc+0x18/0xc0 [rt2800usb]
*pdpt = 000000002408c001 *pde = 0000000024079067 *pte = 0000000000000000
Oops: 0000 [#1] SMP
EIP: 0060:[<f8e06078>] EFLAGS: 00010282 CPU: 0
EIP is at rt2800usb_write_tx_desc+0x18/0xc0 [rt2800usb]
EAX: 00000035 EBX: ef2bef10 ECX: 00000000 EDX: d40958a0
ESI: ef1865f8 EDI: ef1865f8 EBP: d4095878 ESP: d409585c
 DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
Call Trace:
 [<f8da5e85>] rt2x00queue_write_tx_frame+0x155/0x300 [rt2x00lib]
 [<f8da424c>] rt2x00mac_tx+0x7c/0x370 [rt2x00lib]
 [<c04882b2>] ? mark_held_locks+0x62/0x90
 [<c081f645>] ? _raw_spin_unlock_irqrestore+0x35/0x60
 [<c04884ba>] ? trace_hardirqs_on_caller+0x5a/0x170
 [<c04885db>] ? trace_hardirqs_on+0xb/0x10
 [<f8d618ac>] __ieee80211_tx+0x5c/0x1e0 [mac80211]
 [<f8d631fc>] ieee80211_tx+0xbc/0xe0 [mac80211]
 [<f8d63163>] ? ieee80211_tx+0x23/0xe0 [mac80211]
 [<f8d632e1>] ieee80211_xmit+0xc1/0x200 [mac80211]
 [<f8d63220>] ? ieee80211_tx+0xe0/0xe0 [mac80211]
 [<c0487d45>] ? lock_release_holdtime+0x35/0x1b0
 [<f8d63986>] ? ieee80211_subif_start_xmit+0x446/0x5f0 [mac80211]
 [<f8d637dd>] ieee80211_subif_start_xmit+0x29d/0x5f0 [mac80211]
 [<f8d63924>] ? ieee80211_subif_start_xmit+0x3e4/0x5f0 [mac80211]
 [<c0760188>] ? sock_setsockopt+0x6a8/0x6f0
 [<c0760000>] ? sock_setsockopt+0x520/0x6f0
 [<c076daef>] dev_hard_start_xmit+0x2ef/0x650

Oops might happen because we perform parallel putting new entries in a
queue (rt2x00queue_write_tx_frame()) and removing entries after
finishing transmitting (rt2800usb_work_txdone()). There are cases when
_txdone may process an entry that was not fully send and nullify
entry->skb .

To fix check in _txdone if entry has flags that indicate pending
transmission and wait until flags get cleared.

Reported-by: Justin Piszcz <jpiszcz@lucidpixels.com>
Cc: stable@kernel.org
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Acked-by: Ivo van Doorn <IvDoorn@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

EnJens pushed a commit that referenced this issue Oct 2, 2012

rt2x00: fix crash in rt2800usb_get_txwi
Patch should fix this oops:

BUG: unable to handle kernel NULL pointer dereference at 000000a0
IP: [<f81b30c9>] rt2800usb_get_txwi+0x19/0x70 [rt2800usb]
*pdpt = 0000000000000000 *pde = f000ff53f000ff53
Oops: 0000 [#1] SMP
Pid: 198, comm: kworker/u:3 Tainted: G        W   3.0.0-wl+ #9 LENOVO 6369CTO/6369CTO
EIP: 0060:[<f81b30c9>] EFLAGS: 00010283 CPU: 1
EIP is at rt2800usb_get_txwi+0x19/0x70 [rt2800usb]
EAX: 00000000 EBX: f465e140 ECX: f4494960 EDX: ef24c5f8
ESI: 810f21f5 EDI: f1da9960 EBP: f4581e80 ESP: f4581e70
 DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
Process kworker/u:3 (pid: 198, ti=f4580000 task=f4494960 task.ti=f4580000)
Call Trace:
 [<f804790f>] rt2800_txdone_entry+0x2f/0xf0 [rt2800lib]
 [<c045110d>] ? warn_slowpath_common+0x7d/0xa0
 [<f81b3a38>] ? rt2800usb_work_txdone+0x288/0x360 [rt2800usb]
 [<f81b3a38>] ? rt2800usb_work_txdone+0x288/0x360 [rt2800usb]
 [<f81b3a13>] rt2800usb_work_txdone+0x263/0x360 [rt2800usb]
 [<c046a8d6>] process_one_work+0x186/0x440
 [<c046a85a>] ? process_one_work+0x10a/0x440
 [<f81b37b0>] ? rt2800usb_probe_hw+0x120/0x120 [rt2800usb]
 [<c046c283>] worker_thread+0x133/0x310
 [<c04885db>] ? trace_hardirqs_on+0xb/0x10
 [<c046c150>] ? manage_workers+0x1e0/0x1e0
 [<c047054c>] kthread+0x7c/0x90
 [<c04704d0>] ? __init_kthread_worker+0x60/0x60
 [<c0826b42>] kernel_thread_helper+0x6/0x1

Oops might happen because we check rt2x00queue_empty(queue) twice,
but this condition can change and we can process entry in
rt2800_txdone_entry(), which was already processed by
rt2800usb_txdone_entry_check() -> rt2x00lib_txdone_noinfo() and
has nullify entry->skb .

Reported-by: Justin Piszcz <jpiszcz@lucidpixels.com>
Cc: stable@kernel.org
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Acked-by: Ivo van Doorn <IvDoorn@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

EnJens pushed a commit that referenced this issue Oct 2, 2012

usb: musb: gadget: fix error path
In case one "forgot" to load the receiver i.e. doing
|modprobe omap2430
|modprobe musb_hdrc

he ends up with:

|musb-hdrc: version 6.0, ?dma?, otg (peripheral+host)
|HS USB OTG: no transceiver configured
|musb-hdrc musb-hdrc: musb_init_controller failed with status -19
|(NULL device *): gadget not registered.
|Unable to handle kernel NULL pointer dereference at virtual address 0000001c
|Internal error: Oops: 17 [#1] SMP
|[<c011383c>] (sysfs_find_dirent+0x4/0x60) from [<c01138c0>] (sysfs_get_dirent+0x28/0x78)
|[<c01138c0>] (sysfs_get_dirent+0x28/0x78) from [<c0115b78>] (sysfs_unmerge_group+0x1c/0x90)
|[<c0115b78>] (sysfs_unmerge_group+0x1c/0x90) from [<c0179ba4>] (dpm_sysfs_remove+0x14/0x3c)
|[<c0179ba4>] (dpm_sysfs_remove+0x14/0x3c) from [<c01742f8>] (device_del+0x40/0x1b4)
|[<c01742f8>] (device_del+0x40/0x1b4) from [<c0174478>] (device_unregister+0xc/0x18)
|[<c0174478>] (device_unregister+0xc/0x18) from [<bf0489b4>] (musb_free+0x24/0x88 [musb_hdrc])
|[<bf0489b4>] (musb_free+0x24/0x88 [musb_hdrc]) from [<bf057d18>] (musb_probe+0xb50/0xe3c [musb_hdrc])
|[<bf057d18>] (musb_probe+0xb50/0xe3c [musb_hdrc]) from [<c01779c4>] (platform_drv_probe+0x1c/0x24)

The problem is that musb_free() tries to figure out what was
initializued and what wasn't and clean up only the initialized part.
This works well for usb_del_gadget_udc() but device_unregister() can't
deal with it. Therefore we rely on the fact the we always have a parent
device and only then remove the device.
I broke this in 0f91349 ("usb: gadget: convert all users to the new udc
infrastructure")

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Felipe Balbi <balbi@ti.com>

EnJens pushed a commit that referenced this issue Oct 2, 2012

mmc: sdhci: check host->clock before using it as a denominator
Sometimes host->clock could be zero which is a legal situation. This
patch checks host->clock before usage as a denominator when timeout is
calculated. A similar patch is applied for mmc core (see commit e9b8684,
"mmc: fix division by zero in MMC core").

Without this patch, the execution of the sdhci_calc_timeout could end up
with a backtrace:

<0>[    4.014319] divide error: 0000 [#1] PREEMPT SMP
<4>[    4.014352] Modules linked in: g_ether
<4>[    4.014376]
<4>[    4.014393] Pid: 33, comm: kworker/u:2 Not tainted 3.0.0+ #646
<4>[    4.014421] EIP: 0060:[<c12fa38e>] EFLAGS: 00010046 CPU: 1
<4>[    4.014449] EIP is at sdhci_calc_timeout+0x2e/0x100
<4>[    4.014468] EAX: 00000000 EBX: f5930fc8 ECX: 00000000 EDX: 00000000
<4>[    4.014488] ESI: f5291de8 EDI: f5291db8 EBP: f5291c6c ESP: f5291c50
<4>[    4.014508]  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
<0>[    4.014529] Process kworker/u:2 (pid: 33, ti=f5290000 task=f53065a0 task.ti=f5290000)
<0>[    4.014546] Stack:
<4>[    4.014557]  00000082 c1054fdd f5291c78 04000000 f5930fc8 f5291de8 f5291db8 f5291cac
<4>[    4.014611]  c12fab7c c107a98b f5291c88 c13b6d3f f593109c f5882000 f5291cac c1054fdd
<4>[    4.014663]  00000000 00000000 f5882000 00000082 f5930fc8 f5291db8 0000000a f5291ccc
<0>[    4.014716] Call Trace:
<4>[    4.014743]  [<c1054fdd>] ? mod_timer+0x11d/0x380
<4>[    4.014770]  [<c12fab7c>] sdhci_prepare_data+0x2c/0x3a0
<4>[    4.014798]  [<c107a98b>] ? trace_hardirqs_off+0xb/0x10
<4>[    4.014827]  [<c13b6d3f>] ? _raw_spin_unlock_irqrestore+0x2f/0x60
<4>[    4.014854]  [<c1054fdd>] ? mod_timer+0x11d/0x380
<4>[    4.014880]  [<c12fc7db>] sdhci_send_command+0xdb/0x210
<4>[    4.014906]  [<c12fd5f3>] sdhci_request+0xc3/0x150
<4>[    4.014932]  [<c12ec56a>] mmc_start_request+0xda/0x200
<4>[    4.014960]  [<c120d7c2>] ? __raw_spin_lock_init+0x32/0x60
<4>[    4.014989]  [<c1066a85>] ? __init_waitqueue_head+0x35/0x50
<4>[    4.015015]  [<c12ec70b>] mmc_wait_for_req+0x7b/0x90
<4>[    4.015045]  [<c12f0c67>] mmc_send_cxd_data+0xf7/0x130
<4>[    4.015076]  [<c12ecbc0>] ? mmc_erase+0x140/0x140
<4>[    4.015102]  [<c12f139d>] mmc_send_ext_csd+0x1d/0x20
<4>[    4.015125]  [<c12efef0>] mmc_get_ext_csd+0x70/0x140
<4>[    4.015151]  [<c12effe8>] mmc_compare_ext_csds+0x28/0x190
<4>[    4.015176]  [<c12f039f>] mmc_init_card+0x24f/0x650
<4>[    4.015201]  [<c13b6d5d>] ? _raw_spin_unlock_irqrestore+0x4d/0x60
<4>[    4.015226]  [<c107fd9c>] ? trace_hardirqs_on_caller+0x11c/0x160
<4>[    4.015255]  [<c12f09a4>] mmc_attach_mmc+0xa4/0x190
<4>[    4.015282]  [<c12ee3f0>] mmc_rescan+0x210/0x240
<4>[    4.015311]  [<c105f9b6>] process_one_work+0x176/0x550
<4>[    4.015336]  [<c105f93a>] ? process_one_work+0xfa/0x550
<4>[    4.015360]  [<c12ee1e0>] ? mmc_init_erase+0x140/0x140
<4>[    4.015385]  [<c1061c2a>] worker_thread+0x12a/0x2c0
<4>[    4.015410]  [<c1061b00>] ? manage_workers.clone.18+0x100/0x100
<4>[    4.015437]  [<c1066244>] kthread+0x74/0x80
<4>[    4.015463]  [<c10661d0>] ? __init_kthread_worker+0x60/0x60
<4>[    4.015490]  [<c13b7dfa>] kernel_thread_helper+0x6/0xd
<0>[    4.015507] Code: 57 89 d7 56 53 89 c3 83 ec 10 8b 40 04 8b 72 28 f6 c4 10 89 45 f0 0f 85 91 00 00 00 85 f6 0f 84 c1 00 00 00 8b 4e 04 31 d2 89 c8 <f7> 73 58 ba d3 4d 62 10 89 c1 8b 06 f7 e2 c1 ea 06 01 d1 f7 45
<0>[    4.015829] EIP: [<c12fa38e>] sdhci_calc_timeout+0x2e/0x100 SS:ESP 0068:f5291c50

Reported-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Chris Ball <cjb@laptop.org>

EnJens pushed a commit that referenced this issue Oct 2, 2012

OMAP4: clock: re-enable previous clockdomain enable/disable sequence
After commit 665d001 ("OMAP2+: hwmod:
Follow the recommended PRCM module enable sequence"), device drivers
for OMAP IP blocks that do not use runtime PM can cause oopses or
kernel instability[1][2].

This is because those non-runtime PM drivers do not use the hwmod
code, which implements the correct IP block enable and disable
sequence.

Several options for dealing with this problem have been proposed:

1. Add a new field to the OMAP struct clk to mark clocks that are
   currently used by non-runtime PM drivers.  Modify the clock code to
   use the old clockdomain sequence for these marked clocks.  As
   drivers are converted to use runtime PM, remove the annotation from
   the clocks.

2. Similar to #1, but associate the flag with the struct omap_clk
   instead.

3. Add IDLEST wait support to the OMAP4 clock code, similar to the way
   it is implemented for OMAP2/3, and enable it in each struct clk
   currently used by non-runtime PM drivers.  As drivers are converted
   to use runtime PM, remove the annotation from the clocks.

4. Do nothing; leave the problem to those responsible for the
   unconverted drivers.

5. Re-enable clock-based clockdomain control in the OMAP4 clock code.
   This would revert back to the behavior of Linux 3.0, simply with a
   slightly longer module enable/disable latency.

Unfortunately, no approach seemed particularly good.  Options 1
through 3 seemed unwise due to the following reasons:

A. The OMAP struct clks are intended primarily to describe hardware
   clock nodes, and the intention is that no driver-specific data
   should be stored there (applies to #1)

B. The resulting patch would have been quite large for the -rc series
   (applies to #1, #2, #3)

C. The patch would have been a new, yet temporary hack; and similar fixes
   have drawn negative comments in the recent past (see for example [3])

Option 4 is undesirable because commit
665d001 ("OMAP2+: hwmod: Follow the
recommended PRCM module enable sequence") has resulted in a less
stable kernel; and kernel stability is more important than OMAP4 power
management.

Option 5 is the approach taken in this patch.  This seemed to be the
least intrusive approach for 3.1-rc.

The approach in this patch was originally proposed by Ohad Ben-Cohen
<ohad@wizery.com>.  I'm simply writing the commit message and passing
it along.

...

Thanks to Luciano Coelho <coelho@ti.com> for reporting the problem.
Thanks to Ohad Ben-Cohen <ohad@wizery.com> for tracking the problem
down, generating a temporary workaround, and proposing a patch to deal
with the problem.  Thanks to Rajendra Nayak <rnayak@ti.com> for
proposing another patch to deal with the problem.  Thanks to Felipe
Balbi <balbi@ti.com> for comments.

1. Coelho, Luciano <coelho@ti.com>.  _Re: Oops on ehci_hcd when
   booting 3.0.0-rc2 on panda_.  Tue, 09 Aug 2011 14:26:08 +0300.
   Posted to the <linux-omap@vger.kernel.org> mailing list.  Available
   from (among others)
   http://www.spinics.net/linux/lists/linux-omap/msg55213.html

2. Munegowda, Keshava <keshava_mgowda@ti.com>. _Re: Oops on ehci_hcd
   when booting 3.0.0-rc2 on panda_.  Thu, 11 Aug 2011 13:51:05 +0530.
   Posted to the <linux-omap@vger.kernel.org> mailing list.  Available
   from (among others)
   http://www.spinics.net/linux/lists/linux-omap/msg55371.html

3. King, Russell <linux@arm.linux.org.uk>.  _Re: [PATCH 5/8] OMAP4:
   PM: TEMP: Prevent l3init from idling/force sleep_.  Thu, 23 Jun
   2011 16:22:49 +0100.  Posted to the <linux-omap@vger.kernel.org>
   mailing list.  Available from (among others)
   http://www.mail-archive.com/linux-omap@vger.kernel.org/msg51392.html

Signed-off-by: Paul Walmsley <paul@pwsan.com>
Cc: Luciano Coelho <coelho@ti.com>
Cc: Ohad Ben-Cohen <ohad@wizery.com>
Cc: Rajendra Nayak <rnayak@ti.com>
Cc: Benoît Cousson <b-cousson@ti.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com>

EnJens pushed a commit that referenced this issue Oct 2, 2012

arp: fix rcu lockdep splat in arp_process()
Dave Jones reported a lockdep splat triggered by an arp_process() call
from parp_redo().

Commit faa9dcf (arp: RCU changes) is the origin of the bug, since
it assumed arp_process() was called under rcu_read_lock(), which is not
true in this particular path.

Instead of adding rcu_read_lock() in parp_redo(), I chose to add it in
neigh_proxy_process() to take care of IPv6 side too.

 ===================================================
 [ INFO: suspicious rcu_dereference_check() usage. ]
 ---------------------------------------------------
 include/linux/inetdevice.h:209 invoked rcu_dereference_check() without
protection!

 other info that might help us debug this:

 rcu_scheduler_active = 1, debug_locks = 0
 4 locks held by setfiles/2123:
  #0:  (&sb->s_type->i_mutex_key#13){+.+.+.}, at: [<ffffffff8114cbc4>]
walk_component+0x1ef/0x3e8
  #1:  (&isec->lock){+.+.+.}, at: [<ffffffff81204bca>]
inode_doinit_with_dentry+0x3f/0x41f
  #2:  (&tbl->proxy_timer){+.-...}, at: [<ffffffff8106a803>]
run_timer_softirq+0x157/0x372
  #3:  (class){+.-...}, at: [<ffffffff8141f256>] neigh_proxy_process
+0x36/0x103

 stack backtrace:
 Pid: 2123, comm: setfiles Tainted: G        W
3.1.0-0.rc2.git7.2.fc16.x86_64 #1
 Call Trace:
  <IRQ>  [<ffffffff8108ca23>] lockdep_rcu_dereference+0xa7/0xaf
  [<ffffffff8146a0b7>] __in_dev_get_rcu+0x55/0x5d
  [<ffffffff8146a751>] arp_process+0x25/0x4d7
  [<ffffffff8146ac11>] parp_redo+0xe/0x10
  [<ffffffff8141f2ba>] neigh_proxy_process+0x9a/0x103
  [<ffffffff8106a8c4>] run_timer_softirq+0x218/0x372
  [<ffffffff8106a803>] ? run_timer_softirq+0x157/0x372
  [<ffffffff8141f220>] ? neigh_stat_seq_open+0x41/0x41
  [<ffffffff8108f2f0>] ? mark_held_locks+0x6d/0x95
  [<ffffffff81062bb6>] __do_softirq+0x112/0x25a
  [<ffffffff8150d27c>] call_softirq+0x1c/0x30
  [<ffffffff81010bf5>] do_softirq+0x4b/0xa2
  [<ffffffff81062f65>] irq_exit+0x5d/0xcf
  [<ffffffff8150dc11>] smp_apic_timer_interrupt+0x7c/0x8a
  [<ffffffff8150baf3>] apic_timer_interrupt+0x73/0x80
  <EOI>  [<ffffffff8108f439>] ? trace_hardirqs_on_caller+0x121/0x158
  [<ffffffff814fc285>] ? __slab_free+0x30/0x24c
  [<ffffffff814fc283>] ? __slab_free+0x2e/0x24c
  [<ffffffff81204e74>] ? inode_doinit_with_dentry+0x2e9/0x41f
  [<ffffffff81204e74>] ? inode_doinit_with_dentry+0x2e9/0x41f
  [<ffffffff81204e74>] ? inode_doinit_with_dentry+0x2e9/0x41f
  [<ffffffff81130cb0>] kfree+0x108/0x131
  [<ffffffff81204e74>] inode_doinit_with_dentry+0x2e9/0x41f
  [<ffffffff81204fc6>] selinux_d_instantiate+0x1c/0x1e
  [<ffffffff81200f4f>] security_d_instantiate+0x21/0x23
  [<ffffffff81154625>] d_instantiate+0x5c/0x61
  [<ffffffff811563ca>] d_splice_alias+0xbc/0xd2
  [<ffffffff811b17ff>] ext4_lookup+0xba/0xeb
  [<ffffffff8114bf1e>] d_alloc_and_lookup+0x45/0x6b
  [<ffffffff8114cbea>] walk_component+0x215/0x3e8
  [<ffffffff8114cdf8>] lookup_last+0x3b/0x3d
  [<ffffffff8114daf3>] path_lookupat+0x82/0x2af
  [<ffffffff8110fc53>] ? might_fault+0xa5/0xac
  [<ffffffff8110fc0a>] ? might_fault+0x5c/0xac
  [<ffffffff8114c564>] ? getname_flags+0x31/0x1ca
  [<ffffffff8114dd48>] do_path_lookup+0x28/0x97
  [<ffffffff8114df2c>] user_path_at+0x59/0x96
  [<ffffffff811467ad>] ? cp_new_stat+0xf7/0x10d
  [<ffffffff811469a6>] vfs_fstatat+0x44/0x6e
  [<ffffffff811469ee>] vfs_lstat+0x1e/0x20
  [<ffffffff81146b3d>] sys_newlstat+0x1a/0x33
  [<ffffffff8108f439>] ? trace_hardirqs_on_caller+0x121/0x158
  [<ffffffff812535fe>] ? trace_hardirqs_on_thunk+0x3a/0x3f
  [<ffffffff8150af82>] system_call_fastpath+0x16/0x1b

Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

EnJens pushed a commit that referenced this issue Oct 2, 2012

lockdep: Add helper function for dir vs file i_mutex annotation
Purely in-memory filesystems do not use the inode hash as the dcache
tells us if an entry already exists.  As a result, they do not call
unlock_new_inode, and thus directory inodes do not get put into a
different lockdep class for i_sem.

We need the different lockdep classes, because the locking order for
i_mutex is different for directory inodes and regular inodes.  Directory
inodes can do "readdir()", which takes i_mutex *before* possibly taking
mm->mmap_sem (due to a page fault while copying the directory entry to
user space).

In contrast, regular inodes can be mmap'ed, which takes mm->mmap_sem
before accessing i_mutex.

The two cases can never happen for the same inode, so no real deadlock
can occur, but without the different lockdep classes, lockdep cannot
understand that.  As a result, if CONFIG_DEBUG_LOCK_ALLOC is set, this
can lead to false positives from lockdep like below:

    find/645 is trying to acquire lock:
     (&mm->mmap_sem){++++++}, at: [<ffffffff81109514>] might_fault+0x5c/0xac

    but task is already holding lock:
     (&sb->s_type->i_mutex_key#15){+.+.+.}, at: [<ffffffff81149f34>]
    vfs_readdir+0x5b/0xb4

    which lock already depends on the new lock.

    the existing dependency chain (in reverse order) is:

    -> #1 (&sb->s_type->i_mutex_key#15){+.+.+.}:
          [<ffffffff8108ac26>] lock_acquire+0xbf/0x103
          [<ffffffff814db822>] __mutex_lock_common+0x4c/0x361
          [<ffffffff814dbc46>] mutex_lock_nested+0x40/0x45
          [<ffffffff811daa87>] hugetlbfs_file_mmap+0x82/0x110
          [<ffffffff81111557>] mmap_region+0x258/0x432
          [<ffffffff811119dd>] do_mmap_pgoff+0x2ac/0x306
          [<ffffffff81111b4f>] sys_mmap_pgoff+0x118/0x16a
          [<ffffffff8100c858>] sys_mmap+0x22/0x24
          [<ffffffff814e3ec2>] system_call_fastpath+0x16/0x1b

    -> #0 (&mm->mmap_sem){++++++}:
          [<ffffffff8108a4bc>] __lock_acquire+0xa1a/0xcf7
          [<ffffffff8108ac26>] lock_acquire+0xbf/0x103
          [<ffffffff81109541>] might_fault+0x89/0xac
          [<ffffffff81149cff>] filldir+0x6f/0xc7
          [<ffffffff811586ea>] dcache_readdir+0x67/0x205
          [<ffffffff81149f54>] vfs_readdir+0x7b/0xb4
          [<ffffffff8114a073>] sys_getdents+0x7e/0xd1
          [<ffffffff814e3ec2>] system_call_fastpath+0x16/0x1b

This patch moves the directory vs file lockdep annotation into a helper
function that can be called by in-memory filesystems and has hugetlbfs
call it.

Signed-off-by: Josh Boyer <jwboyer@redhat.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

EnJens pushed a commit that referenced this issue Oct 2, 2012

[SCSI] fcoe: Fix deadlock between fip's recv_work and rtnl
The rtnl cannot be held durrng the fcoe_interface_put.
If it is the last reference on the fcoe_interface the
fcoe_ctlr_destroy will be called as a part of the
cleanup, ultimately calling cancel_work_sync(&fip->recv_work);

If we are processing a flogi response we will be in
the recv_work context and we will lock the rtnl to
add a new unicast MAC address. This is how the deadlock
can occur.

The fix is simply to move the rtnl_lock/unlock into
fcoe_interface_cleanup so that it can be unlocked before
fcoe_interface_put is called.

Here is the lockdep report:

Jul 21 11:26:35 bubba [  223.870702]
ul 21 11:26:35 bubba [  223.870704] =======================================================
Jul 21 11:26:35 bubba [  223.871255] [ INFO: possible circular locking dependency detected ]
Jul 21 11:26:35 bubba [  223.871530] 3.0.0-rc7+ #1
Jul 21 11:26:35 bubba [  223.871797] -------------------------------------------------------
Jul 21 11:26:35 bubba [  223.872072] lockdeptest.sh/3464 is trying to acquire lock:
Jul 21 11:26:35 bubba [  223.872345]  ((&fip->recv_work)
Jul 21 11:26:35 bubba ){+.+.+.}
Jul 21 11:26:35 bubba , at:
Jul 21 11:26:35 bubba [<ffffffff810531f1>] wait_on_work+0x0/0xbd
Jul 21 11:26:35 bubba [  223.873022]
Jul 21 11:26:35 bubba [  223.873023] but task is already holding lock:
Jul 21 11:26:35 bubba [  223.873555]  (rtnl_mutex
Jul 21 11:26:35 bubba ){+.+.+.}
Jul 21 11:26:35 bubba , at:
Jul 21 11:26:35 bubba [<ffffffff813e8233>] rtnl_lock+0x12/0x14
Jul 21 11:26:35 bubba [  223.874229]
Jul 21 11:26:35 bubba [  223.874230] which lock already depends on the new lock.
Jul 21 11:26:35 bubba [  223.874231]
Jul 21 11:26:35 bubba [  223.875032]
Jul 21 11:26:35 bubba [  223.875033] the existing dependency chain (in reverse order) is:
Jul 21 11:26:35 bubba [  223.875573]
Jul 21 11:26:35 bubba [  223.875573] -> #1
Jul 21 11:26:35 bubba (rtnl_mutex
Jul 21 11:26:35 bubba ){+.+.+.}
Jul 21 11:26:35 bubba :
Jul 21 11:26:35 bubba [  223.876301]
Jul 21 11:26:35 bubba [<ffffffff8106c14a>] lock_acquire+0xd2/0xf7
Jul 21 11:26:35 bubba [  223.876645]
Jul 21 11:26:35 bubba [<ffffffff8151d975>] __mutex_lock_common+0x47/0x30d
Jul 21 11:26:35 bubba [  223.876991]
Jul 21 11:26:35 bubba [<ffffffff8151dd36>] mutex_lock_nested+0x3b/0x40
Jul 21 11:26:35 bubba [  223.877334]
Jul 21 11:26:35 bubba [<ffffffff813e8233>] rtnl_lock+0x12/0x14
Jul 21 11:26:35 bubba [  223.877675]
Jul 21 11:26:35 bubba [<ffffffffa003d5a0>] fcoe_update_src_mac+0x2b/0x80 [fcoe]
Jul 21 11:26:35 bubba [  223.878022]
Jul 21 11:26:35 bubba [<ffffffffa003d698>] fcoe_flogi_resp+0x5e/0x79 [fcoe]
Jul 21 11:26:35 bubba [  223.878366]
Jul 21 11:26:35 bubba [<ffffffffa001566f>] fc_exch_recv+0x7f5/0x9da [libfc]
Jul 21 11:26:35 bubba [  223.878713]
Jul 21 11:26:35 bubba [<ffffffffa00327d8>] fcoe_ctlr_recv_work+0x71f/0x10dc [libfcoe]
Jul 21 11:26:35 bubba [  223.879258]
Jul 21 11:26:35 bubba [<ffffffff81053761>] process_one_work+0x1d7/0x347
Jul 21 11:26:35 bubba [  223.879601]
Jul 21 11:26:35 bubba [<ffffffff81054ade>] worker_thread+0xf8/0x17c
Jul 21 11:26:35 bubba [  223.879944]
Jul 21 11:26:35 bubba [<ffffffff81058184>] kthread+0x7d/0x85
Jul 21 11:26:35 bubba [  223.880287]
Jul 21 11:26:35 bubba [<ffffffff81526414>] kernel_thread_helper+0x4/0x10
Jul 21 11:26:35 bubba [  223.880634]
Jul 21 11:26:35 bubba [  223.880635] -> #0
Jul 21 11:26:35 bubba ((&fip->recv_work)
Jul 21 11:26:35 bubba ){+.+.+.}
Jul 21 11:26:35 bubba :
Jul 21 11:26:35 bubba [  223.881357]
Jul 21 11:26:35 bubba [<ffffffff8106b93e>] __lock_acquire+0xb1d/0xe2c
Jul 21 11:26:35 bubba [  223.881695]
Jul 21 11:26:35 bubba [<ffffffff8106c14a>] lock_acquire+0xd2/0xf7
Jul 21 11:26:35 bubba [  223.882033]
Jul 21 11:26:35 bubba [<ffffffff81053241>] wait_on_work+0x50/0xbd
Jul 21 11:26:35 bubba [  223.882378]
Jul 21 11:26:35 bubba [<ffffffff81053b32>] __cancel_work_timer+0xb6/0xf4
Jul 21 11:26:35 bubba [  223.882718]
Jul 21 11:26:35 bubba [<ffffffff81053b8a>] cancel_work_sync+0xb/0xd
Jul 21 11:26:35 bubba [  223.883057]
Jul 21 11:26:35 bubba [<ffffffffa00317e6>] fcoe_ctlr_destroy+0x1d/0x67 [libfcoe]
Jul 21 11:26:35 bubba [  223.883399]
Jul 21 11:26:35 bubba [<ffffffffa003e51e>] fcoe_interface_release+0x21/0x45 [fcoe]
Jul 21 11:26:35 bubba [  223.883940]
Jul 21 11:26:35 bubba [<ffffffff811fbbe6>] kref_put+0x43/0x4d
Jul 21 11:26:35 bubba [  223.884280]
Jul 21 11:26:35 bubba [<ffffffffa003ebba>] fcoe_interface_put+0x17/0x19 [fcoe]
Jul 21 11:26:35 bubba [  223.884624]
Jul 21 11:26:35 bubba [<ffffffffa003f2a6>] fcoe_interface_cleanup+0x188/0x193 [fcoe]
Jul 21 11:26:35 bubba [  223.885163]
Jul 21 11:26:35 bubba [<ffffffffa003f303>] fcoe_destroy+0x52/0x72 [fcoe]
Jul 21 11:26:35 bubba [  223.885502]
Jul 21 11:26:35 bubba [<ffffffffa00340a4>] fcoe_transport_destroy+0xab/0x110 [libfcoe]
Jul 21 11:26:35 bubba [  223.886045]
Jul 21 11:26:35 bubba [<ffffffff81056153>] param_attr_store+0x43/0x62
Jul 21 11:26:35 bubba [  223.886385]
Jul 21 11:26:35 bubba [<ffffffff8105602d>] module_attr_store+0x21/0x25
Jul 21 11:26:35 bubba [  223.886728]
Jul 21 11:26:35 bubba [<ffffffff8114c23d>] sysfs_write_file+0x103/0x13f
Jul 21 11:26:35 bubba [  223.887068]
Jul 21 11:26:35 bubba [<ffffffff810f3e7b>] vfs_write+0xa7/0xfa
Jul 21 11:26:35 bubba [  223.887406]
Jul 21 11:26:35 bubba [<ffffffff810f4073>] sys_write+0x45/0x69
Jul 21 11:26:35 bubba [  223.887742]
Jul 21 11:26:35 bubba [<ffffffff815252bb>] system_call_fastpath+0x16/0x1b
Jul 21 11:26:35 bubba [  223.888083]
Jul 21 11:26:35 bubba [  223.888084] other info that might help us debug this:
Jul 21 11:26:35 bubba [  223.888085]
Jul 21 11:26:35 bubba [  223.888879]  Possible unsafe locking scenario:
Jul 21 11:26:35 bubba [  223.888881]
Jul 21 11:26:35 bubba [  223.889411]        CPU0                    CPU1
Jul 21 11:26:35 bubba [  223.889683]        ----                    ----
Jul 21 11:26:35 bubba [  223.889955]   lock(
Jul 21 11:26:35 bubba rtnl_mutex
Jul 21 11:26:35 bubba );
Jul 21 11:26:35 bubba [  223.890349]                                lock(
Jul 21 11:26:35 bubba (&fip->recv_work)
Jul 21 11:26:35 bubba );
Jul 21 11:26:35 bubba [  223.890751]                                lock(
Jul 21 11:26:35 bubba rtnl_mutex
Jul 21 11:26:35 bubba );
Jul 21 11:26:35 bubba [  223.891154]   lock(
Jul 21 11:26:35 bubba (&fip->recv_work)
Jul 21 11:26:35 bubba );
Jul 21 11:26:35 bubba [  223.891549]
Jul 21 11:26:35 bubba [  223.891550]  *** DEADLOCK ***
Jul 21 11:26:35 bubba [  223.891551]
Jul 21 11:26:35 bubba [  223.892347] 6 locks held by lockdeptest.sh/3464:
Jul 21 11:26:35 bubba [  223.892621]  #0:
Jul 21 11:26:35 bubba (&buffer->mutex
Jul 21 11:26:35 bubba ){+.+.+.}
Jul 21 11:26:35 bubba , at:
Jul 21 11:26:35 bubba [<ffffffff8114c171>] sysfs_write_file+0x37/0x13f
Jul 21 11:26:35 bubba [  223.893359]  #1:
Jul 21 11:26:35 bubba (s_active
Jul 21 11:26:35 bubba ){++++.+}
Jul 21 11:26:35 bubba , at:
Jul 21 11:26:35 bubba [<ffffffff8114c21c>] sysfs_write_file+0xe2/0x13f
Jul 21 11:26:35 bubba [  223.894094]  #2:
Jul 21 11:26:35 bubba (param_lock
Jul 21 11:26:35 bubba ){+.+.+.}
Jul 21 11:26:35 bubba , at:
Jul 21 11:26:35 bubba [<ffffffff81056146>] param_attr_store+0x36/0x62
Jul 21 11:26:35 bubba [  223.894835]  #3:
Jul 21 11:26:35 bubba (ft_mutex
Jul 21 11:26:35 bubba ){+.+.+.}
Jul 21 11:26:35 bubba , at:
Jul 21 11:26:35 bubba [<ffffffffa0034017>] fcoe_transport_destroy+0x1e/0x110 [libfcoe]
Jul 21 11:26:35 bubba [  223.895574]  #4:
Jul 21 11:26:35 bubba (fcoe_config_mutex
Jul 21 11:26:35 bubba ){+.+.+.}
Jul 21 11:26:35 bubba , at:
Jul 21 11:26:35 bubba [<ffffffffa003f2c9>] fcoe_destroy+0x18/0x72 [fcoe]
Jul 21 11:26:35 bubba [  223.896314]  #5:
Jul 21 11:26:35 bubba (rtnl_mutex
Jul 21 11:26:35 bubba ){+.+.+.}
Jul 21 11:26:35 bubba , at:
Jul 21 11:26:35 bubba [<ffffffff813e8233>] rtnl_lock+0x12/0x14
Jul 21 11:26:35 bubba [  223.897047]
Jul 21 11:26:35 bubba [  223.897048] stack backtrace:
Jul 21 11:26:35 bubba [  223.897578] Pid: 3464, comm: lockdeptest.sh Not tainted 3.0.0-rc7+ #1
Jul 21 11:26:35 bubba [  223.897853] Call Trace:
Jul 21 11:26:35 bubba [  223.898128]  [<ffffffff81068e16>] print_circular_bug+0x1f8/0x209
Jul 21 11:26:35 bubba [  223.898416]  [<ffffffff8106b93e>] __lock_acquire+0xb1d/0xe2c
Jul 21 11:26:35 bubba [  223.898699]  [<ffffffff810531f1>] ? wait_on_cpu_work+0xe6/0xe6
Jul 21 11:26:35 bubba [  223.898982]  [<ffffffff8106c14a>] lock_acquire+0xd2/0xf7
Jul 21 11:26:35 bubba [  223.899263]  [<ffffffff810531f1>] ? wait_on_cpu_work+0xe6/0xe6
Jul 21 11:26:35 bubba [  223.899547]  [<ffffffff8104a097>] ? mod_timer+0x8f/0x98
Jul 21 11:26:35 bubba [  223.899827]  [<ffffffff81053241>] wait_on_work+0x50/0xbd
Jul 21 11:26:35 bubba [  223.900108]  [<ffffffff810531f1>] ? wait_on_cpu_work+0xe6/0xe6
Jul 21 11:26:35 bubba [  223.900390]  [<ffffffff81053b32>] __cancel_work_timer+0xb6/0xf4
Jul 21 11:26:35 bubba [  223.900671]  [<ffffffff81053b8a>] cancel_work_sync+0xb/0xd
Jul 21 11:26:35 bubba [  223.900953]  [<ffffffffa00317e6>] fcoe_ctlr_destroy+0x1d/0x67 [libfcoe]
Jul 21 11:26:35 bubba [  223.901237]  [<ffffffffa003e51e>] fcoe_interface_release+0x21/0x45 [fcoe]
Jul 21 11:26:35 bubba [  223.901522]  [<ffffffffa003e4fd>] ? fcoe_enable+0x6b/0x6b [fcoe]
Jul 21 11:26:35 bubba [  223.901803]  [<ffffffff811fbbe6>] kref_put+0x43/0x4d
Jul 21 11:26:35 bubba [  223.902083]  [<ffffffffa003ebba>] fcoe_interface_put+0x17/0x19 [fcoe]
Jul 21 11:26:35 bubba [  223.902367]  [<ffffffffa003f2a6>] fcoe_interface_cleanup+0x188/0x193 [fcoe]
Jul 21 11:26:35 bubba [  223.902653]  [<ffffffff8151dd36>] ? mutex_lock_nested+0x3b/0x40
Jul 21 11:26:35 bubba [  223.902939]  [<ffffffffa003f303>] fcoe_destroy+0x52/0x72 [fcoe]
Jul 21 11:26:35 bubba [  223.903223]  [<ffffffffa00340a4>] fcoe_transport_destroy+0xab/0x110 [libfcoe]
Jul 21 11:26:35 bubba [  223.903508]  [<ffffffff81056153>] param_attr_store+0x43/0x62
Jul 21 11:26:35 bubba [  223.903792]  [<ffffffff8105602d>] module_attr_store+0x21/0x25
Jul 21 11:26:35 bubba [  223.904075]  [<ffffffff8114c23d>] sysfs_write_file+0x103/0x13f
Jul 21 11:26:35 bubba [  223.904357]  [<ffffffff810f3e7b>] vfs_write+0xa7/0xfa
Jul 21 11:26:35 bubba [  223.904642]  [<ffffffff810f51d6>] ? fget_light+0x35/0x96
Jul 21 11:26:35 bubba [  223.904923]  [<ffffffff810f4073>] sys_write+0x45/0x69
Jul 21 11:26:35 bubba [  223.905204]  [<ffffffff815252bb>] system_call_fastpath+0x16/0x1b
Jul 21 11:26:36 bubba [  223.964438] ixgbe 0000:05:00.0: eth3: detected SFP+: 5
Jul 21 11:26:37 bubba [  225.196702] ixgbe 0000:05:00.0: eth3: NIC Link is Up 10 Gbps, Flow Control: None

Signed-off-by: Robert Love <robert.w.love@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Reviewed-by: Yi Zou <yi.zou@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

EnJens pushed a commit that referenced this issue Oct 2, 2012

mmc: core: prevent aggressive clock gating racing with ios updates
We have seen at least two different races when clock gating kicks in in a
middle of ios structure update.

First one happens when ios->clock is changed outside of aggressive clock
gating framework, for example via mmc_set_clock(). The race might happen
when we run following code:

mmc_set_ios():
	...
	if (ios->clock > 0)
		mmc_set_ungated(host);

Now if gating kicks in right after the condition check we end up setting
host->clk_gated to false even though we have just gated the clock. Next
time a request is started we try to ungate and restore the clock in
mmc_host_clk_hold(). However since we have host->clk_gated set to false the
original clock is not restored.

This eventually will cause the host controller to hang since its clock is
disabled while we are trying to issue a request. For example on Intel
Medfield platform we see:

[   13.818610] mmc2: Timeout waiting for hardware interrupt.
[   13.818698] sdhci: =========== REGISTER DUMP (mmc2)===========
[   13.818753] sdhci: Sys addr: 0x00000000 | Version:  0x00008901
[   13.818804] sdhci: Blk size: 0x00000000 | Blk cnt:  0x00000000
[   13.818853] sdhci: Argument: 0x00000000 | Trn mode: 0x00000000
[   13.818903] sdhci: Present:  0x1fff0000 | Host ctl: 0x00000001
[   13.818951] sdhci: Power:    0x0000000d | Blk gap:  0x00000000
[   13.819000] sdhci: Wake-up:  0x00000000 | Clock:    0x00000000
[   13.819049] sdhci: Timeout:  0x00000000 | Int stat: 0x00000000
[   13.819098] sdhci: Int enab: 0x00ff00c3 | Sig enab: 0x00ff00c3
[   13.819147] sdhci: AC12 err: 0x00000000 | Slot int: 0x00000000
[   13.819196] sdhci: Caps:     0x6bee32b2 | Caps_1:   0x00000000
[   13.819245] sdhci: Cmd:      0x00000000 | Max curr: 0x00000000
[   13.819292] sdhci: Host ctl2: 0x00000000
[   13.819331] sdhci: ADMA Err: 0x00000000 | ADMA Ptr: 0x00000000
[   13.819377] sdhci: ===========================================
[   13.919605] mmc2: Reset 0x2 never completed.

and it never recovers.

Second race might happen while running mmc_power_off():

static void mmc_power_off(struct mmc_host *host)
{
	host->ios.clock = 0;
	host->ios.vdd = 0;

[ clock gating kicks in here ]

	/*
	 * Reset ocr mask to be the highest possible voltage supported for
	 * this mmc host. This value will be used at next power up.
	 */
	host->ocr = 1 << (fls(host->ocr_avail) - 1);

	if (!mmc_host_is_spi(host)) {
		host->ios.bus_mode = MMC_BUSMODE_OPENDRAIN;
		host->ios.chip_select = MMC_CS_DONTCARE;
	}
	host->ios.power_mode = MMC_POWER_OFF;
	host->ios.bus_width = MMC_BUS_WIDTH_1;
	host->ios.timing = MMC_TIMING_LEGACY;
	mmc_set_ios(host);
}

If the clock gating worker kicks in while we are only partially updated the
ios structure the host controller gets incomplete ios and might not work as
supposed. Again on Intel Medfield platform we get:

[    4.185349] kernel BUG at drivers/mmc/host/sdhci.c:1155!
[    4.185422] invalid opcode: 0000 [#1] PREEMPT SMP
[    4.185509] Modules linked in:
[    4.185565]
[    4.185608] Pid: 4, comm: kworker/0:0 Not tainted 3.0.0+ #240 Intel Corporation Medfield/iCDKA
[    4.185742] EIP: 0060:[<c136364e>] EFLAGS: 00010083 CPU: 0
[    4.185827] EIP is at sdhci_set_power+0x3e/0xd0
[    4.185891] EAX: f5ff98e0 EBX: f5ff98e0 ECX: 00000000 EDX: 00000001
[    4.185970] ESI: f5ff977c EDI: f5ff9904 EBP: f644fe98 ESP: f644fe94
[    4.186049]  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
[    4.186125] Process kworker/0:0 (pid: 4, ti=f644e000 task=f644c0e0 task.ti=f644e000)
[    4.186219] Stack:
[    4.186257]  f5ff98e0 f644feb0 c1365173 00000282 f5ff9460 f5ff96e0 f5ff96e0 f644feec
[    4.186418]  c1355bd8 f644c0e0 c1499c3d f5ff96e0 f644fed4 00000006 f5ff96e0 00000286
[    4.186579]  f644fedc c107922b f644feec 00000286 f5ff9460 f5ff9700 f644ff10 c135839e
[    4.186739] Call Trace:
[    4.186802]  [<c1365173>] sdhci_set_ios+0x1c3/0x340
[    4.186883]  [<c1355bd8>] mmc_gate_clock+0x68/0x120
[    4.186963]  [<c1499c3d>] ? _raw_spin_unlock_irqrestore+0x4d/0x60
[    4.187052]  [<c107922b>] ? trace_hardirqs_on+0xb/0x10
[    4.187134]  [<c135839e>] mmc_host_clk_gate_delayed+0xbe/0x130
[    4.187219]  [<c105ec09>] ? process_one_work+0xf9/0x5b0
[    4.187300]  [<c135841d>] mmc_host_clk_gate_work+0xd/0x10
[    4.187379]  [<c105ec82>] process_one_work+0x172/0x5b0
[    4.187457]  [<c105ec09>] ? process_one_work+0xf9/0x5b0
[    4.187538]  [<c1358410>] ? mmc_host_clk_gate_delayed+0x130/0x130
[    4.187625]  [<c105f3c8>] worker_thread+0x118/0x330
[    4.187700]  [<c1496cee>] ? preempt_schedule+0x2e/0x50
[    4.187779]  [<c105f2b0>] ? rescuer_thread+0x1f0/0x1f0
[    4.187857]  [<c1062cf4>] kthread+0x74/0x80
[    4.187931]  [<c1062c80>] ? __init_kthread_worker+0x60/0x60
[    4.188015]  [<c149acfa>] kernel_thread_helper+0x6/0xd
[    4.188079] Code: 81 fa 00 00 04 00 0f 84 a7 00 00 00 7f 21 81 fa 80 00 00 00 0f 84 92 00 00 00 81 fa 00 00 0
[    4.188780] EIP: [<c136364e>] sdhci_set_power+0x3e/0xd0 SS:ESP 0068:f644fe94
[    4.188898] ---[ end trace a7b23eecc71777e4 ]---

This BUG() comes from the fact that ios.power_mode was still in previous
value (MMC_POWER_ON) and ios.vdd was set to zero.

We prevent these by inhibiting the clock gating while we update the ios
structure.

Both problems can be reproduced by simply running the device in a reboot
loop.

Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Tested-by: Chris Ball <cjb@laptop.org>
Cc: <stable@kernel.org>
Signed-off-by: Chris Ball <cjb@laptop.org>

EnJens pushed a commit that referenced this issue Oct 2, 2012

dmaengine/ste_dma40: fix Oops due to double free of client descriptor
The client list may exist in two lists at the same time. This makes free
fail since the same desc is freed multiple times. Remove desc from
client list when adding it to the pending queue. Move free of client owned
descriptors from free_dma() to terminate_all().

Unable to handle kernel paging request at virtual address 00100104
pgd = dea8c000
[00100104] *pgd=1ea62831, *pte=00000000, *ppte=00000000
Internal error: Oops: 817 [#1] PREEMPT SMP
Modules linked in:
CPU: 0    Not tainted  (3.1.0-rc3+ #58)
PC is at d40_free_chan_resources+0x64/0x330

Signed-off-by: Per Forlin <per.forlin@linaro.org>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>

EnJens pushed a commit that referenced this issue Oct 2, 2012

xen/irq: Alter the locking to use a mutex instead of a spinlock.
When we allocate/change the IRQ informations, we do not
need to use spinlocks. We can use a mutex (which is
what the generic IRQ code does for allocations/changes).
Fixes a slew of:

BUG: sleeping function called from invalid context at /linux/kernel/mutex.c:271
in_atomic(): 1, irqs_disabled(): 0, pid: 3216, name: xenstored
2 locks held by xenstored/3216:
 #0:  (&u->bind_mutex){......}, at: [<ffffffffa02e0920>] evtchn_ioctl+0x30/0x3a0 [xen_evtchn]
 #1:  (irq_mapping_update_lock){......}, at: [<ffffffff8138b274>] bind_evtchn_to_irq+0x24/0x90
Pid: 3216, comm: xenstored Not tainted 3.1.0-rc6-00021-g437a3d1 #2
Call Trace:
 [<ffffffff81088d10>] __might_sleep+0x100/0x130
 [<ffffffff81645c2f>] mutex_lock_nested+0x2f/0x50
 [<ffffffff81627529>] __irq_alloc_descs+0x49/0x200
 [<ffffffffa02e0920>] ? evtchn_ioctl+0x30/0x3a0 [xen_evtchn]
 [<ffffffff8138b214>] xen_allocate_irq_dynamic+0x34/0x70
 [<ffffffff8138b2ad>] bind_evtchn_to_irq+0x5d/0x90
 [<ffffffffa02e03c0>] ? evtchn_bind_to_user+0x60/0x60 [xen_evtchn]
 [<ffffffff8138c282>] bind_evtchn_to_irqhandler+0x32/0x80
 [<ffffffffa02e03a9>] evtchn_bind_to_user+0x49/0x60 [xen_evtchn]
 [<ffffffffa02e0a34>] evtchn_ioctl+0x144/0x3a0 [xen_evtchn]
 [<ffffffff811b4070>] ? vfsmount_lock_local_unlock+0x50/0x80
 [<ffffffff811a6a1a>] do_vfs_ioctl+0x9a/0x5e0
 [<ffffffff811b476f>] ? mntput+0x1f/0x30
 [<ffffffff81196259>] ? fput+0x199/0x240
 [<ffffffff811a7001>] sys_ioctl+0xa1/0xb0
 [<ffffffff8164ea82>] system_call_fastpath+0x16/0x1b

Reported-by: Jim Burns <jim_burn@bellsouth.net>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

EnJens pushed a commit that referenced this issue Oct 2, 2012

MIPS: Octeon: Select CONFIG_HOLES_IN_ZONE
Current Octeon systems do in fact have holes in their memory zones.
We need to select HOLES_IN_ZONE.  If we do not, some memory
configurations will result in crashes at boot time like this:

.
.
.
CPU 6 Unable to handle kernel paging request at virtual address 0000000000700000, epc == ffffffff8118fe00, ra == ffffffff8118fe9c
Oops[#1]:
Cpu 6
.
.
.
        ...
Call Trace:
[<ffffffff8118fe00>] setup_per_zone_wmarks+0x1b0/0x338
[<ffffffff815cd738>] init_per_zone_wmark_min+0x64/0xd0
[<ffffffff81100438>] do_one_initcall+0x38/0x160
.
.
.

Reported-by: Jason Kwon <jason.kwon@ericsson.com>
Signed-off-by: David Daney <david.daney@cavium.com>
To: linux-mips@linux-mips.org
Cc: Jason Kwon <jason.kwon@ericsson.com>
Patchwork: https://patchwork.linux-mips.org/patch/2724/
Tested-by: Guenter Roeck<guenter.roeck@ericsson.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

EnJens pushed a commit that referenced this issue Oct 2, 2012

[SCSI] libsas: fix panic when single phy is disabled on a wide port
When a wide port is being utilized to a target, if one disables only one
of the
phys, we get an OS crash:

BUG: unable to handle kernel NULL pointer dereference at
0000000000000238
IP: [<ffffffff814ca9b1>] mutex_lock+0x21/0x50
PGD 4103f5067 PUD 41dba9067 PMD 0
Oops: 0002 [#1] SMP
last sysfs file: /sys/bus/pci/slots/5/address
CPU 0
Modules linked in: pm8001(U) ses enclosure fuse nfsd exportfs autofs4
ipmi_devintf ipmi_si ipmi_msghandler nfs lockd fscache nfs_acl
auth_rpcgss 8021q fcoe libfcoe garp libfc scsi_transport_fc stp scsi_tgt
llc sunrpc cpufreq_ondemand acpi_cpufreq freq_table ipv6 sr_mod cdrom
dm_mirror dm_region_hash dm_log uinput sg i2c_i801 i2c_core iTCO_wdt
iTCO_vendor_support e1000e mlx4_ib ib_mad ib_core mlx4_en mlx4_core ext3
jbd mbcache sd_mod crc_t10dif usb_storage ata_generic pata_acpi ata_piix
libsas(U) scsi_transport_sas dm_mod [last unloaded: pm8001]

Modules linked in: pm8001(U) ses enclosure fuse nfsd exportfs autofs4
ipmi_devintf ipmi_si ipmi_msghandler nfs lockd fscache nfs_acl
auth_rpcgss 8021q fcoe libfcoe garp libfc scsi_transport_fc stp scsi_tgt
llc sunrpc cpufreq_ondemand acpi_cpufreq freq_table ipv6 sr_mod cdrom
dm_mirror dm_region_hash dm_log uinput sg i2c_i801 i2c_core iTCO_wdt
iTCO_vendor_support e1000e mlx4_ib ib_mad ib_core mlx4_en mlx4_core ext3
jbd mbcache sd_mod crc_t10dif usb_storage ata_generic pata_acpi ata_piix
libsas(U) scsi_transport_sas dm_mod [last unloaded: pm8001]
Pid: 5146, comm: scsi_wq_5 Not tainted
2.6.32-71.29.1.el6.lustre.7.x86_64 #1 Storage Server
RIP: 0010:[<ffffffff814ca9b1>]  [<ffffffff814ca9b1>]
mutex_lock+0x21/0x50
RSP: 0018:ffff8803e4e33d30  EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000238 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffff8803e664c800 RDI: 0000000000000238
RBP: ffff8803e4e33d40 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000
R13: 0000000000000238 R14: ffff88041acb7200 R15: ffff88041c51ada0
FS:  0000000000000000(0000) GS:ffff880028200000(0000)
knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000000238 CR3: 0000000410143000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process scsi_wq_5 (pid: 5146, threadinfo ffff8803e4e32000, task
ffff8803e4e294a0)
Stack:
 ffff8803e664c800 0000000000000000 ffff8803e4e33d70 ffffffffa001f06e
<0> ffff8803e4e33d60 ffff88041c51ada0 ffff88041acb7200 ffff88041bc0aa00
<0> ffff8803e4e33d90 ffffffffa0032b6c 0000000000000014 ffff88041acb7200
Call Trace:
 [<ffffffffa001f06e>] sas_port_delete_phy+0x2e/0xa0 [scsi_transport_sas]
 [<ffffffffa0032b6c>] sas_unregister_devs_sas_addr+0xac/0xe0 [libsas]
 [<ffffffffa0034914>] sas_ex_revalidate_domain+0x204/0x330 [libsas]
 [<ffffffffa00307f0>] ? sas_revalidate_domain+0x0/0x90 [libsas]
 [<ffffffffa0030855>] sas_revalidate_domain+0x65/0x90 [libsas]
 [<ffffffff8108c7d0>] worker_thread+0x170/0x2a0
 [<ffffffff81091ea0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff8108c660>] ? worker_thread+0x0/0x2a0
 [<ffffffff81091b36>] kthread+0x96/0xa0
 [<ffffffff810141ca>] child_rip+0xa/0x20
 [<ffffffff81091aa0>] ? kthread+0x0/0xa0
 [<ffffffff810141c0>] ? child_rip+0x0/0x20
Code: ff ff 85 c0 75 ed eb d6 66 90 55 48 89 e5 48 83 ec 10 48 89 1c 24
4c 89 64 24 08 0f 1f 44 00 00 48 89 fb e8 92 f4 ff ff 48 89 df <f0> ff
0f 79 05 e8 25 00 00 00 65 48 8b 04 25 08 cc 00 00 48 2d
RIP  [<ffffffff814ca9b1>] mutex_lock+0x21/0x50
 RSP <ffff8803e4e33d30>
CR2: 0000000000000238

The following patch is admittedly a band-aid, and does not solve the
root cause, but it still is a good candidate for hardening as a pointer
check before reference.

Signed-off-by: Mark Salyzyn <mark_salyzyn@us.xyratex.com>
Tested-by: Jack Wang <jack_wang@usish.com>
Cc: stable@kernel.org
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

EnJens pushed a commit that referenced this issue Oct 2, 2012

x86/PCI: use host bridge _CRS info on ASUS M2V-MX SE
In summary, this DMI quirk uses the _CRS info by default for the ASUS
M2V-MX SE by turning on `pci=use_crs` and is similar to the quirk
added by commit 2491762 ("x86/PCI: use host bridge _CRS info on
ASRock ALiveSATA2-GLAN") whose commit message should be read for further
information.

Since commit 3e3da00 ("x86/pci: AMD one chain system to use pci
read out res") Linux gives the following oops:

    parport0: PC-style at 0x378, irq 7 [PCSPP,TRISTATE]
    HDA Intel 0000:20:01.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17
    HDA Intel 0000:20:01.0: setting latency timer to 64
    BUG: unable to handle kernel paging request at ffffc90011c08000
    IP: [<ffffffffa0578402>] azx_probe+0x3ad/0x86b [snd_hda_intel]
    PGD 13781a067 PUD 13781b067 PMD 1300ba067 PTE 800000fd00000173
    Oops: 0009 [#1] SMP
    last sysfs file: /sys/module/snd_pcm/initstate
    CPU 0
    Modules linked in: snd_hda_intel(+) snd_hda_codec snd_hwdep snd_pcm_oss snd_mixer_oss snd_pcm snd_seq_midi snd_rawmidi snd_seq_midi_event tpm_tis tpm snd_seq tpm_bios psmouse parport_pc snd_timer snd_seq_device parport processor evdev snd i2c_viapro thermal_sys amd64_edac_mod k8temp i2c_core soundcore shpchp pcspkr serio_raw asus_atk0110 pci_hotplug edac_core button snd_page_alloc edac_mce_amd ext3 jbd mbcache sha256_generic cryptd aes_x86_64 aes_generic cbc dm_crypt dm_mod raid1 md_mod usbhid hid sg sd_mod crc_t10dif sr_mod cdrom ata_generic uhci_hcd sata_via pata_via libata ehci_hcd usbcore scsi_mod via_rhine mii nls_base [last unloaded: scsi_wait_scan]
    Pid: 1153, comm: work_for_cpu Not tainted 2.6.37-1-amd64 #1 M2V-MX SE/System Product Name
    RIP: 0010:[<ffffffffa0578402>]  [<ffffffffa0578402>] azx_probe+0x3ad/0x86b [snd_hda_intel]
    RSP: 0018:ffff88013153fe50  EFLAGS: 00010286
    RAX: ffffc90011c08000 RBX: ffff88013029ec00 RCX: 0000000000000006
    RDX: 0000000000000000 RSI: 0000000000000246 RDI: 0000000000000246
    RBP: ffff88013341d000 R08: 0000000000000000 R09: 0000000000000040
    R10: 0000000000000286 R11: 0000000000003731 R12: ffff88013029c400
    R13: 0000000000000000 R14: 0000000000000000 R15: ffff88013341d090
    FS:  0000000000000000(0000) GS:ffff8800bfc00000(0000) knlGS:00000000f7610ab0
    CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
    CR2: ffffc90011c08000 CR3: 0000000132f57000 CR4: 00000000000006f0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
    Process work_for_cpu (pid: 1153, threadinfo ffff88013153e000, task ffff8801303c86c0)
    Stack:
     0000000000000005 ffffffff8123ad65 00000000000136c0 ffff88013029c400
     ffff8801303c8998 ffff88013341d000 ffff88013341d090 ffff8801322d9dc8
     ffff88013341d208 0000000000000000 0000000000000000 ffffffff811ad232
    Call Trace:
     [<ffffffff8123ad65>] ? __pm_runtime_set_status+0x162/0x186
     [<ffffffff811ad232>] ? local_pci_probe+0x49/0x92
     [<ffffffff8105afc5>] ? do_work_for_cpu+0x0/0x1b
     [<ffffffff8105afc5>] ? do_work_for_cpu+0x0/0x1b
     [<ffffffff8105afd0>] ? do_work_for_cpu+0xb/0x1b
     [<ffffffff8105fd3f>] ? kthread+0x7a/0x82
     [<ffffffff8100a824>] ? kernel_thread_helper+0x4/0x10
     [<ffffffff8105fcc5>] ? kthread+0x0/0x82
     [<ffffffff8100a820>] ? kernel_thread_helper+0x0/0x10
    Code: f4 01 00 00 ef 31 f6 48 89 df e8 29 dd ff ff 85 c0 0f 88 2b 03 00 00 48 89 ef e8 b4 39 c3 e0 8b 7b 40 e8 fc 9d b1 e0 48 8b 43 38 <66> 8b 10 66 89 14 24 8b 43 14 83 e8 03 83 f8 01 77 32 31 d2 be
    RIP  [<ffffffffa0578402>] azx_probe+0x3ad/0x86b [snd_hda_intel]
     RSP <ffff88013153fe50>
    CR2: ffffc90011c08000
    ---[ end trace 8d1f3ebc136437fd ]---

Trusting the ACPI _CRS information (`pci=use_crs`) fixes this problem.

    $ dmesg | grep -i crs # with the quirk
    PCI: Using host bridge windows from ACPI; if necessary, use "pci=nocrs" and report a bug

The match has to be against the DMI board entries though since the vendor entries are not populated.

    DMI: System manufacturer System Product Name/M2V-MX SE, BIOS 0304    10/30/2007

This quirk should be removed when `pci=use_crs` is enabled for machines
from 2006 or earlier or some other solution is implemented.

Using coreboot [1] with this board the problem does not exist but this
quirk also does not affect it either. To be safe though the check is
tightened to only take effect when the BIOS from American Megatrends is
used.

        15:13 < ruik> but coreboot does not need that
        15:13 < ruik> because i have there only one root bus
        15:13 < ruik> the audio is behind a bridge

        $ sudo dmidecode
        BIOS Information
                Vendor: American Megatrends Inc.
                Version: 0304
                Release Date: 10/30/2007

[1] http://www.coreboot.org/

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=30552

Cc: stable@kernel.org (2.6.34)
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: x86@kernel.org
Signed-off-by: Paul Menzel <paulepanter@users.sourceforge.net>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

EnJens pushed a commit that referenced this issue Oct 2, 2012

intel-iommu: Fix AB-BA lockdep report
When unbinding a device so that I could pass it through to a KVM VM, I
got the lockdep report below.  It looks like a legitimate lock
ordering problem:

 - domain_context_mapping_one() takes iommu->lock and calls
   iommu_support_dev_iotlb(), which takes device_domain_lock (inside
   iommu->lock).

 - domain_remove_one_dev_info() starts by taking device_domain_lock
   then takes iommu->lock inside it (near the end of the function).

So this is the classic AB-BA deadlock.  It looks like a safe fix is to
simply release device_domain_lock a bit earlier, since as far as I can
tell, it doesn't protect any of the stuff accessed at the end of
domain_remove_one_dev_info() anyway.

BTW, the use of device_domain_lock looks a bit unsafe to me... it's
at least not obvious to me why we aren't vulnerable to the race below:

  iommu_support_dev_iotlb()
                                          domain_remove_dev_info()

  lock device_domain_lock
    find info
  unlock device_domain_lock

                                          lock device_domain_lock
                                            find same info
                                          unlock device_domain_lock

                                          free_devinfo_mem(info)

  do stuff with info after it's free

However I don't understand the locking here well enough to know if
this is a real problem, let alone what the best fix is.

Anyway here's the full lockdep output that prompted all of this:

     =======================================================
     [ INFO: possible circular locking dependency detected ]
     2.6.39.1+ #1
     -------------------------------------------------------
     bash/13954 is trying to acquire lock:
      (&(&iommu->lock)->rlock){......}, at: [<ffffffff812f6421>] domain_remove_one_dev_info+0x121/0x230

     but task is already holding lock:
      (device_domain_lock){-.-...}, at: [<ffffffff812f6508>] domain_remove_one_dev_info+0x208/0x230

     which lock already depends on the new lock.

     the existing dependency chain (in reverse order) is:

     -> #1 (device_domain_lock){-.-...}:
            [<ffffffff8109ca9d>] lock_acquire+0x9d/0x130
            [<ffffffff81571475>] _raw_spin_lock_irqsave+0x55/0xa0
            [<ffffffff812f8350>] domain_context_mapping_one+0x600/0x750
            [<ffffffff812f84df>] domain_context_mapping+0x3f/0x120
            [<ffffffff812f9175>] iommu_prepare_identity_map+0x1c5/0x1e0
            [<ffffffff81ccf1ca>] intel_iommu_init+0x88e/0xb5e
            [<ffffffff81cab204>] pci_iommu_init+0x16/0x41
            [<ffffffff81002165>] do_one_initcall+0x45/0x190
            [<ffffffff81ca3d3f>] kernel_init+0xe3/0x168
            [<ffffffff8157ac24>] kernel_thread_helper+0x4/0x10

     -> #0 (&(&iommu->lock)->rlock){......}:
            [<ffffffff8109bf3e>] __lock_acquire+0x195e/0x1e10
            [<ffffffff8109ca9d>] lock_acquire+0x9d/0x130
            [<ffffffff81571475>] _raw_spin_lock_irqsave+0x55/0xa0
            [<ffffffff812f6421>] domain_remove_one_dev_info+0x121/0x230
            [<ffffffff812f8b42>] device_notifier+0x72/0x90
            [<ffffffff8157555c>] notifier_call_chain+0x8c/0xc0
            [<ffffffff81089768>] __blocking_notifier_call_chain+0x78/0xb0
            [<ffffffff810897b6>] blocking_notifier_call_chain+0x16/0x20
            [<ffffffff81373a5c>] __device_release_driver+0xbc/0xe0
            [<ffffffff81373ccf>] device_release_driver+0x2f/0x50
            [<ffffffff81372ee3>] driver_unbind+0xa3/0xc0
            [<ffffffff813724ac>] drv_attr_store+0x2c/0x30
            [<ffffffff811e4506>] sysfs_write_file+0xe6/0x170
            [<ffffffff8117569e>] vfs_write+0xce/0x190
            [<ffffffff811759e4>] sys_write+0x54/0xa0
            [<ffffffff81579a82>] system_call_fastpath+0x16/0x1b

     other info that might help us debug this:

     6 locks held by bash/13954:
      #0:  (&buffer->mutex){+.+.+.}, at: [<ffffffff811e4464>] sysfs_write_file+0x44/0x170
      #1:  (s_active#3){++++.+}, at: [<ffffffff811e44ed>] sysfs_write_file+0xcd/0x170
      #2:  (&__lockdep_no_validate__){+.+.+.}, at: [<ffffffff81372edb>] driver_unbind+0x9b/0xc0
      #3:  (&__lockdep_no_validate__){+.+.+.}, at: [<ffffffff81373cc7>] device_release_driver+0x27/0x50
      #4:  (&(&priv->bus_notifier)->rwsem){.+.+.+}, at: [<ffffffff8108974f>] __blocking_notifier_call_chain+0x5f/0xb0
      #5:  (device_domain_lock){-.-...}, at: [<ffffffff812f6508>] domain_remove_one_dev_info+0x208/0x230

     stack backtrace:
     Pid: 13954, comm: bash Not tainted 2.6.39.1+ #1
     Call Trace:
      [<ffffffff810993a7>] print_circular_bug+0xf7/0x100
      [<ffffffff8109bf3e>] __lock_acquire+0x195e/0x1e10
      [<ffffffff810972bd>] ? trace_hardirqs_off+0xd/0x10
      [<ffffffff8109d57d>] ? trace_hardirqs_on_caller+0x13d/0x180
      [<ffffffff8109ca9d>] lock_acquire+0x9d/0x130
      [<ffffffff812f6421>] ? domain_remove_one_dev_info+0x121/0x230
      [<ffffffff81571475>] _raw_spin_lock_irqsave+0x55/0xa0
      [<ffffffff812f6421>] ? domain_remove_one_dev_info+0x121/0x230
      [<ffffffff810972bd>] ? trace_hardirqs_off+0xd/0x10
      [<ffffffff812f6421>] domain_remove_one_dev_info+0x121/0x230
      [<ffffffff812f8b42>] device_notifier+0x72/0x90
      [<ffffffff8157555c>] notifier_call_chain+0x8c/0xc0
      [<ffffffff81089768>] __blocking_notifier_call_chain+0x78/0xb0
      [<ffffffff810897b6>] blocking_notifier_call_chain+0x16/0x20
      [<ffffffff81373a5c>] __device_release_driver+0xbc/0xe0
      [<ffffffff81373ccf>] device_release_driver+0x2f/0x50
      [<ffffffff81372ee3>] driver_unbind+0xa3/0xc0
      [<ffffffff813724ac>] drv_attr_store+0x2c/0x30
      [<ffffffff811e4506>] sysfs_write_file+0xe6/0x170
      [<ffffffff8117569e>] vfs_write+0xce/0x190
      [<ffffffff811759e4>] sys_write+0x54/0xa0
      [<ffffffff81579a82>] system_call_fastpath+0x16/0x1b

Signed-off-by: Roland Dreier <roland@purestorage.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>

EnJens pushed a commit that referenced this issue Oct 2, 2012

crypto: ghash - Avoid null pointer dereference if no key is set
The ghash_update function passes a pointer to gf128mul_4k_lle which will
be NULL if ghash_setkey is not called or if the most recent call to
ghash_setkey failed to allocate memory.  This causes an oops.  Fix this
up by returning an error code in the null case.

This is trivially triggered from unprivileged userspace through the
AF_ALG interface by simply writing to the socket without setting a key.

The ghash_final function has a similar issue, but triggering it requires
a memory allocation failure in ghash_setkey _after_ at least one
successful call to ghash_update.

  BUG: unable to handle kernel NULL pointer dereference at 00000670
  IP: [<d88c92d4>] gf128mul_4k_lle+0x23/0x60 [gf128mul]
  *pde = 00000000
  Oops: 0000 [#1] PREEMPT SMP
  Modules linked in: ghash_generic gf128mul algif_hash af_alg nfs lockd nfs_acl sunrpc bridge ipv6 stp llc

  Pid: 1502, comm: hashatron Tainted: G        W   3.1.0-rc9-00085-ge9308cf #32 Bochs Bochs
  EIP: 0060:[<d88c92d4>] EFLAGS: 00000202 CPU: 0
  EIP is at gf128mul_4k_lle+0x23/0x60 [gf128mul]
  EAX: d69db1f0 EBX: d6b8ddac ECX: 00000004 EDX: 00000000
  ESI: 00000670 EDI: d6b8ddac EBP: d6b8ddc8 ESP: d6b8dda4
   DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
  Process hashatron (pid: 1502, ti=d6b8c000 task=d6810000 task.ti=d6b8c000)
  Stack:
   00000000 d69db1f0 00000163 00000000 d6b8ddc8 c101a520 d69db1f0 d52aa000
   00000ff0 d6b8dde8 d88d310f d6b8a3f8 d52aa000 00001000 d88d502c d6b8ddfc
   00001000 d6b8ddf4 c11676ed d69db1e8 d6b8de24 c11679ad d52aa000 00000000
  Call Trace:
   [<c101a520>] ? kmap_atomic_prot+0x37/0xa6
   [<d88d310f>] ghash_update+0x85/0xbe [ghash_generic]
   [<c11676ed>] crypto_shash_update+0x18/0x1b
   [<c11679ad>] shash_ahash_update+0x22/0x36
   [<c11679cc>] shash_async_update+0xb/0xd
   [<d88ce0ba>] hash_sendpage+0xba/0xf2 [algif_hash]
   [<c121b24c>] kernel_sendpage+0x39/0x4e
   [<d88ce000>] ? 0xd88cdfff
   [<c121b298>] sock_sendpage+0x37/0x3e
   [<c121b261>] ? kernel_sendpage+0x4e/0x4e
   [<c10b4dbc>] pipe_to_sendpage+0x56/0x61
   [<c10b4e1f>] splice_from_pipe_feed+0x58/0xcd
   [<c10b4d66>] ? splice_from_pipe_begin+0x10/0x10
   [<c10b51f5>] __splice_from_pipe+0x36/0x55
   [<c10b4d66>] ? splice_from_pipe_begin+0x10/0x10
   [<c10b6383>] splice_from_pipe+0x51/0x64
   [<c10b63c2>] ? default_file_splice_write+0x2c/0x2c
   [<c10b63d5>] generic_splice_sendpage+0x13/0x15
   [<c10b4d66>] ? splice_from_pipe_begin+0x10/0x10
   [<c10b527f>] do_splice_from+0x5d/0x67
   [<c10b6865>] sys_splice+0x2bf/0x363
   [<c129373b>] ? sysenter_exit+0xf/0x16
   [<c104dc1e>] ? trace_hardirqs_on_caller+0x10e/0x13f
   [<c129370c>] sysenter_do_call+0x12/0x32
  Code: 83 c4 0c 5b 5e 5f c9 c3 55 b9 04 00 00 00 89 e5 57 8d 7d e4 56 53 8d 5d e4 83 ec 18 89 45 e0 89 55 dc 0f b6 70 0f c1 e6 04 01 d6 <f3> a5 be 0f 00 00 00 4e 89 d8 e8 48 ff ff ff 8b 45 e0 89 da 0f
  EIP: [<d88c92d4>] gf128mul_4k_lle+0x23/0x60 [gf128mul] SS:ESP 0068:d6b8dda4
  CR2: 0000000000000670
  ---[ end trace 4eaa2a86a8e2da24 ]---
  note: hashatron[1502] exited with preempt_count 1
  BUG: scheduling while atomic: hashatron/1502/0x10000002
  INFO: lockdep is turned off.
  [...]

Signed-off-by: Nick Bowler <nbowler@elliptictech.com>
Cc: stable@kernel.org [2.6.37+]
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

EnJens pushed a commit that referenced this issue Oct 2, 2012

apic, i386/bigsmp: Fix false warnings regarding logical APIC ID misma…
…tches

commit 838312b upstream.

These warnings (generally one per CPU) are a result of
initializing x86_cpu_to_logical_apicid while apic_default is
still in use, but the check in setup_local_APIC() being done
when apic_bigsmp was already used as an override in
default_setup_apic_routing():

 Overriding APIC driver with bigsmp
 Enabling APIC mode:  Physflat.  Using 5 I/O APICs
 ------------[ cut here ]------------
 WARNING: at .../arch/x86/kernel/apic/apic.c:1239
 ...
 CPU 1 irqstacks, hard=f1c9a000 soft=f1c9c000
 Booting Node   0, Processors  #1
 smpboot cpu 1: start_ip = 9e000
 Initializing CPU#1
 ------------[ cut here ]------------
 WARNING: at .../arch/x86/kernel/apic/apic.c:1239
 setup_local_APIC+0x137/0x46b() Hardware name: ...
 CPU1 logical APIC ID: 2 != 8
 ...

Fix this (for the time being, i.e. until
x86_32_early_logical_apicid() will get removed again, as Tejun
says ought to be possible) by overriding the previously stored
values at the point where the APIC driver gets overridden.

v2: Move this and the pre-existing override logic into
    arch/x86/kernel/apic/bigsmp_32.c.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Tejun Heo <tj@kernel.org>
Link: http://lkml.kernel.org/r/4E835D16020000780005844C@nat28.tlf.novell.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

EnJens pushed a commit that referenced this issue Oct 2, 2012

scsi_dh: check queuedata pointer before proceeding further
commit a18a920 upstream.

This patch validates sdev pointer in scsi_dh_activate before proceeding further.

Without this check we might see the panic as below. I have seen this
panic multiple times..

Call trace:

 #0 [ffff88007d647b50] machine_kexec at ffffffff81020902
 #1 [ffff88007d647ba0] crash_kexec at ffffffff810875b0
 #2 [ffff88007d647c70] oops_end at ffffffff8139c650
 #3 [ffff88007d647c90] __bad_area_nosemaphore at ffffffff8102dd15
 #4 [ffff88007d647d50] page_fault at ffffffff8139b8cf
    [exception RIP: scsi_dh_activate+0x82]
    RIP: ffffffffa0041922  RSP: ffff88007d647e00  RFLAGS: 00010046
    RAX: 0000000000000000  RBX: 0000000000000000  RCX: 00000000000093c5
    RDX: 00000000000093c5  RSI: ffffffffa02e6640  RDI: ffff88007cc88988
    RBP: 000000000000000f   R8: ffff88007d646000   R9: 0000000000000000
    R10: ffff880082293790  R11: 00000000ffffffff  R12: ffff88007cc88988
    R13: 0000000000000000  R14: 0000000000000286  R15: ffff880037b845e0
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0000
 #5 [ffff88007d647e38] run_workqueue at ffffffff81060268
 #6 [ffff88007d647e78] worker_thread at ffffffff81060386
 #7 [ffff88007d647ee8] kthread at ffffffff81064436
 #8 [ffff88007d647f48] kernel_thread at ffffffff81003fba

Signed-off-by: Babu Moger <babu.moger@netapp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

EnJens pushed a commit that referenced this issue Oct 2, 2012

block: make gendisk hold a reference to its queue
commit f992ae8 upstream.

The following command sequence triggers an oops.

# mount /dev/sdb1 /mnt
# echo 1 > /sys/class/scsi_device/0\:0\:1\:0/device/delete
# umount /mnt

 general protection fault: 0000 [#1] PREEMPT SMP
 CPU 2
 Modules linked in:

 Pid: 791, comm: umount Not tainted 3.1.0-rc3-work+ #8 Bochs Bochs
 RIP: 0010:[<ffffffff810d0879>]  [<ffffffff810d0879>] __lock_acquire+0x389/0x1d60
...
 Call Trace:
  [<ffffffff810d2845>] lock_acquire+0x95/0x140
  [<ffffffff81aed87b>] _raw_spin_lock+0x3b/0x50
  [<ffffffff811573bc>] bdi_lock_two+0x5c/0x70
  [<ffffffff811c2f6c>] bdev_inode_switch_bdi+0x4c/0xf0
  [<ffffffff811c3fcb>] __blkdev_put+0x11b/0x1d0
  [<ffffffff811c4010>] __blkdev_put+0x160/0x1d0
  [<ffffffff811c40df>] blkdev_put+0x5f/0x190
  [<ffffffff8118f18d>] kill_block_super+0x4d/0x80
  [<ffffffff8118f4a5>] deactivate_locked_super+0x45/0x70
  [<ffffffff8119003a>] deactivate_super+0x4a/0x70
  [<ffffffff811ac4ad>] mntput_no_expire+0xed/0x130
  [<ffffffff811acf2e>] sys_umount+0x7e/0x3a0
  [<ffffffff81aeeeab>] system_call_fastpath+0x16/0x1b

This is because bdev holds on to disk but disk doesn't pin the
associated queue.  If a SCSI device is removed while the device is
still open, the sdev puts the base reference to the queue on release.
When the bdev is finally released, the associated queue is already
gone along with the bdi and bdev_inode_switch_bdi() ends up
dereferencing already freed bdi.

Even if it were not for this bug, disk not holding onto the associated
queue is very unusual and error-prone.

Fix it by making add_disk() take an extra reference to its queue and
put it on disk_release() and ensuring that disk and its fops owner are
put in that order after all accesses to the disk and queue are
complete.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

EnJens pushed a commit that referenced this issue Oct 2, 2012

KEYS: Fix a NULL pointer deref in the user-defined key type
commit 9f35a33 upstream.

Fix a NULL pointer deref in the user-defined key type whereby updating a
negative key into a fully instantiated key will cause an oops to occur
when the code attempts to free the non-existent old payload.

This results in an oops that looks something like the following:

  BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
  IP: [<ffffffff81085fa1>] __call_rcu+0x11/0x13e
  PGD 3391d067 PUD 3894a067 PMD 0
  Oops: 0002 [#1] SMP
  CPU 1
  Pid: 4354, comm: keyctl Not tainted 3.1.0-fsdevel+ #1140                  /DG965RY
  RIP: 0010:[<ffffffff81085fa1>]  [<ffffffff81085fa1>] __call_rcu+0x11/0x13e
  RSP: 0018:ffff88003d591df8  EFLAGS: 00010246
  RAX: 0000000000000000 RBX: 0000000000000000 RCX: 000000000000006e
  RDX: ffffffff8161d0c0 RSI: 0000000000000000 RDI: 0000000000000000
  RBP: ffff88003d591e18 R08: 0000000000000000 R09: ffffffff8152fa6c
  R10: 0000000000000000 R11: 0000000000000300 R12: ffff88003b8f9538
  R13: ffffffff8161d0c0 R14: ffff88003b8f9d50 R15: ffff88003c69f908
  FS:  00007f97eb18c720(0000) GS:ffff88003bd00000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000000000000008 CR3: 000000003d47a000 CR4: 00000000000006e0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
  Process keyctl (pid: 4354, threadinfo ffff88003d590000, task ffff88003c78a040)
  Stack:
   ffff88003e0ffde0 ffff88003b8f9538 0000000000000001 ffff88003b8f9d50
   ffff88003d591e28 ffffffff810860f0 ffff88003d591e68 ffffffff8117bfea
   ffff88003d591e68 ffffffff00000000 ffff88003e0ffde1 ffff88003e0ffde0
  Call Trace:
   [<ffffffff810860f0>] call_rcu_sched+0x10/0x12
   [<ffffffff8117bfea>] user_update+0x8d/0xa2
   [<ffffffff8117723a>] key_create_or_update+0x236/0x270
   [<ffffffff811789b1>] sys_add_key+0x123/0x17e
   [<ffffffff813b84bb>] system_call_fastpath+0x16/0x1b

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Jeff Layton <jlayton@redhat.com>
Acked-by: Neil Horman <nhorman@redhat.com>
Acked-by: Steve Dickson <steved@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

EnJens pushed a commit that referenced this issue Oct 2, 2012

cfg80211: fix bug on regulatory core exit on access to last_request
commit 58ebacc upstream.

Commit 4d9d88d by Scott James Remnant <keybuk@google.com> added
the .uevent() callback for the regulatory device used during
the platform device registration. The change was done to account
for queuing up udev change requests through udevadm triggers.
The change also meant that upon regulatory core exit we will now
send a uevent() but the uevent() callback, reg_device_uevent(),
also accessed last_request. Right before commiting device suicide
we free'd last_request but never set it to NULL so
platform_device_unregister() would lead to bogus kernel paging
request. Fix this and also simply supress uevents right before
we commit suicide as they are pointless.

This fix is required for kernels >= v2.6.39

$ git describe --contains 4d9d88d
v2.6.39-rc1~468^2~25^2^2~21

The impact of not having this present is that a bogus paging
access may occur (only read) upon cfg80211 unload time. You
may also get this BUG complaint below. Although Johannes
could not reproduce the issue this fix is theoretically correct.

mac80211_hwsim: unregister radios
mac80211_hwsim: closing netlink
BUG: unable to handle kernel paging request at ffff88001a06b5ab
IP: [<ffffffffa030df9a>] reg_device_uevent+0x1a/0x50 [cfg80211]
PGD 1836063 PUD 183a063 PMD 1ffcb067 PTE 1a06b160
Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
CPU 0
Modules linked in: cfg80211(-) [last unloaded: mac80211]

Pid: 2279, comm: rmmod Tainted: G        W   3.1.0-wl+ #663 Bochs Bochs
RIP: 0010:[<ffffffffa030df9a>]  [<ffffffffa030df9a>] reg_device_uevent+0x1a/0x50 [cfg80211]
RSP: 0000:ffff88001c5f9d58  EFLAGS: 00010286
RAX: 0000000000000000 RBX: ffff88001d2eda88 RCX: ffff88001c7468fc
RDX: ffff88001a06b5a0 RSI: ffff88001c7467b0 RDI: ffff88001c7467b0
RBP: ffff88001c5f9d58 R08: 000000000000ffff R09: 000000000000ffff
R10: 0000000000000000 R11: 0000000000000001 R12: ffff88001c7467b0
R13: ffff88001d2eda78 R14: ffffffff8164a840 R15: 0000000000000001
FS:  00007f8a91d8a6e0(0000) GS:ffff88001fc00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: ffff88001a06b5ab CR3: 000000001c62e000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process rmmod (pid: 2279, threadinfo ffff88001c5f8000, task ffff88000023c780)
Stack:
 ffff88001c5f9d98 ffffffff812ff7e5 ffffffff8176ab3d ffff88001c7468c2
 000000000000ffff ffff88001d2eda88 ffff88001c7467b0 ffff880000114820
 ffff88001c5f9e38 ffffffff81241dc7 ffff88001c5f9db8 ffffffff81040189
Call Trace:
 [<ffffffff812ff7e5>] dev_uevent+0xc5/0x170
 [<ffffffff81241dc7>] kobject_uevent_env+0x1f7/0x490
 [<ffffffff81040189>] ? sub_preempt_count+0x29/0x60
 [<ffffffff814cab1a>] ? _raw_spin_unlock_irqrestore+0x4a/0x90
 [<ffffffff81305307>] ? devres_release_all+0x27/0x60
 [<ffffffff8124206b>] kobject_uevent+0xb/0x10
 [<ffffffff812fee27>] device_del+0x157/0x1b0
 [<ffffffff8130377d>] platform_device_del+0x1d/0x90
 [<ffffffff81303b76>] platform_device_unregister+0x16/0x30
 [<ffffffffa030fffd>] regulatory_exit+0x5d/0x180 [cfg80211]
 [<ffffffffa032bec3>] cfg80211_exit+0x2b/0x45 [cfg80211]
 [<ffffffff8109a84c>] sys_delete_module+0x16c/0x220
 [<ffffffff8108a23e>] ? trace_hardirqs_on_caller+0x7e/0x120
 [<ffffffff814cba02>] system_call_fastpath+0x16/0x1b
Code: <all your base are belong to me>
RIP  [<ffffffffa030df9a>] reg_device_uevent+0x1a/0x50 [cfg80211]
 RSP <ffff88001c5f9d58>
CR2: ffff88001a06b5ab
---[ end trace 147c5099a411e8c0 ]---

Reported-by: Johannes Berg <johannes@sipsolutions.net>
Cc: Scott James Remnant <keybuk@google.com>
Signed-off-by: Luis R. Rodriguez <mcgrof@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

EnJens pushed a commit that referenced this issue Oct 2, 2012

[mach-tegra/ventana] use tegra_reserve() to handle carveout memory al…
…location

For K36, the kernel command line will pass "mem=size@base" from bootloader
to kernel. Apparently, we were having two different ways to interpret
it
    1) size = total physical memory size - carveout size
    2) Or size = total physical memory size

Ventana is the only platform to use #1. Switch it to #2 which requires
tegra_reserve() to handle carveout memory allocation.

Change carveout size to 256MB as well.

Original-Change-Id: Ifc24c1a5f6300d827068c67c0580cae7eb4ec229
Reviewed-on: http://git-master/r/17444
Reviewed-by: Varun Colbert <vcolbert@nvidia.com>
Tested-by: Varun Colbert <vcolbert@nvidia.com>

Rebase-Id: Rc02c5fb338f88ac0175f9cb01fbb4d2c5b9c9c67

EnJens pushed a commit that referenced this issue Oct 2, 2012

rtlwifi: fix lps_lock deadlock
commit e55b32c upstream.

rtl_lps_leave can be called from interrupt context, so we have to
disable interrupts when taking lps_lock.

Below is full lockdep info about deadlock:

[   93.815269] =================================
[   93.815390] [ INFO: inconsistent lock state ]
[   93.815472] 2.6.41.1-3.offch.fc15.x86_64.debug #1
[   93.815556] ---------------------------------
[   93.815635] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
[   93.815743] swapper/0 [HC0[0]:SC1[1]:HE1:SE0] takes:
[   93.815832]  (&(&rtlpriv->locks.lps_lock)->rlock){+.?...}, at: [<ffffffffa025dad6>] rtl_lps_leave+0x26/0x103 [rtlwifi]
[   93.815947] {SOFTIRQ-ON-W} state was registered at:
[   93.815947]   [<ffffffff8108e10d>] __lock_acquire+0x369/0xd0c
[   93.815947]   [<ffffffff8108efb3>] lock_acquire+0xf3/0x13e
[   93.815947]   [<ffffffff814e981d>] _raw_spin_lock+0x45/0x79
[   93.815947]   [<ffffffffa025de34>] rtl_swlps_rf_awake+0x5a/0x76 [rtlwifi]
[   93.815947]   [<ffffffffa025aec0>] rtl_op_config+0x12a/0x32a [rtlwifi]
[   93.815947]   [<ffffffffa01d614b>] ieee80211_hw_config+0x124/0x129 [mac80211]
[   93.815947]   [<ffffffffa01e0af3>] ieee80211_dynamic_ps_disable_work+0x32/0x47 [mac80211]
[   93.815947]   [<ffffffff81075aa5>] process_one_work+0x205/0x3e7
[   93.815947]   [<ffffffff81076753>] worker_thread+0xda/0x15d
[   93.815947]   [<ffffffff8107a119>] kthread+0xa8/0xb0
[   93.815947]   [<ffffffff814f3184>] kernel_thread_helper+0x4/0x10
[   93.815947] irq event stamp: 547822
[   93.815947] hardirqs last  enabled at (547822): [<ffffffff814ea1a7>] _raw_spin_unlock_irqrestore+0x45/0x61
[   93.815947] hardirqs last disabled at (547821): [<ffffffff814e9987>] _raw_spin_lock_irqsave+0x22/0x8e
[   93.815947] softirqs last  enabled at (547790): [<ffffffff810623ed>] _local_bh_enable+0x13/0x15
[   93.815947] softirqs last disabled at (547791): [<ffffffff814f327c>] call_softirq+0x1c/0x30
[   93.815947]
[   93.815947] other info that might help us debug this:
[   93.815947]  Possible unsafe locking scenario:
[   93.815947]
[   93.815947]        CPU0
[   93.815947]        ----
[   93.815947]   lock(&(&rtlpriv->locks.lps_lock)->rlock);
[   93.815947]   <Interrupt>
[   93.815947]     lock(&(&rtlpriv->locks.lps_lock)->rlock);
[   93.815947]
[   93.815947]  *** DEADLOCK ***
[   93.815947]
[   93.815947] no locks held by swapper/0.
[   93.815947]
[   93.815947] stack backtrace:
[   93.815947] Pid: 0, comm: swapper Not tainted 2.6.41.1-3.offch.fc15.x86_64.debug #1
[   93.815947] Call Trace:
[   93.815947]  <IRQ>  [<ffffffff814dfd00>] print_usage_bug+0x1e7/0x1f8
[   93.815947]  [<ffffffff8101a849>] ? save_stack_trace+0x2c/0x49
[   93.815947]  [<ffffffff8108d55c>] ? print_irq_inversion_bug.part.18+0x1a0/0x1a0
[   93.815947]  [<ffffffff8108dc8a>] mark_lock+0x106/0x220
[   93.815947]  [<ffffffff8108e099>] __lock_acquire+0x2f5/0xd0c
[   93.815947]  [<ffffffff810152af>] ? native_sched_clock+0x34/0x36
[   93.830125]  [<ffffffff810152ba>] ? sched_clock+0x9/0xd
[   93.830125]  [<ffffffff81080181>] ? sched_clock_local+0x12/0x75
[   93.830125]  [<ffffffffa025dad6>] ? rtl_lps_leave+0x26/0x103 [rtlwifi]
[   93.830125]  [<ffffffff8108efb3>] lock_acquire+0xf3/0x13e
[   93.830125]  [<ffffffffa025dad6>] ? rtl_lps_leave+0x26/0x103 [rtlwifi]
[   93.830125]  [<ffffffff814e981d>] _raw_spin_lock+0x45/0x79
[   93.830125]  [<ffffffffa025dad6>] ? rtl_lps_leave+0x26/0x103 [rtlwifi]
[   93.830125]  [<ffffffff81422467>] ? skb_dequeue+0x62/0x6d
[   93.830125]  [<ffffffffa025dad6>] rtl_lps_leave+0x26/0x103 [rtlwifi]
[   93.830125]  [<ffffffffa025f677>] _rtl_pci_ips_leave_tasklet+0xe/0x10 [rtlwifi]
[   93.830125]  [<ffffffff8106281f>] tasklet_action+0x8d/0xee
[   93.830125]  [<ffffffff810629ce>] __do_softirq+0x112/0x25a
[   93.830125]  [<ffffffff814f327c>] call_softirq+0x1c/0x30
[   93.830125]  [<ffffffff81010bf6>] do_softirq+0x4b/0xa1
[   93.830125]  [<ffffffff81062d7d>] irq_exit+0x5d/0xcf
[   93.830125]  [<ffffffff814f3b7e>] do_IRQ+0x8e/0xa5
[   93.830125]  [<ffffffff814ea533>] common_interrupt+0x73/0x73
[   93.830125]  <EOI>  [<ffffffff8108b825>] ? trace_hardirqs_off+0xd/0xf
[   93.830125]  [<ffffffff812bb6d5>] ? intel_idle+0xe5/0x10c
[   93.830125]  [<ffffffff812bb6d1>] ? intel_idle+0xe1/0x10c
[   93.830125]  [<ffffffff813f8d5e>] cpuidle_idle_call+0x11c/0x1fe
[   93.830125]  [<ffffffff8100e2ef>] cpu_idle+0xab/0x101
[   93.830125]  [<ffffffff814c6373>] rest_init+0xd7/0xde
[   93.830125]  [<ffffffff814c629c>] ? csum_partial_copy_generic+0x16c/0x16c
[   93.830125]  [<ffffffff81d4bbb0>] start_kernel+0x3dd/0x3ea
[   93.830125]  [<ffffffff81d4b2c4>] x86_64_start_reservations+0xaf/0xb3
[   93.830125]  [<ffffffff81d4b140>] ? early_idt_handlers+0x140/0x140
[   93.830125]  [<ffffffff81d4b3ca>] x86_64_start_kernel+0x102/0x111

Resolves:
https://bugzilla.redhat.com/show_bug.cgi?id=755154

Reported-by: vjain02@students.poly.edu
Reported-and-tested-by: Oliver Paukstadt <pstadt@sourcentral.org>
Acked-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

EnJens pushed a commit that referenced this issue Oct 2, 2012

x86/paravirt: PTE updates in k(un)map_atomic need to be synchronous, …
…regardless of lazy_mmu mode

commit 2cd1c8d upstream.

Fix an outstanding issue that has been reported since 2.6.37.
Under a heavy loaded machine processing "fork()" calls could
crash with:

BUG: unable to handle kernel paging request at f573fc8c
IP: [<c01abc54>] swap_count_continued+0x104/0x180
*pdpt = 000000002a3b9027 *pde = 0000000001bed067 *pte = 0000000000000000 Oops: 0000 [#1] SMP
Modules linked in:
Pid: 1638, comm: apache2 Not tainted 3.0.4-linode37 #1
EIP: 0061:[<c01abc54>] EFLAGS: 00210246 CPU: 3
EIP is at swap_count_continued+0x104/0x180
.. snip..
Call Trace:
 [<c01ac222>] ? __swap_duplicate+0xc2/0x160
 [<c01040f7>] ? pte_mfn_to_pfn+0x87/0xe0
 [<c01ac2e4>] ? swap_duplicate+0x14/0x40
 [<c01a0a6b>] ? copy_pte_range+0x45b/0x500
 [<c01a0ca5>] ? copy_page_range+0x195/0x200
 [<c01328c6>] ? dup_mmap+0x1c6/0x2c0
 [<c0132cf8>] ? dup_mm+0xa8/0x130
 [<c013376a>] ? copy_process+0x98a/0xb30
 [<c013395f>] ? do_fork+0x4f/0x280
 [<c01573b3>] ? getnstimeofday+0x43/0x100
 [<c010f770>] ? sys_clone+0x30/0x40
 [<c06c048d>] ? ptregs_clone+0x15/0x48
 [<c06bfb71>] ? syscall_call+0x7/0xb

The problem is that in copy_page_range() we turn lazy mode on,
and then in swap_entry_free() we call swap_count_continued()
which ends up in:

         map = kmap_atomic(page, KM_USER0) + offset;

and then later we touch *map.

Since we are running in batched mode (lazy) we don't actually
set up the PTE mappings and the kmap_atomic is not done
synchronously and ends up trying to dereference a page that has
not been set.

Looking at kmap_atomic_prot_pfn(), it uses
'arch_flush_lazy_mmu_mode' and doing the same in
kmap_atomic_prot() and __kunmap_atomic() makes the problem go
away.

Interestingly, commit b8bcfe9 ("x86/paravirt: remove lazy
mode in interrupts") removed part of this to fix an interrupt
issue - but it went to far and did not consider this scenario.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

EnJens pushed a commit that referenced this issue Oct 2, 2012

oprofile, x86: Fix crash when unloading module (nmi timer mode)
commit 97f7f81 upstream.

If oprofile uses the nmi timer interrupt there is a crash while
unloading the module. The bug can be triggered with oprofile build as
module and kernel parameter nolapic set. This patch fixes this.

oprofile: using NMI timer interrupt.
BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
IP: [<ffffffff8123c226>] unregister_syscore_ops+0x41/0x58
PGD 42dbca067 PUD 41da6a067 PMD 0
Oops: 0002 [#1] PREEMPT SMP
CPU 5
Modules linked in: oprofile(-) [last unloaded: oprofile]

Pid: 2518, comm: modprobe Not tainted 3.1.0-rc7-00019-gb2fb49d #19 Advanced Micro Device Anaheim/Anaheim
RIP: 0010:[<ffffffff8123c226>]  [<ffffffff8123c226>] unregister_syscore_ops+0x41/0x58
RSP: 0018:ffff88041ef71e98  EFLAGS: 00010296
RAX: 0000000000000000 RBX: ffffffffa0017100 RCX: dead000000200200
RDX: 0000000000000000 RSI: dead000000100100 RDI: ffffffff8178c620
RBP: ffff88041ef71ea8 R08: 0000000000000001 R09: 0000000000000082
R10: 0000000000000000 R11: ffff88041ef71de8 R12: 0000000000000080
R13: fffffffffffffff5 R14: 0000000000000001 R15: 0000000000610210
FS:  00007fc902f20700(0000) GS:ffff88042fd40000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000008 CR3: 000000041cdb6000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process modprobe (pid: 2518, threadinfo ffff88041ef70000, task ffff88041d348040)
Stack:
 ffff88041ef71eb8 ffffffffa0017790 ffff88041ef71eb8 ffffffffa0013532
 ffff88041ef71ec8 ffffffffa00132d6 ffff88041ef71ed8 ffffffffa00159b2
 ffff88041ef71f78 ffffffff81073115 656c69666f72706f 0000000000610200
Call Trace:
 [<ffffffffa0013532>] op_nmi_exit+0x15/0x17 [oprofile]
 [<ffffffffa00132d6>] oprofile_arch_exit+0xe/0x10 [oprofile]
 [<ffffffffa00159b2>] oprofile_exit+0x1e/0x20 [oprofile]
 [<ffffffff81073115>] sys_delete_module+0x1c3/0x22f
 [<ffffffff811bf09e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
 [<ffffffff8148070b>] system_call_fastpath+0x16/0x1b
Code: 20 c6 78 81 e8 c5 cc 23 00 48 8b 13 48 8b 43 08 48 be 00 01 10 00 00 00 ad de 48 b9 00 02 20 00 00 00 ad de 48 c7 c7 20 c6 78 81
 89 42 08 48 89 10 48 89 33 48 89 4b 08 e8 a6 c0 23 00 5a 5b
RIP  [<ffffffff8123c226>] unregister_syscore_ops+0x41/0x58
 RSP <ffff88041ef71e98>
CR2: 0000000000000008
---[ end trace 43a541a52956b7b0 ]---

Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

EnJens pushed a commit that referenced this issue Oct 2, 2012

add missing .set function for NT_S390_LAST_BREAK regset
commit b934069 upstream.

The last breaking event address is a read-only value, the regset misses the
.set function. If a PTRACE_SETREGSET is done for NT_S390_LAST_BREAK we
get an oops due to a branch to zero:

Kernel BUG at 0000000000000002 verbose debug info unavailable
illegal operation: 0001 #1 SMP
...
Call Trace:
(<0000000000158294> ptrace_regset+0x184/0x188)
 <00000000001595b6> ptrace_request+0x37a/0x4fc
 <0000000000109a78> arch_ptrace+0x108/0x1fc
 <00000000001590d6> SyS_ptrace+0xaa/0x12c
 <00000000005c7a42> sysc_noemu+0x16/0x1c
 <000003fffd5ec10c> 0x3fffd5ec10c
Last Breaking-Event-Address:
 <0000000000158242> ptrace_regset+0x132/0x188

Add a nop .set function to prevent the branch to zero.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

EnJens pushed a commit that referenced this issue Oct 2, 2012

oprofile: Fix crash when unloading module (hr timer mode)
commit 87121ca upstream.

Oprofile may crash in a KVM guest while unlaoding modules. This
happens if oprofile_arch_init() fails and oprofile switches to the hr
timer mode as a fallback. In this case oprofile_arch_exit() is called,
but it never was initialized properly which causes the crash. This
patch fixes this.

oprofile: using timer interrupt.
BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
IP: [<ffffffff8123c226>] unregister_syscore_ops+0x41/0x58
PGD 41da3f067 PUD 41d80e067 PMD 0
Oops: 0002 [#1] PREEMPT SMP
CPU 5
Modules linked in: oprofile(-)

Pid: 2382, comm: modprobe Not tainted 3.1.0-rc7-00018-g709a39d #18 Advanced Micro Device Anaheim/Anaheim
RIP: 0010:[<ffffffff8123c226>]  [<ffffffff8123c226>] unregister_syscore_ops+0x41/0x58
RSP: 0018:ffff88041de1de98  EFLAGS: 00010296
RAX: 0000000000000000 RBX: ffffffffa00060e0 RCX: dead000000200200
RDX: 0000000000000000 RSI: dead000000100100 RDI: ffffffff8178c620
RBP: ffff88041de1dea8 R08: 0000000000000001 R09: 0000000000000082
R10: 0000000000000000 R11: ffff88041de1dde8 R12: 0000000000000080
R13: fffffffffffffff5 R14: 0000000000000001 R15: 0000000000610210
FS:  00007f9ae5bef700(0000) GS:ffff88042fd40000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000008 CR3: 000000041ca44000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process modprobe (pid: 2382, threadinfo ffff88041de1c000, task ffff88042db6d040)
Stack:
 ffff88041de1deb8 ffffffffa0006770 ffff88041de1deb8 ffffffffa000251e
 ffff88041de1dec8 ffffffffa00022c2 ffff88041de1ded8 ffffffffa0004993
 ffff88041de1df78 ffffffff81073115 656c69666f72706f 0000000000610200
Call Trace:
 [<ffffffffa000251e>] op_nmi_exit+0x15/0x17 [oprofile]
 [<ffffffffa00022c2>] oprofile_arch_exit+0xe/0x10 [oprofile]
 [<ffffffffa0004993>] oprofile_exit+0x13/0x15 [oprofile]
 [<ffffffff81073115>] sys_delete_module+0x1c3/0x22f
 [<ffffffff811bf09e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
 [<ffffffff8148070b>] system_call_fastpath+0x16/0x1b
Code: 20 c6 78 81 e8 c5 cc 23 00 48 8b 13 48 8b 43 08 48 be 00 01 10 00 00 00 ad de 48 b9 00 02 20 00 00 00 ad de 48 c7 c7 20 c6 78 81
 89 42 08 48 89 10 48 89 33 48 89 4b 08 e8 a6 c0 23 00 5a 5b
RIP  [<ffffffff8123c226>] unregister_syscore_ops+0x41/0x58
 RSP <ffff88041de1de98>
CR2: 0000000000000008
---[ end trace 06d4e95b6aa3b437 ]---

Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

EnJens pushed a commit that referenced this issue Oct 2, 2012

thp: add compound tail page _mapcount when mapped
commit b6999b1 upstream.

With the 3.2-rc kernel, IOMMU 2M pages in KVM works.  But when I tried
to use IOMMU 1GB pages in KVM, I encountered an oops and the 1GB page
failed to be used.

The root cause is that 1GB page allocation calls gup_huge_pud() while 2M
page calls gup_huge_pmd.  If compound pages are used and the page is a
tail page, gup_huge_pmd() increases _mapcount to record tail page are
mapped while gup_huge_pud does not do that.

So when the mapped page is relesed, it will result in kernel oops
because the page is not marked mapped.

This patch add tail process for compound page in 1GB huge page which
keeps the same process as 2M page.

Reproduce like:
1. Add grub boot option: hugepagesz=1G hugepages=8
2. mount -t hugetlbfs -o pagesize=1G hugetlbfs /dev/hugepages
3. qemu-kvm -m 2048 -hda os-kvm.img -cpu kvm64 -smp 4 -mem-path /dev/hugepages
	-net none -device pci-assign,host=07:00.1

  kernel BUG at mm/swap.c:114!
  invalid opcode: 0000 [#1] SMP
  Call Trace:
    put_page+0x15/0x37
    kvm_release_pfn_clean+0x31/0x36
    kvm_iommu_put_pages+0x94/0xb1
    kvm_iommu_unmap_memslots+0x80/0xb6
    kvm_assign_device+0xba/0x117
    kvm_vm_ioctl_assigned_device+0x301/0xa47
    kvm_vm_ioctl+0x36c/0x3a2
    do_vfs_ioctl+0x49e/0x4e4
    sys_ioctl+0x5a/0x7c
    system_call_fastpath+0x16/0x1b
  RIP  put_compound_page+0xd4/0x168

Signed-off-by: Youquan Song <youquan.song@intel.com>
Reviewed-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

EnJens pushed a commit that referenced this issue Oct 2, 2012

thp: set compound tail page _count to zero
commit 58a84aa upstream.

Commit 70b50f9 ("mm: thp: tail page refcounting fix") keeps all
page_tail->_count zero at all times.  But the current kernel does not
set page_tail->_count to zero if a 1GB page is utilized.  So when an
IOMMU 1GB page is used by KVM, it wil result in a kernel oops because a
tail page's _count does not equal zero.

  kernel BUG at include/linux/mm.h:386!
  invalid opcode: 0000 [#1] SMP
  Call Trace:
    gup_pud_range+0xb8/0x19d
    get_user_pages_fast+0xcb/0x192
    ? trace_hardirqs_off+0xd/0xf
    hva_to_pfn+0x119/0x2f2
    gfn_to_pfn_memslot+0x2c/0x2e
    kvm_iommu_map_pages+0xfd/0x1c1
    kvm_iommu_map_memslots+0x7c/0xbd
    kvm_iommu_map_guest+0xaa/0xbf
    kvm_vm_ioctl_assigned_device+0x2ef/0xa47
    kvm_vm_ioctl+0x36c/0x3a2
    do_vfs_ioctl+0x49e/0x4e4
    sys_ioctl+0x5a/0x7c
    system_call_fastpath+0x16/0x1b
  RIP  gup_huge_pud+0xf2/0x159

Signed-off-by: Youquan Song <youquan.song@intel.com>
Reviewed-by: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

EnJens pushed a commit that referenced this issue Oct 2, 2012

mm: Ensure that pfn_valid() is called once per pageblock when reservi…
…ng pageblocks

commit d021563 upstream.

setup_zone_migrate_reserve() expects that zone->start_pfn starts at
pageblock_nr_pages aligned pfn otherwise we could access beyond an
existing memblock resulting in the following panic if
CONFIG_HOLES_IN_ZONE is not configured and we do not check pfn_valid:

  IP: [<c02d331d>] setup_zone_migrate_reserve+0xcd/0x180
  *pdpt = 0000000000000000 *pde = f000ff53f000ff53
  Oops: 0000 [#1] SMP
  Pid: 1, comm: swapper Not tainted 3.0.7-0.7-pae #1 VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform
  EIP: 0060:[<c02d331d>] EFLAGS: 00010006 CPU: 0
  EIP is at setup_zone_migrate_reserve+0xcd/0x180
  EAX: 000c0000 EBX: f5801fc0 ECX: 000c0000 EDX: 00000000
  ESI: 000c01fe EDI: 000c01fe EBP: 00140000 ESP: f2475f58
  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
  Process swapper (pid: 1, ti=f2474000 task=f2472cd0 task.ti=f2474000)
  Call Trace:
  [<c02d389c>] __setup_per_zone_wmarks+0xec/0x160
  [<c02d3a1f>] setup_per_zone_wmarks+0xf/0x20
  [<c08a771c>] init_per_zone_wmark_min+0x27/0x86
  [<c020111b>] do_one_initcall+0x2b/0x160
  [<c086639d>] kernel_init+0xbe/0x157
  [<c05cae26>] kernel_thread_helper+0x6/0xd
  Code: a5 39 f5 89 f7 0f 46 fd 39 cf 76 40 8b 03 f6 c4 08 74 32 eb 91 90 89 c8 c1 e8 0e 0f be 80 80 2f 86 c0 8b 14 85 60 2f 86 c0 89 c8 <2b> 82 b4 12 00 00 c1 e0 05 03 82 ac 12 00 00 8b 00 f6 c4 08 0f
  EIP: [<c02d331d>] setup_zone_migrate_reserve+0xcd/0x180 SS:ESP 0068:f2475f58
  CR2: 00000000000012b4

We crashed in pageblock_is_reserved() when accessing pfn 0xc0000 because
highstart_pfn = 0x36ffe.

The issue was introduced in 3.0-rc1 by 6d3163c ("mm: check if any page
in a pageblock is reserved before marking it MIGRATE_RESERVE").

Make sure that start_pfn is always aligned to pageblock_nr_pages to
ensure that pfn_valid s always called at the start of each pageblock.
Architectures with holes in pageblocks will be correctly handled by
pfn_valid_within in pageblock_is_reserved.

Signed-off-by: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Tested-by: Dang Bo <bdang@vmware.com>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Arve Hjnnevg <arve@android.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Dave Hansen <dave@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

EnJens pushed a commit that referenced this issue Oct 2, 2012

target/file: walk properly over sg list
commit 9649fa1 upstream.

This patch changes fileio to use for_each_sg() when walking se_task->task_sg
memory passed into from loopback LLD struct scsi_cmnd scatterlist memory.

This addresses an issue where FILEIO backends with loopback where hitting the
following OOPs with mkfs.ext2:

|kernel BUG at include/linux/scatterlist.h:97!
|invalid opcode: 0000 [#1] PREEMPT SMP
|Modules linked in: sd_mod tcm_loop target_core_stgt scsi_tgt target_core_pscsi target_core_file target_core_iblock target_core_mod configfs scsi_mod
|
|Pid: 671, comm: LIO_fileio Not tainted 3.1.0-rc10+ #139 Bochs Bochs
|EIP: 0060:[<e0afd746>] EFLAGS: 00010202 CPU: 0
|EIP is at fd_do_task+0x396/0x420 [target_core_file]
| [<e0aa7884>] __transport_execute_tasks+0xd4/0x190 [target_core_mod]
| [<e0aa797c>] transport_execute_tasks+0x3c/0xf0 [target_core_mod]
|EIP: [<e0afd746>] fd_do_task+0x396/0x420 [target_core_file] SS:ESP 0068:dea47e90

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

EnJens pushed a commit that referenced this issue Oct 2, 2012

xen/pm_idle: Make pm_idle be default_idle under Xen.
commit e5fd47b upstream.

The idea behind commit d91ee58 ("cpuidle: replace xen access to x86
pm_idle and default_idle") was to have one call - disable_cpuidle()
which would make pm_idle not be molested by other code.  It disallows
cpuidle_idle_call to be set to pm_idle (which is excellent).

But in the select_idle_routine() and idle_setup(), the pm_idle can still
be set to either: amd_e400_idle, mwait_idle or default_idle.  This
depends on some CPU flags (MWAIT) and in AMD case on the type of CPU.

In case of mwait_idle we can hit some instances where the hypervisor
(Amazon EC2 specifically) sets the MWAIT and we get:

  Brought up 2 CPUs
  invalid opcode: 0000 [#1] SMP

  Pid: 0, comm: swapper Not tainted 3.1.0-0.rc6.git0.3.fc16.x86_64 #1
  RIP: e030:[<ffffffff81015d1d>]  [<ffffffff81015d1d>] mwait_idle+0x6f/0xb4
  ...
  Call Trace:
   [<ffffffff8100e2ed>] cpu_idle+0xae/0xe8
   [<ffffffff8149ee78>] cpu_bringup_and_idle+0xe/0x10
  RIP  [<ffffffff81015d1d>] mwait_idle+0x6f/0xb4
   RSP <ffff8801d28ddf10>

In the case of amd_e400_idle we don't get so spectacular crashes, but we
do end up making an MSR which is trapped in the hypervisor, and then
follow it up with a yield hypercall.  Meaning we end up going to
hypervisor twice instead of just once.

The previous behavior before v3.0 was that pm_idle was set to
default_idle regardless of select_idle_routine/idle_setup.

We want to do that, but only for one specific case: Xen.  This patch
does that.

Fixes RH BZ #739499 and Ubuntu #881076
Reported-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

EnJens pushed a commit that referenced this issue Oct 2, 2012

ext4: avoid hangs in ext4_da_should_update_i_disksize()
commit ea51d13 upstream.

If the pte mapping in generic_perform_write() is unmapped between
iov_iter_fault_in_readable() and iov_iter_copy_from_user_atomic(), the
"copied" parameter to ->end_write can be zero. ext4 couldn't cope with
it with delayed allocations enabled. This skips the i_disksize
enlargement logic if copied is zero and no new data was appeneded to
the inode.

 gdb> bt
 #0  0xffffffff811afe80 in ext4_da_should_update_i_disksize (file=0xffff88003f606a80, mapping=0xffff88001d3824e0, pos=0x1\
 08000, len=0x1000, copied=0x0, page=0xffffea0000d792e8, fsdata=0x0) at fs/ext4/inode.c:2467
 #1  ext4_da_write_end (file=0xffff88003f606a80, mapping=0xffff88001d3824e0, pos=0x108000, len=0x1000, copied=0x0, page=0\
 xffffea0000d792e8, fsdata=0x0) at fs/ext4/inode.c:2512
 #2  0xffffffff810d97f1 in generic_perform_write (iocb=<value optimized out>, iov=<value optimized out>, nr_segs=<value o\
 ptimized out>, pos=0x108000, ppos=0xffff88001e26be40, count=<value optimized out>, written=0x0) at mm/filemap.c:2440
 #3  generic_file_buffered_write (iocb=<value optimized out>, iov=<value optimized out>, nr_segs=<value optimized out>, p\
 os=0x108000, ppos=0xffff88001e26be40, count=<value optimized out>, written=0x0) at mm/filemap.c:2482
 #4  0xffffffff810db5d1 in __generic_file_aio_write (iocb=0xffff88001e26bde8, iov=0xffff88001e26bec8, nr_segs=0x1, ppos=0\
 xffff88001e26be40) at mm/filemap.c:2600
 #5  0xffffffff810db853 in generic_file_aio_write (iocb=0xffff88001e26bde8, iov=0xffff88001e26bec8, nr_segs=<value optimi\
 zed out>, pos=<value optimized out>) at mm/filemap.c:2632
 #6  0xffffffff811a71aa in ext4_file_write (iocb=0xffff88001e26bde8, iov=0xffff88001e26bec8, nr_segs=0x1, pos=0x108000) a\
 t fs/ext4/file.c:136
 #7  0xffffffff811375aa in do_sync_write (filp=0xffff88003f606a80, buf=<value optimized out>, len=<value optimized out>, \
 ppos=0xffff88001e26bf48) at fs/read_write.c:406
 #8  0xffffffff81137e56 in vfs_write (file=0xffff88003f606a80, buf=0x1ec2960 <Address 0x1ec2960 out of bounds>, count=0x4\
 000, pos=0xffff88001e26bf48) at fs/read_write.c:435
 #9  0xffffffff8113816c in sys_write (fd=<value optimized out>, buf=0x1ec2960 <Address 0x1ec2960 out of bounds>, count=0x\
 4000) at fs/read_write.c:487
 #10 <signal handler called>
 #11 0x00007f120077a390 in __brk_reservation_fn_dmi_alloc__ ()
 #12 0x0000000000000000 in ?? ()
 gdb> print offset
 $22 = 0xffffffffffffffff
 gdb> print idx
 $23 = 0xffffffff
 gdb> print inode->i_blkbits
 $24 = 0xc
 gdb> up
 #1  ext4_da_write_end (file=0xffff88003f606a80, mapping=0xffff88001d3824e0, pos=0x108000, len=0x1000, copied=0x0, page=0\
 xffffea0000d792e8, fsdata=0x0) at fs/ext4/inode.c:2512
 2512                    if (ext4_da_should_update_i_disksize(page, end)) {
 gdb> print start
 $25 = 0x0
 gdb> print end
 $26 = 0xffffffffffffffff
 gdb> print pos
 $27 = 0x108000
 gdb> print new_i_size
 $28 = 0x108000
 gdb> print ((struct ext4_inode_info *)((char *)inode-((int)(&((struct ext4_inode_info *)0)->vfs_inode))))->i_disksize
 $29 = 0xd9000
 gdb> down
 2467            for (i = 0; i < idx; i++)
 gdb> print i
 $30 = 0xd44acbee

This is 100% reproducible with some autonuma development code tuned in
a very aggressive manner (not normal way even for knumad) which does
"exotic" changes to the ptes. It wouldn't normally trigger but I don't
see why it can't happen normally if the page is added to swap cache in
between the two faults leading to "copied" being zero (which then
hangs in ext4). So it should be fixed. Especially possible with lumpy
reclaim (albeit disabled if compaction is enabled) as that would
ignore the young bits in the ptes.

Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

EnJens pushed a commit that referenced this issue Oct 2, 2012

usb: usb-storage doesn't support dynamic id currently, the patch disa…
…bles the feature to fix an oops

commit 1a3a026 upstream.

Echo vendor and product number of a non usb-storage device to
usb-storage driver's new_id, then plug in the device to host and you
will find following oops msg, the root cause is usb_stor_probe1()
refers invalid id entry if giving a dynamic id, so just disable the
feature.

[ 3105.018012] general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC
[ 3105.018062] CPU 0
[ 3105.018075] Modules linked in: usb_storage usb_libusual bluetooth
dm_crypt binfmt_misc snd_hda_codec_analog snd_hda_intel snd_hda_codec
snd_hwdep hp_wmi ppdev sparse_keymap snd_pcm snd_seq_midi snd_rawmidi
snd_seq_midi_event snd_seq snd_timer snd_seq_device psmouse snd
serio_raw tpm_infineon soundcore i915 snd_page_alloc tpm_tis
parport_pc tpm tpm_bios drm_kms_helper drm i2c_algo_bit video lp
parport usbhid hid sg sr_mod sd_mod ehci_hcd uhci_hcd usbcore e1000e
usb_common floppy
[ 3105.018408]
[ 3105.018419] Pid: 189, comm: khubd Tainted: G          I  3.2.0-rc7+
#29 Hewlett-Packard HP Compaq dc7800p Convertible Minitower/0AACh
[ 3105.018481] RIP: 0010:[<ffffffffa045830d>]  [<ffffffffa045830d>]
usb_stor_probe1+0x2fd/0xc20 [usb_storage]
[ 3105.018536] RSP: 0018:ffff880056a3d830  EFLAGS: 00010286
[ 3105.018562] RAX: ffff880065f4e648 RBX: ffff88006bb28000 RCX: 0000000000000000
[ 3105.018597] RDX: ffff88006f23c7b0 RSI: 0000000000000001 RDI: 0000000000000206
[ 3105.018632] RBP: ffff880056a3d900 R08: 0000000000000000 R09: ffff880067365000
[ 3105.018665] R10: 00000000000002ac R11: 0000000000000010 R12: ffff6000b41a7340
[ 3105.018698] R13: ffff880065f4ef60 R14: ffff88006bb28b88 R15: ffff88006f23d270
[ 3105.018733] FS:  0000000000000000(0000) GS:ffff88007a200000(0000)
knlGS:0000000000000000
[ 3105.018773] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 3105.018801] CR2: 00007fc99c8c4650 CR3: 0000000001e05000 CR4: 00000000000006f0
[ 3105.018835] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 3105.018870] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 3105.018906] Process khubd (pid: 189, threadinfo ffff880056a3c000,
task ffff88005677a400)
[ 3105.018945] Stack:
[ 3105.018959]  0000000000000000 0000000000000000 ffff880056a3d8d0
0000000000000002
[ 3105.019011]  0000000000000000 ffff880056a3d918 ffff880000000000
0000000000000002
[ 3105.019058]  ffff880056a3d8d0 0000000000000012 ffff880056a3d8d0
0000000000000006
[ 3105.019105] Call Trace:
[ 3105.019128]  [<ffffffffa0458cd4>] storage_probe+0xa4/0xe0 [usb_storage]
[ 3105.019173]  [<ffffffffa0097822>] usb_probe_interface+0x172/0x330 [usbcore]
[ 3105.019211]  [<ffffffff815fda67>] driver_probe_device+0x257/0x3b0
[ 3105.019243]  [<ffffffff815fdd43>] __device_attach+0x73/0x90
[ 3105.019272]  [<ffffffff815fdcd0>] ? __driver_attach+0x110/0x110
[ 3105.019303]  [<ffffffff815fb93c>] bus_for_each_drv+0x9c/0xf0
[ 3105.019334]  [<ffffffff815fd6c7>] device_attach+0xf7/0x120
[ 3105.019364]  [<ffffffff815fc905>] bus_probe_device+0x45/0x80
[ 3105.019396]  [<ffffffff815f98a6>] device_add+0x876/0x990
[ 3105.019434]  [<ffffffffa0094e42>] usb_set_configuration+0x822/0x9e0 [usbcore]
[ 3105.019479]  [<ffffffffa00a3492>] generic_probe+0x62/0xf0 [usbcore]
[ 3105.019518]  [<ffffffffa0097a46>] usb_probe_device+0x66/0xb0 [usbcore]
[ 3105.019555]  [<ffffffff815fda67>] driver_probe_device+0x257/0x3b0
[ 3105.019589]  [<ffffffff815fdd43>] __device_attach+0x73/0x90
[ 3105.019617]  [<ffffffff815fdcd0>] ? __driver_attach+0x110/0x110
[ 3105.019648]  [<ffffffff815fb93c>] bus_for_each_drv+0x9c/0xf0
[ 3105.019680]  [<ffffffff815fd6c7>] device_attach+0xf7/0x120
[ 3105.019709]  [<ffffffff815fc905>] bus_probe_device+0x45/0x80
[ 3105.021040] usb usb6: usb auto-resume
[ 3105.021045] usb usb6: wakeup_rh
[ 3105.024849]  [<ffffffff815f98a6>] device_add+0x876/0x990
[ 3105.025086]  [<ffffffffa0088987>] usb_new_device+0x1e7/0x2b0 [usbcore]
[ 3105.025086]  [<ffffffffa008a4d7>] hub_thread+0xb27/0x1ec0 [usbcore]
[ 3105.025086]  [<ffffffff810d5200>] ? wake_up_bit+0x50/0x50
[ 3105.025086]  [<ffffffffa00899b0>] ? usb_remote_wakeup+0xa0/0xa0 [usbcore]
[ 3105.025086]  [<ffffffff810d49b8>] kthread+0xd8/0xf0
[ 3105.025086]  [<ffffffff81939884>] kernel_thread_helper+0x4/0x10
[ 3105.025086]  [<ffffffff8192a8c0>] ? _raw_spin_unlock_irq+0x50/0x80
[ 3105.025086]  [<ffffffff8192b1b4>] ? retint_restore_args+0x13/0x13
[ 3105.025086]  [<ffffffff810d48e0>] ? __init_kthread_worker+0x80/0x80
[ 3105.025086]  [<ffffffff81939880>] ? gs_change+0x13/0x13
[ 3105.025086] Code: 00 48 83 05 cd ad 00 00 01 48 83 05 cd ad 00 00
01 4c 8b ab 30 0c 00 00 48 8b 50 08 48 83 c0 30 48 89 45 a0 4c 89 a3
40 0c 00 00 <41> 0f b6 44 24 10 48 89 55 a8 3c ff 0f 84 b8 04 00 00 48
83 05
[ 3105.025086] RIP  [<ffffffffa045830d>] usb_stor_probe1+0x2fd/0xc20
[usb_storage]
[ 3105.025086]  RSP <ffff880056a3d830>
[ 3105.060037] hub 6-0:1.0: hub_resume
[ 3105.062616] usb usb5: usb auto-resume
[ 3105.064317] ehci_hcd 0000:00:1d.7: resume root hub
[ 3105.094809] ---[ end trace a7919e7f17c0a727 ]---
[ 3105.130069] hub 5-0:1.0: hub_resume
[ 3105.132131] usb usb4: usb auto-resume
[ 3105.132136] usb usb4: wakeup_rh
[ 3105.180059] hub 4-0:1.0: hub_resume
[ 3106.290052] usb usb6: suspend_rh (auto-stop)
[ 3106.290077] usb usb4: suspend_rh (auto-stop)

Signed-off-by: Huajun Li <huajun.li.lee@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

EnJens pushed a commit that referenced this issue Oct 2, 2012

mtd: mtd_blkdevs: don't increase 'open' count on error path
commit 342ff28 upstream.

Some error paths in mtd_blkdevs were fixed in the following commit:

    commit 94735ec
    mtd: mtd_blkdevs: fix error path in blktrans_open

But on these error paths, the block device's `dev->open' count is
already incremented before we check for errors. This meant that, while
the error path was handled correctly on the first time through
blktrans_open(), the device is erroneously considered already open on
the second time through.

This problem can be seen, for instance, when a UBI volume is
simultaneously mounted as a UBIFS partition and read through its
corresponding gluebi mtdblockX device. This results in blktrans_open()
passing its error checks (with `dev->open > 0') without actually having
a handle on the device. Here's a summarized log of the actions and
results with nandsim:

    # modprobe nandsim
    # modprobe mtdblock
    # modprobe gluebi
    # modprobe ubifs
    # ubiattach /dev/ubi_ctrl -m 0
    ...
    # ubimkvol /dev/ubi0 -N test -s 16MiB
    ...
    # mount -t ubifs ubi0:test /mnt
    # ls /dev/mtdblock*
    /dev/mtdblock0  /dev/mtdblock1
    # cat /dev/mtdblock1 > /dev/null
    cat: can't open '/dev/mtdblock4': Device or resource busy
    # cat /dev/mtdblock1 > /dev/null

    CPU 0 Unable to handle kernel paging request at virtual address
    fffffff0, epc == 8031536c, ra == 8031f280
    Oops[#1]:
    ...
    Call Trace:
    [<8031536c>] ubi_leb_read+0x14/0x164
    [<8031f280>] gluebi_read+0xf0/0x148
    [<802edba8>] mtdblock_readsect+0x64/0x198
    [<802ecfe4>] mtd_blktrans_thread+0x330/0x3f4
    [<8005be98>] kthread+0x88/0x90
    [<8000bc04>] kernel_thread_helper+0x10/0x18

Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@linux.intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

EnJens pushed a commit that referenced this issue Oct 2, 2012

media: video: tegra: tegra_camera: re-arch power and clock
There are a couple of issues found in tegra_camera.

1. clock enable/disable is controlled by user space.
   -> If client process crashes, there is no way to disable clock.
2. power enable/disable is associated with clock enable.
   -> There is no reason to relate power with clock.
   -> There is only one regulator for this driver.
   -> As same as #1, it may leave power up when client process crashes.
3. driver allows multiple clients to access.
   -> This is not the case for this driver.

This changes addresses the problems described above.

Bug 948780

Change-Id: Ie534771327175f56cf0e138f1c07096ddba470a8
Signed-off-by: Jihoon Bang <jbang@nvidia.com>
Reviewed-on: http://git-master/r/92386
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: David Schalig <dschalig@nvidia.com>
Reviewed-by: Dan Willemsen <dwillemsen@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment