Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ath9k_htc: irrational sleep behaviour in mesh PS #15

Closed
mporsch opened this issue Nov 1, 2012 · 1 comment
Closed

ath9k_htc: irrational sleep behaviour in mesh PS #15

mporsch opened this issue Nov 1, 2012 · 1 comment

Comments

@mporsch
Copy link
Contributor

mporsch commented Nov 1, 2012

Issue:

hardware wakes up way too frequent and early starts passing beacons to driver
-> power consumption too high, because radio not asleep

Background:

To implement the wakeup scheduling for mesh PS in ath9k_htc we employ 2 routines:

  • ath9k_hw_set_beacon_timers:
    (slightly modified not to overwrite own TBTT timer)
    program hardwares registers with given timing information (TBTT, beacon interval, sleep duration, ...)
    TBTT = most imminent neighbor beacon to wake up for
  • ath9k_hw_setpower:
    (unchanged vanilla code)
    put the hardware in NETWORK_SLEEP mode (= disable radio and wait for TBTT interrupt)

Unsuccessful approaches yet:

  • modify CAP_AUTOSLEEP (ar9280 does not have this, while ar9271 has that capability)
  • modify timing information passed to hardware

Dmesg trace:

  • AWAKE -> NETWORK SLEEP indicates that radio is (should be) put to sleep
  • "bcn " is printed when a beacon from any station is received
  • -> trace shows, that beacons are still received after NETWORK SLEEP

[ 146.435133] mesh0: adding 00:03:7f:10:4e:31 to wakeup list
[ 146.437802] ath: phy0: AWAKE -> NETWORK SLEEP
[ 146.445256] ath: phy0: opmode is MESH_POINT
[ 146.445268] ath: phy0: adding 00:03:7f:10:4e:31 to wakeup list
[ 146.451251] ath: phy0: 00:03:7f:10:4e:31's beacon not received yet
[ 146.994165] ath: phy0: bcn a0:21:b7:9a:fa:c3
[ 147.036151] ath: phy0: bcn c0:c1:c0:3b:5f:be
[ 147.077321] ath: phy0: bcn cc:7d:37:81:47:80
[ 147.097287] ath: phy0: bcn a0:21:b7:9a:fa:c3
[ 147.127184] ath: phy0: bcn f8:d1:11:65:f0:13
[ 147.127372] ath: phy0: NETWORK SLEEP -> AWAKE
[ 147.140415] ath: phy0: bcn c0:c1:c0:3b:5f:be
[ 147.151240] ath: phy0: 00:03:7f:10:4e:31's beacon not received yet
[ 147.183169] ath: phy0: bcn cc:7d:37:81:47:80
[ 147.185126] ath: phy0: bcn 00:03:7f:10:4e:31
[ 147.185137] ath: phy0: updating 00:03:7f:10:4e:31 : BI=1000, DP=4, DC=1
[ 147.191259] ath: phy0: next wakeup is 00:03:7f:10:4e:31 in 1006592us
[ 147.199324] ath: phy0: bcn a0:21:b7:9a:fa:c3
[ 147.219242] ath: phy0: AWAKE -> NETWORK SLEEP
[ 147.227334] ath: phy0: NETWORK SLEEP -> AWAKE
[ 147.244446] ath: phy0: bcn c0:c1:c0:3b:5f:be
[ 147.251261] ath: phy0: next wakeup is 00:03:7f:10:4e:31 in 947200us
[ 147.277238] ath: phy0: AWAKE -> NETWORK SLEEP
[ 147.282161] ath: phy0: bcn cc:7d:37:81:47:80
[ 148.018335] ath: phy0: bcn a0:21:b7:9a:fa:c3
[ 148.060150] ath: phy0: bcn c0:c1:c0:3b:5f:be
[ 148.120325] ath: phy0: bcn a0:21:b7:9a:fa:c3
[ 148.169223] ath: phy0: bcn c0:c1:c0:3b:5f:be
[ 148.173145] ath: phy0: bcn f8:d1:11:65:f0:13
[ 148.173313] ath: phy0: NETWORK SLEEP -> AWAKE
[ 148.197239] ath: phy0: next wakeup is 00:03:7f:10:4e:31 in 1024us
[ 148.209207] ath: phy0: bcn cc:7d:37:81:47:80
[ 148.215215] ath: phy0: bcn c0:ea:e4:14:a4:9b
[ 148.223354] ath: phy0: bcn a0:21:b7:9a:fa:c3
[ 148.223459] ath: phy0: AWAKE -> NETWORK SLEEP
[ 148.231318] ath: phy0: NETWORK SLEEP -> AWAKE
[ 148.249277] ath: phy0: 00:03:7f:10:4e:31's beacon missed 1 time(s) -> next try in 962560us (margin 20TU)
[ 148.249288] ath: phy0: next wakeup is 00:03:7f:10:4e:31 in 962560us
[ 148.265270] ath: phy0: bcn c0:c1:c0:3b:5f:be
[ 148.275294] ath: phy0: AWAKE -> NETWORK SLEEP
[ 149.042349] ath: phy0: bcn a0:21:b7:9a:fa:c3
[ 149.085287] ath: phy0: bcn c0:c1:c0:3b:5f:be
[ 149.175213] ath: phy0: bcn f8:d1:11:65:f0:13
[ 149.175397] ath: phy0: NETWORK SLEEP -> AWAKE
[ 149.186210] ath: phy0: bcn c0:c1:c0:3b:5f:be
[ 149.199298] ath: phy0: next wakeup is 00:03:7f:10:4e:31 in 12288us
[ 149.227275] ath: phy0: AWAKE -> NETWORK SLEEP
[ 149.229249] ath: phy0: bcn 00:03:7f:10:4e:31
[ 149.229261] ath: phy0: updating 00:03:7f:10:4e:31 : BI=1000, DP=4, DC=3
[ 149.235301] ath: phy0: NETWORK SLEEP -> AWAKE
[ 149.247334] ath: phy0: bcn a0:21:b7:9a:fa:c3
[ 149.259284] ath: phy0: next wakeup is 00:03:7f:10:4e:31 in 984064us
[ 149.285162] ath: phy0: AWAKE -> NETWORK SLEEP
[ 149.293188] ath: phy0: NETWORK SLEEP -> AWAKE
[ 149.311252] ath: phy0: next wakeup is 00:03:7f:10:4e:31 in 931840us
[ 149.330153] ath: phy0: bcn cc:7d:37:81:47:80
[ 149.337259] ath: phy0: AWAKE -> NETWORK SLEEP
[ 150.066166] ath: phy0: bcn a0:21:b7:9a:fa:c3
[ 150.111252] ath: phy0: bcn c0:c1:c0:3b:5f:be
[ 150.155615] ath: phy0: bcn cc:7d:37:81:47:80
[ 150.169231] ath: phy0: bcn a0:21:b7:9a:fa:c3
[ 150.199294] ath: phy0: bcn f8:d1:11:65:f0:13
[ 150.199480] ath: phy0: NETWORK SLEEP -> AWAKE
[ 150.210256] ath: phy0: bcn c0:c1:c0:3b:5f:be
[ 150.224283] ath: phy0: next wakeup is 00:03:7f:10:4e:31 in 18432us
[ 150.250326] ath: phy0: AWAKE -> NETWORK SLEEP
[ 150.252340] ath: phy0: bcn cc:7d:37:81:47:80
[ 150.254352] ath: phy0: bcn 00:03:7f:10:4e:31
[ 150.254364] ath: phy0: updating 00:03:7f:10:4e:31 : BI=1000, DP=4, DC=2
[ 150.258330] ath: phy0: NETWORK SLEEP -> AWAKE
[ 150.271391] ath: phy0: bcn a0:21:b7:9a:fa:c3
[ 150.282156] ath: phy0: next wakeup is 00:03:7f:10:4e:31 in 986112us
[ 150.308258] ath: phy0: AWAKE -> NETWORK SLEEP
[ 150.314502] ath: phy0: bcn c0:c1:c0:3b:5f:be
[ 150.319213] ath: phy0: NETWORK SLEEP -> AWAKE
[ 150.337262] ath: phy0: next wakeup is 00:03:7f:10:4e:31 in 930816us
[ 150.363231] ath: phy0: AWAKE -> NETWORK SLEEP
[ 150.373153] ath: phy0: bcn a0:21:b7:9a:fa:c3
[ 150.459193] ath: phy0: bcn cc:7d:37:81:47:80
[ 150.476211] ath: phy0: bcn a0:21:b7:9a:fa:c3
[ 150.566352] ath: phy0: bcn cc:7d:37:81:47:80
[ 150.578265] ath: phy0: bcn a0:21:b7:9a:fa:c3
[ 150.622127] ath: phy0: bcn c0:c1:c0:3b:5f:be
[ 150.661275] ath: phy0: bcn cc:7d:37:81:47:80
[ 150.681195] ath: phy0: bcn a0:21:b7:9a:fa:c3
[ 150.722182] ath: phy0: bcn c0:c1:c0:3b:5f:be
[ 150.783224] ath: phy0: bcn a0:21:b7:9a:fa:c3
[ 150.827215] ath: phy0: bcn c0:c1:c0:3b:5f:be
[ 150.866296] ath: phy0: bcn cc:7d:37:81:47:80
[ 150.885201] ath: phy0: bcn a0:21:b7:9a:fa:c3
[ 150.970277] ath: phy0: bcn cc:7d:37:81:47:80
[ 151.030942] ath: phy0: bcn c0:c1:c0:3b:5f:be
[ 151.071003] ath: phy0: bcn cc:7d:37:81:47:80
[ 151.090380] ath: phy0: bcn a0:21:b7:9a:fa:c3
[ 151.132317] ath: phy0: bcn c0:c1:c0:3b:5f:be
[ 151.173318] ath: phy0: bcn cc:7d:37:81:47:80
[ 151.193215] ath: phy0: bcn a0:21:b7:9a:fa:c3
[ 151.223310] ath: phy0: bcn f8:d1:11:65:f0:13
[ 151.223483] ath: phy0: NETWORK SLEEP -> AWAKE
[ 151.234131] ath: phy0: bcn c0:c1:c0:3b:5f:be
[ 151.247258] ath: phy0: next wakeup is 00:03:7f:10:4e:31 in 20480us
[ 151.273265] ath: phy0: AWAKE -> NETWORK SLEEP
[ 151.276376] ath: phy0: bcn cc:7d:37:81:47:80
[ 152.156245] ath: phy0: bcn c0:c1:c0:3b:5f:be
[ 152.156457] ath: phy0: NETWORK SLEEP -> AWAKE
[ 152.174306] ath: phy0: 00:03:7f:10:4e:31's beacon missed 1 time(s) -> next try in 107520us (margin 20TU)
[ 152.174318] ath: phy0: next wakeup is 00:03:7f:10:4e:31 in 107520us
[ 152.197466] ath: phy0: bcn cc:7d:37:81:47:80
[ 152.200267] ath: phy0: AWAKE -> NETWORK SLEEP
[ 153.138192] ath: phy0: bcn a0:21:b7:9a:fa:c3
[ 153.138438] ath: phy0: NETWORK SLEEP -> AWAKE
[ 153.156262] ath: phy0: 00:03:7f:10:4e:31's beacon missed 2 time(s) -> next try in 139264us (margin 30TU)
[ 153.156275] ath: phy0: next wakeup is 00:03:7f:10:4e:31 in 139264us
[ 153.181444] ath: phy0: bcn c0:c1:c0:3b:5f:be
[ 153.181574] ath: phy0: AWAKE -> NETWORK SLEEP
[ 153.324279] ath: phy0: bcn cc:7d:37:81:47:80
[ 153.326263] ath: phy0: bcn 00:03:7f:10:4e:31
[ 153.326275] ath: phy0: updating 00:03:7f:10:4e:31 : BI=1000, DP=4, DC=3
[ 153.326424] ath: phy0: NETWORK SLEEP -> AWAKE
[ 153.343496] ath: phy0: bcn a0:21:b7:9a:fa:c3
[ 153.344490] ath: phy0: next wakeup is 00:03:7f:10:4e:31 in 994304us
[ 153.370304] ath: phy0: AWAKE -> NETWORK SLEEP
[ 153.378321] ath: phy0: NETWORK SLEEP -> AWAKE
[ 153.402265] ath: phy0: next wakeup is 00:03:7f:10:4e:31 in 936960us
[ 153.426150] ath: phy0: bcn cc:7d:37:81:47:80
[ 153.428153] ath: phy0: AWAKE -> NETWORK SLEEP
[ 154.162193] ath: phy0: bcn a0:21:b7:9a:fa:c3
[ 154.307234] ath: phy0: bcn c0:c1:c0:3b:5f:be
[ 154.348266] ath: phy0: bcn cc:7d:37:81:47:80
[ 154.352225] ath: phy0: bcn 00:03:7f:10:4e:31
[ 154.352236] ath: phy0: updating 00:03:7f:10:4e:31 : BI=1000, DP=4, DC=2
[ 154.352360] ath: phy0: NETWORK SLEEP -> AWAKE
[ 154.368266] ath: phy0: bcn a0:21:b7:9a:fa:c3
[ 154.370441] ath: phy0: next wakeup is 00:03:7f:10:4e:31 in 994304us
[ 154.397257] ath: phy0: AWAKE -> NETWORK SLEEP
[ 154.405273] ath: phy0: NETWORK SLEEP -> AWAKE
[ 154.429246] ath: phy0: next wakeup is 00:03:7f:10:4e:31 in 935936us
[ 154.455211] ath: phy0: AWAKE -> NETWORK SLEEP
[ 155.186107] ath: phy0: bcn a0:21:b7:9a:fa:c3
[ 155.229198] ath: phy0: bcn c0:c1:c0:3b:5f:be
[ 155.269146] ath: phy0: bcn cc:7d:37:81:47:80
[ 155.289303] ath: phy0: bcn a0:21:b7:9a:fa:c3
[ 155.319342] ath: phy0: bcn f8:d1:11:65:f0:13
[ 155.319526] ath: phy0: NETWORK SLEEP -> AWAKE
[ 155.331229] ath: phy0: bcn c0:c1:c0:3b:5f:be
[ 155.343215] ath: phy0: next wakeup is 00:03:7f:10:4e:31 in 21504us
[ 155.369254] ath: phy0: AWAKE -> NETWORK SLEEP
[ 155.373193] ath: phy0: bcn cc:7d:37:81:47:80
[ 155.375364] ath: phy0: bcn 00:03:7f:10:4e:31
[ 155.375437] ath: phy0: updating 00:03:7f:10:4e:31 : BI=1000, DP=4, DC=1
[ 155.377329] ath: phy0: NETWORK SLEEP -> AWAKE
[ 155.401294] ath: phy0: next wakeup is 00:03:7f:10:4e:31 in 986112us
[ 155.427268] ath: phy0: AWAKE -> NETWORK SLEEP
[ 155.433201] ath: phy0: bcn c0:c1:c0:3b:5f:be
[ 155.435307] ath: phy0: NETWORK SLEEP -> AWAKE
[ 155.453237] ath: phy0: next wakeup is 00:03:7f:10:4e:31 in 934912us
[ 155.474129] ath: phy0: bcn cc:7d:37:81:47:80
[ 155.479302] ath: phy0: AWAKE -> NETWORK SLEEP
[ 156.260221] ath: phy0: bcn c0:c1:c0:3b:5f:be
[ 156.313314] ath: phy0: bcn a0:21:b7:9a:fa:c3
[ 156.355248] ath: phy0: bcn c0:c1:c0:3b:5f:be
[ 156.396213] ath: phy0: bcn cc:7d:37:81:47:80
[ 156.398319] ath: phy0: bcn 00:03:7f:10:4e:31
[ 156.398331] ath: phy0: updating 00:03:7f:10:4e:31 : BI=1000, DP=4, DC=0
[ 156.398615] ath: phy0: NETWORK SLEEP -> AWAKE
[ 156.416334] ath: phy0: next wakeup is 00:03:7f:10:4e:31 in 994304us
[ 156.442244] ath: phy0: AWAKE -> NETWORK SLEEP
[ 156.450349] ath: phy0: NETWORK SLEEP -> AWAKE
[ 156.474252] ath: phy0: next wakeup is 00:03:7f:10:4e:31 in 936960us
[ 156.498227] ath: phy0: bcn cc:7d:37:81:47:80
[ 156.500314] ath: phy0: AWAKE -> NETWORK SLEEP
[ 157.234186] ath: phy0: bcn a0:21:b7:9a:fa:c3
[ 157.276241] ath: phy0: bcn c0:c1:c0:3b:5f:be
[ 157.317320] ath: phy0: bcn cc:7d:37:81:47:80
[ 157.336439] ath: phy0: bcn a0:21:b7:9a:fa:c3
[ 157.378187] ath: phy0: bcn c0:c1:c0:3b:5f:be
[ 157.420331] ath: phy0: bcn cc:7d:37:81:47:80
[ 157.422165] ath: phy0: bcn 00:03:7f:10:4e:31
[ 157.422181] ath: phy0: updating 00:03:7f:10:4e:31 : BI=1000, DP=4, DC=3
[ 157.422425] ath: phy0: NETWORK SLEEP -> AWAKE
[ 157.440242] ath: phy0: next wakeup is 00:03:7f:10:4e:31 in 994304us
[ 157.466333] ath: phy0: AWAKE -> NETWORK SLEEP
[ 157.474343] ath: phy0: NETWORK SLEEP -> AWAKE
[ 157.498337] ath: phy0: next wakeup is 00:03:7f:10:4e:31 in 936960us
[ 157.524381] ath: phy0: AWAKE -> NETWORK SLEEP
[ 157.525368] ath: phy0: bcn cc:7d:37:81:47:80

@ghost ghost assigned mporsch Nov 1, 2012
@mporsch
Copy link
Contributor Author

mporsch commented Nov 2, 2012

solution:

added workaround to immediately go to sleep when woken up before scheduled TBTT
may cost a few milliamperes more

@mporsch mporsch closed this as completed Nov 2, 2012
twpedersen pushed a commit that referenced this issue May 22, 2013
When hot removing memory presented at boot time, following messages are shown:

  kernel BUG at mm/slub.c:3409!
  invalid opcode: 0000 [#1] SMP
  Modules linked in: ebtable_nat ebtables xt_CHECKSUM iptable_mangle bridge stp llc ipmi_devintf ipmi_msghandler sunrpc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables binfmt_misc vfat fat dm_mirror dm_region_hash dm_log dm_mod vhost_net macvtap macvlan tun uinput iTCO_wdt iTCO_vendor_support coretemp kvm_intel kvm crc32c_intel ghash_clmulni_intel microcode pcspkr sg i2c_i801 lpc_ich mfd_core igb i2c_algo_bit i2c_core e1000e ptp pps_core tpm_infineon ioatdma dca sr_mod cdrom sd_mod crc_t10dif usb_storage megaraid_sas lpfc scsi_transport_fc scsi_tgt scsi_mod
  CPU 0
  Pid: 5091, comm: kworker/0:2 Tainted: G        W    3.9.0-rc6+ #15
  RIP: kfree+0x232/0x240
  Process kworker/0:2 (pid: 5091, threadinfo ffff88084678c000, task ffff88083928ca80)
  Call Trace:
    __release_region+0xd4/0xe0
    __remove_pages+0x52/0x110
    arch_remove_memory+0x89/0xd0
    remove_memory+0xc4/0x100
    acpi_memory_device_remove+0x6d/0xb1
    acpi_device_remove+0x89/0xab
    __device_release_driver+0x7c/0xf0
    device_release_driver+0x2f/0x50
    acpi_bus_device_detach+0x6c/0x70
    acpi_ns_walk_namespace+0x11a/0x250
    acpi_walk_namespace+0xee/0x137
    acpi_bus_trim+0x33/0x7a
    acpi_bus_hot_remove_device+0xc4/0x1a1
    acpi_os_execute_deferred+0x27/0x34
    process_one_work+0x1f7/0x590
    worker_thread+0x11a/0x370
    kthread+0xee/0x100
    ret_from_fork+0x7c/0xb0
  RIP  [<ffffffff811c41d2>] kfree+0x232/0x240
   RSP <ffff88084678d968>

The reason why the messages are shown is to release a resource
structure, allocated by bootmem, by kfree().  So when we release a
resource structure, we should check whether it is allocated by bootmem
or not.

But even if we know a resource structure is allocated by bootmem, we
cannot release it since SLxB cannot treat it.  So for reusing a resource
structure, this patch remembers it by using bootmem_resource as follows:

When releasing a resource structure by free_resource(), free_resource()
checks whether the resource structure is allocated by bootmem or not.
If it is allocated by bootmem, free_resource() adds it to
bootmem_resource.  If it is not allocated by bootmem, free_resource()
release it by kfree().

And when getting a new resource structure by get_resource(),
get_resource() checks whether bootmem_resource has released resource
structures or not.  If there is a released resource structure,
get_resource() returns it.  If there is not a releaed resource
structure, get_resource() returns new resource structure allocated by
kzalloc().

[akpm@linux-foundation.org: s/get_resource/alloc_resource/]
Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Reviewed-by: Toshi Kani <toshi.kani@hp.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Ram Pai <linuxram@us.ibm.com>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
ctwitty pushed a commit that referenced this issue Jun 3, 2013
With the rwsem lock around
__cpufreq_governor(policy, CPUFREQ_GOV_POLICY_EXIT), we
get circular dependency when we call sysfs_remove_group().

 ======================================================
 [ INFO: possible circular locking dependency detected ]
 3.9.0-rc7+ #15 Not tainted
 -------------------------------------------------------
 cat/2387 is trying to acquire lock:
  (&per_cpu(cpu_policy_rwsem, cpu)){+++++.}, at: [<c02f6179>] lock_policy_rwsem_read+0x25/0x34

 but task is already holding lock:
  (s_active#41){++++.+}, at: [<c00f9bf7>] sysfs_read_file+0x4f/0xcc

 which lock already depends on the new lock.

 the existing dependency chain (in reverse order) is:

-> #1 (s_active#41){++++.+}:
        [<c0055a79>] lock_acquire+0x61/0xbc
        [<c00fabf1>] sysfs_addrm_finish+0xc1/0x128
        [<c00f9819>] sysfs_hash_and_remove+0x35/0x64
        [<c00fbe6f>] remove_files.isra.0+0x1b/0x24
        [<c00fbea5>] sysfs_remove_group+0x2d/0xa8
        [<c02f9a0b>] cpufreq_governor_interactive+0x13b/0x35c
        [<c02f61df>] __cpufreq_governor+0x2b/0x8c
        [<c02f6579>] __cpufreq_set_policy+0xa9/0xf8
        [<c02f6b75>] store_scaling_governor+0x61/0x100
        [<c02f6f4d>] store+0x39/0x60
        [<c00f9b81>] sysfs_write_file+0xed/0x114
        [<c00b3fd1>] vfs_write+0x65/0xd8
        [<c00b424b>] sys_write+0x2f/0x50
        [<c000cdc1>] ret_fast_syscall+0x1/0x52

-> #0 (&per_cpu(cpu_policy_rwsem, cpu)){+++++.}:
        [<c0055253>] __lock_acquire+0xef3/0x13dc
        [<c0055a79>] lock_acquire+0x61/0xbc
        [<c03ee1f5>] down_read+0x25/0x30
        [<c02f6179>] lock_policy_rwsem_read+0x25/0x34
        [<c02f6edd>] show+0x21/0x58
        [<c00f9c0f>] sysfs_read_file+0x67/0xcc
        [<c00b40a7>] vfs_read+0x63/0xd8
        [<c00b41fb>] sys_read+0x2f/0x50
        [<c000cdc1>] ret_fast_syscall+0x1/0x52

 other info that might help us debug this:

  Possible unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock(s_active#41);
                                lock(&per_cpu(cpu_policy_rwsem, cpu));
                                lock(s_active#41);
   lock(&per_cpu(cpu_policy_rwsem, cpu));

  *** DEADLOCK ***

 2 locks held by cat/2387:
  #0:  (&buffer->mutex){+.+.+.}, at: [<c00f9bcd>] sysfs_read_file+0x25/0xcc
  #1:  (s_active#41){++++.+}, at: [<c00f9bf7>] sysfs_read_file+0x4f/0xcc

 stack backtrace:
 [<c0011d55>] (unwind_backtrace+0x1/0x9c) from [<c03e9a09>] (print_circular_bug+0x19d/0x1e8)
 [<c03e9a09>] (print_circular_bug+0x19d/0x1e8) from [<c0055253>] (__lock_acquire+0xef3/0x13dc)
 [<c0055253>] (__lock_acquire+0xef3/0x13dc) from [<c0055a79>] (lock_acquire+0x61/0xbc)
 [<c0055a79>] (lock_acquire+0x61/0xbc) from [<c03ee1f5>] (down_read+0x25/0x30)
 [<c03ee1f5>] (down_read+0x25/0x30) from [<c02f6179>] (lock_policy_rwsem_read+0x25/0x34)
 [<c02f6179>] (lock_policy_rwsem_read+0x25/0x34) from [<c02f6edd>] (show+0x21/0x58)
 [<c02f6edd>] (show+0x21/0x58) from [<c00f9c0f>] (sysfs_read_file+0x67/0xcc)
 [<c00f9c0f>] (sysfs_read_file+0x67/0xcc) from [<c00b40a7>] (vfs_read+0x63/0xd8)
 [<c00b40a7>] (vfs_read+0x63/0xd8) from [<c00b41fb>] (sys_read+0x2f/0x50)
 [<c00b41fb>] (sys_read+0x2f/0x50) from [<c000cdc1>] (ret_fast_syscall+0x1/0x52)

This lock isn't required while calling __cpufreq_governor(policy,
CPUFREQ_GOV_POLICY_EXIT). Remove it.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
twpedersen pushed a commit that referenced this issue Jul 16, 2013
The recent modification in the cpuidle framework consolidated the
timer broadcast code across the different drivers by setting a new
flag in the idle state. It tells the cpuidle core code to enter/exit
the broadcast mode for the cpu when entering a deep idle state. The
broadcast timer enter/exit is no longer handled by the back-end
driver.

This change made the local interrupt to be enabled *before* calling
CLOCK_EVENT_NOTIFY_EXIT.

On a tegra114, a four cores system, when the flag has been introduced
in the driver, the following warning appeared:

WARNING: at kernel/time/tick-broadcast.c:578 tick_broadcast_oneshot_control
CPU: 2 PID: 0 Comm: swapper/2 Not tainted 3.10.0-rc3-next-20130529+ #15
[<c00667f8>] (tick_broadcast_oneshot_control+0x1a4/0x1d0) from [<c0065cd0>] (tick_notify+0x240/0x40c)
[<c0065cd0>] (tick_notify+0x240/0x40c) from [<c0044724>] (notifier_call_chain+0x44/0x84)
[<c0044724>] (notifier_call_chain+0x44/0x84) from [<c0044828>] (raw_notifier_call_chain+0x18/0x20)
[<c0044828>] (raw_notifier_call_chain+0x18/0x20) from [<c00650cc>] (clockevents_notify+0x28/0x170)
[<c00650cc>] (clockevents_notify+0x28/0x170) from [<c033f1f0>] (cpuidle_idle_call+0x11c/0x168)
[<c033f1f0>] (cpuidle_idle_call+0x11c/0x168) from [<c000ea94>] (arch_cpu_idle+0x8/0x38)
[<c000ea94>] (arch_cpu_idle+0x8/0x38) from [<c005ea80>] (cpu_startup_entry+0x60/0x134)
[<c005ea80>] (cpu_startup_entry+0x60/0x134) from [<804fe9a4>] (0x804fe9a4)

I don't have the hardware, so I wasn't able to reproduce the warning
but after looking a while at the code, I deduced the following:

 1. the CPU2 enters a deep idle state and sets the broadcast timer

 2. the timer expires, the tick_handle_oneshot_broadcast function is
    called, setting the tick_broadcast_pending_mask and waking up the
    idle cpu CPU2

 3. the CPU2 exits idle handles the interrupt and then invokes
    tick_broadcast_oneshot_control with CLOCK_EVENT_NOTIFY_EXIT which
    runs the following code:

    [...]
    if (dev->next_event.tv64 == KTIME_MAX)
            goto out;

    if (cpumask_test_and_clear_cpu(cpu,
                                 tick_broadcast_pending_mask))
            goto out;
    [...]

    So if there is no next event scheduled for CPU2, we fulfil the
    first condition and jump out without clearing the
    tick_broadcast_pending_mask.

 4. CPU2 goes to deep idle again and calls
    tick_broadcast_oneshot_control with CLOCK_NOTIFY_EVENT_ENTER but
    with the tick_broadcast_pending_mask set for CPU2, triggering the
    warning.

The issue only surfaced due to the modifications of the cpuidle
framework, which resulted in interrupts being enabled before the call
to the clockevents code. If the call happens before interrupts have
been enabled, the warning cannot trigger, because there is still the
event pending which caused the broadcast timer expiry.

Move the check for the next event below the check for the pending bit,
so the pending bit gets cleared whether an event is scheduled on the
cpu or not.

[ tglx: Massaged changelog ]

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reported-and-tested-by: Joseph Lo <josephl@nvidia.com>
Cc: Stephen Warren <swarren@nvidia.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linaro-kernel@lists.linaro.org
Link: http://lkml.kernel.org/r/1371485735-31249-1-git-send-email-daniel.lezcano@linaro.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
jasonabele pushed a commit that referenced this issue Aug 16, 2013
…s struct file

commit e4daf1f upstream.

The following call chain:
------------------------------------------------------------
nfs4_get_vfs_file
- nfsd_open
  - dentry_open
    - do_dentry_open
      - __get_file_write_access
        - get_write_access
          - return atomic_inc_unless_negative(&inode->i_writecount) ? 0 : -ETXTBSY;
------------------------------------------------------------

can result in the following state:
------------------------------------------------------------
struct nfs4_file {
...
  fi_fds = {0xffff880c1fa65c80, 0xffffffffffffffe6, 0x0},
  fi_access = {{
      counter = 0x1
    }, {
      counter = 0x0
    }},
...
------------------------------------------------------------

1) First time around, in nfs4_get_vfs_file() fp->fi_fds[O_WRONLY] is
NULL, hence nfsd_open() is called where we get status set to an error
and fp->fi_fds[O_WRONLY] to -ETXTBSY. Thus we do not reach
nfs4_file_get_access() and fi_access[O_WRONLY] is not incremented.

2) Second time around, in nfs4_get_vfs_file() fp->fi_fds[O_WRONLY] is
NOT NULL (-ETXTBSY), so nfsd_open() is NOT called, but
nfs4_file_get_access() IS called and fi_access[O_WRONLY] is incremented.
Thus we leave a landmine in the form of the nfs4_file data structure in
an incorrect state.

3) Eventually, when __nfs4_file_put_access() is called it finds
fi_access[O_WRONLY] being non-zero, it decrements it and calls
nfs4_file_put_fd() which tries to fput -ETXTBSY.
------------------------------------------------------------
...
     [exception RIP: fput+0x9]
     RIP: ffffffff81177fa9  RSP: ffff88062e365c90  RFLAGS: 00010282
     RAX: ffff880c2b3d99cc  RBX: ffff880c2b3d9978  RCX: 0000000000000002
     RDX: dead000000100101  RSI: 0000000000000001  RDI: ffffffffffffffe6
     RBP: ffff88062e365c90   R8: ffff88041fe797d8   R9: ffff88062e365d58
     R10: 0000000000000008  R11: 0000000000000000  R12: 0000000000000001
     R13: 0000000000000007  R14: 0000000000000000  R15: 0000000000000000
     ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
  #9 [ffff88062e365c98] __nfs4_file_put_access at ffffffffa0562334 [nfsd]
 #10 [ffff88062e365cc8] nfs4_file_put_access at ffffffffa05623ab [nfsd]
 #11 [ffff88062e365ce8] free_generic_stateid at ffffffffa056634d [nfsd]
 #12 [ffff88062e365d18] release_open_stateid at ffffffffa0566e4b [nfsd]
 #13 [ffff88062e365d38] nfsd4_close at ffffffffa0567401 [nfsd]
 #14 [ffff88062e365d88] nfsd4_proc_compound at ffffffffa0557f28 [nfsd]
 #15 [ffff88062e365dd8] nfsd_dispatch at ffffffffa054543e [nfsd]
 #16 [ffff88062e365e18] svc_process_common at ffffffffa04ba5a4 [sunrpc]
 #17 [ffff88062e365e98] svc_process at ffffffffa04babe0 [sunrpc]
 #18 [ffff88062e365eb8] nfsd at ffffffffa0545b62 [nfsd]
 #19 [ffff88062e365ee8] kthread at ffffffff81090886
 #20 [ffff88062e365f48] kernel_thread at ffffffff8100c14a
------------------------------------------------------------

Signed-off-by: Harshula Jayasuriya <harshula@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
chunyeow pushed a commit that referenced this issue Aug 29, 2013
Several people reported the warning: "kernel BUG at kernel/timer.c:729!"
and the stack trace is:

	#7 [ffff880214d25c10] mod_timer+501 at ffffffff8106d905
	#8 [ffff880214d25c50] br_multicast_del_pg.isra.20+261 at ffffffffa0731d25 [bridge]
	#9 [ffff880214d25c80] br_multicast_disable_port+88 at ffffffffa0732948 [bridge]
	#10 [ffff880214d25cb0] br_stp_disable_port+154 at ffffffffa072bcca [bridge]
	#11 [ffff880214d25ce8] br_device_event+520 at ffffffffa072a4e8 [bridge]
	#12 [ffff880214d25d18] notifier_call_chain+76 at ffffffff8164aafc
	#13 [ffff880214d25d50] raw_notifier_call_chain+22 at ffffffff810858f6
	#14 [ffff880214d25d60] call_netdevice_notifiers+45 at ffffffff81536aad
	#15 [ffff880214d25d80] dev_close_many+183 at ffffffff81536d17
	#16 [ffff880214d25dc0] rollback_registered_many+168 at ffffffff81537f68
	#17 [ffff880214d25de8] rollback_registered+49 at ffffffff81538101
	#18 [ffff880214d25e10] unregister_netdevice_queue+72 at ffffffff815390d8
	#19 [ffff880214d25e30] __tun_detach+272 at ffffffffa074c2f0 [tun]
	#20 [ffff880214d25e88] tun_chr_close+45 at ffffffffa074c4bd [tun]
	#21 [ffff880214d25ea8] __fput+225 at ffffffff8119b1f1
	#22 [ffff880214d25ef0] ____fput+14 at ffffffff8119b3fe
	#23 [ffff880214d25f00] task_work_run+159 at ffffffff8107cf7f
	#24 [ffff880214d25f30] do_notify_resume+97 at ffffffff810139e1
	#25 [ffff880214d25f50] int_signal+18 at ffffffff8164f292

this is due to I forgot to check if mp->timer is armed in
br_multicast_del_pg(). This bug is introduced by
commit 9f00b2e (bridge: only expire the mdb entry
when query is received).

Same for __br_mdb_del().

Tested-by: poma <pomidorabelisima@gmail.com>
Reported-by: LiYonghua <809674045@qq.com>
Reported-by: Robert Hancock <hancockrwd@gmail.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ctwitty pushed a commit that referenced this issue Sep 5, 2013
…s struct file

The following call chain:
------------------------------------------------------------
nfs4_get_vfs_file
- nfsd_open
  - dentry_open
    - do_dentry_open
      - __get_file_write_access
        - get_write_access
          - return atomic_inc_unless_negative(&inode->i_writecount) ? 0 : -ETXTBSY;
------------------------------------------------------------

can result in the following state:
------------------------------------------------------------
struct nfs4_file {
...
  fi_fds = {0xffff880c1fa65c80, 0xffffffffffffffe6, 0x0},
  fi_access = {{
      counter = 0x1
    }, {
      counter = 0x0
    }},
...
------------------------------------------------------------

1) First time around, in nfs4_get_vfs_file() fp->fi_fds[O_WRONLY] is
NULL, hence nfsd_open() is called where we get status set to an error
and fp->fi_fds[O_WRONLY] to -ETXTBSY. Thus we do not reach
nfs4_file_get_access() and fi_access[O_WRONLY] is not incremented.

2) Second time around, in nfs4_get_vfs_file() fp->fi_fds[O_WRONLY] is
NOT NULL (-ETXTBSY), so nfsd_open() is NOT called, but
nfs4_file_get_access() IS called and fi_access[O_WRONLY] is incremented.
Thus we leave a landmine in the form of the nfs4_file data structure in
an incorrect state.

3) Eventually, when __nfs4_file_put_access() is called it finds
fi_access[O_WRONLY] being non-zero, it decrements it and calls
nfs4_file_put_fd() which tries to fput -ETXTBSY.
------------------------------------------------------------
...
     [exception RIP: fput+0x9]
     RIP: ffffffff81177fa9  RSP: ffff88062e365c90  RFLAGS: 00010282
     RAX: ffff880c2b3d99cc  RBX: ffff880c2b3d9978  RCX: 0000000000000002
     RDX: dead000000100101  RSI: 0000000000000001  RDI: ffffffffffffffe6
     RBP: ffff88062e365c90   R8: ffff88041fe797d8   R9: ffff88062e365d58
     R10: 0000000000000008  R11: 0000000000000000  R12: 0000000000000001
     R13: 0000000000000007  R14: 0000000000000000  R15: 0000000000000000
     ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
  #9 [ffff88062e365c98] __nfs4_file_put_access at ffffffffa0562334 [nfsd]
 #10 [ffff88062e365cc8] nfs4_file_put_access at ffffffffa05623ab [nfsd]
 #11 [ffff88062e365ce8] free_generic_stateid at ffffffffa056634d [nfsd]
 #12 [ffff88062e365d18] release_open_stateid at ffffffffa0566e4b [nfsd]
 #13 [ffff88062e365d38] nfsd4_close at ffffffffa0567401 [nfsd]
 #14 [ffff88062e365d88] nfsd4_proc_compound at ffffffffa0557f28 [nfsd]
 #15 [ffff88062e365dd8] nfsd_dispatch at ffffffffa054543e [nfsd]
 #16 [ffff88062e365e18] svc_process_common at ffffffffa04ba5a4 [sunrpc]
 #17 [ffff88062e365e98] svc_process at ffffffffa04babe0 [sunrpc]
 #18 [ffff88062e365eb8] nfsd at ffffffffa0545b62 [nfsd]
 #19 [ffff88062e365ee8] kthread at ffffffff81090886
 #20 [ffff88062e365f48] kernel_thread at ffffffff8100c14a
------------------------------------------------------------

Cc: stable@vger.kernel.org
Signed-off-by: Harshula Jayasuriya <harshula@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
mporsch pushed a commit that referenced this issue Oct 18, 2013
When booting secondary CPUs, announce_cpu() is called to show which cpu has
been brought up. For example:

[    0.402751] smpboot: Booting Node   0, Processors  #1 #2 #3 #4 #5 OK
[    0.525667] smpboot: Booting Node   1, Processors  #6 #7 #8 #9 #10 #11 OK
[    0.755592] smpboot: Booting Node   0, Processors  #12 #13 #14 #15 #16 #17 OK
[    0.890495] smpboot: Booting Node   1, Processors  #18 #19 #20 #21 #22 #23

But the last "OK" is lost, because 'nr_cpu_ids-1' represents the maximum
possible cpu id. It should use the maximum present cpu id in case not all
CPUs booted up.

Signed-off-by: Libin <huawei.libin@huawei.com>
Cc: <guohanjun@huawei.com>
Cc: <wangyijing@huawei.com>
Cc: <fenghua.yu@intel.com>
Cc: <paul.gortmaker@windriver.com>
Link: http://lkml.kernel.org/r/1378378676-18276-1-git-send-email-huawei.libin@huawei.com
[ tweaked the changelog, removed unnecessary line break, tweaked the format to align the fields vertically. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
silverjam pushed a commit that referenced this issue Nov 19, 2013
As the new x86 CPU bootup printout format code maintainer, I am
taking immediate action to improve and clean (and thus indulge
my OCD) the reporting of the cores when coming up online.

Fix padding to a right-hand alignment, cleanup code and bind
reporting width to the max number of supported CPUs on the
system, like this:

 [    0.074509] smpboot: Booting Node   0, Processors:      #1  #2  #3  #4  #5  #6  #7 OK
 [    0.644008] smpboot: Booting Node   1, Processors:  #8  #9 #10 #11 #12 #13 #14 #15 OK
 [    1.245006] smpboot: Booting Node   2, Processors: #16 #17 #18 #19 #20 #21 #22 #23 OK
 [    1.864005] smpboot: Booting Node   3, Processors: #24 #25 #26 #27 #28 #29 #30 #31 OK
 [    2.489005] smpboot: Booting Node   4, Processors: #32 #33 #34 #35 #36 #37 #38 #39 OK
 [    3.093005] smpboot: Booting Node   5, Processors: #40 #41 #42 #43 #44 #45 #46 #47 OK
 [    3.698005] smpboot: Booting Node   6, Processors: #48 #49 #50 #51 #52 #53 #54 #55 OK
 [    4.304005] smpboot: Booting Node   7, Processors: #56 #57 #58 #59 #60 #61 #62 #63 OK
 [    4.961413] Brought up 64 CPUs

and this:

 [    0.072367] smpboot: Booting Node   0, Processors:    #1 #2 #3 #4 #5 #6 #7 OK
 [    0.686329] Brought up 8 CPUs

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Libin <huawei.libin@huawei.com>
Cc: wangyijing@huawei.com
Cc: fenghua.yu@intel.com
Cc: guohanjun@huawei.com
Cc: paul.gortmaker@windriver.com
Link: http://lkml.kernel.org/r/20130927143554.GF4422@pd.tnic
Signed-off-by: Ingo Molnar <mingo@kernel.org>
silverjam pushed a commit that referenced this issue Nov 19, 2013
Turn it into (for example):

[    0.073380] x86: Booting SMP configuration:
[    0.074005] .... node   #0, CPUs:          #1   #2   #3   #4   #5   #6   #7
[    0.603005] .... node   #1, CPUs:     #8   #9  #10  #11  #12  #13  #14  #15
[    1.200005] .... node   #2, CPUs:    #16  #17  #18  #19  #20  #21  #22  #23
[    1.796005] .... node   #3, CPUs:    #24  #25  #26  #27  #28  #29  #30  #31
[    2.393005] .... node   #4, CPUs:    #32  #33  #34  #35  #36  #37  #38  #39
[    2.996005] .... node   #5, CPUs:    #40  #41  #42  #43  #44  #45  #46  #47
[    3.600005] .... node   #6, CPUs:    #48  #49  #50  #51  #52  #53  #54  #55
[    4.202005] .... node   #7, CPUs:    #56  #57  #58  #59  #60  #61  #62  #63
[    4.811005] .... node   #8, CPUs:    #64  #65  #66  #67  #68  #69  #70  #71
[    5.421006] .... node   #9, CPUs:    #72  #73  #74  #75  #76  #77  #78  #79
[    6.032005] .... node  #10, CPUs:    #80  #81  #82  #83  #84  #85  #86  #87
[    6.648006] .... node  #11, CPUs:    #88  #89  #90  #91  #92  #93  #94  #95
[    7.262005] .... node  #12, CPUs:    #96  #97  #98  #99 #100 #101 #102 #103
[    7.865005] .... node  #13, CPUs:   #104 #105 #106 #107 #108 #109 #110 #111
[    8.466005] .... node  #14, CPUs:   #112 #113 #114 #115 #116 #117 #118 #119
[    9.073006] .... node  #15, CPUs:   #120 #121 #122 #123 #124 #125 #126 #127
[    9.679901] x86: Booted up 16 nodes, 128 CPUs

and drop useless elements.

Change num_digits() to hpa's division-avoiding, cell-phone-typed
version which he went at great lengths and pains to submit on a
Saturday evening.

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: huawei.libin@huawei.com
Cc: wangyijing@huawei.com
Cc: fenghua.yu@intel.com
Cc: guohanjun@huawei.com
Cc: paul.gortmaker@windriver.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20130930095624.GB16383@pd.tnic
Signed-off-by: Ingo Molnar <mingo@kernel.org>
silverjam pushed a commit that referenced this issue Nov 19, 2013
…heck

If the type we receive is greater than ST_MAX_CHANNELS we can't rely on
type as vector index since we would be accessing unknown memory when we use the type
as index.

 Unable to handle kernel NULL pointer dereference at virtual address 0000001b
 pgd = c0004000
 [0000001b] *pgd=00000000
 Internal error: Oops: 17 [#1] PREEMPT SMP ARM
 Modules linked in: btwilink wl12xx wlcore mac80211 cfg80211 rfcomm bnep bluo
 CPU: 0    Tainted: G        W     (3.4.0+ #15)
 PC is at st_int_recv+0x278/0x344
 LR is at get_parent_ip+0x14/0x30
 pc : [<c03b01a8>]    lr : [<c007273c>]    psr: 200f0193
 sp : dc631ed0  ip : e3e21c24  fp : dc631f04
 r10: 00000000  r9 : 600f0113  r8 : 0000003f
 r7 : e3e21b14  r6 : 00000067  r5 : e2e49c1c  r4 : e3e21a80
 r3 : 00000001  r2 : 00000001  r1 : 00000001  r0 : 600f0113
 Flags: nzCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment kernel
 Control: 10c5387d  Table: 9c50004a  DAC: 00000015

Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
silverjam pushed a commit that referenced this issue Nov 19, 2013
loop: fix crash if blk_alloc_queue fails

If blk_alloc_queue fails, loop_add cleans up, but it doesn't clean up the
identifier allocated with idr_alloc. That causes crash on module unload in
idr_for_each(&loop_index_idr, &loop_exit_cb, NULL); where we attempt to
remove non-existed device with that id.

BUG: unable to handle kernel NULL pointer dereference at 0000000000000380
IP: [<ffffffff812057c9>] del_gendisk+0x19/0x2d0
PGD 43d399067 PUD 43d0ad067 PMD 0
Oops: 0000 [#1] PREEMPT SMP
Modules linked in: loop(-) dm_snapshot dm_zero dm_mirror dm_region_hash dm_log dm_loop dm_mod ip6table_filter ip6_tables uvesafb cfbcopyarea cfbimgblt cfbfillrect fbcon font bitblit fbcon_rotate fbcon_cw fbcon_ud fbcon_ccw softcursor fb fbdev msr ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_conntrack_ipv4 nf_defrag_ipv4 xt_state ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge stp llc tun ipv6 cpufreq_userspace cpufreq_stats cpufreq_ondemand cpufreq_conservative cpufreq_powersave spadfs fuse hid_generic usbhid hid raid0 md_mod dmi_sysfs nf_nat_ftp nf_nat nf_conntrack_ftp nf_conntrack snd_usb_audio snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd_page_alloc lm85 hwmon_vid snd_hwdep snd_usbmidi_lib snd_rawmidi snd soundcore acpi_cpufreq ohci_hcd freq_table tg3 ehci_pci mperf ehci_hcd kvm_amd kvm sata_svw serverworks libphy libata ide_core k10temp usbcore hwmon microcode ptp pcspkr pps_core e100 skge mii usb_common i2c_piix4 floppy evdev rtc_cmos i2c_core processor but!
 ton unix
CPU: 7 PID: 2735 Comm: rmmod Tainted: G        W    3.10.15-devel #15
Hardware name: empty empty/S3992-E, BIOS 'V1.06   ' 06/09/2009
task: ffff88043d38e780 ti: ffff88043d21e000 task.ti: ffff88043d21e000
RIP: 0010:[<ffffffff812057c9>]  [<ffffffff812057c9>] del_gendisk+0x19/0x2d0
RSP: 0018:ffff88043d21fe10  EFLAGS: 00010282
RAX: ffffffffa05102e0 RBX: 0000000000000000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffff88043ea82800 RDI: 0000000000000000
RBP: ffff88043d21fe48 R08: 0000000000000000 R09: 0000000000000001
R10: 0000000000000001 R11: 0000000000000000 R12: 00000000000000ff
R13: 0000000000000080 R14: 0000000000000000 R15: ffff88043ea82800
FS:  00007ff646534700(0000) GS:ffff880447000000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000380 CR3: 000000043e9bf000 CR4: 00000000000007e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Stack:
 ffffffff8100aba4 0000000000000092 ffff88043d21fe48 ffff88043ea82800
 00000000000000ff ffff88043d21fe98 0000000000000000 ffff88043d21fe60
 ffffffffa05102b4 0000000000000000 ffff88043d21fe70 ffffffffa05102ec
Call Trace:
 [<ffffffff8100aba4>] ? native_sched_clock+0x24/0x80
 [<ffffffffa05102b4>] loop_remove+0x14/0x40 [loop]
 [<ffffffffa05102ec>] loop_exit_cb+0xc/0x10 [loop]
 [<ffffffff81217b74>] idr_for_each+0x104/0x190
 [<ffffffffa05102e0>] ? loop_remove+0x40/0x40 [loop]
 [<ffffffff8109adc5>] ? trace_hardirqs_on_caller+0x105/0x1d0
 [<ffffffffa05135dc>] loop_exit+0x34/0xa58 [loop]
 [<ffffffff810a98ea>] SyS_delete_module+0x13a/0x260
 [<ffffffff81221d5e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
 [<ffffffff813cff16>] system_call_fastpath+0x1a/0x1f
Code: f0 4c 8b 6d f8 c9 c3 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 41 56 41 55 4c 8d af 80 00 00 00 41 54 53 48 89 fb 48 83 ec 18 <48> 83 bf 80 03 00
00 00 74 4d e8 98 fe ff ff 31 f6 48 c7 c7 20
RIP  [<ffffffff812057c9>] del_gendisk+0x19/0x2d0
 RSP <ffff88043d21fe10>
CR2: 0000000000000380
---[ end trace 64ec069ec70f1309 ]---

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: stable@kernel.org	# 3.1+
Signed-off-by: Jens Axboe <axboe@kernel.dk>
silverjam pushed a commit that referenced this issue Nov 19, 2013
…ux/kernel/git/tip/tip

Pull x86 boot changes from Ingo Molnar:
 "Two changes that prettify and compactify the SMP bootup output from:

     smpboot: Booting Node   0, Processors  #1 #2 #3 OK
     smpboot: Booting Node   1, Processors  #4 #5 #6 #7 OK
     smpboot: Booting Node   2, Processors  #8 #9 #10 #11 OK
     smpboot: Booting Node   3, Processors  #12 #13 #14 #15 OK
     Brought up 16 CPUs

  to something like:

     x86: Booting SMP configuration:
     .... node  #0, CPUs:        #1  #2  #3
     .... node  #1, CPUs:    #4  #5  #6  #7
     .... node  #2, CPUs:    #8  #9 #10 #11
     .... node  #3, CPUs:   #12 #13 #14 #15
     x86: Booted up 4 nodes, 16 CPUs"

* 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/boot: Further compress CPUs bootup message
  x86: Improve the printout of the SMP bootup CPU table
ctwitty pushed a commit that referenced this issue Mar 13, 2014
 BUG: sleeping function called from invalid context at mm/mempool.c:203
 in_atomic(): 1, irqs_disabled(): 0, pid: 43502, name: linbug
 no locks held by linbug/43502.
 CPU: 7 PID: 43502 Comm: linbug Not tainted 3.13.0-rc1+ #15
 Hardware name:
  0000000000000010 ffff88005ebd1878 ffffffff8172d512 ffff8801752bc1c0
  ffff8801752bc1c0 ffff88005ebd1898 ffffffff8109d1f6 ffff88005f9a3c58
  ffff880177f0f080 ffff88005ebd1918 ffffffff81161f43 ffff88005ebd18f8
 Call Trace:
  [<ffffffff8172d512>] dump_stack+0x4e/0x68
  [<ffffffff8109d1f6>] __might_sleep+0xe6/0x120
  [<ffffffff81161f43>] mempool_alloc+0x93/0x170
  [<ffffffff810c0c34>] ? mark_held_locks+0x74/0x140
  [<ffffffff8118a826>] ? follow_page_mask+0x556/0x600
  [<ffffffff814107ae>] dmaengine_get_unmap_data+0x2e/0x60
  [<ffffffff81410f11>] dma_async_memcpy_pg_to_pg+0x41/0x1c0
  [<ffffffff814110e0>] dma_async_memcpy_buf_to_pg+0x50/0x60
  [<ffffffff81411bdc>] dma_memcpy_to_iovec+0xfc/0x190
  [<ffffffff816163af>] dma_skb_copy_datagram_iovec+0x6f/0x2b0

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
ctwitty pushed a commit that referenced this issue Mar 26, 2014
When trying to allocate skb for new PDU, l2cap_chan is unlocked so we
can sleep waiting for memory as otherwise there's possible deadlock as
fixed in e454c84. However, in a6a5568 lock was moved from socket
to channel level and it's no longer safe to just unlock and lock again
without checking l2cap_chan state since channel can be disconnected
when lock is not held.

This patch adds missing checks for l2cap_chan state when returning from
call which allocates skb.

Scenario is easily reproducible by running rfcomm-tester in a loop.

BUG: unable to handle kernel NULL pointer dereference at         (null)
IP: [<ffffffffa0442169>] l2cap_do_send+0x29/0x120 [bluetooth]
PGD 0
Oops: 0000 [#1] SMP
Modules linked in:
CPU: 7 PID: 4038 Comm: krfcommd Not tainted 3.14.0-rc2+ #15
Hardware name: Dell Inc. OptiPlex 790/0HY9JP, BIOS A10 11/24/2011
task: ffff8802bdd731c0 ti: ffff8801ec986000 task.ti: ffff8801ec986000
RIP: 0010:[<ffffffffa0442169>]  [<ffffffffa0442169>] l2cap_do_send+0x29/0x120
RSP: 0018:ffff8801ec987ad8  EFLAGS: 00010202
RAX: 0000000000000000 RBX: ffff8800c5796800 RCX: 0000000000000000
RDX: ffff880410e7a800 RSI: ffff8802b6c1da00 RDI: ffff8800c5796800
RBP: ffff8801ec987af8 R08: 00000000000000c0 R09: 0000000000000300
R10: 000000000000573b R11: 000000000000573a R12: ffff8802b6c1da00
R13: 0000000000000000 R14: ffff8802b6c1da00 R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffff88042dce0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 000000041257c000 CR4: 00000000000407e0
Stack:
 ffff8801ec987d78 ffff8800c5796800 ffff8801ec987d78 0000000000000000
 ffff8801ec987ba8 ffffffffa0449e37 0000000000000004 ffff8801ec987af0
 ffff8801ec987d40 0000000000000282 0000000000000000 ffffffff00000004
Call Trace:
 [<ffffffffa0449e37>] l2cap_chan_send+0xaa7/0x1120 [bluetooth]
 [<ffffffff81770100>] ? _raw_spin_unlock_bh+0x20/0x40
 [<ffffffffa045188b>] l2cap_sock_sendmsg+0xcb/0x110 [bluetooth]
 [<ffffffff81652b0f>] sock_sendmsg+0xaf/0xc0
 [<ffffffff810a8381>] ? update_curr+0x141/0x200
 [<ffffffff810a8961>] ? dequeue_entity+0x181/0x520
 [<ffffffff81652b60>] kernel_sendmsg+0x40/0x60
 [<ffffffffa04a8505>] rfcomm_send_frame+0x45/0x70 [rfcomm]
 [<ffffffff810766f0>] ? internal_add_timer+0x20/0x50
 [<ffffffffa04a8564>] rfcomm_send_cmd+0x34/0x60 [rfcomm]
 [<ffffffffa04a8605>] rfcomm_send_disc+0x75/0xa0 [rfcomm]
 [<ffffffffa04aacec>] rfcomm_run+0x8cc/0x1a30 [rfcomm]
 [<ffffffffa04aa420>] ? rfcomm_check_accept+0xc0/0xc0 [rfcomm]
 [<ffffffff8108e3a9>] kthread+0xc9/0xe0
 [<ffffffff8108e2e0>] ? flush_kthread_worker+0xb0/0xb0
 [<ffffffff817795fc>] ret_from_fork+0x7c/0xb0
 [<ffffffff8108e2e0>] ? flush_kthread_worker+0xb0/0xb0
Code: 00 00 66 66 66 66 90 55 48 89 e5 48 83 ec 20 f6 05 d6 a3 02 00 04
RIP  [<ffffffffa0442169>] l2cap_do_send+0x29/0x120 [bluetooth]
 RSP <ffff8801ec987ad8>
CR2: 0000000000000000

Signed-off-by: Andrzej Kaczmarek <andrzej.kaczmarek@tieto.com>
Acked-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
@mporsch mporsch removed their assignment Jan 27, 2015
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant