Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove trailing slash from kernel.org link #28

Closed
wants to merge 1 commit into from

Conversation

mikegagnon
Copy link

A trailing slash on a URL connotes a link to a directory; an absence of a trailing slash connotes a link to a page.

@mikegagnon mikegagnon closed this Mar 1, 2013
@mikegagnon mikegagnon deleted the patch-1 branch March 1, 2013 21:26
koct9i pushed a commit to koct9i/linux that referenced this pull request Mar 7, 2013
should fix the following issue

	[ 3229.815012] [ BUG: lock held when returning to user space! ]
	[ 3229.815016] 3.5.0-rc7-wl torvalds#28 Tainted: G        W  O
	[ 3229.815017]
	------------------------------------------------
	[ 3229.815019] wpa_supplicant/5783 is leaving the kernel with locks still held!
	[ 3229.815022] 1 lock held by wpa_supplicant/5783:
	[ 3229.815023]  #0: (reg_mutex){+.+.+.}, at: [<fa65834d>]
	reg_last_request_cell_base+0x1d/0x60 [cfg80211]

Cc: Luis Rodriguez <mcgrof@gmail.com>
Signed-off-by: Mohammed Shafi Shajakhan <mohammed@qca.qualcomm.com>
Tested-by: Luciano Coelho <coelho@ti.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
cianmcgovern pushed a commit to cianmcgovern/linux that referenced this pull request Mar 10, 2013
commit 62d2feb upstream.

Fix crash after issuing:
	echo hmc5843 0x1e > /sys/class/i2c-dev/i2c-2/device/new_device

	[   37.180999] device: '2-001e': device_add
	[   37.188293] bus: 'i2c': add device 2-001e
	[   37.194549] PM: Adding info for i2c:2-001e
	[   37.200958] bus: 'i2c': driver_probe_device: matched device 2-001e with driver hmc5843
	[   37.210815] bus: 'i2c': really_probe: probing driver hmc5843 with device 2-001e
	[   37.224884] HMC5843 initialized
	[   37.228759] ------------[ cut here ]------------
	[   37.233612] kernel BUG at mm/slab.c:505!
	[   37.237701] Internal error: Oops - BUG: 0 [#1] PREEMPT
	[   37.243103] Modules linked in:
	[   37.246337] CPU: 0    Not tainted  (3.3.1-gta04+ torvalds#28)
	[   37.251647] PC is at kfree+0x84/0x144
	[   37.255493] LR is at kfree+0x20/0x144
	[   37.259338] pc : [<c00b408c>]    lr : [<c00b4028>]    psr: 40000093
	[   37.259368] sp : de249cd8  ip : 0000000c  fp : 00000090
	[   37.271362] r10: 0000000a  r9 : de229eac  r8 : c0236274
	[   37.276855] r7 : c09d6490  r6 : a0000013  r5 : de229c00  r4 : de229c10
	[   37.283691] r3 : c0f00218  r2 : 00000400  r1 : c0eea000  r0 : c00b4028
	[   37.290527] Flags: nZcv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment user
	[   37.298095] Control: 10c5387d  Table: 9e1d0019  DAC: 00000015
	[   37.304107] Process sh (pid: 91, stack limit = 0xde2482f0)
	[   37.309844] Stack: (0xde249cd8 to 0xde24a000)
	[   37.314422] 9cc0:                                                       de229c10 de229c00
	[   37.322998] 9ce0: de229c10 ffffffea 00000005 c0236274 de140a80 c00b4798 dec00080 de140a80
	[   37.331573] 9d00: c032f37c dec00080 000080d0 00000001 de229c00 de229c10 c048d578 00000005
	[   37.340148] 9d20: de229eac 0000000a 00000090 c032fa40 00000001 00000000 00000001 de229c10
	[   37.348724] 9d40: de229eac 00000029 c075b558 00000001 00000003 00000004 de229c10 c048d594
	[   37.357299] 9d60: 00000000 60000013 00000018 205b0007 37332020 3432322e 5d343838 c0060020
	[   37.365905] 9d80: de251600 00000001 00000000 de251600 00000001 c0065a84 de229c00 de229c48
	[   37.374481] 9da0: 00000006 0048d62c de229c38 de229c00 de229c00 de1f6c00 de1f6c20 00000001
	[   37.383056] 9dc0: 00000000 c048d62c 00000000 de229c00 de229c00 de1f6c00 de1f6c20 00000001
	[   37.391632] 9de0: 00000000 c048d62c 00000000 c0330164 00000000 de1f6c20 c048d62c de1f6c00
	[   37.400207] 9e00: c0330078 de1f6c04 c078d714 de189b58 00000000 c02ccfd8 de1f6c20 c0795f40
	[   37.408782] 9e20: c0238330 00000000 00000000 c02381a8 de1b9fc0 de1f6c20 de1f6c20 de249e48
	[   37.417358] 9e40: c0238330 c0236bb0 decdbed8 de7d0f14 de1f6c20 de1f6c20 de1f6c54 de1f6c20
	[   37.425933] 9e60: 00000000 c0238030 de1f6c20 c078d7bc de1f6c20 c02377ec de1f6c20 de1f6c28
	[   37.434509] 9e80: dee64cb0 c0236138 c047c554 de189b58 00000000 c004b45c de1f6c20 de1f6cd8
	[   37.443084] 9ea0: c0edfa6c de1f6c00 dee64c68 de1f6c04 de1f6c20 dee64cb8 c047c554 de189b58
	[   37.451690] 9ec0: 00000000 c02cd634 dee64c68 de249ef4 de23b008 dee64cb0 0000000d de23b000
	[   37.460266] 9ee0: de23b007 c02cd78c 00000002 00000000 00000000 35636d68 00333438 00000000
	[   37.468841] 9f00: 00000000 00000000 001e0000 00000000 00000000 00000000 00000000 0a10cec0
	[   37.477416] 9f20: 00000002 de249f80 0000000d dee62990 de189b40 c0234d88 0000000d c010c354
	[   37.485992] 9f40: 0000000d de210f28 000acc88 de249f80 0000000d de248000 00000000 c00b7bf8
	[   37.494567] 9f60: de210f28 000acc88 de210f28 000acc88 00000000 00000000 0000000d c00b7ed8
	[   37.503143] 9f80: 00000000 00000000 0000000d 00000000 0007fa28 0000000d 000acc88 00000004
	[   37.511718] 9fa0: c000e544 c000e380 0007fa28 0000000d 00000001 000acc88 0000000d 00000000
	[   37.520294] 9fc0: 0007fa28 0000000d 000acc88 00000004 00000001 00000020 00000002 00000000
	[   37.528869] 9fe0: 00000000 beab8624 0000ea05 b6eaebac 600d0010 00000001 00000000 00000000
	[   37.537475] [<c00b408c>] (kfree+0x84/0x144) from [<c0236274>] (device_add+0x530/0x57c)
	[   37.545806] [<c0236274>] (device_add+0x530/0x57c) from [<c032fa40>] (iio_device_register+0x8c8/0x990)
	[   37.555480] [<c032fa40>] (iio_device_register+0x8c8/0x990) from [<c0330164>] (hmc5843_probe+0xec/0x114)
	[   37.565338] [<c0330164>] (hmc5843_probe+0xec/0x114) from [<c02ccfd8>] (i2c_device_probe+0xc4/0xf8)
	[   37.574737] [<c02ccfd8>] (i2c_device_probe+0xc4/0xf8) from [<c02381a8>] (driver_probe_device+0x118/0x218)
	[   37.584777] [<c02381a8>] (driver_probe_device+0x118/0x218) from [<c0236bb0>] (bus_for_each_drv+0x4c/0x84)
	[   37.594818] [<c0236bb0>] (bus_for_each_drv+0x4c/0x84) from [<c0238030>] (device_attach+0x78/0xa4)
	[   37.604125] [<c0238030>] (device_attach+0x78/0xa4) from [<c02377ec>] (bus_probe_device+0x28/0x9c)
	[   37.613433] [<c02377ec>] (bus_probe_device+0x28/0x9c) from [<c0236138>] (device_add+0x3f4/0x57c)
	[   37.622650] [<c0236138>] (device_add+0x3f4/0x57c) from [<c02cd634>] (i2c_new_device+0xf8/0x19c)
	[   37.631805] [<c02cd634>] (i2c_new_device+0xf8/0x19c) from [<c02cd78c>] (i2c_sysfs_new_device+0xb4/0x130)
	[   37.641754] [<c02cd78c>] (i2c_sysfs_new_device+0xb4/0x130) from [<c0234d88>] (dev_attr_store+0x18/0x24)
	[   37.651611] [<c0234d88>] (dev_attr_store+0x18/0x24) from [<c010c354>] (sysfs_write_file+0x10c/0x140)
	[   37.661193] [<c010c354>] (sysfs_write_file+0x10c/0x140) from [<c00b7bf8>] (vfs_write+0xb0/0x178)
	[   37.670410] [<c00b7bf8>] (vfs_write+0xb0/0x178) from [<c00b7ed8>] (sys_write+0x3c/0x68)
	[   37.678833] [<c00b7ed8>] (sys_write+0x3c/0x68) from [<c000e380>] (ret_fast_syscall+0x0/0x3c)
	[   37.687683] Code: 1593301c e5932000 e3120080 1a000000 (e7f001f2)
	[   37.700775] ---[ end trace aaf805debdb69390 ]---

Client data was assigned to iio_dev structure in probe but in
hmc5843_init_client function casted to private driver data structure which
is wrong. Possibly calling mutex_init(&data->lock); corrupt data
which the lead to above crash.

Signed-off-by: Marek Belisko <marek.belisko@open-nandra.com>
Acked-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
torvalds pushed a commit that referenced this pull request Mar 27, 2013
With git commit c705c78
"acpi: Export the acpi_processor_get_performance_info" we are now
using a different mechanism to access the P-states.

The acpi_processor per-cpu structure is set and filtered by the
core ACPI code which shrinks the per_cpu contents to only online CPUs.
In the past we would call acpi_processor_register_performance()
which would have not tried to dereference offline cpus.

With the new patch and the fact that the loop we take is for
for_all_possible_cpus we end up crashing on some machines.
We could modify the loop to be for online_cpus - but all the other
loops in the code use possible_cpus (for a good reason) - so lets
leave it as so and just check if per_cpu(processor) is NULL.

With this patch we will bypass the !online but possible CPUs.
This fixes:

IP: [<ffffffffa00d13b5>] xen_acpi_processor_init+0x1b6/0xe01 [xen_acpi_processor]
PGD 4126e6067 PUD 4126e3067 PMD 0
Oops: 0002 [#1] SMP
Pid: 432, comm: modprobe Not tainted 3.9.0-rc3+ #28 To be filled by O.E.M. To be filled by O.E.M./M5A97
RIP: e030:[<ffffffffa00d13b5>]  [<ffffffffa00d13b5>] xen_acpi_processor_init+0x1b6/0xe01 [xen_acpi_processor]
RSP: e02b:ffff88040c8a3ce8  EFLAGS: 00010282
.. snip..
Call Trace:
 [<ffffffffa00d11ff>] ? read_acpi_id+0x12b/0x12b [xen_acpi_processor]
 [<ffffffff8100215a>] do_one_initcall+0x12a/0x180
 [<ffffffff810c42c3>] load_module+0x1cd3/0x2870
 [<ffffffff81319b70>] ? ddebug_proc_open+0xc0/0xc0
 [<ffffffff810c4f37>] sys_init_module+0xd7/0x120
 [<ffffffff8166ce19>] system_call_fastpath+0x16/0x1b

on some machines.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
sanyaade-embedded-systems pushed a commit to sanyaade-embedded-systems/linux-1 that referenced this pull request Apr 4, 2013
commit 62d2feb upstream.

Fix crash after issuing:
	echo hmc5843 0x1e > /sys/class/i2c-dev/i2c-2/device/new_device

	[   37.180999] device: '2-001e': device_add
	[   37.188293] bus: 'i2c': add device 2-001e
	[   37.194549] PM: Adding info for i2c:2-001e
	[   37.200958] bus: 'i2c': driver_probe_device: matched device 2-001e with driver hmc5843
	[   37.210815] bus: 'i2c': really_probe: probing driver hmc5843 with device 2-001e
	[   37.224884] HMC5843 initialized
	[   37.228759] ------------[ cut here ]------------
	[   37.233612] kernel BUG at mm/slab.c:505!
	[   37.237701] Internal error: Oops - BUG: 0 [#1] PREEMPT
	[   37.243103] Modules linked in:
	[   37.246337] CPU: 0    Not tainted  (3.3.1-gta04+ torvalds#28)
	[   37.251647] PC is at kfree+0x84/0x144
	[   37.255493] LR is at kfree+0x20/0x144
	[   37.259338] pc : [<c00b408c>]    lr : [<c00b4028>]    psr: 40000093
	[   37.259368] sp : de249cd8  ip : 0000000c  fp : 00000090
	[   37.271362] r10: 0000000a  r9 : de229eac  r8 : c0236274
	[   37.276855] r7 : c09d6490  r6 : a0000013  r5 : de229c00  r4 : de229c10
	[   37.283691] r3 : c0f00218  r2 : 00000400  r1 : c0eea000  r0 : c00b4028
	[   37.290527] Flags: nZcv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment user
	[   37.298095] Control: 10c5387d  Table: 9e1d0019  DAC: 00000015
	[   37.304107] Process sh (pid: 91, stack limit = 0xde2482f0)
	[   37.309844] Stack: (0xde249cd8 to 0xde24a000)
	[   37.314422] 9cc0:                                                       de229c10 de229c00
	[   37.322998] 9ce0: de229c10 ffffffea 00000005 c0236274 de140a80 c00b4798 dec00080 de140a80
	[   37.331573] 9d00: c032f37c dec00080 000080d0 00000001 de229c00 de229c10 c048d578 00000005
	[   37.340148] 9d20: de229eac 0000000a 00000090 c032fa40 00000001 00000000 00000001 de229c10
	[   37.348724] 9d40: de229eac 00000029 c075b558 00000001 00000003 00000004 de229c10 c048d594
	[   37.357299] 9d60: 00000000 60000013 00000018 205b0007 37332020 3432322e 5d343838 c0060020
	[   37.365905] 9d80: de251600 00000001 00000000 de251600 00000001 c0065a84 de229c00 de229c48
	[   37.374481] 9da0: 00000006 0048d62c de229c38 de229c00 de229c00 de1f6c00 de1f6c20 00000001
	[   37.383056] 9dc0: 00000000 c048d62c 00000000 de229c00 de229c00 de1f6c00 de1f6c20 00000001
	[   37.391632] 9de0: 00000000 c048d62c 00000000 c0330164 00000000 de1f6c20 c048d62c de1f6c00
	[   37.400207] 9e00: c0330078 de1f6c04 c078d714 de189b58 00000000 c02ccfd8 de1f6c20 c0795f40
	[   37.408782] 9e20: c0238330 00000000 00000000 c02381a8 de1b9fc0 de1f6c20 de1f6c20 de249e48
	[   37.417358] 9e40: c0238330 c0236bb0 decdbed8 de7d0f14 de1f6c20 de1f6c20 de1f6c54 de1f6c20
	[   37.425933] 9e60: 00000000 c0238030 de1f6c20 c078d7bc de1f6c20 c02377ec de1f6c20 de1f6c28
	[   37.434509] 9e80: dee64cb0 c0236138 c047c554 de189b58 00000000 c004b45c de1f6c20 de1f6cd8
	[   37.443084] 9ea0: c0edfa6c de1f6c00 dee64c68 de1f6c04 de1f6c20 dee64cb8 c047c554 de189b58
	[   37.451690] 9ec0: 00000000 c02cd634 dee64c68 de249ef4 de23b008 dee64cb0 0000000d de23b000
	[   37.460266] 9ee0: de23b007 c02cd78c 00000002 00000000 00000000 35636d68 00333438 00000000
	[   37.468841] 9f00: 00000000 00000000 001e0000 00000000 00000000 00000000 00000000 0a10cec0
	[   37.477416] 9f20: 00000002 de249f80 0000000d dee62990 de189b40 c0234d88 0000000d c010c354
	[   37.485992] 9f40: 0000000d de210f28 000acc88 de249f80 0000000d de248000 00000000 c00b7bf8
	[   37.494567] 9f60: de210f28 000acc88 de210f28 000acc88 00000000 00000000 0000000d c00b7ed8
	[   37.503143] 9f80: 00000000 00000000 0000000d 00000000 0007fa28 0000000d 000acc88 00000004
	[   37.511718] 9fa0: c000e544 c000e380 0007fa28 0000000d 00000001 000acc88 0000000d 00000000
	[   37.520294] 9fc0: 0007fa28 0000000d 000acc88 00000004 00000001 00000020 00000002 00000000
	[   37.528869] 9fe0: 00000000 beab8624 0000ea05 b6eaebac 600d0010 00000001 00000000 00000000
	[   37.537475] [<c00b408c>] (kfree+0x84/0x144) from [<c0236274>] (device_add+0x530/0x57c)
	[   37.545806] [<c0236274>] (device_add+0x530/0x57c) from [<c032fa40>] (iio_device_register+0x8c8/0x990)
	[   37.555480] [<c032fa40>] (iio_device_register+0x8c8/0x990) from [<c0330164>] (hmc5843_probe+0xec/0x114)
	[   37.565338] [<c0330164>] (hmc5843_probe+0xec/0x114) from [<c02ccfd8>] (i2c_device_probe+0xc4/0xf8)
	[   37.574737] [<c02ccfd8>] (i2c_device_probe+0xc4/0xf8) from [<c02381a8>] (driver_probe_device+0x118/0x218)
	[   37.584777] [<c02381a8>] (driver_probe_device+0x118/0x218) from [<c0236bb0>] (bus_for_each_drv+0x4c/0x84)
	[   37.594818] [<c0236bb0>] (bus_for_each_drv+0x4c/0x84) from [<c0238030>] (device_attach+0x78/0xa4)
	[   37.604125] [<c0238030>] (device_attach+0x78/0xa4) from [<c02377ec>] (bus_probe_device+0x28/0x9c)
	[   37.613433] [<c02377ec>] (bus_probe_device+0x28/0x9c) from [<c0236138>] (device_add+0x3f4/0x57c)
	[   37.622650] [<c0236138>] (device_add+0x3f4/0x57c) from [<c02cd634>] (i2c_new_device+0xf8/0x19c)
	[   37.631805] [<c02cd634>] (i2c_new_device+0xf8/0x19c) from [<c02cd78c>] (i2c_sysfs_new_device+0xb4/0x130)
	[   37.641754] [<c02cd78c>] (i2c_sysfs_new_device+0xb4/0x130) from [<c0234d88>] (dev_attr_store+0x18/0x24)
	[   37.651611] [<c0234d88>] (dev_attr_store+0x18/0x24) from [<c010c354>] (sysfs_write_file+0x10c/0x140)
	[   37.661193] [<c010c354>] (sysfs_write_file+0x10c/0x140) from [<c00b7bf8>] (vfs_write+0xb0/0x178)
	[   37.670410] [<c00b7bf8>] (vfs_write+0xb0/0x178) from [<c00b7ed8>] (sys_write+0x3c/0x68)
	[   37.678833] [<c00b7ed8>] (sys_write+0x3c/0x68) from [<c000e380>] (ret_fast_syscall+0x0/0x3c)
	[   37.687683] Code: 1593301c e5932000 e3120080 1a000000 (e7f001f2)
	[   37.700775] ---[ end trace aaf805debdb69390 ]---

Client data was assigned to iio_dev structure in probe but in
hmc5843_init_client function casted to private driver data structure which
is wrong. Possibly calling mutex_init(&data->lock); corrupt data
which the lead to above crash.

Signed-off-by: Marek Belisko <marek.belisko@open-nandra.com>
Acked-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
rzr pushed a commit to rzr/linux that referenced this pull request Apr 11, 2013
Fix crash after issuing:
	echo hmc5843 0x1e > /sys/class/i2c-dev/i2c-2/device/new_device

	[   37.180999] device: '2-001e': device_add
	[   37.188293] bus: 'i2c': add device 2-001e
	[   37.194549] PM: Adding info for i2c:2-001e
	[   37.200958] bus: 'i2c': driver_probe_device: matched device 2-001e with driver hmc5843
	[   37.210815] bus: 'i2c': really_probe: probing driver hmc5843 with device 2-001e
	[   37.224884] HMC5843 initialized
	[   37.228759] ------------[ cut here ]------------
	[   37.233612] kernel BUG at mm/slab.c:505!
	[   37.237701] Internal error: Oops - BUG: 0 [#1] PREEMPT
	[   37.243103] Modules linked in:
	[   37.246337] CPU: 0    Not tainted  (3.3.1-gta04+ torvalds#28)
	[   37.251647] PC is at kfree+0x84/0x144
	[   37.255493] LR is at kfree+0x20/0x144
	[   37.259338] pc : [<c00b408c>]    lr : [<c00b4028>]    psr: 40000093
	[   37.259368] sp : de249cd8  ip : 0000000c  fp : 00000090
	[   37.271362] r10: 0000000a  r9 : de229eac  r8 : c0236274
	[   37.276855] r7 : c09d6490  r6 : a0000013  r5 : de229c00  r4 : de229c10
	[   37.283691] r3 : c0f00218  r2 : 00000400  r1 : c0eea000  r0 : c00b4028
	[   37.290527] Flags: nZcv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment user
	[   37.298095] Control: 10c5387d  Table: 9e1d0019  DAC: 00000015
	[   37.304107] Process sh (pid: 91, stack limit = 0xde2482f0)
	[   37.309844] Stack: (0xde249cd8 to 0xde24a000)
	[   37.314422] 9cc0:                                                       de229c10 de229c00
	[   37.322998] 9ce0: de229c10 ffffffea 00000005 c0236274 de140a80 c00b4798 dec00080 de140a80
	[   37.331573] 9d00: c032f37c dec00080 000080d0 00000001 de229c00 de229c10 c048d578 00000005
	[   37.340148] 9d20: de229eac 0000000a 00000090 c032fa40 00000001 00000000 00000001 de229c10
	[   37.348724] 9d40: de229eac 00000029 c075b558 00000001 00000003 00000004 de229c10 c048d594
	[   37.357299] 9d60: 00000000 60000013 00000018 205b0007 37332020 3432322e 5d343838 c0060020
	[   37.365905] 9d80: de251600 00000001 00000000 de251600 00000001 c0065a84 de229c00 de229c48
	[   37.374481] 9da0: 00000006 0048d62c de229c38 de229c00 de229c00 de1f6c00 de1f6c20 00000001
	[   37.383056] 9dc0: 00000000 c048d62c 00000000 de229c00 de229c00 de1f6c00 de1f6c20 00000001
	[   37.391632] 9de0: 00000000 c048d62c 00000000 c0330164 00000000 de1f6c20 c048d62c de1f6c00
	[   37.400207] 9e00: c0330078 de1f6c04 c078d714 de189b58 00000000 c02ccfd8 de1f6c20 c0795f40
	[   37.408782] 9e20: c0238330 00000000 00000000 c02381a8 de1b9fc0 de1f6c20 de1f6c20 de249e48
	[   37.417358] 9e40: c0238330 c0236bb0 decdbed8 de7d0f14 de1f6c20 de1f6c20 de1f6c54 de1f6c20
	[   37.425933] 9e60: 00000000 c0238030 de1f6c20 c078d7bc de1f6c20 c02377ec de1f6c20 de1f6c28
	[   37.434509] 9e80: dee64cb0 c0236138 c047c554 de189b58 00000000 c004b45c de1f6c20 de1f6cd8
	[   37.443084] 9ea0: c0edfa6c de1f6c00 dee64c68 de1f6c04 de1f6c20 dee64cb8 c047c554 de189b58
	[   37.451690] 9ec0: 00000000 c02cd634 dee64c68 de249ef4 de23b008 dee64cb0 0000000d de23b000
	[   37.460266] 9ee0: de23b007 c02cd78c 00000002 00000000 00000000 35636d68 00333438 00000000
	[   37.468841] 9f00: 00000000 00000000 001e0000 00000000 00000000 00000000 00000000 0a10cec0
	[   37.477416] 9f20: 00000002 de249f80 0000000d dee62990 de189b40 c0234d88 0000000d c010c354
	[   37.485992] 9f40: 0000000d de210f28 000acc88 de249f80 0000000d de248000 00000000 c00b7bf8
	[   37.494567] 9f60: de210f28 000acc88 de210f28 000acc88 00000000 00000000 0000000d c00b7ed8
	[   37.503143] 9f80: 00000000 00000000 0000000d 00000000 0007fa28 0000000d 000acc88 00000004
	[   37.511718] 9fa0: c000e544 c000e380 0007fa28 0000000d 00000001 000acc88 0000000d 00000000
	[   37.520294] 9fc0: 0007fa28 0000000d 000acc88 00000004 00000001 00000020 00000002 00000000
	[   37.528869] 9fe0: 00000000 beab8624 0000ea05 b6eaebac 600d0010 00000001 00000000 00000000
	[   37.537475] [<c00b408c>] (kfree+0x84/0x144) from [<c0236274>] (device_add+0x530/0x57c)
	[   37.545806] [<c0236274>] (device_add+0x530/0x57c) from [<c032fa40>] (iio_device_register+0x8c8/0x990)
	[   37.555480] [<c032fa40>] (iio_device_register+0x8c8/0x990) from [<c0330164>] (hmc5843_probe+0xec/0x114)
	[   37.565338] [<c0330164>] (hmc5843_probe+0xec/0x114) from [<c02ccfd8>] (i2c_device_probe+0xc4/0xf8)
	[   37.574737] [<c02ccfd8>] (i2c_device_probe+0xc4/0xf8) from [<c02381a8>] (driver_probe_device+0x118/0x218)
	[   37.584777] [<c02381a8>] (driver_probe_device+0x118/0x218) from [<c0236bb0>] (bus_for_each_drv+0x4c/0x84)
	[   37.594818] [<c0236bb0>] (bus_for_each_drv+0x4c/0x84) from [<c0238030>] (device_attach+0x78/0xa4)
	[   37.604125] [<c0238030>] (device_attach+0x78/0xa4) from [<c02377ec>] (bus_probe_device+0x28/0x9c)
	[   37.613433] [<c02377ec>] (bus_probe_device+0x28/0x9c) from [<c0236138>] (device_add+0x3f4/0x57c)
	[   37.622650] [<c0236138>] (device_add+0x3f4/0x57c) from [<c02cd634>] (i2c_new_device+0xf8/0x19c)
	[   37.631805] [<c02cd634>] (i2c_new_device+0xf8/0x19c) from [<c02cd78c>] (i2c_sysfs_new_device+0xb4/0x130)
	[   37.641754] [<c02cd78c>] (i2c_sysfs_new_device+0xb4/0x130) from [<c0234d88>] (dev_attr_store+0x18/0x24)
	[   37.651611] [<c0234d88>] (dev_attr_store+0x18/0x24) from [<c010c354>] (sysfs_write_file+0x10c/0x140)
	[   37.661193] [<c010c354>] (sysfs_write_file+0x10c/0x140) from [<c00b7bf8>] (vfs_write+0xb0/0x178)
	[   37.670410] [<c00b7bf8>] (vfs_write+0xb0/0x178) from [<c00b7ed8>] (sys_write+0x3c/0x68)
	[   37.678833] [<c00b7ed8>] (sys_write+0x3c/0x68) from [<c000e380>] (ret_fast_syscall+0x0/0x3c)
	[   37.687683] Code: 1593301c e5932000 e3120080 1a000000 (e7f001f2)
	[   37.700775] ---[ end trace aaf805debdb69390 ]---

Client data was assigned to iio_dev structure in probe but in
hmc5843_init_client function casted to private driver data structure which
is wrong. Possibly calling mutex_init(&data->lock); corrupt data
which the lead to above crash.

Signed-off-by: Marek Belisko <marek.belisko@open-nandra.com>
Cc: stable <stable@vger.kernel.org>
Acked-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
rzr pushed a commit to rzr/linux that referenced this pull request Apr 11, 2013
The driver calls cxgb_vlan_mode() from init_one().  This calls into
synchronize_rx(), which locks all the q locks, but the q locks are not
initialized until cxgb_up() -> setup_sge_qsets().  So move the call to
cxgb_vlan_mode() into cxgb_up(), after the call to setup_sge_qsets().
We also move the body of these functions up higher to avoid having to
a forward declaration.

This was found because of the lockdep warning:

    INFO: trying to register non-static key.
    the code is fine but needs lockdep annotation.
    turning off the locking correctness validator.
    Pid: 323, comm: work_for_cpu Not tainted 3.4.0-rc5 torvalds#28
    Call Trace:
     [<ffffffff8106e767>] register_lock_class+0x108/0x2d0
     [<ffffffff8106ff42>] __lock_acquire+0xd3/0xd06
     [<ffffffff81070fd0>] lock_acquire+0xbf/0xfe
     [<ffffffff813862a6>] _raw_spin_lock_irq+0x36/0x45
     [<ffffffffa01e71aa>] cxgb_vlan_mode+0x96/0xcb [cxgb3]
     [<ffffffffa01f90eb>] init_one+0x8c4/0x980 [cxgb3]
     [<ffffffff811fcbf0>] local_pci_probe+0x3f/0x70
     [<ffffffff81042206>] do_work_for_cpu+0x10/0x22
     [<ffffffff810482de>] kthread+0xa1/0xa9
     [<ffffffff8138e234>] kernel_thread_helper+0x4/0x10

Contrary to what lockdep says, the code is not fine: we are locking an
uninitialized spinlock.

Signed-off-by: Roland Dreier <roland@purestorage.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
mikey pushed a commit to mikey/linux that referenced this pull request Jul 31, 2013
commit 62d2feb upstream.

Fix crash after issuing:
	echo hmc5843 0x1e > /sys/class/i2c-dev/i2c-2/device/new_device

	[   37.180999] device: '2-001e': device_add
	[   37.188293] bus: 'i2c': add device 2-001e
	[   37.194549] PM: Adding info for i2c:2-001e
	[   37.200958] bus: 'i2c': driver_probe_device: matched device 2-001e with driver hmc5843
	[   37.210815] bus: 'i2c': really_probe: probing driver hmc5843 with device 2-001e
	[   37.224884] HMC5843 initialized
	[   37.228759] ------------[ cut here ]------------
	[   37.233612] kernel BUG at mm/slab.c:505!
	[   37.237701] Internal error: Oops - BUG: 0 [#1] PREEMPT
	[   37.243103] Modules linked in:
	[   37.246337] CPU: 0    Not tainted  (3.3.1-gta04+ torvalds#28)
	[   37.251647] PC is at kfree+0x84/0x144
	[   37.255493] LR is at kfree+0x20/0x144
	[   37.259338] pc : [<c00b408c>]    lr : [<c00b4028>]    psr: 40000093
	[   37.259368] sp : de249cd8  ip : 0000000c  fp : 00000090
	[   37.271362] r10: 0000000a  r9 : de229eac  r8 : c0236274
	[   37.276855] r7 : c09d6490  r6 : a0000013  r5 : de229c00  r4 : de229c10
	[   37.283691] r3 : c0f00218  r2 : 00000400  r1 : c0eea000  r0 : c00b4028
	[   37.290527] Flags: nZcv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment user
	[   37.298095] Control: 10c5387d  Table: 9e1d0019  DAC: 00000015
	[   37.304107] Process sh (pid: 91, stack limit = 0xde2482f0)
	[   37.309844] Stack: (0xde249cd8 to 0xde24a000)
	[   37.314422] 9cc0:                                                       de229c10 de229c00
	[   37.322998] 9ce0: de229c10 ffffffea 00000005 c0236274 de140a80 c00b4798 dec00080 de140a80
	[   37.331573] 9d00: c032f37c dec00080 000080d0 00000001 de229c00 de229c10 c048d578 00000005
	[   37.340148] 9d20: de229eac 0000000a 00000090 c032fa40 00000001 00000000 00000001 de229c10
	[   37.348724] 9d40: de229eac 00000029 c075b558 00000001 00000003 00000004 de229c10 c048d594
	[   37.357299] 9d60: 00000000 60000013 00000018 205b0007 37332020 3432322e 5d343838 c0060020
	[   37.365905] 9d80: de251600 00000001 00000000 de251600 00000001 c0065a84 de229c00 de229c48
	[   37.374481] 9da0: 00000006 0048d62c de229c38 de229c00 de229c00 de1f6c00 de1f6c20 00000001
	[   37.383056] 9dc0: 00000000 c048d62c 00000000 de229c00 de229c00 de1f6c00 de1f6c20 00000001
	[   37.391632] 9de0: 00000000 c048d62c 00000000 c0330164 00000000 de1f6c20 c048d62c de1f6c00
	[   37.400207] 9e00: c0330078 de1f6c04 c078d714 de189b58 00000000 c02ccfd8 de1f6c20 c0795f40
	[   37.408782] 9e20: c0238330 00000000 00000000 c02381a8 de1b9fc0 de1f6c20 de1f6c20 de249e48
	[   37.417358] 9e40: c0238330 c0236bb0 decdbed8 de7d0f14 de1f6c20 de1f6c20 de1f6c54 de1f6c20
	[   37.425933] 9e60: 00000000 c0238030 de1f6c20 c078d7bc de1f6c20 c02377ec de1f6c20 de1f6c28
	[   37.434509] 9e80: dee64cb0 c0236138 c047c554 de189b58 00000000 c004b45c de1f6c20 de1f6cd8
	[   37.443084] 9ea0: c0edfa6c de1f6c00 dee64c68 de1f6c04 de1f6c20 dee64cb8 c047c554 de189b58
	[   37.451690] 9ec0: 00000000 c02cd634 dee64c68 de249ef4 de23b008 dee64cb0 0000000d de23b000
	[   37.460266] 9ee0: de23b007 c02cd78c 00000002 00000000 00000000 35636d68 00333438 00000000
	[   37.468841] 9f00: 00000000 00000000 001e0000 00000000 00000000 00000000 00000000 0a10cec0
	[   37.477416] 9f20: 00000002 de249f80 0000000d dee62990 de189b40 c0234d88 0000000d c010c354
	[   37.485992] 9f40: 0000000d de210f28 000acc88 de249f80 0000000d de248000 00000000 c00b7bf8
	[   37.494567] 9f60: de210f28 000acc88 de210f28 000acc88 00000000 00000000 0000000d c00b7ed8
	[   37.503143] 9f80: 00000000 00000000 0000000d 00000000 0007fa28 0000000d 000acc88 00000004
	[   37.511718] 9fa0: c000e544 c000e380 0007fa28 0000000d 00000001 000acc88 0000000d 00000000
	[   37.520294] 9fc0: 0007fa28 0000000d 000acc88 00000004 00000001 00000020 00000002 00000000
	[   37.528869] 9fe0: 00000000 beab8624 0000ea05 b6eaebac 600d0010 00000001 00000000 00000000
	[   37.537475] [<c00b408c>] (kfree+0x84/0x144) from [<c0236274>] (device_add+0x530/0x57c)
	[   37.545806] [<c0236274>] (device_add+0x530/0x57c) from [<c032fa40>] (iio_device_register+0x8c8/0x990)
	[   37.555480] [<c032fa40>] (iio_device_register+0x8c8/0x990) from [<c0330164>] (hmc5843_probe+0xec/0x114)
	[   37.565338] [<c0330164>] (hmc5843_probe+0xec/0x114) from [<c02ccfd8>] (i2c_device_probe+0xc4/0xf8)
	[   37.574737] [<c02ccfd8>] (i2c_device_probe+0xc4/0xf8) from [<c02381a8>] (driver_probe_device+0x118/0x218)
	[   37.584777] [<c02381a8>] (driver_probe_device+0x118/0x218) from [<c0236bb0>] (bus_for_each_drv+0x4c/0x84)
	[   37.594818] [<c0236bb0>] (bus_for_each_drv+0x4c/0x84) from [<c0238030>] (device_attach+0x78/0xa4)
	[   37.604125] [<c0238030>] (device_attach+0x78/0xa4) from [<c02377ec>] (bus_probe_device+0x28/0x9c)
	[   37.613433] [<c02377ec>] (bus_probe_device+0x28/0x9c) from [<c0236138>] (device_add+0x3f4/0x57c)
	[   37.622650] [<c0236138>] (device_add+0x3f4/0x57c) from [<c02cd634>] (i2c_new_device+0xf8/0x19c)
	[   37.631805] [<c02cd634>] (i2c_new_device+0xf8/0x19c) from [<c02cd78c>] (i2c_sysfs_new_device+0xb4/0x130)
	[   37.641754] [<c02cd78c>] (i2c_sysfs_new_device+0xb4/0x130) from [<c0234d88>] (dev_attr_store+0x18/0x24)
	[   37.651611] [<c0234d88>] (dev_attr_store+0x18/0x24) from [<c010c354>] (sysfs_write_file+0x10c/0x140)
	[   37.661193] [<c010c354>] (sysfs_write_file+0x10c/0x140) from [<c00b7bf8>] (vfs_write+0xb0/0x178)
	[   37.670410] [<c00b7bf8>] (vfs_write+0xb0/0x178) from [<c00b7ed8>] (sys_write+0x3c/0x68)
	[   37.678833] [<c00b7ed8>] (sys_write+0x3c/0x68) from [<c000e380>] (ret_fast_syscall+0x0/0x3c)
	[   37.687683] Code: 1593301c e5932000 e3120080 1a000000 (e7f001f2)
	[   37.700775] ---[ end trace aaf805debdb69390 ]---

Client data was assigned to iio_dev structure in probe but in
hmc5843_init_client function casted to private driver data structure which
is wrong. Possibly calling mutex_init(&data->lock); corrupt data
which the lead to above crash.

Signed-off-by: Marek Belisko <marek.belisko@open-nandra.com>
Acked-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
swarren pushed a commit to swarren/linux-tegra that referenced this pull request Sep 9, 2013
Oleksii reported that he had seen an oops similar to this:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000088
IP: [<ffffffff814dcc13>] sock_sendmsg+0x93/0xd0
PGD 0
Oops: 0000 [#1] PREEMPT SMP
Modules linked in: ipt_MASQUERADE xt_REDIRECT xt_tcpudp iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack ip_tables x_tables carl9170 ath usb_storage f2fs nfnetlink_log nfnetlink md4 cifs dns_resolver hid_generic usbhid hid af_packet uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_core videodev rfcomm btusb bnep bluetooth qmi_wwan qcserial cdc_wdm usb_wwan usbnet usbserial mii snd_hda_codec_hdmi snd_hda_codec_realtek iwldvm mac80211 coretemp intel_powerclamp kvm_intel kvm iwlwifi snd_hda_intel cfg80211 snd_hda_codec xhci_hcd e1000e ehci_pci snd_hwdep sdhci_pci snd_pcm ehci_hcd microcode psmouse sdhci thinkpad_acpi mmc_core i2c_i801 pcspkr usbcore hwmon snd_timer snd_page_alloc snd ptp rfkill pps_core soundcore evdev usb_common vboxnetflt(O) vboxdrv(O)Oops#2 Part8
 loop tun binfmt_misc fuse msr acpi_call(O) ipv6 autofs4
CPU: 0 PID: 21612 Comm: kworker/0:1 Tainted: G        W  O 3.10.1SIGN torvalds#28
Hardware name: LENOVO 2306CTO/2306CTO, BIOS G2ET92WW (2.52 ) 02/22/2013
Workqueue: cifsiod cifs_echo_request [cifs]
task: ffff8801e1f416f0 ti: ffff880148744000 task.ti: ffff880148744000
RIP: 0010:[<ffffffff814dcc13>]  [<ffffffff814dcc13>] sock_sendmsg+0x93/0xd0
RSP: 0000:ffff880148745b00  EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff880148745b78 RCX: 0000000000000048
RDX: ffff880148745c90 RSI: ffff880181864a00 RDI: ffff880148745b78
RBP: ffff880148745c48 R08: 0000000000000048 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff880181864a00
R13: ffff880148745c90 R14: 0000000000000048 R15: 0000000000000048
FS:  0000000000000000(0000) GS:ffff88021e200000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000088 CR3: 000000020c42c000 CR4: 00000000001407b0
Oops#2 Part7
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Stack:
 ffff880148745b30 ffffffff810c4af9 0000004848745b30 ffff880181864a00
 ffffffff81ffbc40 0000000000000000 ffff880148745c90 ffffffff810a5aab
 ffff880148745bc0 ffffffff81ffbc40 ffff880148745b60 ffffffff815a9fb8
Call Trace:
 [<ffffffff810c4af9>] ? finish_task_switch+0x49/0xe0
 [<ffffffff810a5aab>] ? lock_timer_base.isra.36+0x2b/0x50
 [<ffffffff815a9fb8>] ? _raw_spin_unlock_irqrestore+0x18/0x40
 [<ffffffff810a673f>] ? try_to_del_timer_sync+0x4f/0x70
 [<ffffffff815aa38f>] ? _raw_spin_unlock_bh+0x1f/0x30
 [<ffffffff814dcc87>] kernel_sendmsg+0x37/0x50
 [<ffffffffa081a0e0>] smb_send_kvec+0xd0/0x1d0 [cifs]
 [<ffffffffa081a263>] smb_send_rqst+0x83/0x1f0 [cifs]
 [<ffffffffa081ab6c>] cifs_call_async+0xec/0x1b0 [cifs]
 [<ffffffffa08245e0>] ? free_rsp_buf+0x40/0x40 [cifs]
Oops#2 Part6
 [<ffffffffa082606e>] SMB2_echo+0x8e/0xb0 [cifs]
 [<ffffffffa0808789>] cifs_echo_request+0x79/0xa0 [cifs]
 [<ffffffff810b45b3>] process_one_work+0x173/0x4a0
 [<ffffffff810b52a1>] worker_thread+0x121/0x3a0
 [<ffffffff810b5180>] ? manage_workers.isra.27+0x2b0/0x2b0
 [<ffffffff810bae00>] kthread+0xc0/0xd0
 [<ffffffff810bad40>] ? kthread_create_on_node+0x120/0x120
 [<ffffffff815b199c>] ret_from_fork+0x7c/0xb0
 [<ffffffff810bad40>] ? kthread_create_on_node+0x120/0x120
Code: 84 24 b8 00 00 00 4c 89 f1 4c 89 ea 4c 89 e6 48 89 df 4c 89 60 18 48 c7 40 28 00 00 00 00 4c 89 68 30 44 89 70 14 49 8b 44 24 28 <ff> 90 88 00 00 00 3d ef fd ff ff 74 10 48 8d 65 e0 5b 41 5c 41
 RIP  [<ffffffff814dcc13>] sock_sendmsg+0x93/0xd0
 RSP <ffff880148745b00>
CR2: 0000000000000088

The client was in the middle of trying to send a frame when the
server->ssocket pointer got zeroed out. In most places, that we access
that pointer, the srv_mutex is held. There's only one spot that I see
that the server->ssocket pointer gets set and the srv_mutex isn't held.
This patch corrects that.

The upstream bug report was here:

    https://bugzilla.kernel.org/show_bug.cgi?id=60557

Cc: <stable@vger.kernel.org>
Reported-by: Oleksii Shevchuk <alxchk@gmail.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
damentz referenced this pull request in zen-kernel/zen-kernel Sep 27, 2013
commit 73e216a upstream.

Oleksii reported that he had seen an oops similar to this:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000088
IP: [<ffffffff814dcc13>] sock_sendmsg+0x93/0xd0
PGD 0
Oops: 0000 [#1] PREEMPT SMP
Modules linked in: ipt_MASQUERADE xt_REDIRECT xt_tcpudp iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack ip_tables x_tables carl9170 ath usb_storage f2fs nfnetlink_log nfnetlink md4 cifs dns_resolver hid_generic usbhid hid af_packet uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_core videodev rfcomm btusb bnep bluetooth qmi_wwan qcserial cdc_wdm usb_wwan usbnet usbserial mii snd_hda_codec_hdmi snd_hda_codec_realtek iwldvm mac80211 coretemp intel_powerclamp kvm_intel kvm iwlwifi snd_hda_intel cfg80211 snd_hda_codec xhci_hcd e1000e ehci_pci snd_hwdep sdhci_pci snd_pcm ehci_hcd microcode psmouse sdhci thinkpad_acpi mmc_core i2c_i801 pcspkr usbcore hwmon snd_timer snd_page_alloc snd ptp rfkill pps_core soundcore evdev usb_common vboxnetflt(O) vboxdrv(O)Oops#2 Part8
 loop tun binfmt_misc fuse msr acpi_call(O) ipv6 autofs4
CPU: 0 PID: 21612 Comm: kworker/0:1 Tainted: G        W  O 3.10.1SIGN #28
Hardware name: LENOVO 2306CTO/2306CTO, BIOS G2ET92WW (2.52 ) 02/22/2013
Workqueue: cifsiod cifs_echo_request [cifs]
task: ffff8801e1f416f0 ti: ffff880148744000 task.ti: ffff880148744000
RIP: 0010:[<ffffffff814dcc13>]  [<ffffffff814dcc13>] sock_sendmsg+0x93/0xd0
RSP: 0000:ffff880148745b00  EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff880148745b78 RCX: 0000000000000048
RDX: ffff880148745c90 RSI: ffff880181864a00 RDI: ffff880148745b78
RBP: ffff880148745c48 R08: 0000000000000048 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff880181864a00
R13: ffff880148745c90 R14: 0000000000000048 R15: 0000000000000048
FS:  0000000000000000(0000) GS:ffff88021e200000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000088 CR3: 000000020c42c000 CR4: 00000000001407b0
Oops#2 Part7
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Stack:
 ffff880148745b30 ffffffff810c4af9 0000004848745b30 ffff880181864a00
 ffffffff81ffbc40 0000000000000000 ffff880148745c90 ffffffff810a5aab
 ffff880148745bc0 ffffffff81ffbc40 ffff880148745b60 ffffffff815a9fb8
Call Trace:
 [<ffffffff810c4af9>] ? finish_task_switch+0x49/0xe0
 [<ffffffff810a5aab>] ? lock_timer_base.isra.36+0x2b/0x50
 [<ffffffff815a9fb8>] ? _raw_spin_unlock_irqrestore+0x18/0x40
 [<ffffffff810a673f>] ? try_to_del_timer_sync+0x4f/0x70
 [<ffffffff815aa38f>] ? _raw_spin_unlock_bh+0x1f/0x30
 [<ffffffff814dcc87>] kernel_sendmsg+0x37/0x50
 [<ffffffffa081a0e0>] smb_send_kvec+0xd0/0x1d0 [cifs]
 [<ffffffffa081a263>] smb_send_rqst+0x83/0x1f0 [cifs]
 [<ffffffffa081ab6c>] cifs_call_async+0xec/0x1b0 [cifs]
 [<ffffffffa08245e0>] ? free_rsp_buf+0x40/0x40 [cifs]
Oops#2 Part6
 [<ffffffffa082606e>] SMB2_echo+0x8e/0xb0 [cifs]
 [<ffffffffa0808789>] cifs_echo_request+0x79/0xa0 [cifs]
 [<ffffffff810b45b3>] process_one_work+0x173/0x4a0
 [<ffffffff810b52a1>] worker_thread+0x121/0x3a0
 [<ffffffff810b5180>] ? manage_workers.isra.27+0x2b0/0x2b0
 [<ffffffff810bae00>] kthread+0xc0/0xd0
 [<ffffffff810bad40>] ? kthread_create_on_node+0x120/0x120
 [<ffffffff815b199c>] ret_from_fork+0x7c/0xb0
 [<ffffffff810bad40>] ? kthread_create_on_node+0x120/0x120
Code: 84 24 b8 00 00 00 4c 89 f1 4c 89 ea 4c 89 e6 48 89 df 4c 89 60 18 48 c7 40 28 00 00 00 00 4c 89 68 30 44 89 70 14 49 8b 44 24 28 <ff> 90 88 00 00 00 3d ef fd ff ff 74 10 48 8d 65 e0 5b 41 5c 41
 RIP  [<ffffffff814dcc13>] sock_sendmsg+0x93/0xd0
 RSP <ffff880148745b00>
CR2: 0000000000000088

The client was in the middle of trying to send a frame when the
server->ssocket pointer got zeroed out. In most places, that we access
that pointer, the srv_mutex is held. There's only one spot that I see
that the server->ssocket pointer gets set and the srv_mutex isn't held.
This patch corrects that.

The upstream bug report was here:

    https://bugzilla.kernel.org/show_bug.cgi?id=60557

Reported-by: Oleksii Shevchuk <alxchk@gmail.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
mdrjr referenced this pull request in hardkernel/linux Sep 27, 2013
commit 73e216a upstream.

Oleksii reported that he had seen an oops similar to this:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000088
IP: [<ffffffff814dcc13>] sock_sendmsg+0x93/0xd0
PGD 0
Oops: 0000 [#1] PREEMPT SMP
Modules linked in: ipt_MASQUERADE xt_REDIRECT xt_tcpudp iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack ip_tables x_tables carl9170 ath usb_storage f2fs nfnetlink_log nfnetlink md4 cifs dns_resolver hid_generic usbhid hid af_packet uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_core videodev rfcomm btusb bnep bluetooth qmi_wwan qcserial cdc_wdm usb_wwan usbnet usbserial mii snd_hda_codec_hdmi snd_hda_codec_realtek iwldvm mac80211 coretemp intel_powerclamp kvm_intel kvm iwlwifi snd_hda_intel cfg80211 snd_hda_codec xhci_hcd e1000e ehci_pci snd_hwdep sdhci_pci snd_pcm ehci_hcd microcode psmouse sdhci thinkpad_acpi mmc_core i2c_i801 pcspkr usbcore hwmon snd_timer snd_page_alloc snd ptp rfkill pps_core soundcore evdev usb_common vboxnetflt(O) vboxdrv(O)Oops#2 Part8
 loop tun binfmt_misc fuse msr acpi_call(O) ipv6 autofs4
CPU: 0 PID: 21612 Comm: kworker/0:1 Tainted: G        W  O 3.10.1SIGN #28
Hardware name: LENOVO 2306CTO/2306CTO, BIOS G2ET92WW (2.52 ) 02/22/2013
Workqueue: cifsiod cifs_echo_request [cifs]
task: ffff8801e1f416f0 ti: ffff880148744000 task.ti: ffff880148744000
RIP: 0010:[<ffffffff814dcc13>]  [<ffffffff814dcc13>] sock_sendmsg+0x93/0xd0
RSP: 0000:ffff880148745b00  EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff880148745b78 RCX: 0000000000000048
RDX: ffff880148745c90 RSI: ffff880181864a00 RDI: ffff880148745b78
RBP: ffff880148745c48 R08: 0000000000000048 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff880181864a00
R13: ffff880148745c90 R14: 0000000000000048 R15: 0000000000000048
FS:  0000000000000000(0000) GS:ffff88021e200000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000088 CR3: 000000020c42c000 CR4: 00000000001407b0
Oops#2 Part7
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Stack:
 ffff880148745b30 ffffffff810c4af9 0000004848745b30 ffff880181864a00
 ffffffff81ffbc40 0000000000000000 ffff880148745c90 ffffffff810a5aab
 ffff880148745bc0 ffffffff81ffbc40 ffff880148745b60 ffffffff815a9fb8
Call Trace:
 [<ffffffff810c4af9>] ? finish_task_switch+0x49/0xe0
 [<ffffffff810a5aab>] ? lock_timer_base.isra.36+0x2b/0x50
 [<ffffffff815a9fb8>] ? _raw_spin_unlock_irqrestore+0x18/0x40
 [<ffffffff810a673f>] ? try_to_del_timer_sync+0x4f/0x70
 [<ffffffff815aa38f>] ? _raw_spin_unlock_bh+0x1f/0x30
 [<ffffffff814dcc87>] kernel_sendmsg+0x37/0x50
 [<ffffffffa081a0e0>] smb_send_kvec+0xd0/0x1d0 [cifs]
 [<ffffffffa081a263>] smb_send_rqst+0x83/0x1f0 [cifs]
 [<ffffffffa081ab6c>] cifs_call_async+0xec/0x1b0 [cifs]
 [<ffffffffa08245e0>] ? free_rsp_buf+0x40/0x40 [cifs]
Oops#2 Part6
 [<ffffffffa082606e>] SMB2_echo+0x8e/0xb0 [cifs]
 [<ffffffffa0808789>] cifs_echo_request+0x79/0xa0 [cifs]
 [<ffffffff810b45b3>] process_one_work+0x173/0x4a0
 [<ffffffff810b52a1>] worker_thread+0x121/0x3a0
 [<ffffffff810b5180>] ? manage_workers.isra.27+0x2b0/0x2b0
 [<ffffffff810bae00>] kthread+0xc0/0xd0
 [<ffffffff810bad40>] ? kthread_create_on_node+0x120/0x120
 [<ffffffff815b199c>] ret_from_fork+0x7c/0xb0
 [<ffffffff810bad40>] ? kthread_create_on_node+0x120/0x120
Code: 84 24 b8 00 00 00 4c 89 f1 4c 89 ea 4c 89 e6 48 89 df 4c 89 60 18 48 c7 40 28 00 00 00 00 4c 89 68 30 44 89 70 14 49 8b 44 24 28 <ff> 90 88 00 00 00 3d ef fd ff ff 74 10 48 8d 65 e0 5b 41 5c 41
 RIP  [<ffffffff814dcc13>] sock_sendmsg+0x93/0xd0
 RSP <ffff880148745b00>
CR2: 0000000000000088

The client was in the middle of trying to send a frame when the
server->ssocket pointer got zeroed out. In most places, that we access
that pointer, the srv_mutex is held. There's only one spot that I see
that the server->ssocket pointer gets set and the srv_mutex isn't held.
This patch corrects that.

The upstream bug report was here:

    https://bugzilla.kernel.org/show_bug.cgi?id=60557

Reported-by: Oleksii Shevchuk <alxchk@gmail.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
dormando pushed a commit to fastly/linux that referenced this pull request Oct 3, 2013
commit 73e216a upstream.

Oleksii reported that he had seen an oops similar to this:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000088
IP: [<ffffffff814dcc13>] sock_sendmsg+0x93/0xd0
PGD 0
Oops: 0000 [#1] PREEMPT SMP
Modules linked in: ipt_MASQUERADE xt_REDIRECT xt_tcpudp iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack ip_tables x_tables carl9170 ath usb_storage f2fs nfnetlink_log nfnetlink md4 cifs dns_resolver hid_generic usbhid hid af_packet uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_core videodev rfcomm btusb bnep bluetooth qmi_wwan qcserial cdc_wdm usb_wwan usbnet usbserial mii snd_hda_codec_hdmi snd_hda_codec_realtek iwldvm mac80211 coretemp intel_powerclamp kvm_intel kvm iwlwifi snd_hda_intel cfg80211 snd_hda_codec xhci_hcd e1000e ehci_pci snd_hwdep sdhci_pci snd_pcm ehci_hcd microcode psmouse sdhci thinkpad_acpi mmc_core i2c_i801 pcspkr usbcore hwmon snd_timer snd_page_alloc snd ptp rfkill pps_core soundcore evdev usb_common vboxnetflt(O) vboxdrv(O)Oops#2 Part8
 loop tun binfmt_misc fuse msr acpi_call(O) ipv6 autofs4
CPU: 0 PID: 21612 Comm: kworker/0:1 Tainted: G        W  O 3.10.1SIGN torvalds#28
Hardware name: LENOVO 2306CTO/2306CTO, BIOS G2ET92WW (2.52 ) 02/22/2013
Workqueue: cifsiod cifs_echo_request [cifs]
task: ffff8801e1f416f0 ti: ffff880148744000 task.ti: ffff880148744000
RIP: 0010:[<ffffffff814dcc13>]  [<ffffffff814dcc13>] sock_sendmsg+0x93/0xd0
RSP: 0000:ffff880148745b00  EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff880148745b78 RCX: 0000000000000048
RDX: ffff880148745c90 RSI: ffff880181864a00 RDI: ffff880148745b78
RBP: ffff880148745c48 R08: 0000000000000048 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff880181864a00
R13: ffff880148745c90 R14: 0000000000000048 R15: 0000000000000048
FS:  0000000000000000(0000) GS:ffff88021e200000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000088 CR3: 000000020c42c000 CR4: 00000000001407b0
Oops#2 Part7
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Stack:
 ffff880148745b30 ffffffff810c4af9 0000004848745b30 ffff880181864a00
 ffffffff81ffbc40 0000000000000000 ffff880148745c90 ffffffff810a5aab
 ffff880148745bc0 ffffffff81ffbc40 ffff880148745b60 ffffffff815a9fb8
Call Trace:
 [<ffffffff810c4af9>] ? finish_task_switch+0x49/0xe0
 [<ffffffff810a5aab>] ? lock_timer_base.isra.36+0x2b/0x50
 [<ffffffff815a9fb8>] ? _raw_spin_unlock_irqrestore+0x18/0x40
 [<ffffffff810a673f>] ? try_to_del_timer_sync+0x4f/0x70
 [<ffffffff815aa38f>] ? _raw_spin_unlock_bh+0x1f/0x30
 [<ffffffff814dcc87>] kernel_sendmsg+0x37/0x50
 [<ffffffffa081a0e0>] smb_send_kvec+0xd0/0x1d0 [cifs]
 [<ffffffffa081a263>] smb_send_rqst+0x83/0x1f0 [cifs]
 [<ffffffffa081ab6c>] cifs_call_async+0xec/0x1b0 [cifs]
 [<ffffffffa08245e0>] ? free_rsp_buf+0x40/0x40 [cifs]
Oops#2 Part6
 [<ffffffffa082606e>] SMB2_echo+0x8e/0xb0 [cifs]
 [<ffffffffa0808789>] cifs_echo_request+0x79/0xa0 [cifs]
 [<ffffffff810b45b3>] process_one_work+0x173/0x4a0
 [<ffffffff810b52a1>] worker_thread+0x121/0x3a0
 [<ffffffff810b5180>] ? manage_workers.isra.27+0x2b0/0x2b0
 [<ffffffff810bae00>] kthread+0xc0/0xd0
 [<ffffffff810bad40>] ? kthread_create_on_node+0x120/0x120
 [<ffffffff815b199c>] ret_from_fork+0x7c/0xb0
 [<ffffffff810bad40>] ? kthread_create_on_node+0x120/0x120
Code: 84 24 b8 00 00 00 4c 89 f1 4c 89 ea 4c 89 e6 48 89 df 4c 89 60 18 48 c7 40 28 00 00 00 00 4c 89 68 30 44 89 70 14 49 8b 44 24 28 <ff> 90 88 00 00 00 3d ef fd ff ff 74 10 48 8d 65 e0 5b 41 5c 41
 RIP  [<ffffffff814dcc13>] sock_sendmsg+0x93/0xd0
 RSP <ffff880148745b00>
CR2: 0000000000000088

The client was in the middle of trying to send a frame when the
server->ssocket pointer got zeroed out. In most places, that we access
that pointer, the srv_mutex is held. There's only one spot that I see
that the server->ssocket pointer gets set and the srv_mutex isn't held.
This patch corrects that.

The upstream bug report was here:

    https://bugzilla.kernel.org/show_bug.cgi?id=60557

Reported-by: Oleksii Shevchuk <alxchk@gmail.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
tositrino pushed a commit to tositrino/linux that referenced this pull request Oct 7, 2013
commit 73e216a upstream.

Oleksii reported that he had seen an oops similar to this:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000088
IP: [<ffffffff814dcc13>] sock_sendmsg+0x93/0xd0
PGD 0
Oops: 0000 [#1] PREEMPT SMP
Modules linked in: ipt_MASQUERADE xt_REDIRECT xt_tcpudp iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack ip_tables x_tables carl9170 ath usb_storage f2fs nfnetlink_log nfnetlink md4 cifs dns_resolver hid_generic usbhid hid af_packet uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_core videodev rfcomm btusb bnep bluetooth qmi_wwan qcserial cdc_wdm usb_wwan usbnet usbserial mii snd_hda_codec_hdmi snd_hda_codec_realtek iwldvm mac80211 coretemp intel_powerclamp kvm_intel kvm iwlwifi snd_hda_intel cfg80211 snd_hda_codec xhci_hcd e1000e ehci_pci snd_hwdep sdhci_pci snd_pcm ehci_hcd microcode psmouse sdhci thinkpad_acpi mmc_core i2c_i801 pcspkr usbcore hwmon snd_timer snd_page_alloc snd ptp rfkill pps_core soundcore evdev usb_common vboxnetflt(O) vboxdrv(O)Oops#2 Part8
 loop tun binfmt_misc fuse msr acpi_call(O) ipv6 autofs4
CPU: 0 PID: 21612 Comm: kworker/0:1 Tainted: G        W  O 3.10.1SIGN torvalds#28
Hardware name: LENOVO 2306CTO/2306CTO, BIOS G2ET92WW (2.52 ) 02/22/2013
Workqueue: cifsiod cifs_echo_request [cifs]
task: ffff8801e1f416f0 ti: ffff880148744000 task.ti: ffff880148744000
RIP: 0010:[<ffffffff814dcc13>]  [<ffffffff814dcc13>] sock_sendmsg+0x93/0xd0
RSP: 0000:ffff880148745b00  EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff880148745b78 RCX: 0000000000000048
RDX: ffff880148745c90 RSI: ffff880181864a00 RDI: ffff880148745b78
RBP: ffff880148745c48 R08: 0000000000000048 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff880181864a00
R13: ffff880148745c90 R14: 0000000000000048 R15: 0000000000000048
FS:  0000000000000000(0000) GS:ffff88021e200000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000088 CR3: 000000020c42c000 CR4: 00000000001407b0
Oops#2 Part7
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Stack:
 ffff880148745b30 ffffffff810c4af9 0000004848745b30 ffff880181864a00
 ffffffff81ffbc40 0000000000000000 ffff880148745c90 ffffffff810a5aab
 ffff880148745bc0 ffffffff81ffbc40 ffff880148745b60 ffffffff815a9fb8
Call Trace:
 [<ffffffff810c4af9>] ? finish_task_switch+0x49/0xe0
 [<ffffffff810a5aab>] ? lock_timer_base.isra.36+0x2b/0x50
 [<ffffffff815a9fb8>] ? _raw_spin_unlock_irqrestore+0x18/0x40
 [<ffffffff810a673f>] ? try_to_del_timer_sync+0x4f/0x70
 [<ffffffff815aa38f>] ? _raw_spin_unlock_bh+0x1f/0x30
 [<ffffffff814dcc87>] kernel_sendmsg+0x37/0x50
 [<ffffffffa081a0e0>] smb_send_kvec+0xd0/0x1d0 [cifs]
 [<ffffffffa081a263>] smb_send_rqst+0x83/0x1f0 [cifs]
 [<ffffffffa081ab6c>] cifs_call_async+0xec/0x1b0 [cifs]
 [<ffffffffa08245e0>] ? free_rsp_buf+0x40/0x40 [cifs]
Oops#2 Part6
 [<ffffffffa082606e>] SMB2_echo+0x8e/0xb0 [cifs]
 [<ffffffffa0808789>] cifs_echo_request+0x79/0xa0 [cifs]
 [<ffffffff810b45b3>] process_one_work+0x173/0x4a0
 [<ffffffff810b52a1>] worker_thread+0x121/0x3a0
 [<ffffffff810b5180>] ? manage_workers.isra.27+0x2b0/0x2b0
 [<ffffffff810bae00>] kthread+0xc0/0xd0
 [<ffffffff810bad40>] ? kthread_create_on_node+0x120/0x120
 [<ffffffff815b199c>] ret_from_fork+0x7c/0xb0
 [<ffffffff810bad40>] ? kthread_create_on_node+0x120/0x120
Code: 84 24 b8 00 00 00 4c 89 f1 4c 89 ea 4c 89 e6 48 89 df 4c 89 60 18 48 c7 40 28 00 00 00 00 4c 89 68 30 44 89 70 14 49 8b 44 24 28 <ff> 90 88 00 00 00 3d ef fd ff ff 74 10 48 8d 65 e0 5b 41 5c 41
 RIP  [<ffffffff814dcc13>] sock_sendmsg+0x93/0xd0
 RSP <ffff880148745b00>
CR2: 0000000000000088

The client was in the middle of trying to send a frame when the
server->ssocket pointer got zeroed out. In most places, that we access
that pointer, the srv_mutex is held. There's only one spot that I see
that the server->ssocket pointer gets set and the srv_mutex isn't held.
This patch corrects that.

The upstream bug report was here:

    https://bugzilla.kernel.org/show_bug.cgi?id=60557

Reported-by: Oleksii Shevchuk <alxchk@gmail.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
swarren pushed a commit to swarren/linux-tegra that referenced this pull request Oct 14, 2013
As the new x86 CPU bootup printout format code maintainer, I am
taking immediate action to improve and clean (and thus indulge
my OCD) the reporting of the cores when coming up online.

Fix padding to a right-hand alignment, cleanup code and bind
reporting width to the max number of supported CPUs on the
system, like this:

 [    0.074509] smpboot: Booting Node   0, Processors:      #1  #2  #3  #4  #5  torvalds#6  torvalds#7 OK
 [    0.644008] smpboot: Booting Node   1, Processors:  torvalds#8  torvalds#9 torvalds#10 torvalds#11 torvalds#12 torvalds#13 torvalds#14 torvalds#15 OK
 [    1.245006] smpboot: Booting Node   2, Processors: torvalds#16 torvalds#17 torvalds#18 torvalds#19 torvalds#20 torvalds#21 torvalds#22 torvalds#23 OK
 [    1.864005] smpboot: Booting Node   3, Processors: torvalds#24 torvalds#25 torvalds#26 torvalds#27 torvalds#28 torvalds#29 torvalds#30 torvalds#31 OK
 [    2.489005] smpboot: Booting Node   4, Processors: torvalds#32 torvalds#33 torvalds#34 torvalds#35 torvalds#36 torvalds#37 torvalds#38 torvalds#39 OK
 [    3.093005] smpboot: Booting Node   5, Processors: torvalds#40 torvalds#41 torvalds#42 torvalds#43 torvalds#44 torvalds#45 torvalds#46 torvalds#47 OK
 [    3.698005] smpboot: Booting Node   6, Processors: torvalds#48 torvalds#49 torvalds#50 torvalds#51 #52 #53 torvalds#54 torvalds#55 OK
 [    4.304005] smpboot: Booting Node   7, Processors: torvalds#56 torvalds#57 #58 torvalds#59 torvalds#60 torvalds#61 torvalds#62 torvalds#63 OK
 [    4.961413] Brought up 64 CPUs

and this:

 [    0.072367] smpboot: Booting Node   0, Processors:    #1 #2 #3 #4 #5 torvalds#6 torvalds#7 OK
 [    0.686329] Brought up 8 CPUs

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Libin <huawei.libin@huawei.com>
Cc: wangyijing@huawei.com
Cc: fenghua.yu@intel.com
Cc: guohanjun@huawei.com
Cc: paul.gortmaker@windriver.com
Link: http://lkml.kernel.org/r/20130927143554.GF4422@pd.tnic
Signed-off-by: Ingo Molnar <mingo@kernel.org>
swarren pushed a commit to swarren/linux-tegra that referenced this pull request Oct 14, 2013
Turn it into (for example):

[    0.073380] x86: Booting SMP configuration:
[    0.074005] .... node   #0, CPUs:          #1   #2   #3   #4   #5   torvalds#6   torvalds#7
[    0.603005] .... node   #1, CPUs:     torvalds#8   torvalds#9  torvalds#10  torvalds#11  torvalds#12  torvalds#13  torvalds#14  torvalds#15
[    1.200005] .... node   #2, CPUs:    torvalds#16  torvalds#17  torvalds#18  torvalds#19  torvalds#20  torvalds#21  torvalds#22  torvalds#23
[    1.796005] .... node   #3, CPUs:    torvalds#24  torvalds#25  torvalds#26  torvalds#27  torvalds#28  torvalds#29  torvalds#30  torvalds#31
[    2.393005] .... node   #4, CPUs:    torvalds#32  torvalds#33  torvalds#34  torvalds#35  torvalds#36  torvalds#37  torvalds#38  torvalds#39
[    2.996005] .... node   #5, CPUs:    torvalds#40  torvalds#41  torvalds#42  torvalds#43  torvalds#44  torvalds#45  torvalds#46  torvalds#47
[    3.600005] .... node   torvalds#6, CPUs:    torvalds#48  torvalds#49  torvalds#50  torvalds#51  #52  #53  torvalds#54  torvalds#55
[    4.202005] .... node   torvalds#7, CPUs:    torvalds#56  torvalds#57  #58  torvalds#59  torvalds#60  torvalds#61  torvalds#62  torvalds#63
[    4.811005] .... node   torvalds#8, CPUs:    torvalds#64  torvalds#65  torvalds#66  torvalds#67  torvalds#68  torvalds#69  #70  torvalds#71
[    5.421006] .... node   torvalds#9, CPUs:    torvalds#72  torvalds#73  torvalds#74  torvalds#75  torvalds#76  torvalds#77  torvalds#78  torvalds#79
[    6.032005] .... node  torvalds#10, CPUs:    torvalds#80  torvalds#81  torvalds#82  torvalds#83  torvalds#84  torvalds#85  torvalds#86  torvalds#87
[    6.648006] .... node  torvalds#11, CPUs:    torvalds#88  torvalds#89  torvalds#90  torvalds#91  torvalds#92  torvalds#93  torvalds#94  torvalds#95
[    7.262005] .... node  torvalds#12, CPUs:    torvalds#96  torvalds#97  torvalds#98  torvalds#99 torvalds#100 torvalds#101 torvalds#102 torvalds#103
[    7.865005] .... node  torvalds#13, CPUs:   torvalds#104 torvalds#105 torvalds#106 torvalds#107 torvalds#108 torvalds#109 torvalds#110 torvalds#111
[    8.466005] .... node  torvalds#14, CPUs:   torvalds#112 torvalds#113 torvalds#114 torvalds#115 torvalds#116 torvalds#117 torvalds#118 torvalds#119
[    9.073006] .... node  torvalds#15, CPUs:   torvalds#120 torvalds#121 torvalds#122 torvalds#123 torvalds#124 torvalds#125 torvalds#126 torvalds#127
[    9.679901] x86: Booted up 16 nodes, 128 CPUs

and drop useless elements.

Change num_digits() to hpa's division-avoiding, cell-phone-typed
version which he went at great lengths and pains to submit on a
Saturday evening.

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: huawei.libin@huawei.com
Cc: wangyijing@huawei.com
Cc: fenghua.yu@intel.com
Cc: guohanjun@huawei.com
Cc: paul.gortmaker@windriver.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20130930095624.GB16383@pd.tnic
Signed-off-by: Ingo Molnar <mingo@kernel.org>
swarren pushed a commit to swarren/linux-tegra that referenced this pull request Oct 14, 2013
The 'driver' field of the i2c_client struct is redundant. The same data can be
accessed through to_i2c_driver(client->dev.driver). The generated code for both
approaches in more or less the same.

E.g. on ARM the expression client->driver->command(...) generates

		...
		ldr     r3, [r0, torvalds#28]
		ldr     r3, [r3, torvalds#32]
		blx     r3
		...

and the expression to_i2c_driver(client->dev.driver)->command(...) generates

		...
		ldr     r3, [r0, torvalds#160]
    	ldr     r3, [r3, #-4]
    	blx     r3
		...

Other architectures will generate similar code.

All users of the 'driver' field outside of the I2C core have already been
converted. So this only leaves the core itself. This patch converts the
remaining few users in the I2C core and then removes the 'driver' field from the
i2c_client struct.

Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
TechNexion pushed a commit to TechNexion/linux that referenced this pull request Oct 25, 2013
commit 62d2feb upstream.

Fix crash after issuing:
	echo hmc5843 0x1e > /sys/class/i2c-dev/i2c-2/device/new_device

	[   37.180999] device: '2-001e': device_add
	[   37.188293] bus: 'i2c': add device 2-001e
	[   37.194549] PM: Adding info for i2c:2-001e
	[   37.200958] bus: 'i2c': driver_probe_device: matched device 2-001e with driver hmc5843
	[   37.210815] bus: 'i2c': really_probe: probing driver hmc5843 with device 2-001e
	[   37.224884] HMC5843 initialized
	[   37.228759] ------------[ cut here ]------------
	[   37.233612] kernel BUG at mm/slab.c:505!
	[   37.237701] Internal error: Oops - BUG: 0 [#1] PREEMPT
	[   37.243103] Modules linked in:
	[   37.246337] CPU: 0    Not tainted  (3.3.1-gta04+ torvalds#28)
	[   37.251647] PC is at kfree+0x84/0x144
	[   37.255493] LR is at kfree+0x20/0x144
	[   37.259338] pc : [<c00b408c>]    lr : [<c00b4028>]    psr: 40000093
	[   37.259368] sp : de249cd8  ip : 0000000c  fp : 00000090
	[   37.271362] r10: 0000000a  r9 : de229eac  r8 : c0236274
	[   37.276855] r7 : c09d6490  r6 : a0000013  r5 : de229c00  r4 : de229c10
	[   37.283691] r3 : c0f00218  r2 : 00000400  r1 : c0eea000  r0 : c00b4028
	[   37.290527] Flags: nZcv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment user
	[   37.298095] Control: 10c5387d  Table: 9e1d0019  DAC: 00000015
	[   37.304107] Process sh (pid: 91, stack limit = 0xde2482f0)
	[   37.309844] Stack: (0xde249cd8 to 0xde24a000)
	[   37.314422] 9cc0:                                                       de229c10 de229c00
	[   37.322998] 9ce0: de229c10 ffffffea 00000005 c0236274 de140a80 c00b4798 dec00080 de140a80
	[   37.331573] 9d00: c032f37c dec00080 000080d0 00000001 de229c00 de229c10 c048d578 00000005
	[   37.340148] 9d20: de229eac 0000000a 00000090 c032fa40 00000001 00000000 00000001 de229c10
	[   37.348724] 9d40: de229eac 00000029 c075b558 00000001 00000003 00000004 de229c10 c048d594
	[   37.357299] 9d60: 00000000 60000013 00000018 205b0007 37332020 3432322e 5d343838 c0060020
	[   37.365905] 9d80: de251600 00000001 00000000 de251600 00000001 c0065a84 de229c00 de229c48
	[   37.374481] 9da0: 00000006 0048d62c de229c38 de229c00 de229c00 de1f6c00 de1f6c20 00000001
	[   37.383056] 9dc0: 00000000 c048d62c 00000000 de229c00 de229c00 de1f6c00 de1f6c20 00000001
	[   37.391632] 9de0: 00000000 c048d62c 00000000 c0330164 00000000 de1f6c20 c048d62c de1f6c00
	[   37.400207] 9e00: c0330078 de1f6c04 c078d714 de189b58 00000000 c02ccfd8 de1f6c20 c0795f40
	[   37.408782] 9e20: c0238330 00000000 00000000 c02381a8 de1b9fc0 de1f6c20 de1f6c20 de249e48
	[   37.417358] 9e40: c0238330 c0236bb0 decdbed8 de7d0f14 de1f6c20 de1f6c20 de1f6c54 de1f6c20
	[   37.425933] 9e60: 00000000 c0238030 de1f6c20 c078d7bc de1f6c20 c02377ec de1f6c20 de1f6c28
	[   37.434509] 9e80: dee64cb0 c0236138 c047c554 de189b58 00000000 c004b45c de1f6c20 de1f6cd8
	[   37.443084] 9ea0: c0edfa6c de1f6c00 dee64c68 de1f6c04 de1f6c20 dee64cb8 c047c554 de189b58
	[   37.451690] 9ec0: 00000000 c02cd634 dee64c68 de249ef4 de23b008 dee64cb0 0000000d de23b000
	[   37.460266] 9ee0: de23b007 c02cd78c 00000002 00000000 00000000 35636d68 00333438 00000000
	[   37.468841] 9f00: 00000000 00000000 001e0000 00000000 00000000 00000000 00000000 0a10cec0
	[   37.477416] 9f20: 00000002 de249f80 0000000d dee62990 de189b40 c0234d88 0000000d c010c354
	[   37.485992] 9f40: 0000000d de210f28 000acc88 de249f80 0000000d de248000 00000000 c00b7bf8
	[   37.494567] 9f60: de210f28 000acc88 de210f28 000acc88 00000000 00000000 0000000d c00b7ed8
	[   37.503143] 9f80: 00000000 00000000 0000000d 00000000 0007fa28 0000000d 000acc88 00000004
	[   37.511718] 9fa0: c000e544 c000e380 0007fa28 0000000d 00000001 000acc88 0000000d 00000000
	[   37.520294] 9fc0: 0007fa28 0000000d 000acc88 00000004 00000001 00000020 00000002 00000000
	[   37.528869] 9fe0: 00000000 beab8624 0000ea05 b6eaebac 600d0010 00000001 00000000 00000000
	[   37.537475] [<c00b408c>] (kfree+0x84/0x144) from [<c0236274>] (device_add+0x530/0x57c)
	[   37.545806] [<c0236274>] (device_add+0x530/0x57c) from [<c032fa40>] (iio_device_register+0x8c8/0x990)
	[   37.555480] [<c032fa40>] (iio_device_register+0x8c8/0x990) from [<c0330164>] (hmc5843_probe+0xec/0x114)
	[   37.565338] [<c0330164>] (hmc5843_probe+0xec/0x114) from [<c02ccfd8>] (i2c_device_probe+0xc4/0xf8)
	[   37.574737] [<c02ccfd8>] (i2c_device_probe+0xc4/0xf8) from [<c02381a8>] (driver_probe_device+0x118/0x218)
	[   37.584777] [<c02381a8>] (driver_probe_device+0x118/0x218) from [<c0236bb0>] (bus_for_each_drv+0x4c/0x84)
	[   37.594818] [<c0236bb0>] (bus_for_each_drv+0x4c/0x84) from [<c0238030>] (device_attach+0x78/0xa4)
	[   37.604125] [<c0238030>] (device_attach+0x78/0xa4) from [<c02377ec>] (bus_probe_device+0x28/0x9c)
	[   37.613433] [<c02377ec>] (bus_probe_device+0x28/0x9c) from [<c0236138>] (device_add+0x3f4/0x57c)
	[   37.622650] [<c0236138>] (device_add+0x3f4/0x57c) from [<c02cd634>] (i2c_new_device+0xf8/0x19c)
	[   37.631805] [<c02cd634>] (i2c_new_device+0xf8/0x19c) from [<c02cd78c>] (i2c_sysfs_new_device+0xb4/0x130)
	[   37.641754] [<c02cd78c>] (i2c_sysfs_new_device+0xb4/0x130) from [<c0234d88>] (dev_attr_store+0x18/0x24)
	[   37.651611] [<c0234d88>] (dev_attr_store+0x18/0x24) from [<c010c354>] (sysfs_write_file+0x10c/0x140)
	[   37.661193] [<c010c354>] (sysfs_write_file+0x10c/0x140) from [<c00b7bf8>] (vfs_write+0xb0/0x178)
	[   37.670410] [<c00b7bf8>] (vfs_write+0xb0/0x178) from [<c00b7ed8>] (sys_write+0x3c/0x68)
	[   37.678833] [<c00b7ed8>] (sys_write+0x3c/0x68) from [<c000e380>] (ret_fast_syscall+0x0/0x3c)
	[   37.687683] Code: 1593301c e5932000 e3120080 1a000000 (e7f001f2)
	[   37.700775] ---[ end trace aaf805debdb69390 ]---

Client data was assigned to iio_dev structure in probe but in
hmc5843_init_client function casted to private driver data structure which
is wrong. Possibly calling mutex_init(&data->lock); corrupt data
which the lead to above crash.

Signed-off-by: Marek Belisko <marek.belisko@open-nandra.com>
Acked-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
apxii pushed a commit to apxii/linux that referenced this pull request Nov 26, 2013
commit 73e216a upstream.

Oleksii reported that he had seen an oops similar to this:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000088
IP: [<ffffffff814dcc13>] sock_sendmsg+0x93/0xd0
PGD 0
Oops: 0000 [#1] PREEMPT SMP
Modules linked in: ipt_MASQUERADE xt_REDIRECT xt_tcpudp iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack ip_tables x_tables carl9170 ath usb_storage f2fs nfnetlink_log nfnetlink md4 cifs dns_resolver hid_generic usbhid hid af_packet uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_core videodev rfcomm btusb bnep bluetooth qmi_wwan qcserial cdc_wdm usb_wwan usbnet usbserial mii snd_hda_codec_hdmi snd_hda_codec_realtek iwldvm mac80211 coretemp intel_powerclamp kvm_intel kvm iwlwifi snd_hda_intel cfg80211 snd_hda_codec xhci_hcd e1000e ehci_pci snd_hwdep sdhci_pci snd_pcm ehci_hcd microcode psmouse sdhci thinkpad_acpi mmc_core i2c_i801 pcspkr usbcore hwmon snd_timer snd_page_alloc snd ptp rfkill pps_core soundcore evdev usb_common vboxnetflt(O) vboxdrv(O)Oops#2 Part8
 loop tun binfmt_misc fuse msr acpi_call(O) ipv6 autofs4
CPU: 0 PID: 21612 Comm: kworker/0:1 Tainted: G        W  O 3.10.1SIGN torvalds#28
Hardware name: LENOVO 2306CTO/2306CTO, BIOS G2ET92WW (2.52 ) 02/22/2013
Workqueue: cifsiod cifs_echo_request [cifs]
task: ffff8801e1f416f0 ti: ffff880148744000 task.ti: ffff880148744000
RIP: 0010:[<ffffffff814dcc13>]  [<ffffffff814dcc13>] sock_sendmsg+0x93/0xd0
RSP: 0000:ffff880148745b00  EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff880148745b78 RCX: 0000000000000048
RDX: ffff880148745c90 RSI: ffff880181864a00 RDI: ffff880148745b78
RBP: ffff880148745c48 R08: 0000000000000048 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff880181864a00
R13: ffff880148745c90 R14: 0000000000000048 R15: 0000000000000048
FS:  0000000000000000(0000) GS:ffff88021e200000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000088 CR3: 000000020c42c000 CR4: 00000000001407b0
Oops#2 Part7
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Stack:
 ffff880148745b30 ffffffff810c4af9 0000004848745b30 ffff880181864a00
 ffffffff81ffbc40 0000000000000000 ffff880148745c90 ffffffff810a5aab
 ffff880148745bc0 ffffffff81ffbc40 ffff880148745b60 ffffffff815a9fb8
Call Trace:
 [<ffffffff810c4af9>] ? finish_task_switch+0x49/0xe0
 [<ffffffff810a5aab>] ? lock_timer_base.isra.36+0x2b/0x50
 [<ffffffff815a9fb8>] ? _raw_spin_unlock_irqrestore+0x18/0x40
 [<ffffffff810a673f>] ? try_to_del_timer_sync+0x4f/0x70
 [<ffffffff815aa38f>] ? _raw_spin_unlock_bh+0x1f/0x30
 [<ffffffff814dcc87>] kernel_sendmsg+0x37/0x50
 [<ffffffffa081a0e0>] smb_send_kvec+0xd0/0x1d0 [cifs]
 [<ffffffffa081a263>] smb_send_rqst+0x83/0x1f0 [cifs]
 [<ffffffffa081ab6c>] cifs_call_async+0xec/0x1b0 [cifs]
 [<ffffffffa08245e0>] ? free_rsp_buf+0x40/0x40 [cifs]
Oops#2 Part6
 [<ffffffffa082606e>] SMB2_echo+0x8e/0xb0 [cifs]
 [<ffffffffa0808789>] cifs_echo_request+0x79/0xa0 [cifs]
 [<ffffffff810b45b3>] process_one_work+0x173/0x4a0
 [<ffffffff810b52a1>] worker_thread+0x121/0x3a0
 [<ffffffff810b5180>] ? manage_workers.isra.27+0x2b0/0x2b0
 [<ffffffff810bae00>] kthread+0xc0/0xd0
 [<ffffffff810bad40>] ? kthread_create_on_node+0x120/0x120
 [<ffffffff815b199c>] ret_from_fork+0x7c/0xb0
 [<ffffffff810bad40>] ? kthread_create_on_node+0x120/0x120
Code: 84 24 b8 00 00 00 4c 89 f1 4c 89 ea 4c 89 e6 48 89 df 4c 89 60 18 48 c7 40 28 00 00 00 00 4c 89 68 30 44 89 70 14 49 8b 44 24 28 <ff> 90 88 00 00 00 3d ef fd ff ff 74 10 48 8d 65 e0 5b 41 5c 41
 RIP  [<ffffffff814dcc13>] sock_sendmsg+0x93/0xd0
 RSP <ffff880148745b00>
CR2: 0000000000000088

The client was in the middle of trying to send a frame when the
server->ssocket pointer got zeroed out. In most places, that we access
that pointer, the srv_mutex is held. There's only one spot that I see
that the server->ssocket pointer gets set and the srv_mutex isn't held.
This patch corrects that.

The upstream bug report was here:

    https://bugzilla.kernel.org/show_bug.cgi?id=60557

Reported-by: Oleksii Shevchuk <alxchk@gmail.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
raedwulf pushed a commit to raedwulf/linux that referenced this pull request Jan 4, 2014
Fix crash after issuing:
	echo hmc5843 0x1e > /sys/class/i2c-dev/i2c-2/device/new_device

	[   37.180999] device: '2-001e': device_add
	[   37.188293] bus: 'i2c': add device 2-001e
	[   37.194549] PM: Adding info for i2c:2-001e
	[   37.200958] bus: 'i2c': driver_probe_device: matched device 2-001e with driver hmc5843
	[   37.210815] bus: 'i2c': really_probe: probing driver hmc5843 with device 2-001e
	[   37.224884] HMC5843 initialized
	[   37.228759] ------------[ cut here ]------------
	[   37.233612] kernel BUG at mm/slab.c:505!
	[   37.237701] Internal error: Oops - BUG: 0 [#1] PREEMPT
	[   37.243103] Modules linked in:
	[   37.246337] CPU: 0    Not tainted  (3.3.1-gta04+ torvalds#28)
	[   37.251647] PC is at kfree+0x84/0x144
	[   37.255493] LR is at kfree+0x20/0x144
	[   37.259338] pc : [<c00b408c>]    lr : [<c00b4028>]    psr: 40000093
	[   37.259368] sp : de249cd8  ip : 0000000c  fp : 00000090
	[   37.271362] r10: 0000000a  r9 : de229eac  r8 : c0236274
	[   37.276855] r7 : c09d6490  r6 : a0000013  r5 : de229c00  r4 : de229c10
	[   37.283691] r3 : c0f00218  r2 : 00000400  r1 : c0eea000  r0 : c00b4028
	[   37.290527] Flags: nZcv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment user
	[   37.298095] Control: 10c5387d  Table: 9e1d0019  DAC: 00000015
	[   37.304107] Process sh (pid: 91, stack limit = 0xde2482f0)
	[   37.309844] Stack: (0xde249cd8 to 0xde24a000)
	[   37.314422] 9cc0:                                                       de229c10 de229c00
	[   37.322998] 9ce0: de229c10 ffffffea 00000005 c0236274 de140a80 c00b4798 dec00080 de140a80
	[   37.331573] 9d00: c032f37c dec00080 000080d0 00000001 de229c00 de229c10 c048d578 00000005
	[   37.340148] 9d20: de229eac 0000000a 00000090 c032fa40 00000001 00000000 00000001 de229c10
	[   37.348724] 9d40: de229eac 00000029 c075b558 00000001 00000003 00000004 de229c10 c048d594
	[   37.357299] 9d60: 00000000 60000013 00000018 205b0007 37332020 3432322e 5d343838 c0060020
	[   37.365905] 9d80: de251600 00000001 00000000 de251600 00000001 c0065a84 de229c00 de229c48
	[   37.374481] 9da0: 00000006 0048d62c de229c38 de229c00 de229c00 de1f6c00 de1f6c20 00000001
	[   37.383056] 9dc0: 00000000 c048d62c 00000000 de229c00 de229c00 de1f6c00 de1f6c20 00000001
	[   37.391632] 9de0: 00000000 c048d62c 00000000 c0330164 00000000 de1f6c20 c048d62c de1f6c00
	[   37.400207] 9e00: c0330078 de1f6c04 c078d714 de189b58 00000000 c02ccfd8 de1f6c20 c0795f40
	[   37.408782] 9e20: c0238330 00000000 00000000 c02381a8 de1b9fc0 de1f6c20 de1f6c20 de249e48
	[   37.417358] 9e40: c0238330 c0236bb0 decdbed8 de7d0f14 de1f6c20 de1f6c20 de1f6c54 de1f6c20
	[   37.425933] 9e60: 00000000 c0238030 de1f6c20 c078d7bc de1f6c20 c02377ec de1f6c20 de1f6c28
	[   37.434509] 9e80: dee64cb0 c0236138 c047c554 de189b58 00000000 c004b45c de1f6c20 de1f6cd8
	[   37.443084] 9ea0: c0edfa6c de1f6c00 dee64c68 de1f6c04 de1f6c20 dee64cb8 c047c554 de189b58
	[   37.451690] 9ec0: 00000000 c02cd634 dee64c68 de249ef4 de23b008 dee64cb0 0000000d de23b000
	[   37.460266] 9ee0: de23b007 c02cd78c 00000002 00000000 00000000 35636d68 00333438 00000000
	[   37.468841] 9f00: 00000000 00000000 001e0000 00000000 00000000 00000000 00000000 0a10cec0
	[   37.477416] 9f20: 00000002 de249f80 0000000d dee62990 de189b40 c0234d88 0000000d c010c354
	[   37.485992] 9f40: 0000000d de210f28 000acc88 de249f80 0000000d de248000 00000000 c00b7bf8
	[   37.494567] 9f60: de210f28 000acc88 de210f28 000acc88 00000000 00000000 0000000d c00b7ed8
	[   37.503143] 9f80: 00000000 00000000 0000000d 00000000 0007fa28 0000000d 000acc88 00000004
	[   37.511718] 9fa0: c000e544 c000e380 0007fa28 0000000d 00000001 000acc88 0000000d 00000000
	[   37.520294] 9fc0: 0007fa28 0000000d 000acc88 00000004 00000001 00000020 00000002 00000000
	[   37.528869] 9fe0: 00000000 beab8624 0000ea05 b6eaebac 600d0010 00000001 00000000 00000000
	[   37.537475] [<c00b408c>] (kfree+0x84/0x144) from [<c0236274>] (device_add+0x530/0x57c)
	[   37.545806] [<c0236274>] (device_add+0x530/0x57c) from [<c032fa40>] (iio_device_register+0x8c8/0x990)
	[   37.555480] [<c032fa40>] (iio_device_register+0x8c8/0x990) from [<c0330164>] (hmc5843_probe+0xec/0x114)
	[   37.565338] [<c0330164>] (hmc5843_probe+0xec/0x114) from [<c02ccfd8>] (i2c_device_probe+0xc4/0xf8)
	[   37.574737] [<c02ccfd8>] (i2c_device_probe+0xc4/0xf8) from [<c02381a8>] (driver_probe_device+0x118/0x218)
	[   37.584777] [<c02381a8>] (driver_probe_device+0x118/0x218) from [<c0236bb0>] (bus_for_each_drv+0x4c/0x84)
	[   37.594818] [<c0236bb0>] (bus_for_each_drv+0x4c/0x84) from [<c0238030>] (device_attach+0x78/0xa4)
	[   37.604125] [<c0238030>] (device_attach+0x78/0xa4) from [<c02377ec>] (bus_probe_device+0x28/0x9c)
	[   37.613433] [<c02377ec>] (bus_probe_device+0x28/0x9c) from [<c0236138>] (device_add+0x3f4/0x57c)
	[   37.622650] [<c0236138>] (device_add+0x3f4/0x57c) from [<c02cd634>] (i2c_new_device+0xf8/0x19c)
	[   37.631805] [<c02cd634>] (i2c_new_device+0xf8/0x19c) from [<c02cd78c>] (i2c_sysfs_new_device+0xb4/0x130)
	[   37.641754] [<c02cd78c>] (i2c_sysfs_new_device+0xb4/0x130) from [<c0234d88>] (dev_attr_store+0x18/0x24)
	[   37.651611] [<c0234d88>] (dev_attr_store+0x18/0x24) from [<c010c354>] (sysfs_write_file+0x10c/0x140)
	[   37.661193] [<c010c354>] (sysfs_write_file+0x10c/0x140) from [<c00b7bf8>] (vfs_write+0xb0/0x178)
	[   37.670410] [<c00b7bf8>] (vfs_write+0xb0/0x178) from [<c00b7ed8>] (sys_write+0x3c/0x68)
	[   37.678833] [<c00b7ed8>] (sys_write+0x3c/0x68) from [<c000e380>] (ret_fast_syscall+0x0/0x3c)
	[   37.687683] Code: 1593301c e5932000 e3120080 1a000000 (e7f001f2)
	[   37.700775] ---[ end trace aaf805debdb69390 ]---

Client data was assigned to iio_dev structure in probe but in
hmc5843_init_client function casted to private driver data structure which
is wrong. Possibly calling mutex_init(&data->lock); corrupt data
which the lead to above crash.

Signed-off-by: Marek Belisko <marek.belisko@open-nandra.com>
Cc: stable <stable@vger.kernel.org>
Acked-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
raedwulf pushed a commit to raedwulf/linux that referenced this pull request Jan 4, 2014
The driver calls cxgb_vlan_mode() from init_one().  This calls into
synchronize_rx(), which locks all the q locks, but the q locks are not
initialized until cxgb_up() -> setup_sge_qsets().  So move the call to
cxgb_vlan_mode() into cxgb_up(), after the call to setup_sge_qsets().
We also move the body of these functions up higher to avoid having to
a forward declaration.

This was found because of the lockdep warning:

    INFO: trying to register non-static key.
    the code is fine but needs lockdep annotation.
    turning off the locking correctness validator.
    Pid: 323, comm: work_for_cpu Not tainted 3.4.0-rc5 torvalds#28
    Call Trace:
     [<ffffffff8106e767>] register_lock_class+0x108/0x2d0
     [<ffffffff8106ff42>] __lock_acquire+0xd3/0xd06
     [<ffffffff81070fd0>] lock_acquire+0xbf/0xfe
     [<ffffffff813862a6>] _raw_spin_lock_irq+0x36/0x45
     [<ffffffffa01e71aa>] cxgb_vlan_mode+0x96/0xcb [cxgb3]
     [<ffffffffa01f90eb>] init_one+0x8c4/0x980 [cxgb3]
     [<ffffffff811fcbf0>] local_pci_probe+0x3f/0x70
     [<ffffffff81042206>] do_work_for_cpu+0x10/0x22
     [<ffffffff810482de>] kthread+0xa1/0xa9
     [<ffffffff8138e234>] kernel_thread_helper+0x4/0x10

Contrary to what lockdep says, the code is not fine: we are locking an
uninitialized spinlock.

Signed-off-by: Roland Dreier <roland@purestorage.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
mkuoppal pushed a commit to mkuoppal/linux that referenced this pull request Jan 17, 2014
We need to defer the free request until the object/vma is capable of
being freed - or else we have a problem  when we try to destroy the
context.

The exact same issue is described and fixed here:
commit e207804
Author: Ben Widawsky <ben@bwidawsk.net>
Date:   Fri Dec 6 14:11:22 2013 -0800

    drm/i915: Defer request freeing

I had this fix previously, but decided not to keep it for some reason I
can no longer remember.

gem_reset_stats is a really good test at hitting the problem.

For the inquisitive:
[  170.516392] ------------[ cut here ]------------
[  170.517227] WARNING: CPU: 1 PID: 105 at drivers/gpu/drm/drm_mm.c:578 drm_mm_takedown+0x2e/0x30 [drm]()
[  170.518064] Memory manager not clean during takedown.
[  170.518941] CPU: 1 PID: 105 Comm: kworker/1:1 Not tainted 3.13.0-rc4-BEN+ torvalds#28
[  170.519787] Hardware name: Hewlett-Packard HP EliteBook 8470p/179B, BIOS 68ICF Ver. F.02 04/27/2012
[  170.520662] Call Trace:
[  170.521517]  [<ffffffff814f0589>] dump_stack+0x4e/0x7a
[  170.522373]  [<ffffffff81049e6d>] warn_slowpath_common+0x7d/0xa0
[  170.523227]  [<ffffffff81049edc>] warn_slowpath_fmt+0x4c/0x50
[  170.524079]  [<ffffffffa06c414e>] drm_mm_takedown+0x2e/0x30 [drm]
[  170.524934]  [<ffffffffa07213f3>] gen6_ppgtt_cleanup+0x23/0x110
[i915]
[  170.525777]  [<ffffffffa07837ed>] ppgtt_release.part.5+0x24/0x29
[i915]
[  170.526603]  [<ffffffffa071aaa5>] i915_gem_context_free+0x195/0x1a0
[i915]
[  170.527423]  [<ffffffffa071189d>] i915_gem_free_request+0x9d/0xb0
[i915]
[  170.528247]  [<ffffffffa0718af9>] i915_gem_reset+0x1f9/0x3f0 [i915]
[  170.529065]  [<ffffffffa0700cce>] i915_reset+0x4e/0x180 [i915]
[  170.529870]  [<ffffffffa070829d>] i915_error_work_func+0xcd/0x120
[i915]
[  170.530666]  [<ffffffff8106c13a>] process_one_work+0x1fa/0x6d0
[  170.531453]  [<ffffffff8106c0d8>] ? process_one_work+0x198/0x6d0
[  170.532230]  [<ffffffff8106c72b>] worker_thread+0x11b/0x3a0
[  170.532996]  [<ffffffff8106c610>] ? process_one_work+0x6d0/0x6d0
[  170.533771]  [<ffffffff810743ef>] kthread+0xff/0x120
[  170.534548]  [<ffffffff810742f0>] ? insert_kthread_work+0x80/0x80
[  170.535322]  [<ffffffff814f97ac>] ret_from_fork+0x7c/0xb0
[  170.536089]  [<ffffffff810742f0>] ? insert_kthread_work+0x80/0x80
[  170.536847] ---[ end trace 3d4c12892e42d58f ]---

v2: Whitespace fix. (Chris)

Note: This is a bug that only hits the ppgtt topic branch but I've
figured that doing the request cleanup in this order is generally the
right thing to do.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Add a code comment to clarify what's actually going on since
the lifetime rules aroung ppgtt cleanup are ... fuzzy a best atm. Also
add a note about why we need this.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
swarren pushed a commit to swarren/linux-tegra that referenced this pull request Jan 21, 2014
We need to defer the free request until the object/vma is capable of
being freed - or else we have a problem  when we try to destroy the
context.

The exact same issue is described and fixed here:
commit e207804
Author: Ben Widawsky <ben@bwidawsk.net>
Date:   Fri Dec 6 14:11:22 2013 -0800

    drm/i915: Defer request freeing

I had this fix previously, but decided not to keep it for some reason I
can no longer remember.

gem_reset_stats is a really good test at hitting the problem.

For the inquisitive:
[  170.516392] ------------[ cut here ]------------
[  170.517227] WARNING: CPU: 1 PID: 105 at drivers/gpu/drm/drm_mm.c:578 drm_mm_takedown+0x2e/0x30 [drm]()
[  170.518064] Memory manager not clean during takedown.
[  170.518941] CPU: 1 PID: 105 Comm: kworker/1:1 Not tainted 3.13.0-rc4-BEN+ torvalds#28
[  170.519787] Hardware name: Hewlett-Packard HP EliteBook 8470p/179B, BIOS 68ICF Ver. F.02 04/27/2012
[  170.520662] Call Trace:
[  170.521517]  [<ffffffff814f0589>] dump_stack+0x4e/0x7a
[  170.522373]  [<ffffffff81049e6d>] warn_slowpath_common+0x7d/0xa0
[  170.523227]  [<ffffffff81049edc>] warn_slowpath_fmt+0x4c/0x50
[  170.524079]  [<ffffffffa06c414e>] drm_mm_takedown+0x2e/0x30 [drm]
[  170.524934]  [<ffffffffa07213f3>] gen6_ppgtt_cleanup+0x23/0x110
[i915]
[  170.525777]  [<ffffffffa07837ed>] ppgtt_release.part.5+0x24/0x29
[i915]
[  170.526603]  [<ffffffffa071aaa5>] i915_gem_context_free+0x195/0x1a0
[i915]
[  170.527423]  [<ffffffffa071189d>] i915_gem_free_request+0x9d/0xb0
[i915]
[  170.528247]  [<ffffffffa0718af9>] i915_gem_reset+0x1f9/0x3f0 [i915]
[  170.529065]  [<ffffffffa0700cce>] i915_reset+0x4e/0x180 [i915]
[  170.529870]  [<ffffffffa070829d>] i915_error_work_func+0xcd/0x120
[i915]
[  170.530666]  [<ffffffff8106c13a>] process_one_work+0x1fa/0x6d0
[  170.531453]  [<ffffffff8106c0d8>] ? process_one_work+0x198/0x6d0
[  170.532230]  [<ffffffff8106c72b>] worker_thread+0x11b/0x3a0
[  170.532996]  [<ffffffff8106c610>] ? process_one_work+0x6d0/0x6d0
[  170.533771]  [<ffffffff810743ef>] kthread+0xff/0x120
[  170.534548]  [<ffffffff810742f0>] ? insert_kthread_work+0x80/0x80
[  170.535322]  [<ffffffff814f97ac>] ret_from_fork+0x7c/0xb0
[  170.536089]  [<ffffffff810742f0>] ? insert_kthread_work+0x80/0x80
[  170.536847] ---[ end trace 3d4c12892e42d58f ]---

v2: Whitespace fix. (Chris)

Note: This is a bug that only hits the ppgtt topic branch but I've
figured that doing the request cleanup in this order is generally the
right thing to do.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Add a code comment to clarify what's actually going on since
the lifetime rules aroung ppgtt cleanup are ... fuzzy a best atm. Also
add a note about why we need this.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
torvalds pushed a commit that referenced this pull request Jan 30, 2014
We need to defer the free request until the object/vma is capable of
being freed - or else we have a problem  when we try to destroy the
context.

The exact same issue is described and fixed here:
commit e207804
Author: Ben Widawsky <ben@bwidawsk.net>
Date:   Fri Dec 6 14:11:22 2013 -0800

    drm/i915: Defer request freeing

I had this fix previously, but decided not to keep it for some reason I
can no longer remember.

gem_reset_stats is a really good test at hitting the problem.

For the inquisitive:
[  170.516392] ------------[ cut here ]------------
[  170.517227] WARNING: CPU: 1 PID: 105 at drivers/gpu/drm/drm_mm.c:578 drm_mm_takedown+0x2e/0x30 [drm]()
[  170.518064] Memory manager not clean during takedown.
[  170.518941] CPU: 1 PID: 105 Comm: kworker/1:1 Not tainted 3.13.0-rc4-BEN+ #28
[  170.519787] Hardware name: Hewlett-Packard HP EliteBook 8470p/179B, BIOS 68ICF Ver. F.02 04/27/2012
[  170.520662] Call Trace:
[  170.521517]  [<ffffffff814f0589>] dump_stack+0x4e/0x7a
[  170.522373]  [<ffffffff81049e6d>] warn_slowpath_common+0x7d/0xa0
[  170.523227]  [<ffffffff81049edc>] warn_slowpath_fmt+0x4c/0x50
[  170.524079]  [<ffffffffa06c414e>] drm_mm_takedown+0x2e/0x30 [drm]
[  170.524934]  [<ffffffffa07213f3>] gen6_ppgtt_cleanup+0x23/0x110
[i915]
[  170.525777]  [<ffffffffa07837ed>] ppgtt_release.part.5+0x24/0x29
[i915]
[  170.526603]  [<ffffffffa071aaa5>] i915_gem_context_free+0x195/0x1a0
[i915]
[  170.527423]  [<ffffffffa071189d>] i915_gem_free_request+0x9d/0xb0
[i915]
[  170.528247]  [<ffffffffa0718af9>] i915_gem_reset+0x1f9/0x3f0 [i915]
[  170.529065]  [<ffffffffa0700cce>] i915_reset+0x4e/0x180 [i915]
[  170.529870]  [<ffffffffa070829d>] i915_error_work_func+0xcd/0x120
[i915]
[  170.530666]  [<ffffffff8106c13a>] process_one_work+0x1fa/0x6d0
[  170.531453]  [<ffffffff8106c0d8>] ? process_one_work+0x198/0x6d0
[  170.532230]  [<ffffffff8106c72b>] worker_thread+0x11b/0x3a0
[  170.532996]  [<ffffffff8106c610>] ? process_one_work+0x6d0/0x6d0
[  170.533771]  [<ffffffff810743ef>] kthread+0xff/0x120
[  170.534548]  [<ffffffff810742f0>] ? insert_kthread_work+0x80/0x80
[  170.535322]  [<ffffffff814f97ac>] ret_from_fork+0x7c/0xb0
[  170.536089]  [<ffffffff810742f0>] ? insert_kthread_work+0x80/0x80
[  170.536847] ---[ end trace 3d4c12892e42d58f ]---

v2: Whitespace fix. (Chris)

Note: This is a bug that only hits the ppgtt topic branch but I've
figured that doing the request cleanup in this order is generally the
right thing to do.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Add a code comment to clarify what's actually going on since
the lifetime rules aroung ppgtt cleanup are ... fuzzy a best atm. Also
add a note about why we need this.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
johnweber pushed a commit to wandboard-org/linux that referenced this pull request Apr 23, 2014
Kernel will dump when run deinterlace stress test.
It is caused by vditmpbuf being reallocated by another thread
when one thread accesses it.
Issue is fixed by putting these code in mutex.

Kernel dump log:
[Playing  ][Vol=01][00:00:10/00:00:30][fps:32]Unable to handle kernel
paging request at virtual address 607d6085
pgd = 80004000
[607d6085] *pgd=00000000
Internal error: Oops: 5 [#1] SMP ARM
Modules linked in:
CPU: 0 PID: 50 Comm: ipu2_task Not tainted 3.10.17-02308-g3700819 torvalds#28
task: ac1dc700 ti: ac1ba000 task.ti: ac1ba000
PC is at __kmalloc+0x40/0x114
LR is at __kmalloc+0x14/0x114
pc : [<800bbd40>]    lr : [<800bbd14>]    psr: 200f0013
sp : ac1bbbc8  ip : 008cc000  fp : 00001e40
r10: ac772e00  r9 : 0057b255  r8 : 000000d0
r7 : 00000790  r6 : ac773800  r5 : 607d6085  r4 : ac001b00
r3 : 00000000  r2 : 814f92a0  r1 : 000000d0  r0 : 000398c9
Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment kernel
Control: 10c53c7d  Table: 3c4c004a  DAC: 00000015
Process ipu2_task (pid: 50, stack limit = 0xac1ba238)
Stack: (0xac1bbbc8 to 0xac1bc000)

Signed-off-by: Sandor Yu <R01008@freescale.com>
ddstreet pushed a commit to ddstreet/linux that referenced this pull request Sep 25, 2014
I'm getting the spew below when booting with Haswell (Xeon
E5-2699 v3) CPUs and the "Cluster-on-Die" (CoD) feature enabled
in the BIOS.  It seems similar to the issue that some folks from
AMD ran in to on their systems and addressed in this commit:

  161270f ("x86/smp: Fix topology checks on AMD MCM CPUs")

Both these Intel and AMD systems break an assumption which is
being enforced by topology_sane(): a socket may not contain more
than one NUMA node.

AMD special-cased their system by looking for a cpuid flag.  The
Intel mode is dependent on BIOS options and I do not know of a
way which it is enumerated other than the tables being parsed
during the CPU bringup process.  In other words, we have to trust
the ACPI tables <shudder>.

This detects the situation where a NUMA node occurs at a place in
the middle of the "CPU" sched domains.  It replaces the default
topology with one that relies on the NUMA information from the
firmware (SRAT table) for all levels of sched domains above the
hyperthreads.

This also fixes a sysfs bug.  We used to freak out when we saw
the "mc" group cross a node boundary, so we stopped building the
MC group.  MC gets exported as the 'core_siblings_list' in
/sys/devices/system/cpu/cpu*/topology/ and this caused CPUs with
the same 'physical_package_id' to not be listed together in
'core_siblings_list'.  This violates a statement from
Documentation/ABI/testing/sysfs-devices-system-cpu:

	core_siblings: internal kernel map of cpu#'s hardware threads
	within the same physical_package_id.

	core_siblings_list: human-readable list of the logical CPU
	numbers within the same physical_package_id as cpu#.

The sysfs effects here cause an issue with the hwloc tool where
it gets confused and thinks there are more sockets than are
physically present.

Before this patch, there are two packages:

# cd /sys/devices/system/cpu/
# cat cpu*/topology/physical_package_id | sort | uniq -c
     18 0
     18 1

But 4 _sets_ of core siblings:

# cat cpu*/topology/core_siblings_list | sort | uniq -c
      9 0-8
      9 18-26
      9 27-35
      9 9-17

After this set, there are only 2 sets of core siblings, which
is what we expect for a 2-socket system.

# cat cpu*/topology/physical_package_id | sort | uniq -c
     18 0
     18 1
# cat cpu*/topology/core_siblings_list | sort | uniq -c
     18 0-17
     18 18-35

Example spew:
...
	NMI watchdog: enabled on all CPUs, permanently consumes one hw-PMU counter.
	 #2  #3  #4  #5  torvalds#6  torvalds#7  torvalds#8
	.... node  #1, CPUs:    torvalds#9
	------------[ cut here ]------------
	WARNING: CPU: 9 PID: 0 at /home/ak/hle/linux-hle-2.6/arch/x86/kernel/smpboot.c:306 topology_sane.isra.2+0x74/0x90()
	sched: CPU torvalds#9's mc-sibling CPU #0 is not on the same node! [node: 1 != 0]. Ignoring dependency.
	Modules linked in:
	CPU: 9 PID: 0 Comm: swapper/9 Not tainted 3.17.0-rc1-00293-g8e01c4d-dirty torvalds#631
	Hardware name: Intel Corporation S2600WTT/S2600WTT, BIOS GRNDSDP1.86B.0036.R05.1407140519 07/14/2014
	0000000000000009 ffff88046ddabe00 ffffffff8172e485 ffff88046ddabe48
	ffff88046ddabe38 ffffffff8109691d 000000000000b001 0000000000000009
	ffff88086fc12580 000000000000b020 0000000000000009 ffff88046ddabe98
	Call Trace:
	[<ffffffff8172e485>] dump_stack+0x45/0x56
	[<ffffffff8109691d>] warn_slowpath_common+0x7d/0xa0
	[<ffffffff8109698c>] warn_slowpath_fmt+0x4c/0x50
	[<ffffffff81074f94>] topology_sane.isra.2+0x74/0x90
	[<ffffffff8107530e>] set_cpu_sibling_map+0x31e/0x4f0
	[<ffffffff8107568d>] start_secondary+0x1ad/0x240
	---[ end trace 3fe5f587a9fcde61 ]---
	torvalds#10 torvalds#11 torvalds#12 torvalds#13 torvalds#14 torvalds#15 torvalds#16 torvalds#17
	.... node  #2, CPUs:   torvalds#18 torvalds#19 torvalds#20 torvalds#21 torvalds#22 torvalds#23 torvalds#24 torvalds#25 torvalds#26
	.... node  #3, CPUs:   torvalds#27 torvalds#28 torvalds#29 torvalds#30 torvalds#31 torvalds#32 torvalds#33 torvalds#34 torvalds#35

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
[ Added LLC domain and s/match_mc/match_die/ ]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: David Rientjes <rientjes@google.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Toshi Kani <toshi.kani@hp.com>
Cc: brice.goglin@gmail.com
Cc: "H. Peter Anvin" <hpa@linux.intel.com>
Link: http://lkml.kernel.org/r/20140918193334.C065EBCE@viggo.jf.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
koct9i referenced this pull request in koct9i/linux Sep 27, 2014
GIT 2cf50762577a191f2f646e4034cc21f4764d90ab

commit 68c7ed167ef564c83a0022f2d0fe15cbdac0c5a4
Author: Peter Foley <pefoley2@pefoley.com>
Date:   Thu Sep 25 10:34:04 2014 +1000

    Documentation: disable vdso_test to avoid breakage with old glibc
    
    glibc versions older than 2.16 don't include sys/auxv.h which this
    executable uses.
    Since we don't have a good way to test for specific glibc versions in
    kbuild, just disable it for now.
    
    Signed-off-by: Peter Foley <pefoley2@pefoley.com>
    Signed-off-by: Randy Dunlap <rdunlap@infradead.org>

commit 03979f1100aca61cf16ac7f3e464d7f11747e86b
Author: Peter Foley <pefoley2@pefoley.com>
Date:   Thu Sep 25 10:34:04 2014 +1000

    Documentation: update vDSO makefile to build portable examples
    
    Signed-off-by: Peter Foley <pefoley2@pefoley.com>
    Signed-off-by: Randy Dunlap <rdunlap@infradead.org>

commit 13770826a9872de445703ee6c05b44f525f78513
Author: Peter Foley <pefoley2@pefoley.com>
Date:   Thu Sep 25 10:34:04 2014 +1000

    Documentation: update .gitignore files
    
    Add some missing files to .gitignore.
    Push Documentation/.gitignore down into subdirectories.
    
    Signed-off-by: Peter Foley <pefoley2@pefoley.com>
    Signed-off-by: Randy Dunlap <rdunlap@infradead.org>

commit ce0f3b2691c066255fcf25c77e929ff8cda73417
Author: Peter Foley <pefoley2@pefoley.com>
Date:   Thu Sep 25 10:34:03 2014 +1000

    Documentation: support glibc versions without htole macros
    
    glibc 2.9 introduced the htole<16/32/64> macros, add them to
    tools/include to support older versions of glibc.
    
    Reported-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Peter Foley <pefoley2@pefoley.com>
    Signed-off-by: Randy Dunlap <rdunlap@infradead.org>

commit 77cb6d96708f5b681969ce144057ae1ebda96057
Author: Mark Brown <broonie@kernel.org>
Date:   Thu Sep 25 10:34:03 2014 +1000

    v4l2-pci-skeleton: Only build if PCI is available
    
    Currently arm64 does not support PCI but it does support v4l2. Since the
    PCI skeleton driver is built unconditionally as a module with no dependency
    on PCI this causes build failures for arm64 allmodconfig. Fix this by
    defining a symbol VIDEO_PCI_SKELETON for the skeleton and conditionalising
    the build on that.
    
    Signed-off-by: Mark Brown <broonie@linaro.org>
    Signed-off-by: Randy Dunlap <rdunlap@infradead.org> [added VIDEO dependencies]

commit 112a45ca901ced590ff71c5fd7d717f8771717ce
Author: Peter Foley <pefoley2@pefoley.com>
Date:   Thu Sep 25 10:34:03 2014 +1000

    Documentation: fix misc. warnings
    
    Fix a few warnings that gcc emits during a default build.
    
    Signed-off-by: Peter Foley <pefoley2@pefoley.com>
    Signed-off-by: Randy Dunlap <rdunlap@infradead.org>

commit b0fc375fe2c09477b5effc8e030668203d24c897
Author: Peter Foley <pefoley2@pefoley.com>
Date:   Thu Sep 25 10:34:03 2014 +1000

    Documentation: make functions static to avoid prototype warnings
    
    Signed-off-by: Peter Foley <pefoley2@pefoley.com>
    Signed-off-by: Randy Dunlap <rdunlap@infradead.org>

commit 593ed472bcea93949c72ddc45e340fee47a05aa3
Author: Peter Foley <pefoley2@pefoley.com>
Date:   Thu Sep 25 10:34:03 2014 +1000

    Documentation: add makefiles for more targets
    
    Add a bunch of previously unbuilt source files to the Documentation build
    machinery.
    
    Signed-off-by: Peter Foley <pefoley2@pefoley.com>
    Signed-off-by: Randy Dunlap <rdunlap@infradead.org>

commit cc67d673f252264535cef654c37c8dd621340e8b
Author: Peter Foley <pefoley2@pefoley.com>
Date:   Thu Sep 25 10:34:02 2014 +1000

    Documentation: use subdir-y to avoid unnecessary built-in.o files
    
    Change the Documentation makefiles from obj-m to subdir-y
    to avoid generating unnecessary built-in.o files since nothing
    in Documentation/ is ever linked in to vmlinux.
    
    Signed-off-by: Peter Foley <pefoley2@pefoley.com>
    Acked-by: Sam Ravnborg <sam@ravnborg.org>
    Signed-off-by: Randy Dunlap <rdunlap@infradead.org>

commit bb69afeca345380203c89d2f19e95eca786b5a4c
Author: Peter Foley <pefoley2@pefoley.com>
Date:   Thu Sep 25 10:34:02 2014 +1000

    Documentation: remove networking/.gitignore
    
    Remove empty networking/.gitignore
    
    Signed-off-by: Peter Foley <pefoley2@pefoley.com>
    Cc: rdunlap@infradead.org
    Cc: linux-doc@vger.kernel.org
    Signed-off-by: Randy Dunlap <rdunlap@infradead.org>

commit 4d96fb1ec81118c6406fe6d3670f172b2faaedf3
Author: Heiko Stuebner <heiko.stuebner@bq.com>
Date:   Tue Sep 23 22:42:16 2014 +0200

    power: gpio-charger: do not use gpio value directly
    
    Some gpio implementations return interesting values for gpio_get_value when
    the value is not 0 - as seen on a imx6sl board. Therefore do not use the
    value returned from gpio_get_value directly but simply check for 0 or not 0.
    
    Signed-off-by: Heiko Stuebner <heiko.stuebner@bq.com>
    Reviewed-by: Doug Anderson <dianders@chromium.org>
    Tested-by: Doug Anderson <dianders@chromium.org>
    Signed-off-by: Sebastian Reichel <sre@kernel.org>

commit ddd26dff757d08d4eb309a28bf2a02372387e71f
Author: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Date:   Tue Sep 16 18:10:41 2014 +0200

    power: max8925: Use of_get_child_by_name
    
    Use of_get_child_by_name to obtain reference to charger node instead of
    of_find_node_by_name which can walk outside of the parent node.
    
    Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
    Signed-off-by: Sebastian Reichel <sre@kernel.org>

commit 920ac5be91bc447c5ef82f457207a169aa79c5f6
Author: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Date:   Tue Sep 16 18:10:40 2014 +0200

    power: max8925: Fix NULL ptr dereference on memory allocation failure
    
    Check the return value of devm_kzalloc() to fix possible NULL pointer
    dereference and properly exit the probe() on memory allocation failure.
    
    Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
    Signed-off-by: Sebastian Reichel <sre@kernel.org>

commit 628ef02c56e515430dd8d8439126dd0ecb8ce8bb
Author: Puthikorn Voravootivat <puthik@chromium.org>
Date:   Tue Sep 9 12:20:35 2014 -0700

    bq27x00_battery: Add support to bq27742
    
    Add support to bq27742 in bq27x00 driver. bq27742 register
    addresses are mostly mostly the same as bq27500 addresses
    with minor differences.
    
    Signed-off-by: Puthikorn Voravootivat <puthik@chromium.org>
    Reviewed-by: Gwendal Grignou <gwendal@chromium.org>
    Reviewed-by: Rhyland Klein <rklein@nvidia.com>
    Reviewed-by: Benson Leung <bleung@chromium.org>
    Signed-off-by: Sebastian Reichel <sre@kernel.org>

commit 042e1c79166b9250edd8262bea84e1703f27ad2e
Author: Jin Yao <yao.jin@linux.intel.com>
Date:   Mon Sep 22 10:31:14 2014 -0700

    Input: soc_button_array - convert to platform bus
    
    ACPI device enumeration mechanism changed a lot since 3.16-rc1.
    ACPI device objects with _HID will be enumerated to platform bus by default.
    For the existing PNP drivers that probe the PNPACPI devices, the device ids
    are listed explicitly in drivers/acpi/acpi_pnp.c.
    But ACPI folks will continue their effort on shrinking this id list by
    converting the PNP drivers to platform drivers, for the devices that don't
    belong to PNP bus in nature.
    
    Signed-off-by: Jin Yao <yao.jin@intel.com>
    Signed-off-by: Zhang Rui <rui.zhang@intel.com>
    Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>

commit 3049683eafdbbbd7350b0e5ca02a2d8c026a3362
Author: Marcos Paulo de Souza <marcos.souza.org@gmail.com>
Date:   Wed Sep 24 16:00:33 2014 -0700

    Input: i8042 - fix Asus X450LCP touchpad detection
    
    We need to add this module to the nomux table to be able to detect the
    touchpad.
    
    Cc: stablevger.kernel.org
    Signed-off-by: Marcos Paulo de Souza <marcos.souza.org@gmail.com>
    Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>

commit 3f3d0463636e38a27c3000d59889b8bdc9072ce1
Author: Javier Martinez Canillas <javier.martinez@collabora.co.uk>
Date:   Thu Sep 25 08:23:26 2014 +0900

    ARM: dts: Remove display timings node from exynos5250-snow
    
    Commit a3d72cad63ed ("ARM: dts: Clean up exynos5250-snow")
    improved the Snow DTS but the patch was rebased due conflicting
    changes and the merge resolution added a device node removed by
    commit a98c3c23868f ("ARM: dts: update display related nodes for
    exynos5250-snow").
    
    This patch removes the node again to keep the DTS as before.
    
    Signed-off-by: Javier Martinez Canillas <javier.martinez@collabora.co.uk>
    Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>

commit dd12ac7fb3988be113465f499ad91b30b8aa4bdd
Author: Javier Martinez Canillas <javier.martinez@collabora.co.uk>
Date:   Thu Sep 25 08:21:25 2014 +0900

    ARM: dts: Fix chip select GPIO on exynos5250-smdk5250
    
    Commit 65bbe3fee0e1 ("ARM: dts: Clean up exynos5250-smdk5250")
    improved the smdk5250 DTS but the patch was rebased due conflicting
    changes and the merge resolution added a regression of the bug fixed
    in commit e138d4333aa0 ("ARM: dts: fix the chip select gpios definition
    in the SPI nodes").
    
    This patch fixes the issue by removing the old cs-gpio and using
    the generic cs-gpios property.
    
    Signed-off-by: Javier Martinez Canillas <javier.martinez@collabora.co.uk>
    Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>

commit 5c4dd348af35a6f6db97b4f2401f74c71f7f3c7d
Author: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Date:   Thu Sep 25 00:53:44 2014 +0200

    Revert "PM / Hibernate: Iterate over set bits instead of PFNs in swsusp_free()"
    
    Revert commit 6efde38f0769 (PM / Hibernate: Iterate over set bits
    instead of PFNs in swsusp_free()) that introduced a NULL pointer
    dereference during system resume from hibernation:
    
    BUG: unable to handle kernel NULL pointer dereference at (null)
    IP: [<ffffffff810a8cc1>] swsusp_free+0x21/0x190
    PGD b39c2067 PUD b39c1067 PMD 0
    Oops: 0000 [#1] SMP
    Modules linked in: <irrelevant list of modules>
    CPU: 1 PID: 4898 Comm: s2disk Tainted: G         C     3.17-rc5-amd64 #1 Debian 3.17~rc5-1~exp1
    Hardware name: LENOVO 2776LEG/2776LEG, BIOS 6EET55WW (3.15 ) 12/19/2011
    task: ffff88023155ea40 ti: ffff8800b3b14000 task.ti: ffff8800b3b14000
    RIP: 0010:[<ffffffff810a8cc1>]  [<ffffffff810a8cc1>]
    swsusp_free+0x21/0x190
    RSP: 0018:ffff8800b3b17ea8  EFLAGS: 00010246
    RAX: 0000000000000000 RBX: ffff8800b39bab00 RCX: 0000000000000001
    RDX: ffff8800b39bab10 RSI: ffff8800b39bab00 RDI: 0000000000000000
    RBP: 0000000000000010 R08: 0000000000000000 R09: 0000000000000000
    R10: ffff8800b39bab10 R11: 0000000000000246 R12: ffffea0000000000
    R13: ffff880232f485a0 R14: ffff88023ac27cd8 R15: ffff880232927590
    FS:  00007f406d83b700(0000) GS:ffff88023bc80000(0000)
    knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
    CR2: 0000000000000000 CR3: 00000000b3a62000 CR4: 00000000000007e0
    Stack:
     ffff8800b39bab00 0000000000000010 ffff880232927590 ffffffff810acb4a
     ffff8800b39bab00 ffffffff811a955a ffff8800b39bab10 0000000000000000
     ffff88023155f098 ffffffff81a6b8c0 ffff88023155ea40 0000000000000007
    Call Trace:
     [<ffffffff810acb4a>] ? snapshot_release+0x2a/0xb0
     [<ffffffff811a955a>] ? __fput+0xca/0x1d0
     [<ffffffff81080627>] ? task_work_run+0x97/0xd0
     [<ffffffff81012d89>] ? do_notify_resume+0x69/0xa0
     [<ffffffff8151452a>] ? int_signal+0x12/0x17
    Code: 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 41 54 48 8b 05 ba 62 9c 00 49 bc 00 00 00 00 00 ea ff ff 48 8b 3d a1 62 9c 00 55 53 <48> 8b 10 48 89 50 18 48 8b 52 20 48 c7 40 28 00 00 00 00 c7 40
    RIP  [<ffffffff810a8cc1>] swsusp_free+0x21/0x190
     RSP <ffff8800b3b17ea8>
    CR2: 0000000000000000
    ---[ end trace f02be86a1ec0cccb ]---
    
    due to forbidden_pages_map being NULL in swsusp_free().
    
    Fixes: 6efde38f0769 "PM / Hibernate: Iterate over set bits instead of PFNs in swsusp_free()"
    Reported-by: Bjørn Mork <bjorn@mork.no>
    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

commit a928f97152ea39c31f8ea89b60dd30a169d4ed6e
Author: Peter Hüwe <PeterHuewe@gmx.de>
Date:   Fri Sep 12 21:09:47 2014 +0200

    i2c: acpi: Fix NULL Pointer dereference
    
    If adapter->dev.parent == NULL there is a NULL pointer dereference in
    acpi_i2c_install_space_handler and acpi_i2c_remove_space_handler.
    
    This is present since introduction of this code:
    366047515c6e "i2c: rework kernel config I2C_ACPI" or even
    da3c6647ee08 "I2C/ACPI: Clean up I2C ACPI code and Add CONFIG_I2C_ACPI"
    
    The adapter->dev.parent == NULL case is valid for the i2c_stub,
    so loading i2c_stub with ACPI_I2C_OPREGION enabled results in an oops.
    This is also valid at least for i2c_tiny_usb and i2c_robotfuzz_osif.
    
    Fix by checking whether it is null before calling ACPI_HANDLE.
    
    Signed-off-by: Peter Huewe <peterhuewe@gmx.de>
    Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
    Signed-off-by: Wolfram Sang <wsa@the-dreams.de>

commit c15d821ddb9dac9ac6b5beb75bf942f3bc3a4004
Author: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Date:   Tue Sep 23 10:35:54 2014 +0800

    gpio / ACPI: Use pin index and bit length
    
    Fix code when the operation region callback is for an gpio, which
    is not at index 0 and for partial pins in a GPIO definition.
    For example:
    Name (GMOD, ResourceTemplate ()
    {
    	//3 Outputs that define the Power mode of the device
    	GpioIo (Exclusive, PullDown, , , , "\\_SB.GPI2") {10, 11, 12}
    	})
    }
    
    If opregion callback calls is for:
    - Set pin 10, then address = 0 and bit length = 1
    - Set pin 11, then address = 1 and bit length = 1
    - Set for both pin 11 and pin 12, then address = 1, bit length = 2
    
    This change requires updated ACPICA gpio operation handler code to
    send the pin index and bit length.
    
    Fixes: 473ed7be0da0 (gpio / ACPI: Add support for ACPI GPIO operation regions)
    Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
    Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
    Acked-by: Linus Walleij <linus.walleij@linaro.org>
    Cc: 3.15+ <stable@vger.kernel.org> # 3.15+: 75ec6e55f138 ACPICA: Update to GPIO region handler interface.
    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

commit 75ec6e55f1384548311a13ce4fcb39c516053314
Author: Bob Moore <Robert.Moore@intel.com>
Date:   Tue Sep 23 10:35:47 2014 +0800

    ACPICA: Update to GPIO region handler interface.
    
    Changes to correct several GPIO issues:
    
    1) The update_rule in a GPIO field definition is now ignored;
    a read-modify-write operation is never performed for GPIO fields.
    (Internally, this means that the field assembly/disassembly
    code is completely bypassed for GPIO.)
    
    2) The Address parameter passed to a GPIO region handler is
    now the bit offset of the field from a previous Connection()
    operator. Thus, it becomes a "Pin Number Index" into the
    Connection() resource descriptor.
    
    3) The bit_width parameter passed to a GPIO region handler is
    now the exact bit width of the GPIO field. Thus, it can be
    interpreted as "number of pins".
    
    Overall, we can now say that the region handler interface
    to GPIO handlers is a raw "bit/pin" addressed interface, not
    a byte-addressed interface like the system_memory handler interface.
    
    Signed-off-by: Bob Moore <robert.moore@intel.com>
    Signed-off-by: Lv Zheng <lv.zheng@intel.com>
    Cc: 3.15+ <stable@vger.kernel.org> # 3.15+
    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

commit 457920817e645a7dee42c2a75c81c5ed8e12ee1c
Author: Fu Zhonghui <zhonghui.fu@linux.intel.com>
Date:   Wed Sep 24 22:42:26 2014 +0200

    ACPI / platform / LPSS: disable async suspend/resume of LPSS devices
    
    On some systems (Asus T100 in particular) there are strict ordering
    dependencies between LPSS devices with respect to power management
    that break if they suspend/resume asynchronously.
    
    In theory it should be possible to follow those dependencies in the
    async suspend/resume case too (the ACPI tables tell as that the
    dependencies are there), but since we're missing infrastructure
    for that at the moment, disable async suspend/resume for all of
    the LPSS devices for the time being.
    
    Link: http://marc.info/?l=linux-acpi&m=141158962321905&w=2
    Fixes: 8ce62f85a81f (ACPI / platform / LPSS: Enable async suspend/resume of LPSS devices)
    Signed-off-by: Li Aubrey <aubrey.li@linux.intel.com>
    Signed-off-by: Fu Zhonghui <zhonghui.fu@linux.intel.com>
    Cc: 3.16+ <stable@vger.kernel.org> # 3.16+
    [ rjw: Changelog ]
    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

commit c7da579763f29cf45a861ad4c339aba590d8b80d
Author: Johan Hedberg <johan.hedberg@intel.com>
Date:   Wed Sep 24 22:41:46 2014 +0300

    Bluetooth: Add retransmission effort into SCO parameter table
    
    It is expected that new parameter combinations will have the
    retransmission effort value different between some entries (mainly
    because of the new S4 configuration added by HFP 1.7), so it makes sense
    to move it into the table instead of having it hard coded based on the
    selected SCO_AIRMODE_*.
    
    Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
    Signed-off-by: Marcel Holtmann <marcel@holtmann.org>

commit b2fc3f3c6d397d434174147eca3db1ec778195ce
Author: Olof Johansson <olof@lixom.net>
Date:   Wed Sep 24 11:42:38 2014 -0700

    drivers/soc: ti: fix build break with modules
    
    Fixes below build break by not switching to stubs when the driver is a module:
    
    drivers/soc/ti/knav_dma.c:418:7: error: redefinition of 'knav_dma_open_channel'
     void *knav_dma_open_channel(struct device *dev, const char *name,
           ^
    In file included from drivers/soc/ti/knav_dma.c:26:0:
    include/linux/soc/ti/knav_dma.h:165:21: note: previous definition of 'knav_dma_open_channel' was here
     static inline void *knav_dma_open_channel(struct device *dev, const char *name,
                         ^
    
    Cc: Santosh Shilimkar <santosh.shilimkar@ti.com>
    Signed-off-by: Olof Johansson <olof@lixom.net>

commit 2d9251e3501356ceb44444a8f9a393b57163dc6a
Author: Matthias Brugger <matthias.bgg@gmail.com>
Date:   Mon Aug 18 16:58:00 2014 +0200

    ARM: multi_v7_defconfig: Enable Mediatek platform
    
    Enable Mediatek platform support for multi_v7_defconfig.
    
    Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com>
    Signed-off-by: Olof Johansson <olof@lixom.net>

commit d66820853251e8a9b53125a95a773e482cd79136
Author: Matthias Brugger <matthias.bgg@gmail.com>
Date:   Mon Aug 18 16:58:00 2014 +0200

    ARM: mediatek: Add earlyprintk support for mt6589
    
    Enable low-level debug for Mediatek mt6589 SoC on UART0.
    
    Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com>
    Signed-off-by: Olof Johansson <olof@lixom.net>

commit 7e9b2828f25ec156623da0c2156604066de5514d
Author: Matthias Brugger <matthias.bgg@gmail.com>
Date:   Mon Aug 18 16:58:00 2014 +0200

    ARM: dts: mt6589: Change compatible string for GIC
    
    This patch changes the compatible string of the GIC to the
    new "arm,cortex-a7-gic" which does reflect the actual hardware.
    
    Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com>
    Signed-off-by: Olof Johansson <olof@lixom.net>

commit 6e9cb2633698ddadd2493b3793dbc9723f570538
Author: Matthias Brugger <matthias.bgg@gmail.com>
Date:   Mon Aug 18 16:58:00 2014 +0200

    ARM: dts: mediatek: Add compatible property for aquaris5
    
    Add the missing 'compatible' property to device tree root node of
    
     - mt6589-aquaris5.dts
    
    and document the new values.
    
    Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com>
    Signed-off-by: Olof Johansson <olof@lixom.net>

commit d82df11466df3e0934c7e7aa2f5e08c284e1fd9d
Author: Matthias Brugger <matthias.bgg@gmail.com>
Date:   Mon Aug 18 16:58:00 2014 +0200

    ARM: dts: mt6589-aquaris5: Add boot argument earlyprintk
    
    Add boot argument for earlyprintk to the aquaris5 device tree file.
    
    Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com>
    Signed-off-by: Olof Johansson <olof@lixom.net>

commit 510f1d72e526e776243397142cbcd459dd2a2efa
Author: Matthias Brugger <matthias.bgg@gmail.com>
Date:   Mon Aug 18 16:58:00 2014 +0200

    ARM: dts: mt6589: Fix typo in GIC unit address
    
    This changes the unit address of the gic node to it's first register area.
    
    Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com>
    Signed-off-by: Olof Johansson <olof@lixom.net>

commit 995425883e4087a4bfd61d12e442089d1201fc5c
Author: Matthias Brugger <matthias.bgg@gmail.com>
Date:   Mon Aug 18 16:58:00 2014 +0200

    ARM: dts: Build dtb for Mediatek board
    
    This allows the "make dtbs" to build the aquaris5 dtb for the Mediatek
    SoC.
    
    Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com>
    Signed-off-by: Olof Johansson <olof@lixom.net>

commit 00e978180b1fa211c30642139149ea448ff06c55
Author: Pawel Moll <pawel.moll@arm.com>
Date:   Mon Sep 15 15:33:48 2014 +0100

    bus: arm-ccn: Fix spurious warning message
    
    Because CCN's cycle counter always runs, it will generate
    an interrupt on overflow even if the relevant perf event
    was not requested, causing a spurious warning message.
    
    Fixed now by warning on only normal counter unwanted
    overflows. Also cleaning the overflow mask at init now,
    not to warn on event previously requested by firmware.
    
    Signed-off-by: Pawel Moll <pawel.moll@arm.com>
    Signed-off-by: Olof Johansson <olof@lixom.net>

commit 94e57fea62020dbf6e5d0093eabcd28366e86044
Author: Francesco Ruggeri <fruggeri@arista.com>
Date:   Wed Sep 24 10:12:41 2014 -0700

    PCI: Move PCI_VENDOR_ID_VMWARE to pci_ids.h
    
    Move PCI_VENDOR_ID_VMWARE from device-specific files to pci_ids.h.
    It is useful to always have access to it, especially when accessing
    subsystem_vendor_id on emulated devices.
    
    [bhelgaas: keep pci_ids.h sorted and use lower-case hex]
    Signed-off-by: Francesco Ruggeri <fruggeri@arista.com>
    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

commit 17497acbdce9506fd6a75115dee4ab80c3cc5ee5
Author: Tejun Heo <tj@kernel.org>
Date:   Wed Sep 24 13:31:50 2014 -0400

    blk-mq, percpu_ref: start q->mq_usage_counter in atomic mode
    
    blk-mq uses percpu_ref for its usage counter which tracks the number
    of in-flight commands and used to synchronously drain the queue on
    freeze.  percpu_ref shutdown takes measureable wallclock time as it
    involves a sched RCU grace period.  This means that draining a blk-mq
    takes measureable wallclock time.  One would think that this shouldn't
    matter as queue shutdown should be a rare event which takes place
    asynchronously w.r.t. userland.
    
    Unfortunately, SCSI probing involves synchronously setting up and then
    tearing down a lot of request_queues back-to-back for non-existent
    LUNs.  This means that SCSI probing may take above ten seconds when
    scsi-mq is used.
    
      [    0.949892] scsi host0: Virtio SCSI HBA
      [    1.007864] scsi 0:0:0:0: Direct-Access     QEMU     QEMU HARDDISK    1.1. PQ: 0 ANSI: 5
      [    1.021299] scsi 0:0:1:0: Direct-Access     QEMU     QEMU HARDDISK    1.1. PQ: 0 ANSI: 5
      [    1.520356] tsc: Refined TSC clocksource calibration: 2491.910 MHz
    
      <stall>
    
      [   16.186549] sd 0:0:0:0: Attached scsi generic sg0 type 0
      [   16.190478] sd 0:0:1:0: Attached scsi generic sg1 type 0
      [   16.194099] osd: LOADED open-osd 0.2.1
      [   16.203202] sd 0:0:0:0: [sda] 31457280 512-byte logical blocks: (16.1 GB/15.0 GiB)
      [   16.208478] sd 0:0:0:0: [sda] Write Protect is off
      [   16.211439] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
      [   16.218771] sd 0:0:1:0: [sdb] 31457280 512-byte logical blocks: (16.1 GB/15.0 GiB)
      [   16.223264] sd 0:0:1:0: [sdb] Write Protect is off
      [   16.225682] sd 0:0:1:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
    
    This is also the reason why request_queues start in bypass mode which
    is ended on blk_register_queue() as shutting down a fully functional
    queue also involves a RCU grace period and the queues for non-existent
    SCSI devices never reach registration.
    
    blk-mq basically needs to do the same thing - start the mq in a
    degraded mode which is faster to shut down and then make it fully
    functional only after the queue reaches registration.  percpu_ref
    recently grew facilities to force atomic operation until explicitly
    switched to percpu mode, which can be used for this purpose.  This
    patch makes blk-mq initialize q->mq_usage_counter in atomic mode and
    switch it to percpu mode only once blk_register_queue() is reached.
    
    Note that this issue was previously worked around by 0a30288da1ae
    ("blk-mq, percpu_ref: implement a kludge for SCSI blk-mq stall during
    probe") for v3.17.  The temp fix was reverted in preparation of adding
    persistent atomic mode to percpu_ref by 9eca80461a45 ("Revert "blk-mq,
    percpu_ref: implement a kludge for SCSI blk-mq stall during probe"").
    This patch and the prerequisite percpu_ref changes will be merged
    during v3.18 devel cycle.
    
    Signed-off-by: Tejun Heo <tj@kernel.org>
    Reported-by: Christoph Hellwig <hch@infradead.org>
    Link: http://lkml.kernel.org/g/20140919113815.GA10791@lst.de
    Fixes: add703fda981 ("blk-mq: use percpu_ref for mq usage count")
    Reviewed-by: Kent Overstreet <kmo@daterainc.com>
    Cc: Jens Axboe <axboe@kernel.dk>
    Cc: Johannes Weiner <hannes@cmpxchg.org>

commit e4de4d412fc0d07f820906dcd70ede06dad312e2
Author: Olof Johansson <olof@lixom.net>
Date:   Wed Sep 24 10:34:06 2014 -0700

    ARM: at91: Fix bad conflict resolution in board-dt-sama5.c
    
    Signed-off-by: Olof Johansson <olof@lixom.net>

commit 3c9703a87bdda1fef221ab0957a3ec2a2126f129
Author: Olof Johansson <olof@lixom.net>
Date:   Wed Sep 24 10:32:20 2014 -0700

    Revert "ARM: make arrays containing machine compatible strings const"
    
    I dropped it from next/cleanup, doing a revert on for-next instead of
    rebuilding the branch.
    
    This reverts commit fbce9bc876ab2c905f1f82a78bc25745e98760d9.

commit 1cae13e75b7a7848c03138636d4eb8d8a5054dd5
Author: Tejun Heo <tj@kernel.org>
Date:   Wed Sep 24 13:31:50 2014 -0400

    percpu_ref: make INIT_ATOMIC and switch_to_atomic() sticky
    
    Currently, a percpu_ref which is initialized with
    PERPCU_REF_INIT_ATOMIC or switched to atomic mode via
    switch_to_atomic() automatically reverts to percpu mode on the first
    percpu_ref_reinit().  This makes the atomic mode difficult to use for
    cases where a percpu_ref is used as a persistent on/off switch which
    may be cycled multiple times.
    
    This patch makes such atomic state sticky so that it survives through
    kill/reinit cycles.  After this patch, atomic state is cleared only by
    an explicit percpu_ref_switch_to_percpu() call.
    
    Signed-off-by: Tejun Heo <tj@kernel.org>
    Reviewed-by: Kent Overstreet <kmo@daterainc.com>
    Cc: Jens Axboe <axboe@kernel.dk>
    Cc: Christoph Hellwig <hch@infradead.org>
    Cc: Johannes Weiner <hannes@cmpxchg.org>

commit 2aad2a86f6685c10360ec8a5a55eb9ab7059cb72
Author: Tejun Heo <tj@kernel.org>
Date:   Wed Sep 24 13:31:50 2014 -0400

    percpu_ref: add PERCPU_REF_INIT_* flags
    
    With the recent addition of percpu_ref_reinit(), percpu_ref now can be
    used as a persistent switch which can be turned on and off repeatedly
    where turning off maps to killing the ref and waiting for it to drain;
    however, there currently isn't a way to initialize a percpu_ref in its
    off (killed and drained) state, which can be inconvenient for certain
    persistent switch use cases.
    
    Similarly, percpu_ref_switch_to_atomic/percpu() allow dynamic
    selection of operation mode; however, currently a newly initialized
    percpu_ref is always in percpu mode making it impossible to avoid the
    latency overhead of switching to atomic mode.
    
    This patch adds @flags to percpu_ref_init() and implements the
    following flags.
    
    * PERCPU_REF_INIT_ATOMIC	: start ref in atomic mode
    * PERCPU_REF_INIT_DEAD		: start ref killed and drained
    
    These flags should be able to serve the above two use cases.
    
    v2: target_core_tpg.c conversion was missing.  Fixed.
    
    Signed-off-by: Tejun Heo <tj@kernel.org>
    Reviewed-by: Kent Overstreet <kmo@daterainc.com>
    Cc: Jens Axboe <axboe@kernel.dk>
    Cc: Christoph Hellwig <hch@infradead.org>
    Cc: Johannes Weiner <hannes@cmpxchg.org>

commit f47ad45784611297b699f3dffb6c7222b76afe64
Author: Tejun Heo <tj@kernel.org>
Date:   Wed Sep 24 13:31:49 2014 -0400

    percpu_ref: decouple switching to percpu mode and reinit
    
    percpu_ref has treated the dropping of the base reference and
    switching to atomic mode as an integral operation; however, there's
    nothing inherent tying the two together.
    
    The use cases for percpu_ref have been expanding continuously.  While
    the current init/kill/reinit/exit model can cover a lot, the coupling
    of kill/reinit with atomic/percpu mode switching is turning out to be
    too restrictive for use cases where many percpu_refs are created and
    destroyed back-to-back with only some of them reaching extended
    operation.  The coupling also makes implementing always-atomic debug
    mode difficult.
    
    This patch separates out percpu mode switching into
    percpu_ref_switch_to_percpu() and reimplements percpu_ref_reinit() on
    top of it.
    
    * DEAD still requires ATOMIC.  A dead ref can't be switched to percpu
      mode w/o going through reinit.
    
    v2: __percpu_ref_switch_to_percpu() was missing static.  Fixed.
        Reported by Fengguang aka kbuild test robot.
    
    Signed-off-by: Tejun Heo <tj@kernel.org>
    Reviewed-by: Kent Overstreet <kmo@daterainc.com>
    Cc: Jens Axboe <axboe@kernel.dk>
    Cc: Christoph Hellwig <hch@infradead.org>
    Cc: Johannes Weiner <hannes@cmpxchg.org>
    Cc: kbuild test robot <fengguang.wu@intel.com>

commit 490c79a65708873228cf114cf00e32c204e4e907
Author: Tejun Heo <tj@kernel.org>
Date:   Wed Sep 24 13:31:49 2014 -0400

    percpu_ref: decouple switching to atomic mode and killing
    
    percpu_ref has treated the dropping of the base reference and
    switching to atomic mode as an integral operation; however, there's
    nothing inherent tying the two together.
    
    The use cases for percpu_ref have been expanding continuously.  While
    the current init/kill/reinit/exit model can cover a lot, the coupling
    of kill/reinit with atomic/percpu mode switching is turning out to be
    too restrictive for use cases where many percpu_refs are created and
    destroyed back-to-back with only some of them reaching extended
    operation.  The coupling also makes implementing always-atomic debug
    mode difficult.
    
    This patch separates out atomic mode switching into
    percpu_ref_switch_to_atomic() and reimplements
    percpu_ref_kill_and_confirm() on top of it.
    
    * The handling of __PERCPU_REF_ATOMIC and __PERCPU_REF_DEAD is now
      differentiated.  Among get/put operations, percpu_ref_tryget_live()
      is the only one which cares about DEAD.
    
    * percpu_ref_switch_to_atomic() can be called multiple times on the
      same ref.  This means that multiple @confirm_switch may get queued
      up which we can't do reliably without extra memory area.  This is
      handled by making the later invocation synchronously wait for the
      completion of the previous one.  This isn't particularly desirable
      but such synchronous waits shouldn't happen in most cases.
    
    Signed-off-by: Tejun Heo <tj@kernel.org>
    Reviewed-by: Kent Overstreet <kmo@daterainc.com>
    Cc: Jens Axboe <axboe@kernel.dk>
    Cc: Christoph Hellwig <hch@infradead.org>
    Cc: Johannes Weiner <hannes@cmpxchg.org>

commit 27344a9017cdaff82a167827da3001a0918afdc3
Author: Tejun Heo <tj@kernel.org>
Date:   Wed Sep 24 13:31:49 2014 -0400

    percpu_ref: add PCPU_REF_DEAD
    
    percpu_ref will be restructured so that percpu/atomic mode switching
    and reference killing are dedoupled.  In preparation, add
    PCPU_REF_DEAD and PCPU_REF_ATOMIC_DEAD which is OR of ATOMIC and DEAD.
    For now, ATOMIC and DEAD are changed together and all PCPU_REF_ATOMIC
    uses are converted to PCPU_REF_ATOMIC_DEAD without causing any
    behavior changes.
    
    percpu_ref_init() now specifies an explicit alignment when allocating
    the percpu counters so that the pointer has enough unused low bits to
    accomodate the flags.  Note that one flag was fine as min alignment
    for percpu memory is 2 bytes but two flags are already too many for
    the natural alignment of unsigned longs on archs like cris and m68k.
    
    v2: The original patch had BUILD_BUG_ON() which triggers if unsigned
        long's alignment isn't enough to accomodate the flags, which
        triggered on cris and m64k.  percpu_ref_init() updated to specify
        the required alignment explicitly.  Reported by Fengguang.
    
    Signed-off-by: Tejun Heo <tj@kernel.org>
    Reviewed-by: Kent Overstreet <kmo@daterainc.com>
    Cc: kbuild test robot <fengguang.wu@intel.com>

commit 9e804d1f58da1eca079f796347c1cf1d1df564e2
Author: Tejun Heo <tj@kernel.org>
Date:   Wed Sep 24 13:31:48 2014 -0400

    percpu_ref: rename things to prepare for decoupling percpu/atomic mode switch
    
    percpu_ref will be restructured so that percpu/atomic mode switching
    and reference killing are dedoupled.  In preparation, do the following
    renames.
    
    * percpu_ref->confirm_kill	-> percpu_ref->confirm_switch
    * __PERCPU_REF_DEAD		-> __PERCPU_REF_ATOMIC
    * __percpu_ref_alive()		-> __ref_is_percpu()
    
    This patch is pure rename and doesn't introduce any functional
    changes.
    
    Signed-off-by: Tejun Heo <tj@kernel.org>
    Reviewed-by: Kent Overstreet <kmo@daterainc.com>

commit eecc16ba9a49b05dd847a317af166a6728eb56ca
Author: Tejun Heo <tj@kernel.org>
Date:   Wed Sep 24 13:31:48 2014 -0400

    percpu_ref: replace pcpu_ prefix with percpu_
    
    percpu_ref uses pcpu_ prefix for internal stuff and percpu_ for
    externally visible ones.  This is the same convention used in the
    percpu allocator implementation.  It works fine there but percpu_ref
    doesn't have too much internal-only stuff and scattered usages of
    pcpu_ prefix are confusing than helpful.
    
    This patch replaces all pcpu_ prefixes with percpu_.  This is pure
    rename and there's no functional change.  Note that PCPU_REF_DEAD is
    renamed to __PERCPU_REF_DEAD to signify that the flag is internal.
    
    Signed-off-by: Tejun Heo <tj@kernel.org>
    Reviewed-by: Kent Overstreet <kmo@daterainc.com>

commit 6251f9976af7656b6970a8820153f356430f5de2
Author: Tejun Heo <tj@kernel.org>
Date:   Wed Sep 24 13:31:48 2014 -0400

    percpu_ref: minor code and comment updates
    
    * Some comments became stale.  Updated.
    * percpu_ref_tryget() unnecessarily initializes @ret.  Removed.
    * A blank line removed from percpu_ref_kill_rcu().
    * Explicit function name in a WARN format string replaced with __func__.
    * WARN_ON() in percpu_ref_reinit() converted to WARN_ON_ONCE().
    
    Signed-off-by: Tejun Heo <tj@kernel.org>
    Reviewed-by: Kent Overstreet <kmo@daterainc.com>

commit a2237370194484ee6aeeff04b617e4b14d178966
Author: Tejun Heo <tj@kernel.org>
Date:   Wed Sep 24 13:31:48 2014 -0400

    percpu_ref: relocate percpu_ref_reinit()
    
    percpu_ref is gonna go through restructuring.  Move
    percpu_ref_reinit() after percpu_ref_kill_and_confirm().  This will
    make later changes easier to follow and result in cleaner
    organization.
    
    Signed-off-by: Tejun Heo <tj@kernel.org>
    Reviewed-by: Kent Overstreet <kmo@daterainc.com>

commit 1aafa57340c6d906a285d7823e0fe68696c1ae07
Author: Wei Xu <xuwei5@hisilicon.com>
Date:   Wed Sep 24 17:07:48 2014 +0800

    ARM: hisi: Fix platmcpm compilation when ARMv6 is selected
    
    When compiling with "ARCH=arm" and "allmodconfig",
    with commit: 9cdc99919a95e8b54c1998b65bb1bfdabd47d27b [2/7] ARM: hisi: enable MCPM implementation
    we will get:
    
       /tmp/cc6DjYjT.s: Assembler messages:
       /tmp/cc6DjYjT.s:63: Error: selected processor does not support ARM mode `ubfx r1,r0,#8,#8'
       /tmp/cc6DjYjT.s:761: Error: selected processor does not support ARM mode `isb '
       /tmp/cc6DjYjT.s:762: Error: selected processor does not support ARM mode `dsb '
       /tmp/cc6DjYjT.s:769: Error: selected processor does not support ARM mode `isb '
       /tmp/cc6DjYjT.s:775: Error: selected processor does not support ARM mode `isb '
       /tmp/cc6DjYjT.s:776: Error: selected processor does not support ARM mode `dsb '
       /tmp/cc6DjYjT.s:795: Error: selected processor does not support ARM mode `isb '
       /tmp/cc6DjYjT.s:801: Error: selected processor does not support ARM mode `isb '
       /tmp/cc6DjYjT.s:802: Error: selected processor does not support ARM mode `dsb '
    
    Fix platmcpm compilation when ARMv6 is selected.
    
    Signed-off-by: Wei Xu <xuwei5@hisilicon.com>
    Signed-off-by: Olof Johansson <olof@lixom.net>

commit 495aa2e1facebf770224c33aae2a01541889e54e
Author: Liviu Dudau <Liviu.Dudau@arm.com>
Date:   Tue Sep 23 20:01:14 2014 +0100

    arm64: Add architectural support for PCI
    
    Use the generic PCI domain and OF functions to provide support for PCI
    on arm64.
    
    [bhelgaas: Change comments to use generic PCI, not just PCIe.  Nothing at
    this level is PCIe-specific.]
    Signed-off-by: Liviu Dudau <Liviu.Dudau@arm.com>
    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
    Acked-by: Catalin Marinas <catalin.marinas@arm.com>

commit b766eafe68281a12169f9eb8c06fd80d2e7897d7
Author: Liviu Dudau <Liviu.Dudau@arm.com>
Date:   Tue Sep 23 20:01:13 2014 +0100

    PCI: Add pci_remap_iospace() to map bus I/O resources
    
    Add pci_remap_iospace() to map bus I/O resources into the CPU virtual
    address space.  Architectures with special needs may provide their own
    version, but most should be able to use this one.
    
    This function is useful for PCI host bridge drivers that need to map the
    PCI I/O resources into virtual memory space.
    
    [bhelgaas: phys_addr description, drop temporary "err" variable]
    Signed-off-by: Liviu Dudau <Liviu.Dudau@arm.com>
    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
    Reviewed-by: Rob Herring <robh@kernel.org>
    Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
    CC: Arnd Bergmann <arnd@arndb.de>

commit 5a4f662d44116022e3378b32a1a79b718e110096
Author: Liviu Dudau <Liviu.Dudau@arm.com>
Date:   Tue Sep 23 20:01:12 2014 +0100

    PCI: Assign unassigned bus resources in pci_scan_root_bus()
    
    If the firmware has not assigned all the bus resources and we are not just
    probing the PCI buses, it makes sense to assign the unassigned resources
    in pci_scan_root_bus().
    
    Signed-off-by: Liviu Dudau <Liviu.Dudau@arm.com>
    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
    CC: Arnd Bergmann <arnd@arndb.de>
    CC: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
    CC: Rob Herring <robh+dt@kernel.org>

commit 07428b62addb624a3b51b6e8a213fec9b97dbfd0
Author: Liviu Dudau <Liviu.Dudau@arm.com>
Date:   Wed Sep 24 11:27:33 2014 -0600

    of/pci: Add support for parsing PCI host bridge resources from DT
    
    Provide a function to parse the PCI DT ranges that can be used to create a
    pci_host_bridge structure together with its associated bus.
    
    Signed-off-by: Liviu Dudau <Liviu.Dudau@arm.com>
    [make io_base parameter optional]
    Signed-off-by: Robert Richter <rrichter@cavium.com>
    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
    CC: Arnd Bergmann <arnd@arndb.de>
    CC: Grant Likely <grant.likely@linaro.org>
    CC: Rob Herring <robh+dt@kernel.org>
    CC: Catalin Marinas <catalin.marinas@arm.com>

commit 9eca80461a45177e456219a9cd944c27675d6512
Author: Tejun Heo <tj@kernel.org>
Date:   Wed Sep 24 13:07:33 2014 -0400

    Revert "blk-mq, percpu_ref: implement a kludge for SCSI blk-mq stall during probe"
    
    This reverts commit 0a30288da1aec914e158c2d7a3482a85f632750f, which
    was a temporary fix for SCSI blk-mq stall issue.  The following
    patches will fix the issue properly by introducing atomic mode to
    percpu_ref.
    
    Signed-off-by: Tejun Heo <tj@kernel.org>
    Cc: Kent Overstreet <kmo@daterainc.com>
    Cc: Jens Axboe <axboe@kernel.dk>
    Cc: Christoph Hellwig <hch@lst.de>

commit 83417b202bf5aa2cd82637aa81aa9f065cfe5f21
Author: Wolfram Sang <wsa@the-dreams.de>
Date:   Mon Sep 22 19:41:00 2014 +0200

    i2c: move acpi code back into the core
    
    Commit 5d98e61d337c ("I2C/ACPI: Add i2c ACPI operation region support")
    renamed the i2c-core module. This may cause regressions for
    distributions, so put the ACPI code back into the core.
    
    Reported-by: Jean Delvare <jdelvare@suse.de>
    Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
    Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
    Cc: Lan Tianyu <tianyu.lan@intel.com>

commit 964356938fcd3c0001a786f55b9f0a0fbe47656a
Author: Andreas Werner <andreas.werner@men.de>
Date:   Wed Aug 27 19:53:06 2014 +0200

    hwmon: (menf21bmc) Introduce MEN14F021P00 BMC HWMON driver
    
    Added driver to support the 14F021P00 BMC Hardware Monitoring.
    The BMC is a Board Management Controller including monitoring of the
    board voltages.
    
    Signed-off-by: Andreas Werner <andreas.werner@men.de>
    Reviewed-by: Guenter Roeck <linux@roeck-us.net>
    Signed-off-by: Lee Jones <lee.jones@linaro.org>

commit 38433639af915deeb0b0e28462dd740ce57b72fd
Author: Andreas Werner <andreas.werner@men.de>
Date:   Wed Aug 27 19:52:36 2014 +0200

    leds: leds-menf21bmc: Introduce MEN 14F021P00 BMC LED driver
    
    Added driver to support the 14F021P00 BMC LEDs.
    The BMC is a Board Management Controller including four LEDs which
    can be switched on and off.
    
    Signed-off-by: Andreas Werner <andreas.werner@men.de>
    Acked-by: Bryan Wu <cooloney@gmail.com>
    Signed-off-by: Lee Jones <lee.jones@linaro.org>

commit 5033263992eece84e19946d2cab940c86ec862ba
Author: Andreas Werner <andreas.werner@men.de>
Date:   Wed Aug 27 19:52:06 2014 +0200

    watchdog: menf21bmc_wdt: Introduce MEN 14F021P00 BMC Watchdog driver
    
    Added driver to support the 14F021P00 BMC Watchdog.
    The BMC is a Board Management Controller including watchdog functionality.
    
    Signed-off-by: Andreas Werner <andreas.werner@men.de>
    Reviewed-by: Guenter Roeck <linux@roeck-us.net>
    Signed-off-by: Lee Jones <lee.jones@linaro.org>

commit d6cc1f5824cbca392d099f3bb0c441efd9e54de9
Author: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Date:   Fri Sep 12 08:54:00 2014 +0200

    Documentation: charger: max14577: Document exported sysfs entry
    
    Document the 'fast charge timer' setting exported by max14577 driver
    through sysfs entry.
    
    Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
    Acked-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Lee Jones <lee.jones@linaro.org>

commit 8d70d68d7a1b3082ca5a3808be18103a83ae348d
Author: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Date:   Fri Sep 12 08:53:59 2014 +0200

    devicetree: mfd: max14577: Add device tree bindings document
    
    Add document describing device tree bindings for MAX14577 MFD
    drivers: MFD core, extcon, regulator and charger.
    
    Both MAX14577 and MAX77836 chipsets are documented.
    
    Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
    Reviewed-by: Tomasz Figa <t.figa@samsung.com>
    Acked-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Lee Jones <lee.jones@linaro.org>

commit 2c33e9296202cd11bf2e2f801b69ffba0953748a
Author: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Date:   Fri Sep 12 08:53:58 2014 +0200

    power: max17040: Add ID for MAX77836 Fuel Gauge block
    
    MAX77836 has the same Fuel Gauge as MAX17040/17048. The max17040 driver
    can be safely re-used. The patch adds MAX77836 device to the array of
    i2c_device_id. Additionally it removes the id associated with MAX17040
    device as the value is not used.
    
    Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
    Acked-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Lee Jones <lee.jones@linaro.org>

commit e30110e9c96f48aea01abc3e6dfadb369cbafec3
Author: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Date:   Fri Sep 12 08:53:57 2014 +0200

    charger: max14577: Configure battery-dependent settings from DTS and sysfs
    
    Remove hard-coded values for:
     - Fast Charge current,
     - End Of Charge current,
     - Fast Charge timer,
     - Overvoltage Protection Threshold,
     - Battery Constant Voltage,
    and use DTS or sysfs to configure them. This allows using the max14577 charger
    driver with different batteries.
    
    Now the charger driver requires valid configuration data from DTS. In
    case of wrong configuration data it fails during probe.
    
    The fast charge timer is configured through sysfs entry.
    
    Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
    Acked-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Lee Jones <lee.jones@linaro.org>

commit b8f139f68f2099b7f8b4ef470a1e53210e3aa025
Author: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Date:   Fri Sep 12 08:53:56 2014 +0200

    regulator/mfd: max14577: Export symbols for calculating charger current
    
    This patch prepares for changing the max14577 charger driver to allow
    configuring battery-dependent settings from DTS.
    
    The patch moves from regulator driver to MFD core driver and exports:
     - function for calculating register value for charger's current;
     - table of limits for chargers (MAX14577, MAX77836).
    
    Previously they were used only by the max14577 regulator driver. In next
    patch the charger driver will use them as well. Exporting them will
    reduce unnecessary code duplication.
    
    Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
    Acked-by: Mark Brown <broonie@linaro.org>
    Acked-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Lee Jones <lee.jones@linaro.org>

commit 3682a8ee87f9107253e51733f42da10160ce41e3
Author: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Date:   Fri Sep 12 08:53:55 2014 +0200

    charger: max14577: Add support for MAX77836 charger
    
    Add support for MAX77836 charger to the max14577 driver. The MAX77836
    charger is almost the same as 14577 model except:
     - No dead-battery detection;
     - Support for special charger (like in MAX77693);
     - Support for DX over-voltage protection (like in MAX77693);
     - Lower values of charging current (two times lower current for
       slow/fast charge, much lower EOC current);
     - Slightly different values in ChgTyp field of STATUS2 register. On
       MAX14577 0x6 is reserved and 0x7 dead battery. On the MAX77836 the
       0x6 means special charger and 0x7 is reserved. Regardless of these
       differences the driver maps them to one enum max14577_muic_charger_type.
    
    Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
    Acked-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Lee Jones <lee.jones@linaro.org>

commit 2f4096e311ef0922c42cbf7bc5df44efb3aff716
Author: Quentin Lambert <lambert.quentin@gmail.com>
Date:   Sun Sep 7 20:04:28 2014 +0200

    PCI: Remove assignment from complicated "if" conditions
    
    The modifications effectively change the value of len_tmp
    in the case where the first condition is not met.
    
    Signed-off-by: Quentin Lambert <lambert.quentin@gmail.com>
    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

commit 79e50e72986c9fcb06d707ce587cfd24fefa33e3
Author: Quentin Lambert <lambert.quentin@gmail.com>
Date:   Sun Sep 7 20:03:32 2014 +0200

    PCI: Remove assignment from "if" conditions
    
    The following Coccinelle semantic patch was used to find and correct cases
    of assignments in "if" conditions:
    
    @@
    expression var, expr;
    statement S;
    @@
    
    + var = expr;
      if(
    - (var = expr)
    + var
      ) S
    
    Signed-off-by: Quentin Lambert <lambert.quentin@gmail.com>
    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

commit 656f978f9af9d8d77436e8159f51f7aa1e673309
Author: Quentin Lambert <lambert.quentin@gmail.com>
Date:   Sun Sep 7 20:02:47 2014 +0200

    PCI: Remove unnecessary curly braces
    
    Remove curly braces in simple "if" cases.
    
    No functional change.
    
    Signed-off-by: Quentin Lambert <lambert.quentin@gmail.com>
    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

commit e0c524049f8279d00d2fbd4748b03234a2726fdd
Author: Santosh Shilimkar <santosh.shilimkar@ti.com>
Date:   Thu Jul 10 11:30:08 2014 -0400

    MAINTAINERS: Add Keystone Multicore Navigator drivers entry
    
    Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>

commit 88139ed030583557751e279968e13e892ae10825
Author: Santosh Shilimkar <santosh.shilimkar@ti.com>
Date:   Sun Mar 30 17:29:04 2014 -0400

    soc: ti: add Keystone Navigator DMA support
    
    The Keystone Navigator DMA driver sets up the dma channels and flows for
    the QMSS(Queue Manager SubSystem) who triggers the actual data movements
    across clients using destination queues. Every client modules like
    NETCP(Network Coprocessor), SRIO(Serial Rapid IO) and CRYPTO
    Engines has its own instance of packet dma hardware. QMSS has also
    an internal packet DMA module which is used as an infrastructure
    DMA with zero copy.
    
    Initially this driver was proposed as DMA engine driver but since the
    hardware is not typical DMA engine and hence doesn't comply with typical
    DMA engine driver needs, that approach was naked. Link to that
    discussion -
    	https://lkml.org/lkml/2014/3/18/340
    
    As aligned, now we pair the Navigator DMA with its companion Navigator
    QMSS subsystem driver.
    
    Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Cc: Kumar Gala <galak@codeaurora.org>
    Cc: Olof Johansson <olof@lixom.net>
    Cc: Arnd Bergmann <arnd@arndb.de>
    Cc: Grant Likely <grant.likely@linaro.org>
    Cc: Rob Herring <robh+dt@kernel.org>
    Cc: Mark Rutland <mark.rutland@arm.com>
    Signed-off-by: Sandeep Nair <sandeep_n@ti.com>
    Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>

commit 8172296d8717be1951da4bb4feb2700a60e8cdde
Author: Santosh Shilimkar <santosh.shilimkar@ti.com>
Date:   Sun Mar 30 17:29:04 2014 -0400

    Documentation: dt: soc: add Keystone Navigator DMA bindings
    
    The Keystone Navigator DMA driver sets up the dma channels and flows for
    the QMSS(Queue Manager SubSystem) who triggers the actual data movements
    across clients using destination queues. Every client modules like
    NETCP(Network Coprocessor), SRIO(Serial Rapid IO) and CRYPTO
    Engines has its own instance of packet dma hardware. QMSS has also
    an internal packet DMA module which is used as an infrastructure
    DMA with zero copy.
    
    Initially this driver was proposed as DMA engine driver but since the
    hardware is not typical DMA engine and hence doesn't comply with typical
    DMA engine driver needs, that approach was naked. Link to that
    discussion -
    	https://lkml.org/lkml/2014/3/18/340
    
    As aligned, now we pair the Navigator DMA with its companion Navigator
    QMSS subsystem driver.
    
    Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Cc: Kumar Gala <galak@codeaurora.org>
    Cc: Olof Johansson <olof@lixom.net>
    Cc: Arnd Bergmann <arnd@arndb.de>
    Cc: Grant Likely <grant.likely@linaro.org>
    Cc: Rob Herring <robh+dt@kernel.org>
    Cc: Mark Rutland <mark.rutland@arm.com>
    Signed-off-by: Sandeep Nair <sandeep_n@ti.com>
    Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>

commit 41f93af900a20d1a0a358b522b5129c89677e9dc
Author: Sandeep Nair <sandeep_n@ti.com>
Date:   Fri Feb 28 10:47:50 2014 -0500

    soc: ti: add Keystone Navigator QMSS driver
    
    The QMSS (Queue Manager Sub System) found on Keystone SOCs is one of
    the main hardware sub system which forms the backbone of the Keystone
    Multi-core Navigator. QMSS consist of queue managers, packed-data structure
    processors(PDSP), linking RAM, descriptor pools and infrastructure
    Packet DMA.
    
    The Queue Manager is a hardware module that is responsible for accelerating
    management of the packet queues. Packets are queued/de-queued by writing or
    reading descriptor address to a particular memory mapped location. The PDSPs
    perform QMSS related functions like accumulation, QoS, or event management.
    Linking RAM registers are used to link the descriptors which are stored in
    descriptor RAM. Descriptor RAM is configurable as internal or external memory.
    
    The QMSS driver manages the PDSP setups, linking RAM regions,
    queue pool management (allocation, push, pop and notify) and descriptor
    pool management. The specifics on the device tree bindings for
    QMSS can be found in:
    	Documentation/devicetree/bindings/soc/keystone-navigator-qmss.txt
    
    Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Cc: Kumar Gala <galak@codeaurora.org>
    Cc: Olof Johansson <olof@lixom.net>
    Cc: Arnd Bergmann <arnd@arndb.de>
    Cc: Grant Likely <grant.likely@linaro.org>
    Cc: Rob Herring <robh+dt@kernel.org>
    Cc: Mark Rutland <mark.rutland@arm.com>
    Signed-off-by: Sandeep Nair <sandeep_n@ti.com>
    Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>

commit a4dfb8c41043dd6c2b9defbe846c44389c4b6f02
Author: Sandeep Nair <sandeep_n@ti.com>
Date:   Fri Feb 28 10:47:50 2014 -0500

    Documentation: dt: soc: add Keystone Navigator QMSS bindings
    
    The QMSS (Queue Manager Sub System) found on Keystone SOCs is one of
    the main hardware sub system which forms the backbone of the Keystone
    Multi-core Navigator. QMSS consist of queue managers, packed-data structure
    processors(PDSP), linking RAM, descriptor pools and infrastructure
    Packet DMA.
    
    The Queue Manager is a hardware module that is responsible for accelerating
    management of the packet queues. Packets are queued/de-queued by writing or
    reading descriptor address to a particular memory mapped location. The PDSPs
    perform QMSS related functions like accumulation, QoS, or event management.
    Linking RAM registers are used to link the descriptors which are stored in
    descriptor RAM. Descriptor RAM is configurable as internal or external memory.
    
    Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Cc: Kumar Gala <galak@codeaurora.org>
    Cc: Olof Johansson <olof@lixom.net>
    Cc: Arnd Bergmann <arnd@arndb.de>
    Cc: Grant Likely <grant.likely@linaro.org>
    Cc: Rob Herring <robh+dt@kernel.org>
    Cc: Mark Rutland <mark.rutland@arm.com>
    Signed-off-by: Sandeep Nair <sandeep_n@ti.com>
    Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>

commit 382a9c9adc1cd540f5b714b65db315fc1c0b553d
Author: Quentin Lambert <lambert.quentin@gmail.com>
Date:   Sun Sep 7 20:02:04 2014 +0200

    PCI: Add space before open parenthesis
    
    Add space before open parenthesis as is conventional.
    
    No functional change.
    
    [bhelgaas: fix a few more in ibmphp, shpchp]
    Signed-off-by: Quentin Lambert <lambert.quentin@gmail.com>
    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

commit fe6ce32189136370b3d236d1c32ab812b93cd0d8
Author: Yijing Wang <wangyijing@huawei.com>
Date:   Wed Sep 24 11:09:59 2014 +0800

    PCI/MSI: Rename __read_msi_msg() to read_msi_msg()
    
    Rename __read_msi_msg() to read_msi_msg().
    
    No functional change.
    
    Signed-off-by: Yijing Wang <wangyijing@huawei.com>
    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

commit c2eb2dc3461e522d6452e1f74e36194633d4db0d
Author: Yijing Wang <wangyijing@huawei.com>
Date:   Wed Sep 24 11:09:53 2014 +0800

    PCI/MSI: Remove unused read_msi_msg()
    
    Now no one uses read_msi_msg(), so remove it.
    
    No functional change.
    
    Signed-off-by: Yijing Wang <wangyijing@huawei.com>
    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

commit 11c013fd8afbc7c6b2e15608ed5fc2c249552998
Author: Yijing Wang <wangyijing@huawei.com>
Date:   Wed Sep 24 11:09:45 2014 +0800

    MSI/powerpc: Use __read_msi_msg() instead of read_msi_msg()
    
    rtas_setup_msi_irqs() already has the struct msi_desc pointer required by
    __read_msi_msg(), so call it directly instead of having read_msi_msg() look
    it up from the IRQ.
    
    No functional change.
    
    [bhelgaas: changelog]
    Signed-off-by: Yijing Wang <wangyijing@huawei.com>
    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
    Acked-by: Michael Ellerman <mpe@ellerman.id.au>
    CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
    CC: linuxppc-dev@lists.ozlabs.org

commit eeeda4cd06e828b331b15741a204ff9f5874d28d
Author: Ben Hutchings <ben@decadent.org.uk>
Date:   Wed Sep 24 13:30:12 2014 +0100

    x86/relocs: Make per_cpu_load_addr static
    
    per_cpu_load_addr is only used for 64-bit relocations, but is
    declared in both configurations of relocs.c - with different
    types.  This has undefined behaviour in general.  GNU ld is
    documented to use the larger size in this case, but other tools
    may differ and some warn about this.
    
    References: https://bugs.debian.org/748577
    Reported-by: Michael Tautschnig <mt@debian.org>
    Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
    Cc: 748577@bugs.debian.org
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Link: http://lkml.kernel.org/r/1411561812.3659.23.camel@decadent.org.uk
    Signed-off-by: Ingo Molnar <mingo@kernel.org>

commit 212be3b2320bcf33eff648bc4e1f0edbf4d90acf
Author: Oleg Nesterov <oleg@redhat.com>
Date:   Sun Sep 21 20:42:32 2014 +0200

    x86/lib/Makefile: Remove the unnecessary "+= thunk_64.o"
    
    Trivial. We have "lib-y += thunk_$(BITS).o" at the start, no
    need to add thunk_64.o if !CONFIG_X86_32.
    
    Signed-off-by: Oleg Nesterov <oleg@redhat.com>
    Acked-by: Andy Lutomirski <luto@amacapital.net>
    Cc: Denys Vlasenko <dvlasenk@redhat.com>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Link: http://lkml.kernel.org/r/20140921184232.GB23727@redhat.com
    Signed-off-by: Ingo Molnar <mingo@kernel.org>

commit 0ad6e3c5199be12c9745da8f8b9e3c9f8066c235
Author: Oleg Nesterov <oleg@redhat.com>
Date:   Sun Sep 21 20:41:53 2014 +0200

    x86: Speed up ___preempt_schedule*() by using THUNK helpers
    
    ___preempt_schedule() does SAVE_ALL/RESTORE_ALL but this is
    suboptimal, we do not need to save/restore the callee-saved
    register. And we already have arch/x86/lib/thunk_*.S which
    implements the similar asm wrappers, so it makes sense to
    redefine ___preempt_schedule() as "THUNK ..." and remove
    preempt.S altogether.
    
    Signed-off-by: Oleg Nesterov <oleg@redhat.com>
    Reviewed-by: Andy Lutomirski <luto@amacapital.net>
    Cc: Denys Vlasenko <dvlasenk@redhat.com>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Link: http://lkml.kernel.org/r/20140921184153.GA23727@redhat.com
    Signed-off-by: Ingo Molnar <mingo@kernel.org>

commit 03bd4e1f7265548832a76e7919a81f3137c44fd1
Author: Wanpeng Li <wanpeng.li@linux.intel.com>
Date:   Wed Sep 24 16:38:05 2014 +0800

    sched: Fix unreleased llc_shared_mask bit during CPU hotplug
    
    The following bug can be triggered by hot adding and removing a large number of
    xen domain0's vcpus repeatedly:
    
    	BUG: unable to handle kernel NULL pointer dereference at 0000000000000004 IP: [..] find_busiest_group
    	PGD 5a9d5067 PUD 13067 PMD 0
    	Oops: 0000 [#3] SMP
    	[...]
    	Call Trace:
    	load_balance
    	? _raw_spin_unlock_irqrestore
    	idle_balance
    	__schedule
    	schedule
    	schedule_timeout
    	? lock_timer_base
    	schedule_timeout_uninterruptible
    	msleep
    	lock_device_hotplug_sysfs
    	online_store
    	dev_attr_store
    	sysfs_write_file
    	vfs_write
    	SyS_write
    	system_call_fastpath
    
    Last level cache shared mask is built during CPU up and the
    build_sched_domain() routine takes advantage of it to setup
    the sched domain CPU topology.
    
    However, llc_shared_mask is not released during CPU disable,
    which leads to an invalid sched domainCPU topology.
    
    This patch fix it by releasing the llc_shared_mask correctly
    during CPU disable.
    
    Yasuaki also reported that this can happen on real hardware:
    
      https://lkml.org/lkml/2014/7/22/1018
    
    His case is here:
    
    	==
    	Here is an example on my system.
    	My system has 4 sockets and each socket has 15 cores and HT is
    	enabled. In this case, each core of sockes is numbered as
    	follows:
    
    		 | CPU#
    	Socket#0 | 0-14 , 60-74
    	Socket#1 | 15-29, 75-89
    	Socket#2 | 30-44, 90-104
    	Socket#3 | 45-59, 105-119
    
    	Then llc_shared_mask of CPU#30 has 0x3fff80000001fffc0000000.
    
    	It means that last level cache of Socket#2 is shared with
    	CPU#30-44 and 90-104.
    
    	When hot-removing socket#2 and #3, each core of sockets is
    	numbered as follows:
    
    		 | CPU#
    	Socket#0 | 0-14 , 60-74
    	Socket#1 | 15-29, 75-89
    
    	But llc_shared_mask is not cleared. So llc_shared_mask of CPU#30
    	remains having 0x3fff80000001fffc0000000.
    
    	After that, when hot-adding socket#2 and #3, each core of
    	sockets is numbered as follows:
    
    		 | CPU#
    	Socket#0 | 0-14 , 60-74
    	Socket#1 | 15-29, 75-89
    	Socket#2 | 30-59
    	Socket#3 | 90-119
    
    	Then llc_shared_mask of CPU#30 becomes
    	0x3fff8000fffffffc0000000. It means that last level cache of
    	Socket#2 is shared with CPU#30-59 and 90-104. So the mask has
    	the wrong value.
    
    Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com>
    Tested-by: Linn Crosetto <linn@hp.com>
    Reviewed-by: Borislav Petkov <bp@suse.de>
    Reviewed-by: Toshi Kani <toshi.kani@hp.com>
    Reviewed-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
    Cc: <stable@vger.kernel.org>
    Cc: David Rientjes <rientjes@google.com>
    Cc: Prarit Bhargava <prarit@redhat.com>
    Cc: Steven Rostedt <srostedt@redhat.com>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Link: http://lkml.kernel.org/r/1411547885-48165-1-git-send-email-wanpeng.li@linux.intel.com
    Signed-off-by: Ingo Molnar <mingo@kernel.org>

commit 24832b4de315ad00e5430a53772750dfcf18514d
Author: Minghuan Lian <Minghuan.Lian@freescale.com>
Date:   Tue Sep 23 22:28:59 2014 +0800

    PCI: designware: Add get_msi_data() to pcie_host_ops
    
    Add a struct pcie_host_ops .get_msi_data() method for platforms to return
    their special MSI message data.
    
    Signed-off-by: Minghuan Lian <Minghuan.Lian@freescale.com>
    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
    Acked-by: Mohit KUMAR <mohit.kumar@st.com>

commit ee1b5b165c0a2f04d2107e634e51f05d0eb107de
Author: Bryan O'Donoghue <pure.logic@nexus-software.ie>
Date:   Wed Sep 24 00:26:24 2014 +0100

    x86/intel/quark: Switch off CR4.PGE so TLB flush uses CR3 instead
    
    Quark x1000 advertises PGE via the standard CPUID method
    PGE bits exist in Quark X1000's PTEs. In order to flush
    an individual PTE it is necessary to reload CR3 irrespective
    of the PTE.PGE bit.
    
    See Quark Core_DevMan_001.pdf section 6.4.11
    
    This bug was fixed in Galileo kernels, unfixed vanilla kernels are expected to
    crash and burn on this platform.
    
    Signed-off-by: Bryan O'Donoghue <pure.logic@nexus-software.ie>
    Cc: Borislav Petkov <bp@alien8.de>
    Cc: <stable@vger.kernel.org>
    Link: http://lkml.kernel.org/r/1411514784-14885-1-git-send-email-pure.logic@nexus-software.ie
    Signed-off-by: Ingo Molnar <mingo@kernel.org>

commit 450e344e421b9f555261a2d97952d9e71d4cb082
Author: Minghuan Lian <Minghuan.Lian@freescale.com>
Date:   Tue Sep 23 22:28:58 2014 +0800

    PCI: designware: Rename get_msi_data() to get_msi_addr()
    
    The struct pcie_host_ops .get_msi_data() method returns the MSI message
    address.  To accurately express its purpose, rename it to .get_msi_addr().
    
    Signed-off-by: Minghuan Lian <Minghuan.Lian@freescale.com>
    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
    Acked-by: Mohit KUMAR <mohit.kumar@st.com>

commit 0c61ea77cceafd1134225099961c2df0866b500f
Author: Minghuan Lian <Minghuan.Lian@fre…
aryabinin referenced this pull request in aryabinin/linux Oct 3, 2014
GIT 6fe676b243e5a0cb4cc4d9a4b094de8db0cdbf74

commit e500f488c27659bb6f5d313b336621f3daa67701
Author: Fabian Frederick <fabf@skynet.be>
Date:   Wed Oct 1 06:52:06 2014 +0200

    net/dccp/ccid.c: add __init to ccid_activate
    
    ccid_activate is only called by __init ccid_initialize_builtins in same module.
    
    Signed-off-by: Fabian Frederick <fabf@skynet.be>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit 0c5b8a46294d43fc63788839d3c18de0961ec1bc
Author: Fabian Frederick <fabf@skynet.be>
Date:   Wed Oct 1 06:48:03 2014 +0200

    net/dccp/proto.c: add __init to dccp_mib_init
    
    dccp_mib_init is only called by __init dccp_init in same module.
    
    Signed-off-by: Fabian Frederick <fabf@skynet.be>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit 082f58ac4a48d3f5cb4597232cb2ac6823a96f43
Author: Quinn Tran <quinn.tran@qlogic.com>
Date:   Thu Sep 25 06:22:28 2014 -0400

    target: Fix queue full status NULL pointer for SCF_TRANSPORT_TASK_SENSE
    
    During temporary resource starvation at lower transport layer, command
    is placed on queue full retry path, which expose this problem.  The TCM
    queue full handling of SCF_TRANSPORT_TASK_SENSE currently sends the same
    cmd twice to lower layer.  The 1st time led to cmd normal free path.
    The 2nd time cause Null pointer access.
    
    This regression bug was originally introduced v3.1-rc code in the
    following commit:
    
    commit e057f53308a5f071556ee80586b99ee755bf07f5
    Author: Christoph Hellwig <hch@infradead.org>
    Date:   Mon Oct 17 13:56:41 2011 -0400
    
        target: remove the transport_qf_callback se_cmd callback
    
    Signed-off-by: Quinn Tran <quinn.tran@qlogic.com>
    Signed-off-by: Saurav Kashyap <saurav.kashyap@qlogic.com>
    Cc: <stable@vger.kernel.org> # v3.1+
    Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>

commit db3a99b9921f27fe71ca8c0f218ee810e0e7fb69
Author: Joern Engel <joern@logfs.org>
Date:   Tue Sep 16 16:23:19 2014 -0400

    qla_target: rearrange struct qla_tgt_prm
    
    On most (non-x86) 64bit platforms this will remove 8 padding bytes
    from the structure.
    
    Signed-off-by: Joern Engel <joern@logfs.org>
    Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>

commit f9b6721a9cef94908467abf7a2cacbd15a7d23cb
Author: Joern Engel <joern@logfs.org>
Date:   Tue Sep 16 16:23:18 2014 -0400

    qla_target: improve qlt_unmap_sg()
    
    Remove the inline attribute.  Modern compilers ignore it and the
    function has grown beyond where inline made sense anyway.
    Remove the BUG_ON(!cmd->sg_mapped), and instead return if sg_mapped is
    not set.  Every caller is doing this check, so we might as well have it
    in one place instead of four.
    
    Signed-off-by: Joern Engel <joern@logfs.org>
    Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>

commit 55a9066fffd2f533e7ed434b072469ef09d6c476
Author: Joern Engel <joern@logfs.org>
Date:   Tue Sep 16 16:23:15 2014 -0400

    qla_target: make some global functions static
    
    Also removes the declarations from the header - including two
    declarations without function definitions or callers.
    
    Signed-off-by: Joern Engel <joern@logfs.org>
    Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>

commit c57010420654aca179c500f61e86315a337244ca
Author: Joern Engel <joern@logfs.org>
Date:   Tue Sep 16 16:23:14 2014 -0400

    qla_target: remove unused parameter
    
    Signed-off-by: Joern Engel <joern@logfs.org>
    Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>

commit f81ccb489a7a641c1bed41b49cf8d72c199c68d5
Author: Joern Engel <joern@logfs.org>
Date:   Tue Sep 16 16:23:13 2014 -0400

    target: simplify core_tmr_abort_task
    
    list_for_each_entry_safe is necessary if list objects are deleted from
    the list while traversing it.  Not the case here, so we can use the base
    list_for_each_entry variant.
    
    Signed-off-by: Joern Engel <joern@logfs.org>
    Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>

commit 33940d09937276cd3c81f2874faf43e37c2db0e2
Author: Joern Engel <joern@logfs.org>
Date:   Tue Sep 16 16:23:12 2014 -0400

    target: encapsulate smp_mb__after_atomic()
    
    The target code has a rather generous helping of smp_mb__after_atomic()
    throughout the code base.  Most atomic operations were followed by one
    and none were preceded by smp_mb__before_atomic(), nor accompanied by a
    comment explaining the need for a barrier.
    
    Instead of trying to prove for every case whether or not it is needed,
    this patch introduces atomic_inc_mb() and atomic_dec_mb(), which
    explicitly include the memory barriers before and after the atomic
    operation.  For now they are defined in a target header, although they
    could be of general use.
    
    Most of the existing atomic/mb combinations were replaced by the new
    helpers.  In a few cases the atomic was sandwiched in
    spin_lock/spin_unlock and I simply removed the barrier.
    
    I suspect that in most cases the correct conversion would have been to
    drop the barrier.  I also suspect that a few cases exist where a) the
    barrier was necessary and b) a second barrier before the atomic would
    have been necessary and got added by this patch.
    
    Signed-off-by: Joern Engel <joern@logfs.org>
    Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>

commit 74ed7e62289dc6d388996d7c8f89c2e7e95b9657
Author: Joern Engel <joern@logfs.org>
Date:   Tue Sep 16 16:23:11 2014 -0400

    target: remove some smp_mb__after_atomic()s
    
    atomic_inc_return() already does an implicit memory barrier and the
    second case was moved from an atomic to a plain flag operation.  If a
    barrier were needed in the second case, it would have to be smp_mb(),
    not a variant optimized away for x86 and other architectures.
    
    Signed-off-by: Joern Engel <joern@logfs.org>
    Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>

commit 8f83269048628d7b139dacbfc6cc97befcbdd2e9
Author: Joern Engel <joern@logfs.org>
Date:   Tue Sep 16 16:23:10 2014 -0400

    target: simplify core_tmr_release_req()
    
    And while at it, do minimal coding style fixes in the area.
    
    Signed-off-by: Joern Engel <joern@logfs.org>
    Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>

commit 9c7d6154bc4b9dfefd580490cdca5f7c72321464
Author: Andy Grover <agrover@redhat.com>
Date:   Mon Jun 30 16:39:46 2014 -0700

    target: Remove core_tpg_release_virtual_lun0 function
    
    Simple and just called from one place.
    
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Signed-off-by: Andy Grover <agrover@redhat.com>
    Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>

commit cd9d7cbaec8b622eee4edcd8bf481c4047f74915
Author: Andy Grover <agrover@redhat.com>
Date:   Mon Jun 30 16:39:44 2014 -0700

    target: Change core_dev_del_lun to take a se_lun instead of unpacked_lun
    
    Remove core_tpg_pre_dellun entirely, since we don't need to get/check
    a pointer we already have.
    
    Nothing else can return an error, so core_dev_del_lun can return void.
    
    Rename core_tpg_post_dellun to remove_lun - a clearer name, now that
    pre_dellun is gone.
    
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Signed-off-by: Andy Grover <agrover@redhat.com>
    Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>

commit cc83881f2c57caaf4b14adaffa65595640a59661
Author: Andy Grover <agrover@redhat.com>
Date:   Mon Jun 30 16:39:43 2014 -0700

    target: core_tpg_post_dellun can return void
    
    Nothing in it can raise an error.
    
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Signed-off-by: Andy Grover <agrover@redhat.com>
    Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>

commit 49be17235c0acd96f2ff0fe282867fe3a83f554c
Author: hayeswang <hayeswang@realtek.com>
Date:   Wed Oct 1 13:25:11 2014 +0800

    r8152: disable power cut for RTL8153
    
    The firmware would be clear when the power cut is enabled for
    RTL8153.
    
    Signed-off-by: Hayes Wang <hayeswang@realtek.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit 204c8704128943bf3f8b605f4b40bdc2b6bd89dc
Author: hayeswang <hayeswang@realtek.com>
Date:   Wed Oct 1 13:25:10 2014 +0800

    r8152: remove clearing bp
    
    The xxx_clear_bp() is used to halt the firmware. It only necessary
    for updating the new firmware. Besides, depend on the version of
    the current firmware, it may have problem to halt the firmware
    directly. Finally, halt the firmware would let the firmware code
    useless, and the bugs which are fixed by the firmware would occur.
    
    Signed-off-by: Hayes Wang <hayeswang@realtek.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit aa55c8e2f7a395dfc9e67fc6637321e19ce9bfe1
Author: Masahiro Yamada <yamada.m@jp.panasonic.com>
Date:   Tue Sep 9 20:02:24 2014 +0900

    kbuild: handle C=... and M=... after entering into build directory
    
    This commit avoids processing C=... and M=... twice
    when O=... is also given.
    
    Besides, we can also remove KBUILD_EXTMOD="$(KBUILD_EXTMOD)"
    in the sub-make target.
    
    Signed-off-by: Masahiro Yamada <yamada.m@jp.panasonic.com>
    Acked-by: Peter Foley <pefoley2@pefoley.com>
    Signed-off-by: Michal Marek <mmarek@suse.cz>

commit 745a254322c898dadf019342cd7140f7867d2d0f
Author: Masahiro Yamada <yamada.m@jp.panasonic.com>
Date:   Tue Sep 9 20:02:23 2014 +0900

    kbuild: use $(Q) for sub-make target
    
    Since commit 066b7ed9558087a7957a1128f27d7a3462ff117f
    (kbuild: Do not print the build directory with make -s),
    "Q" is defined above the sub-make target.
    
    This commit takes advantage of that and replaces
    "$(if $(KBUILD_VERBOSE:1=),@)" with "$(Q)".
    
    Signed-off-by: Masahiro Yamada <yamada.m@jp.panasonic.com>
    Acked-by: Peter Foley <pefoley2@pefoley.com>
    Signed-off-by: Michal Marek <mmarek@suse.cz>

commit 7ff525712acf9325e9acdb27bbc93049ea2e850c
Author: Masahiro Yamada <yamada.m@jp.panasonic.com>
Date:   Tue Sep 9 20:02:22 2014 +0900

    kbuild: fake the "Entering directory ..." message more simply
    
    Commit c2e28dc975ea87feed84415006ae143424912ac7
    (kbuild: Print the name of the build directory)
    added a gimmick to show the "Entering directory ...".
    
    Instead of echoing the hard-coded message (that is, we need to know
    the exact message), moving --no-print-directory would be easier.
    
    Signed-off-by: Masahiro Yamada <yamada.m@jp.panasonic.com>
    Acked-by: Peter Foley <pefoley2@pefoley.com>
    Signed-off-by: Michal Marek <mmarek@suse.cz>

commit 1b0ecb28b0cc216535ce6477d39aa610c3ff68a1
Author: Vlad Yasevich <vyasevich@gmail.com>
Date:   Tue Sep 30 19:39:37 2014 -0400

    bnx2: Correctly receive full sized 802.1ad fragmes
    
    This driver, similar to tg3, has a check that will
    cause full sized 802.1ad frames to be dropped.  The
    frame will be larger then the standard mtu due to the
    presense of vlan header that has not been stripped.
    The driver should not drop this frame and should process
    it just like it does for 802.1q.
    
    CC: Sony Chacko <sony.chacko@qlogic.com>
    CC: Dept-HSGLinuxNICDev@qlogic.com
    Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit 7d3083ee36b51e425b6abd76778a2046906b0fd3
Author: Vlad Yasevich <vyasevich@gmail.com>
Date:   Tue Sep 30 19:39:36 2014 -0400

    tg3: Allow for recieve of full-size 8021AD frames
    
    When receiving a vlan-tagged frame that still contains
    a vlan header, the length of the packet will be greater
    then MTU+ETH_HLEN since it will account of the extra
    vlan header.  TG3 checks this for the case for 802.1Q,
    but not for 802.1ad.  As a result, full sized 802.1ad
    frames get dropped by the card.
    
    Add a check for 802.1ad protocol when receving full
    sized frames.
    
    Suggested-by: Prashant Sreedharan <prashant@broadcom.com>
    CC: Prashant Sreedharan <prashant@broadcom.com>
    CC: Michael Chan <mchan@broadcom.com>
    Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit 1e918876853aa85435e0f17fd8b4a92dcfff53d6
Author: Florian Westphal <fw@strlen.de>
Date:   Wed Oct 1 13:38:03 2014 +0200

    r8169: add support for Byte Queue Limits
    
    tested on RTL8168d/8111d model using 'super_netperf 40' with TCP/UDP_STREAM.
    
    Output of
    while true; do
        for n in inflight limit; do
              echo -n $n\ ; cat $n;
        done;
        sleep 1;
    done
    
    during netperf run, 100mbit peer:
    
    inflight 0
    limit 3028
    inflight 6056
    limit 4542
    
    [ trimmed output for brevity, no limit/inflight changes during
      test steady-state ]
    
    limit 4542
    inflight 3028
    limit 6122
    inflight 0
    limit 6122
    [ changed cable to 1gbit peer, restart netperf ]
    inflight 37850
    limit 36336
    inflight 33308
    limit 31794
    inflight 33308
    limit 31794
    inflight 27252
    limit 25738
    [ again, no changes during test ]
    inflight 27252
    limit 25738
    inflight 0
    limit 28766
    [ change cable to 100mbit peer, restart netperf ]
    limit 28766
    inflight 27370
    limit 28766
    inflight 4542
    limit 5990
    inflight 6056
    limit 4542
    [ .. ]
    inflight 6056
    limit 4542
    inflight 0
    
    [end of test]
    
    Cc: Francois Romieu <romieu@fr.zoreil.com>
    Cc: Hayes Wang <hayeswang@realtek.com>
    Signed-off-by: Florian Westphal <fw@strlen.de>
    Acked-by: Eric Dumazet <edumazet@google.com>
    Acked-by: Tom Herbert <therbert@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit d0bf4a9e92b9a93ffeeacbd7b6cb83e0ee3dc2ef
Author: Eric Dumazet <edumazet@google.com>
Date:   Mon Sep 29 13:29:15 2014 -0700

    net: cleanup and document skb fclone layout
    
    Lets use a proper structure to clearly document and implement
    skb fast clones.
    
    Then, we might experiment more easily alternative layouts.
    
    This patch adds a new skb_fclone_busy() helper, used by tcp and xfrm,
    to stop leaking of implementation details.
    
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit 0f1ca65ee50df042051e8fa3a14f73b0c71d45b9
Author: Arianna Avanzini <avanzini.arianna@gmail.com>
Date:   Fri Aug 22 13:20:02 2014 +0200

    xen, blkfront: factor out flush-related checks from do_blkif_request()
    
    This commit factors out some checks related to the request insertion
    path, which can be done in an function instead of by itself.
    
    Reviewed-by: David Vrabel <david.vrabel@citrix.com>
    Signed-off-by: Arianna Avanzini <avanzini.arianna@gmail.com>
    Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

commit 61cecca865280bef4f8a9748d0a9afa5df351ac2
Author: Roger Pau Monné <roger.pau@citrix.com>
Date:   Mon Sep 15 11:55:27 2014 +0200

    xen-blkback: fix leak on grant map error path
    
    Fix leaking a page when a grant mapping has failed.
    
    CC: stable@vger.kernel.org
    Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
    Reported-and-Tested-by: Tao Chen <boby.chen@huawei.com>
    Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

commit 12ea729645ace01e08f9654df155622898d3aae6
Author: Vitaly Kuznetsov <vkuznets@redhat.com>
Date:   Mon Sep 8 15:21:33 2014 +0200

    xen/blkback: unmap all persistent grants when frontend gets disconnected
    
    blkback does not unmap persistent grants when frontend goes to Closed
    state (e.g. when blkfront module is being removed). This leads to the
    following in guest's dmesg:
    
    [  343.243825] xen:grant_table: WARNING: g.e. 0x445 still in use!
    [  343.243825] xen:grant_table: WARNING: g.e. 0x42a still in use!
    ...
    
    When load module -> use device -> unload module sequence is performed multiple times
    it is possible to hit BUG() condition in blkfront module:
    
    [  343.243825] kernel BUG at drivers/block/xen-blkfront.c:954!
    [  343.243825] invalid opcode: 0000 [#1] SMP
    [  343.243825] Modules linked in: xen_blkfront(-) ata_generic pata_acpi [last unloaded: xen_blkfront]
    ...
    [  343.243825] Call Trace:
    [  343.243825]  [<ffffffff814111ef>] ? unregister_xenbus_watch+0x16f/0x1e0
    [  343.243825]  [<ffffffffa0016fbf>] blkfront_remove+0x3f/0x140 [xen_blkfront]
    ...
    [  343.243825] RIP  [<ffffffffa0016aae>] blkif_free+0x34e/0x360 [xen_blkfront]
    [  343.243825]  RSP <ffff88001eb8fdc0>
    
    We don't need to keep these grants if we're disconnecting as frontend might already
    forgot about them. Solve the issue by moving xen_blkbk_free_caches() call from
    xen_blkif_free() to xen_blkif_disconnect().
    
    Now we can see the following:
    [  928.590893] xen:grant_table: WARNING: g.e. 0x587 still in use!
    [  928.591861] xen:grant_table: WARNING: g.e. 0x372 still in use!
    ...
    [  929.592146] xen:grant_table: freeing g.e. 0x587
    [  929.597174] xen:grant_table: freeing g.e. 0x372
    ...
    
    Backend does not keep persistent grants any more, reconnect works fine.
    
    CC: stable@vger.kernel.org
    Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
    Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

commit b248230c34970a6c1c17c591d63b464e8d2cfc33
Author: Yuchung Cheng <ycheng@google.com>
Date:   Mon Sep 29 13:20:38 2014 -0700

    tcp: abort orphan sockets stalling on zero window probes
    
    Currently we have two different policies for orphan sockets
    that repeatedly stall on zero window ACKs. If a socket gets
    a zero window ACK when it is transmitting data, the RTO is
    used to probe the window. The socket is aborted after roughly
    tcp_orphan_retries() retries (as in tcp_write_timeout()).
    
    But if the socket was idle when it received the zero window ACK,
    and later wants to send more data, we use the probe timer to
    probe the window. If the receiver always returns zero window ACKs,
    icsk_probes keeps getting reset in tcp_ack() and the orphan socket
    can stall forever until the system reaches the orphan limit (as
    commented in tcp_probe_timer()). This opens up a simple attack
    to create lots of hanging orphan sockets to burn the memory
    and the CPU, as demonstrated in the recent netdev post "TCP
    connection will hang in FIN_WAIT1 after closing if zero window is
    advertised." http://www.spinics.net/lists/netdev/msg296539.html
    
    This patch follows the design in RTO-based probe: we abort an orphan
    socket stalling on zero window when the probe timer reaches both
    the maximum backoff and the maximum RTO. For example, an 100ms RTT
    connection will timeout after roughly 153 seconds (0.3 + 0.6 +
    .... + 76.8) if the receiver keeps the window shut. If the orphan
    socket passes this check, but the system already has too many orphans
    (as in tcp_out_of_resources()), we still abort it but we'll also
    send an RST packet as the connection may still be active.
    
    In addition, we change TCP_USER_TIMEOUT to cover (life or dead)
    sockets stalled on zero-window probes. This changes the semantics
    of TCP_USER_TIMEOUT slightly because it previously only applies
    when the socket has pending transmission.
    
    Signed-off-by: Yuchung Cheng <ycheng@google.com>
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: Neal Cardwell <ncardwell@google.com>
    Reported-by: Andrey Dmitrov <andrey.dmitrov@oktetlabs.ru>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit 3edfe0030bb7a82dab2a30a29ea6e1800e600c4b
Author: Helge Deller <deller@gmx.de>
Date:   Wed Oct 1 22:11:01 2014 +0200

    parisc: Fix serial console for machines with serial port on superio chip
    
    Fix the serial console on machines where the serial port is located on
    the SuperIO chip.
    
    Signed-off-by: Helge Deller <deller@gmx.de>
    Cc: Peter Hurley <peter@hurleysoftware.com>

commit baf378126b08474de2e2428b16e62a69df0339d9
Author: Michael Opdenacker <michael.opdenacker@free-electrons.com>
Date:   Wed Oct 1 14:07:39 2014 -0600

    rsxx: Remove deprecated IRQF_DISABLED
    
    This removes the use of the IRQF_DISABLED flag
    from drivers/block/rsxx/core.c
    
    It's a NOOP since 2.6.35 and it will be removed one day.
    
    Signed-off-by: Michael Opdenacker <michael.opdenacker@free-electrons.com>
    Acked-by Philip Kelleher <pjk1939@linux.vnet.ibm.com>
    Signed-off-by: Jens Axboe <axboe@fb.com>

commit cb57659a15c6c0576493cc8a10474ce7ffd44eb3
Author: Fabian Frederick <fabf@skynet.be>
Date:   Wed Oct 1 19:30:03 2014 +0200

    cipso: add __init to cipso_v4_cache_init
    
    cipso_v4_cache_init is only called by __init cipso_v4_init
    
    Signed-off-by: Fabian Frederick <fabf@skynet.be>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit 57a02c39c1c20ed03a86f8014c11a8c18b94cac3
Author: Fabian Frederick <fabf@skynet.be>
Date:   Wed Oct 1 19:18:57 2014 +0200

    inet: frags: add __init to ip4_frags_ctl_register
    
    ip4_frags_ctl_register is only called by __init ipfrag_init
    
    Signed-off-by: Fabian Frederick <fabf@skynet.be>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit 47d7a88c188f06ffaea3a539f84fe10cb4e77787
Author: Fabian Frederick <fabf@skynet.be>
Date:   Wed Oct 1 18:27:50 2014 +0200

    tcp: add __init to tcp_init_mem
    
    tcp_init_mem is only called by __init tcp_init.
    
    Signed-off-by: Fabian Frederick <fabf@skynet.be>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit ee7a1beb9759c94aea67dd887faf5e447a5c6710
Author: Chun-Hao Lin <hau@realtek.com>
Date:   Wed Oct 1 23:17:21 2014 +0800

    r8169:call "rtl8168_driver_start" "rtl8168_driver_stop" only when hardware dash function is enabled
    
    These two functions are used to inform dash firmware that driver is been
    brought up or brought down. So call these two functions only when hardware dash
    function is enabled.
    
    Signed-off-by: Chun-Hao Lin <hau@realtek.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit 2a9b4d9670e71784896d95c41c9b0acd50db1dbb
Author: Chun-Hao Lin <hau@realtek.com>
Date:   Wed Oct 1 23:17:20 2014 +0800

    r8169:modify the behavior of function "rtl8168_oob_notify"
    
    In function "rtl8168_oob_notify", using function "rtl_eri_write" to access
    eri register 0xe8, instead of using MAC register "ERIDR" and "ERIAR" to
    access it.
    
    For using function "rtl_eri_write" in function "rtl8168_oob_notify", need to
    move down "rtl8168_oob_notify" related functions under the function
    "rtl_eri_write".
    
    Signed-off-by: Chun-Hao Lin <hau@realtek.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit 2f8c040ce6791ef0477e6d59768ee3d5fd0df0fd
Author: Chun-Hao Lin <hau@realtek.com>
Date:   Wed Oct 1 23:17:19 2014 +0800

    r8169:change the name of function "r8168dp_check_dash" to "r8168_check_dash"
    
    DASH function not only RTL8168DP can support, but also RTL8168EP.
    So change the name of function "r8168dp_check_dash" to "r8168_check_dash".
    
    Signed-off-by: Chun-Hao Lin <hau@realtek.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit 706123d06c18b55da5e9da21e2d138ee789bf8f4
Author: Chun-Hao Lin <hau@realtek.com>
Date:   Wed Oct 1 23:17:18 2014 +0800

    r8169:change the name of function"rtl_w1w0_eri"
    
    Change the name of function "rtl_w1w0_eri" to "rtl_w0w1_eri".
    
    In this function, the local variable "val" is "write zeros then write ones".
    Please see below code.
    
    (val & ~m) | p
    
    In this patch, change the function name from "xx_w1w0_xx" to "xx_w0w1_xx".
    The changed function name is more suitable for it's behavior.
    
    Signed-off-by: Chun-Hao Lin <hau@realtek.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit 7656442824f6174b56a19c664fe560972df56ad4
Author: Chun-Hao Lin <hau@realtek.com>
Date:   Wed Oct 1 23:17:17 2014 +0800

    r8169:for function "rtl_w1w0_phy" change its name and behavior
    
    Change function name from "rtl_w1w0_phy" to "rtl_w0w1_phy".
    And its behavior from "write ones then write zeros" to
    "write zeros then write ones".
    
    In Realtek internal driver, bitwise operations are almost "write zeros then
    write ones". For easy to port hardware parameters from Realtek internal driver
    to Linux kernal driver "r8169", we would like to change this function's
    behavior and its name.
    
    Signed-off-by: Chun-Hao Lin <hau@realtek.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit ac85bcdbc0ffd3903d6db4abcd769ecacf98605b
Author: Chun-Hao Lin <hau@realtek.com>
Date:   Wed Oct 1 23:17:16 2014 +0800

    r8169:add more chips to support magic packet v2
    
    For RTL8168F RTL8168FB RTL8168G RTL8168GU RTL8411 RTL8411B RTL8402 RTL8107E,
    the magic packet enable bit is changed to eri 0xde bit0.
    
    In this patch, change magic packet enable bit of these chips to eri 0xde bit0.
    
    Signed-off-by: Chun-Hao Lin <hau@realtek.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit 89cceb2729c752e6ff9b3bc8650a70f29884f116
Author: Chun-Hao Lin <hau@realtek.com>
Date:   Wed Oct 1 23:17:15 2014 +0800

    r8169:add support more chips to get mac address from backup mac address register
    
    RTL8168FB RTL8168G RTL8168GU RTL8411 RTL8411B RTL8106EUS RTL8402 can
    support get mac address from backup mac address register.
    
    Signed-off-by: Chun-Hao Lin <hau@realtek.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit 42fde7371035144037844f41bd16950de9912bdb
Author: Chun-Hao Lin <hau@realtek.com>
Date:   Wed Oct 1 23:17:14 2014 +0800

    r8169:add disable/enable RTL8411B pll function
    
    RTL8411B can support disable/enable pll function.
    
    Signed-off-by: Chun-Hao Lin <hau@realtek.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit b8e5e6ad7115befef13a4493f1d2b8e438abc058
Author: Chun-Hao Lin <hau@realtek.com>
Date:   Wed Oct 1 23:17:13 2014 +0800

    r8169:add disable/enable RTL8168G pll function
    
    RTL8168G also can disable/enable pll function.
    
    Signed-off-by: Chun-Hao Lin <hau@realtek.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit 05b9687bb3606190304f08c2e4cd63de8717e30b
Author: Chun-Hao Lin <hau@realtek.com>
Date:   Wed Oct 1 23:17:12 2014 +0800

    r8169:change uppercase number to lowercase number
    
    Signed-off-by: Chun-Hao Lin <hau@realtek.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit a29c9c43bb633a9965909cd548879fee4aa789a4
Author: David L Stevens <david.stevens@oracle.com>
Date:   Wed Oct 1 11:05:27 2014 -0400

    sunvnet: fix potential NULL pointer dereference
    
    One of the error cases for vnet_start_xmit()'s "out_dropped" label
    is port == NULL, so only mess with port->clean_timer when port is not NULL.
    
    Signed-off-by: David L Stevens <david.stevens@oracle.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit e506d405ac7d34d03996c97ac68aa2ac010be64a
Author: Thierry Reding <treding@nvidia.com>
Date:   Wed Oct 1 13:59:00 2014 +0200

    net: dsa: Fix build warning for !PM_SLEEP
    
    The dsa_switch_suspend() and dsa_switch_resume() functions are only used
    when PM_SLEEP is enabled, so they need #ifdef CONFIG_PM_SLEEP protection
    to avoid a compiler warning.
    
    Signed-off-by: Thierry Reding <treding@nvidia.com>
    Acked-by: Florian Fainelli <f.fainelli@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit 84ac1f2ca41f5888cc995944c073a5220f3ed549
Author: Tanmay Inamdar <tinamdar@apm.com>
Date:   Fri Sep 26 14:08:25 2014 -0700

    arm64: dts: Add APM X-Gene PCIe device tree nodes
    
    Add the device tree nodes for APM X-Gene PCIe host controller and PCIe
    clock interface.  Since X-Gene SOC supports maximum 5 ports, 5 dts nodes
    are added.
    
    Signed-off-by: Tanmay Inamdar <tinamdar@apm.com>
    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

commit 2896e4418b17363f211e084471b589e3c06a7248
Author: Bjorn Helgaas <bhelgaas@google.com>
Date:   Wed Oct 1 13:01:35 2014 -0600

    PCI: xgene: Add APM X-Gene PCIe driver
    
    Add the AppliedMicro X-Gene SOC PCIe host controller driver.  The X-Gene
    PCIe controller supports up to 8 lanes and GEN3 speed.  The X-Gene SOC
    supports up to 5 PCIe ports.
    
    [bhelgaas: folded in MAINTAINERS and bindings updates]
    Tested-by: Ming Lei <ming.lei@canonical.com>
    Tested-by: Dann Frazier <dann.frazier@canonical.com>
    Signed-off-by: Tanmay Inamdar <tinamdar@apm.com>
    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
    Reviewed-by: Liviu Dudau <Liviu.Dudau@arm.com> (driver)

commit 3c87dcbfb36ce6d3d9087f0163c02ba5690d9a85
Author: Subbaraya Sundeep Bhatta <subbaraya.sundeep.bhatta@xilinx.com>
Date:   Wed Oct 1 11:01:17 2014 +0200

    net: ll_temac: Remove unnecessary ether_setup after alloc_etherdev
    
    Calling ether_setup is redundant since alloc_etherdev calls it.
    
    Signed-off-by: Subbaraya Sundeep Bhatta <sbhatta@xilinx.com>
    Signed-off-by: Michal Simek <michal.simek@xilinx.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit 8493ecca74a7b4a66e19676de1a0f14194179941
Author: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Date:   Wed Oct 1 11:59:47 2014 -0400

    HID: uHID: fix excepted report type
    
    When uhid_get_report() or uhid_set_report() are called, they emit on the
    char device a UHID_GET_REPORT or UHID_SET_REPORT message. Then, the
    protocol says that the user space asnwers with UHID_GET_REPORT_REPLY
    or UHID_SET_REPORT_REPLY.
    
    Unfortunatelly, the current code waits for an event of type UHID_GET_REPORT
    or UHID_SET_REPORT instead of the reply one.
    Add 1 to UHID_GET_REPORT or UHID_SET_REPORT to actually wait for the
    reply, and validate the reply.
    
    Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
    Reviewed-by: David Herrmann <dh.herrmann@gmail.com>
    Signed-off-by: Jiri Kosina <jkosina@suse.cz>

commit c8df6ac9452e8f47a6f660993c526d13e858a6f3
Author: Lucas Stach <l.stach@pengutronix.de>
Date:   Tue Sep 30 18:36:27 2014 +0200

    PCI: designware: Remove open-coded bitmap operations
    
    Replace them by using the standard kernel bitmap ops.  No functional
    change, but makes the code a lot cleaner.
    
    Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
    Reviewed-by: Pratyush Anand <pratyush.anand@st.com>
    Acked-by: Jingoo Han <jg1.han@samsung.com>

commit 2199f0608864cf4e8c93d37842a5ee50c8d79843
Author: Mikulas Patocka <mpatocka@redhat.com>
Date:   Fri Mar 28 15:51:56 2014 -0400

    dm crypt: sort writes
    
    Write requests are sorted in a red-black tree structure and are
    submitted in the sorted order.
    
    In theory the sorting should be performed by the underlying disk
    scheduler, however, in practice the disk scheduler only accepts and
    sorts a finite number of requests.  To allow the sorting of all
    requests, dm-crypt needs to implement its own own sorting.
    
    The overhead associated with rbtree-based sorting is considered
    negligible so it is not used conditionally.  Even on SSD sorting can be
    beneficial since in-order request dispatch promotes lower latency IO
    completion to the upper layers.
    
    Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
    Signed-off-by: Mike Snitzer <snitzer@redhat.com>

commit 648fee35be4c75667aa18bf513f7e7e65c01640b
Author: Mikulas Patocka <mpatocka@redhat.com>
Date:   Fri Mar 28 15:51:56 2014 -0400

    dm crypt: offload writes to thread
    
    Submitting write bios directly in the encryption thread caused serious
    performance degradation.  On a multiprocessor machine, encryption requests
    finish in a different order than they were submitted.  Consequently, write
    requests would be submitted in a different order and it could cause severe
    performance degradation.
    
    Move the submission of write requests to a separate thread so that the
    requests can be sorted before submitting.  But this commit improves
    dm-crypt performance even without having dm-crypt perform request
    sorting (in particular it enables IO schedulers like CFQ to sort more
    effectively).
    
    Note: it is required that a previous commit ("dm crypt: don't allocate
    pages for a partial request") be applied before applying this patch.
    Otherwise, this commit could introduce a crash.
    
    Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
    Signed-off-by: Mike Snitzer <snitzer@redhat.com>

commit 4a0d7e0464226eee625a5b77484c339334453882
Author: Mikulas Patocka <mpatocka@redhat.com>
Date:   Fri Mar 28 15:51:55 2014 -0400

    dm crypt: use unbound workqueue for request processing
    
    Use unbound workqueue so that work is automatically balanced between
    available CPUs.
    
    Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
    Signed-off-by: Mike Snitzer <snitzer@redhat.com>

commit 72bfc40ca3b393cb0bc6b5e2ce364e6c6ce0f390
Author: Mikulas Patocka <mpatocka@redhat.com>
Date:   Thu May 29 14:18:12 2014 -0400

    dm crypt: remove io_pending refcount member from dm_crypt_io
    
    Commit "dm crypt: don't allocate pages for a partial request" changed
    the code to allocate all pages for one request.  There is always just
    one pending request, so the io_pending refcount may be removed.
    
    Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
    Signed-off-by: Mike Snitzer <snitzer@redhat.com>

commit 42196fec8945cc84c032b7f59deaffee82036245
Author: Mikulas Patocka <mpatocka@redhat.com>
Date:   Fri Mar 28 15:51:56 2014 -0400

    dm crypt: remove unused io_pool and _crypt_io_pool
    
    The previous commits ("dm crypt: use per-bio data") and ("dm crypt:
    don't allocate pages for a partial request") stopped using the
    io_pool slab mempool and backing _crypt_io_pool kmem cache.
    
    Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
    Signed-off-by: Mike Snitzer <snitzer@redhat.com>

commit ebfda24b1e1bf483accdb900f8625151d8f01383
Author: Mikulas Patocka <mpatocka@redhat.com>
Date:   Fri Mar 28 15:51:56 2014 -0400

    dm crypt: avoid deadlock in mempools
    
    Fix a theoretical deadlock introduced in the previous commit ("dm crypt:
    don't allocate pages for a partial request").
    
    The function crypt_alloc_buffer may be called concurrently.  If we allocate
    from the mempool concurrently, there is a possibility of deadlock.  For
    example, if we have mempool of 256 pages, two processes, each wanting
    256, pages allocate from the mempool concurrently, it may deadlock in a
    situation where both processes have allocated 128 pages and the mempool
    is exhausted.
    
    In order to avoid such a scenario, we allocate the pages under a mutex.
    
    In order to not degrade performance with excessive locking, we try
    non-blocking allocations without a mutex first and if it fails, we
    fallback to a blocking allocation with a mutex.
    
    Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
    Signed-off-by: Mike Snitzer <snitzer@redhat.com>

commit b9ea7cb3fb237078be400522880932008c630fb7
Author: Mikulas Patocka <mpatocka@redhat.com>
Date:   Fri Mar 28 15:51:56 2014 -0400

    dm crypt: don't allocate pages for a partial request
    
    Change crypt_alloc_buffer so that it only ever allocates pages for a
    full request.
    
    This change is a prerequisite for the commit "dm crypt: offload writes
    to thread".  Which implies this change is effectively required for the
    upcoming cpu parallelization changes.
    
    But this change simplifies the dm-crypt code at the expense of reduced
    throughput in low memory conditions (where allocation for a partial
    request is most useful).
    
    This change also enables the removal of the io_pending refcount.
    
    Note: the next commit ("dm-crypt: avoid deadlock in mempools") is needed
    to fix a theoretical deadlock.
    
    Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
    Signed-off-by: Mike Snitzer <snitzer@redhat.com>

commit 117cd3e12232afea97dd31489fbde8888ad22b3e
Author: Heinz Mauelshagen <heinzm@redhat.com>
Date:   Wed Sep 24 17:47:19 2014 +0200

    dm raid: add discard support for RAID levels 4, 5 and 6
    
    In case of RAID levels 4, 5 and 6 we have to verify each RAID members'
    ability to zero data on discards to avoid stripe data corruption -- if
    discard_zeroes_data is not set for each RAID member discard support must
    be disabled.
    
    Also add an 'ignore_discard' table argument to the target in order to
    ignore discard processing completely on a RAID array, hence not passing
    down discards to MD personalities.
    
    This 'ignore_discard' control provides the ability to:
    - prohibit discards in case of _potential_ data corruptions in RAID4/5/6
      (e.g. if ability to zero data on discard is flawed in a RAID member)
    - avoid discard processing overhead
    
    Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
    Signed-off-by: Mike Snitzer <snitzer@redhat.com>

commit 04c308f43a90a9b3b84c344b324d6af29288da05
Author: Mikulas Patocka <mpatocka@redhat.com>
Date:   Wed Oct 1 13:29:48 2014 -0400

    dm bufio: when done scanning return from __scan immediately
    
    When __scan frees the required number of buffer entries that the
    shrinker requested (nr_to_scan becomes zero) it must return.  Before
    this fix the __scan code exited only the inner loop and continued in the
    outer loop.
    
    Also, move dm_bufio_cond_resched to __scan's inner loop, so that
    iterating the bufio client's lru lists doesn't result in scheduling
    latency.
    
    Reported-by: Joe Thornber <thornber@redhat.com>
    Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
    Signed-off-by: Mike Snitzer <snitzer@redhat.com>
    Cc: stable@vger.kernel.org # 3.2+

commit 5ec094057c7df5ff80f5e7fe282f47ad205fb976
Author: Bjorn Helgaas <bhelgaas@google.com>
Date:   Tue Sep 23 14:38:28 2014 -0600

    PCI/MSI: Remove unnecessary temporary variable
    
    The only use of "status" is to hold a value which is immediately returned,
    so just return and remove the variable directly.
    
    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

commit 56b72b40957947f7c08771f030102351d4c906df
Author: Yijing Wang <wangyijing@huawei.com>
Date:   Mon Sep 29 18:35:16 2014 -0600

    PCI/MSI: Use __write_msi_msg() instead of write_msi_msg()
    
    default_restore_msi_irq() already has the struct msi_desc pointer required
    by __write_msi_msg(), so call it directly instead of having write_msi_msg()
    look it up from the IRQ.
    
    No functional change.
    
    [bhelgaas: split into separate patch]
    Signed-off-by: Yijing Wang <wangyijing@huawei.com>
    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

commit 1e8f4cc82eded0c3c97ef6e2f119782e42deda35
Author: Yijing Wang <wangyijing@huawei.com>
Date:   Wed Sep 24 11:09:45 2014 +0800

    MSI/powerpc: Use __read_msi_msg() instead of read_msi_msg()
    
    rtas_setup_msi_irqs() already has the struct msi_desc pointer required by
    __read_msi_msg(), so call it directly instead of having read_msi_msg() look
    it up from the IRQ.
    
    No functional change.
    
    [bhelgaas: changelog]
    Signed-off-by: Yijing Wang <wangyijing@huawei.com>
    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
    Acked-by: Michael Ellerman <mpe@ellerman.id.au>
    CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
    CC: linuxppc-dev@lists.ozlabs.org

commit 2b260085e466c345e78f23b1c9ad1d123d509ef8
Author: Yijing Wang <wangyijing@huawei.com>
Date:   Tue Sep 23 13:27:25 2014 +0800

    PCI/MSI: Use __get_cached_msi_msg() instead of get_cached_msi_msg()
    
    Both callers of get_cached_msi_msg() start with a struct irq_data pointer,
    look up the corresponding IRQ number, and pass it to get_cached_msi_msg(),
    which then uses irq_get_irq_data() to look up the struct irq_data again to
    call __get_cached_msi_msg().
    
    Since we already have the struct irq_data, call __get_cached_msi_msg()
    directly and skip the lookup work done by get_cached_msi_msg().
    
    No functional change.
    
    [bhelgaas: changelog]
    Signed-off-by: Yijing Wang <wangyijing@huawei.com>
    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
    CC: Tony Luck <tony.luck@intel.com>
    CC: linux-ia64@vger.kernel.org

commit 468ff15a3ab98ed7153c29c68229ffb97f15a251
Author: Yijing Wang <wangyijing@huawei.com>
Date:   Tue Sep 23 13:27:24 2014 +0800

    PCI/MSI: Add "msi_bus" sysfs MSI/MSI-X control for endpoints
    
    The "msi_bus" sysfs file for bridges sets a bus flag to allow or disallow
    future driver requests for MSI or MSI-X.  Previously, the sysfs file
    existed for endpoints but did nothing.
    
    Add "msi_bus" support for endpoints, so an administrator can prevent the
    use of MSI and MSI-X for individual devices.
    
    Note that as for bridges, these changes only affect future driver requests
    for MSI or MSI-X, so drivers may need to be reloaded.
    
    Add documentation for the "msi_bus" sysfs file.
    
    [bhelgaas: changelog, comments, add "subordinate", add endpoint printk,
    rework bus_flags setting, make bus_flags printk unconditional]
    Signed-off-by: Yijing Wang <wangyijing@huawei.com>
    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

commit 48c3c38f003c25d50a09d3da558667c5ecd530aa
Author: Yijing Wang <wangyijing@huawei.com>
Date:   Tue Sep 23 11:02:42 2014 -0600

    PCI/MSI: Remove "pos" from the struct msi_desc msi_attrib
    
    "msi_attrib.pos" is only used for MSI (not MSI-X), and we already cache the
    MSI capability offset in "dev->msi_cap".
    
    Remove "pos" from the struct msi_attrib and use "dev->msi_cap" directly.
    
    [bhelgaas: changelog, fix whitespace]
    Signed-off-by: Yijing Wang <wangyijing@huawei.com>
    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

commit 81052769e48609525c452d8f078a5786b673e178
Author: Yijing Wang <wangyijing@huawei.com>
Date:   Tue Sep 23 13:27:22 2014 +0800

    PCI/MSI: Remove unused kobject from struct msi_desc
    
    After commit 1c51b50c2995 ("PCI/MSI: Export MSI mode using attributes, not
    kobjects"), the kobject in struct msi_desc is unused.
    
    Remove the unused struct kobject from struct msi_desc.
    
    [bhelgaas: changelog]
    Fixes: 1c51b50c2995 ("PCI/MSI: Export MSI mode using attributes, not kobjects")
    Signed-off-by: Yijing Wang <wangyijing@huawei.com>
    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
    Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit a06cd74cefe754341f747ddc4cf7b0058fa9bff8
Author: Alexander Gordeev <agordeev@redhat.com>
Date:   Tue Sep 23 12:45:58 2014 -0600

    PCI/MSI: Rename pci_msi_check_device() to pci_msi_supported()
    
    Rename pci_msi_check_device() to pci_msi_supported() for clarity.  Note
    that pci_msi_supported() returns true if MSI/MSI-X is supported, so code
    like:
    
      if (pci_msi_supported(...))
    
    reads naturally.
    
    [bhelgaas: changelog, split to separate patch, reverse sense]
    Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

commit 27e20603c54ba633ed259284d006275f13c9f95b
Author: Alexander Gordeev <agordeev@redhat.com>
Date:   Tue Sep 23 14:25:11 2014 -0600

    PCI/MSI: Move D0 check into pci_msi_check_device()
    
    Both callers of pci_msi_check_device() check that the device is in D0
    state, so move the check from the callers into pci_msi_check_device()
    itself.
    
    In pci_enable_msi_range(), note that pci_msi_check_device() never returns a
    positive value any more, so the loop that called it until it returns zero
    or negative is no longer necessary.
    
    [bhelgaas: changelog, split to separate patch]
    Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

commit ad975ebad4c3ce8dcc7d0bb4db26ea5aca4cfc99
Author: Alexander Gordeev <agordeev@redhat.com>
Date:   Tue Sep 23 12:39:54 2014 -0600

    PCI/MSI: Remove arch_msi_check_device()
    
    No architectures implement arch_msi_check_device() or the struct msi_chip
    .check_device() method, so remove them.
    
    Remove the "type" parameter to pci_msi_check_device() because it was only
    used to call arch_msi_check_device() and is no longer needed.
    
    [bhelgaas: changelog, split to separate patch]
    Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

commit 3930115e0dd67f61b3b1882c7a34d0baeff1bb4c
Author: Alexander Gordeev <agordeev@redhat.com>
Date:   Sun Sep 7 20:57:54 2014 +0200

    irqchip: armada-370-xp: Remove arch_msi_check_device()
    
    Move MSI checks from arch_msi_check_device() to arch_setup_msi_irqs().
    This makes the code more compact and allows removing
    arch_msi_check_device() from generic MSI code.
    
    Tested-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
    Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
    Acked-by: Jason Cooper <jason@lakedaemon.net>
    CC: Thomas Gleixner <tglx@linutronix.de>

commit 6b2fd7efeb888fa781c1f767de6c36497ac1596b
Author: Alexander Gordeev <agordeev@redhat.com>
Date:   Sun Sep 7 20:57:53 2014 +0200

    PCI/MSI/PPC: Remove arch_msi_check_device()
    
    Move MSI checks from arch_msi_check_device() to arch_setup_msi_irqs().
    This makes the code more compact and allows removing
    arch_msi_check_device() from generic MSI code.
    
    Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
    Acked-by: Michael Ellerman <mpe@ellerman.id.au>

commit 977104ece1568f2e2ad3f5fd8e55bd640e8ab55a
Author: Mark Charlebois <charlebm@gmail.com>
Date:   Thu Sep 4 14:16:17 2014 -0700

    arm: LLVMLinux: Use global stack register variable for percpu
    
    Using global current_stack_pointer works on both clang and gcc.
    current_stack_pointer is an unsigned long and needs to be cast
    as a pointer to dereference.
    
    KernelVersion: 3.17.0-rc6
    Signed-off-by: Mark Charlebois <charlebm@gmail.com>
    Signed-off-by: Behan Webster <behanw@converseincode.com>

commit a35dc594542b29935cd3a92e53233ad4ba4e622f
Author: Behan Webster <behanw@converseincode.com>
Date:   Tue Sep 3 22:27:27 2013 -0400

    arm: LLVMLinux: Use current_stack_pointer in unwind_backtrace
    
    Use the global current_stack_pointer to get the value of the stack pointer.
    This change supports being able to compile the kernel with both gcc and clang.
    
    KernelVersion: 3.17.0-rc6
    Signed-off-by: Behan Webster <behanw@converseincode.com>
    Reviewed-by: Mark Charlebois <charlebm@gmail.com>
    Reviewed-by: Jan-Simon Möller <dl9pf@gmx.de>
    Acked-by: Will Deacon <will.deacon@arm.com>
    Acked-by: Nicolas Pitre <nico@linaro.org>

commit 5c5da6724d8e1767405a3f4b611451a11ece99e2
Author: Behan Webster <behanw@converseincode.com>
Date:   Tue Sep 3 22:27:27 2013 -0400

    arm: LLVMLinux: Calculate current_thread_info from current_stack_pointer
    
    Use the global current_stack_pointer to get the value of the stack pointer.
    This change supports being able to compile the kernel with both gcc and clang.
    
    KernelVersion: 3.17.0-rc6
    Signed-off-by: Behan Webster <behanw@converseincode.com>
    Reviewed-by: Mark Charlebois <charlebm@gmail.com>
    Reviewed-by: Jan-Simon Möller <dl9pf@gmx.de>
    Acked-by: Will Deacon <will.deacon@arm.com>
    Acked-by: Nicolas Pitre <nico@linaro.org>

commit f2b6d8c6c56c9a164a2d885ba34a09d613c959c9
Author: Behan Webster <behanw@converseincode.com>
Date:   Tue Sep 3 22:27:27 2013 -0400

    arm: LLVMLinux: Use current_stack_pointer in save_stack_trace_tsk
    
    Use the global current_stack_pointer to get the value of the stack pointer.
    This change supports being able to compile the kernel with both gcc and clang.
    
    KernelVersion: 3.17.0-rc6
    Signed-off-by: Behan Webster <behanw@converseincode.com>
    Reviewed-by: Mark Charlebois <charlebm@gmail.com>
    Reviewed-by: Jan-Simon Möller <dl9pf@gmx.de>
    Acked-by: Will Deacon <will.deacon@arm.com>
    Acked-by: Nicolas Pitre <nico@linaro.org>

commit 40802b84566a3d9731a8fea43b144301d9ac450d
Author: Behan Webster <behanw@converseincode.com>
Date:   Tue Sep 3 22:27:27 2013 -0400

    arm: LLVMLinux: Use current_stack_pointer for return_address
    
    Use the global current_stack_pointer to get the value of the stack pointer.
    This change supports being able to compile the kernel with both gcc and Clang.
    
    KernelVersion: 3.17.0-rc6
    Signed-off-by: Behan Webster <behanw@converseincode.com>
    Reviewed-by: Mark Charlebois <charlebm@gmail.com>
    Reviewed-by: Jan-Simon Möller <dl9pf@gmx.de>
    Acked-by: Will Deacon <will.deacon@arm.com>
    Acked-by: Nicolas Pitre <nico@linaro.org>

commit d80ced5236764b8c4ffda5545d5b357cf88c77c1
Author: Behan Webster <behanw@converseincode.com>
Date:   Tue Sep 3 22:27:27 2013 -0400

    arm: LLVMLinux: Use current_stack_pointer to calculate pt_regs address
    
    Use the global current_stack_pointer to calculate the end of the stack for
    current_pt_regs()
    
    KernelVersion: 3.17.0-rc6
    Signed-off-by: Behan Webster <behanw@converseincode.com>
    Reviewed-by: Mark Charlebois <charlebm@gmail.com>
    Reviewed-by: Jan-Simon Möller <dl9pf@gmx.de>
    Acked-by: Will Deacon <will.deacon@arm.com>
    Acked-by: Nicolas Pitre <nico@linaro.org>

commit 9d0d6994806b36891453beb1e94b6253f853af61
Author: Behan Webster <behanw@converseincode.com>
Date:   Tue Sep 3 22:27:26 2013 -0400

    arm: LLVMLinux: Add global named register current_stack_pointer for ARM
    
    Define a global named register for current_stack_pointer. The use of this new
    variable guarantees that both gcc and clang can access this register in C code.
    
    KernelVersion: 3.17.0-rc6
    Signed-off-by: Behan Webster <behanw@converseincode.com>
    Reviewed-by: Jan-Simon Möller <dl9pf@gmx.de>
    Reviewed-by: Mark Charlebois <charlebm@gmail.com>
    Acked-by: Will Deacon <will.deacon@arm.com>
    Acked-by: Nicolas Pitre <nico@linaro.org>

commit 2c804d0f8fc7799981d9fdd8c88653541b28c1a7
Author: Eric Dumazet <edumazet@google.com>
Date:   Tue Sep 30 22:12:05 2014 -0700

    ipv4: mentions skb_gro_postpull_rcsum() in inet_gro_receive()
    
    Proper CHECKSUM_COMPLETE support needs to adjust skb->csum
    when we remove one header. Its done using skb_gro_postpull_rcsum()
    
    In the case of IPv4, we know that the adjustment is not really needed,
    because the checksum over IPv4 header is 0. Lets add a comment to
    ease code comprehension and avoid copy/paste errors.
    
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit eb51bbaf8dedf142a54a7ff58514a29b40d515bb
Author: Stephen Rothwell <sfr@canb.auug.org.au>
Date:   Wed Oct 1 17:00:49 2014 +1000

    fm10k: using vmalloc requires including linux/vmalloc.h
    
    Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
    Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit 078efae00ffc76381c3248006e9cf0988163488f
Author: Anish Bhatt <anish@chelsio.com>
Date:   Mon Sep 15 17:44:18 2014 -0700

    [SCSI] cxgb4i: avoid holding mutex in interrupt context
    
    cxgbi_inet6addr_handler() can be called in interrupt context, so use rcu
    protected list while finding netdev.  This is observed as a scheduling in
    atomic oops when running over ipv6.
    
    Fixes: fc8d0590d914 ("libcxgbi: Add ipv6 api to driver")
    Fixes: 759a0cc5a3e1 ("cxgb4i: Add ipv6 code to driver, call into libcxgbi ipv6 api")
    
    Signed-off-by: Anish Bhatt <anish@chelsio.com>
    Signed-off-by: Karen Xie <kxie@chelsio.com>
    Signed-off-by: Christoph Hellwig <hch@lst.de>
    Signed-off-by: James Bottomley <JBottomley@Parallels.com>

commit 34549ab09e62db9703811c6ed4715f2ffa1fd7fb
Author: Jeff Layton <jlayton@primarydata.com>
Date:   Wed Oct 1 08:05:22 2014 -0400

    nfsd: eliminate "to_delegation" define
    
    We now have cb_to_delegation and to_delegation, which do the same thing
    and are defined separately in different .c files. Move the
    cb_to_delegation definition into a header file and eliminate the
    redundant to_delegation definition.
    
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Signed-off-by: Jeff Layton <jlayton@primarydata.com>

commit 4a0efdc933680d908de11712a774a2c9492c3d5a
Author: Hannes Reinecke <hare@suse.de>
Date:   Wed Oct 1 14:32:31 2014 +0200

    block: misplaced rq_complete tracepoint
    
    The rq_complete tracepoint was never issued for empty requests,
    causing the resulting blktrace information to never show any
    completion for those request.
    
    Signed-off-by: Hannes Reinecke <hare@suse.de>
    Acked-by: Tejun Heo <tj@kernel.org>
    Signed-off-by: Jens Axboe <axboe@fb.com>

commit fc2021fb9baf9ed375c8161b40b68e120e75c60e
Author: Michael Opdenacker <michael.opdenacker@free-electrons.com>
Date:   Wed Oct 1 12:07:07 2014 +0200

    block: hd: remove deprecated IRQF_DISABLED
    
    This patch removes the use of the IRQF_DISABLED flag
    from drivers/block/hd.c
    
    It's a NOOP since 2.6.35 and it will be removed one day.
    
    This also removes a related comment which is obsolete too.
    
    Signed-off-by: Michael Opdenacker <michael.opdenacker@free-electrons.com>
    Signed-off-by: Jens Axboe <axboe@fb.com>

commit 19aeb5a65f1a6504fc665466c188241e7393d66f
Author: Bob Peterson <rpeterso@redhat.com>
Date:   Mon Sep 29 08:52:04 2014 -0400

    GFS2: Make rename not save dirent location
    
    This patch fixes a regression in the patch "GFS2: Remember directory
    insert point", commit 2b47dad866d04f14c328f888ba5406057b8c7d33.
    The problem had to do with the rename function: The function found
    space for the new dirent, and remembered that location. But then the
    old dirent was removed, which often moved the eligible location for
    the renamed dirent. Putting the new dirent at the saved location
    caused file system corruption.
    
    This patch adds a new "save_loc" variable to struct gfs2_diradd.
    If 1, the dirent location is saved. If 0, the dirent location is not
    saved and the buffer_head is released as per previous behavior.
    
    Signed-off-by: Bob Peterson <rpeterso@redhat.com>
    Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

commit 5235166fbc332c8b5dcf49e3a498a8b510a77449
Author: Oliver Neukum <oneukum@suse.de>
Date:   Tue Sep 30 12:54:56 2014 +0200

    HID: usbhid: add another mouse that needs QUIRK_ALWAYS_POLL
    
    There is a second mouse sharing the same vendor strings but different IDs.
    
    Signed-off-by: Oliver Neukum <oneukum@suse.de>
    Signed-off-by: Jiri Kosina <jkosina@suse.cz>

commit 2013add4ce73c93ae2148969a9ec3ecc8b1e26fa
Author: Gavin Shan <gwshan@linux.vnet.ibm.com>
Date:   Wed Oct 1 14:34:51 2014 +1000

    powerpc/eeh: Show hex prefix for PE state sysfs
    
    As Michael suggested, the hex prefix for the output of EEH PE
    state sysfs entry (/sys/bus/pci/devices/xxx/eeh_pe_state) is
    always informative to users.
    
    Suggested-by: Michael Ellerman <mpe@ellerman.id.au>
    Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
    Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

commit 24c20f10583647e30afe87b6f6d5e14bc7b1cbc6
Author: Christoph Hellwig <hch@lst.de>
Date:   Tue Sep 30 16:43:46 2014 +0200

    scsi: add a CONFIG_SCSI_MQ_DEFAULT option
    
    Add a Kconfig option to enable the blk-mq path for SCSI by default
    to ease testing and deployment in setups that know they benefit
    from blk-mq.
    
    Signed-off-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
    Reviewed-by: Robert Elliott <elliott@hp.com>
    Tested-by: Robert Elliott <elliott@hp.com>

commit e785060ea3a1c8e37a8bc1449c79e36bff2b5b13
Author: Dolev Raviv <draviv@codeaurora.org>
Date:   Thu Sep 25 15:32:36 2014 +0300

    ufs: definitions for phy interface
    
    - Adding some of the definitions missing in unipro.h, including power
      enumeration.
    - Read Modify Write Line helper function
    - Indication for the type of suspend
    
    Signed-off-by: Dolev Raviv <draviv@codeaurora.org>
    Signed-off-by: Subhash Jadavani <subhashj@codeaurora.org>
    Signed-off-by: Yaniv Gardi <ygardi@codeaurora.org>
    Signed-off-by: Christoph Hellwig <hch@lst.de>

commit 374a246e4ebda1fc55d537877bf2412e511ecc7b
Author: Subhash Jadavani <subhashj@codeaurora.org>
Date:   Thu Sep 25 15:32:35 2014 +0300

    ufs: tune bkops while power managment events
    
    Add capability to control the auto bkops during suspend.
    If host explicitly enables the auto bkops (background operation) on device
    then only device would perform the bkops on its own. If auto bkops is not
    enabled explicitly and if the device reaches to state where it must do
    background operation, device would raise the urgent bkops exception event
    to host and then host will enable the auto bkops on device. This patch
    adds the option to choose whether auto bkops should be enabled during
    runtime suspend or not. Since we don't want to keep the device active to
    perform the non critical bkops, host will enable urgent bkops only.
    
    Keep auto-bkops enabled after resume if urgent bkops needed.
    If device bkops status shows that its in critical need of executing
    background operations, host should allow the device to continue doing
    background operations.
    
    Signed-off-by: Subhash Jadavani <subhashj@codeaurora.org>
    Signed-off-by: Dolev Raviv <draviv@codeaurora.org>
    Signed-off-by: Christoph Hellwig <hch@lst.de>

commit 856b348305c98d4e0c8e5eafa97c61443197f8d3
Author: Sahitya Tummala <stummala@codeaurora.org>
Date:   Thu Sep 25 15:32:34 2014 +0300

    ufs: Add support for clock scaling using devfreq framework
    
    The clocks for UFS device will be managed by generic DVFS (Dynamic
    Voltage and Frequency Scaling) framework within kernel. This devfreq
    framework works with different governors to scale the clocks. By default,
    UFS devices uses simple_ondemand governor which scales the clocks up if
    the load is more than upthreshold and scales down if the load is less than
    downthreshold.
    
    Signed-off-by: Sahitya Tummala <stummala@codeaurora.org>
    Signed-off-by: Dolev Raviv <draviv@codeaurora.org>
    Signed-off-by: Christoph Hellwig <hch@lst.de>

commit 4cff6d991e4a291cf50fe2659da2ea9ad46620bf
Author: Sahitya Tummala <stummala@codeaurora.org>
Date:   Thu Sep 25 15:32:33 2014 +0300

    ufs: Add freq-table-hz property for UFS device
    
    Add freq-table-hz propery for UFS device to keep track of
    <min max> frequencies supported by UFS clocks.
    
    Signed-off-by: Sahitya Tummala <stummala@codeaurora.org>
    Signed-off-by: Dolev Raviv <draviv@codeaurora.org>
    Signed-off-by: Christoph Hellwig <hch@lst.de>

commit 1ab27c9cf8b63dd8dec9e17b5c17721c7f3b6cc7
Author: Sahitya Tummala <stummala@codeaurora.org>
Date:   Thu Sep 25 15:32:32 2014 +0300

    ufs: Add support for clock gating
    
    The UFS controller clocks can be gated after certain period of
    inactivity, which is typically less than runtime suspend timeout.
    In addition to clocks the link will also be put into Hibern8 mode
    to save more power.
    
    The clock gating can be turned on by enabling the capability
    UFSHCD_CAP_CLK_GATING. To enable entering into Hibern8 mode as part of
    clock gating, set the capability UFSHCD_CAP_HIBERN8_WITH_CLK_GATING.
    
    The tracing events for clock gating can be enabled through debugfs as:
    echo 1 > /sys/kernel/debug/tracing/events/ufs/ufshcd_clk_gating/enable
    cat /sys/kernel/debug/tracing/trace_pipe
    
    Signed-off-by: Sahitya Tummala <stummala@codeaurora.org>
    Signed-off-by: Dolev Raviv <draviv@codeaurora.org>
    Signed-off-by: Christoph Hellwig <hch@lst.de>

commit 7eb584db73bebbc9852a14341431ed6935419bec
Author: Dolev Raviv <draviv@codeaurora.org>
Date:   Thu Sep 25 15:32:31 2014 +0300

    ufs: refactor configuring power mode
    
    Sometimes, the device shall report its maximum power and speed
    capabilities, but we might not wish to configure it to use those
    maximum capabilities.
    This change adds support for the vendor specific host driver to
    implement power change notify callback.
    
    To enable configuring different power modes (number of lanes,
    gear number and fast/slow modes) it is necessary to split the
    configuration stage from the stage that reads the device max power mode.
    In addition, it is not required to read the configuration more than
    once, thus the configuration is stored after reading it once.
    
    Signed-off-by: Dolev Raviv <draviv@codeaurora.org>
    Signed-off-by: Yaniv Gardi <ygardi@codeaurora.org>
    Signed-off-by: Christoph Hellwig <hch@lst.de>

commit 57d104c153d3d6d7bea60089e80f37501851ed2c
Author: Subhash Jadavani <subhashj@codeaurora.org>
Date:   Thu Sep 25 15:32:30 2014 +0300

    ufs: add UFS power management support
    
    This patch adds support for UFS device and UniPro link power management
    during runtime/system PM.
    
    Main idea is to define multiple UFS low power levels based on UFS device
    and UFS link power states. This would allow any specific platform or pci
    driver to choose the best suited low power level during runtime and
    system suspend based on their power goals.
    
    bkops handlig:
    To put the UFS device in sleep state when bkops is disabled, first query
    the bkops status from the device and enable bkops on device only if
    device needs time to perform the bkops.
    
    START_STOP handling:
    Before sending START_STOP_UNIT to the device well-known logical unit
    (w-lun) to make sure that the device w-lun unit attention condition is
    cleared.
    
    Write protection:
    UFS device specification allows LUs to be write protected, either
    permanently or power on write protected. If any LU is power on write
    protected and if the card is power cycled (by powering off VCCQ and/or
    VCC rails), LU's write protect status would be lost. So this means those
    LUs can be written now. To ensures that UFS device is power cycled only
    if the power on protect is not set for any of the LUs, check if power on
    write protect is set and if device is in sleep/power-off state & link in
    inactive state (Hibern8 or OFF state).
    If none of the Logical Units on UFS device is power on write protected
    then all UFS device power rails (VCC, VCCQ & VCCQ2) can be turned off if
    UFS device is in power-off state and UFS link is in OFF state. But current
    implementation would disable all device power rails even if UFS link is
    not in OFF state.
    
    Low power mode:
    If UFS link is in OFF state then UFS host controller can be power collapsed
    to avoid leakage current from it. Note that if UFS host controller is power
    collapsed, full UFS reinitialization will be required on resume to
    re-establish the link between host and device.
    
    Signed-off-by: Subhash Jadavani <subhashj@codeaurora.org>
    Signed-off-by: Dolev Raviv <draviv@codeaurora.org>
    Signed-off-by: Sujit Reddy Thumma <sthumma@codeaurora.org>
    Signed-off-by: Christoph Hellwig <hch@lst.de>

commit 0ce147d48a3e3352859f0c185e98e8392bee7a25
Author: Subhash Jadavani <subhashj@codeaurora.org>
Date:   Thu Sep 25 15:32:29 2014 +0300

    ufs: introduce well known logical unit in ufs
    
    UFS device may have standard LUs and LUN id could be from 0x00 to 0x7F.
    UFS device specification use "Peripheral Device Addressing Format"
    (SCSI SAM-5) for standard LUs.
    
    UFS device may also have the Well Known LUs (also referred as W-LU) which
    again could be from 0x00 to 0x7F. For W-LUs, UFS device specification only
    allows the "Extended Addressing Format" (SCSI SAM-5) which means the W-LUNs
    would start from 0xC100 onwards.
    
    This means max. LUN number reported from UFS device could be 0xC17F hence
    this patch advertise the "max_lun" as 0xC17F which will allow SCSI mid
    layer to detect the W-LUs as well.
    
    But once the W-LUs are detected, UFSHCD driver may get the commands with
    SCSI LUN id upto 0xC17F but UPIU LUN id field is only 8-bit wide so it
    requires the mapping of SCSI LUN id to UPIU LUN id. This patch also add
    support for this mapping.
    
    Signed-off-by: Subhash Jadavani <subhashj@codeaurora.org>
    Signed-off-by: Dolev Raviv <draviv@codeaurora.org>
    Signed-off-by: Sujit Reddy Thumma <sthumma@codeaurora.org>
    Signed-off-by: Christoph Hellwig <hch@lst.de>

commit 2a8fa600445c45222632810a4811ce820279d106
Author: Subhash Jadavani <su…
ammarfaizi2 pushed a commit to ammarfaizi2/linux-fork that referenced this pull request Apr 20, 2023
[ Upstream commit e8b7445 ]

For quite some time we were chasing a bug which looked like a sudden
permanent failure of networking and mmc on some of our devices.
The bug was very sensitive to any software changes and even more to
any kernel debug options.

Finally we got a setup where the problem was reproducible with
CONFIG_DMA_API_DEBUG=y and it revealed the issue with the rx dma:

[   16.992082] ------------[ cut here ]------------
[   16.996779] DMA-API: macb ff0b0000.ethernet: device driver tries to free DMA memory it has not allocated [device address=0x0000000875e3e244] [size=1536 bytes]
[   17.011049] WARNING: CPU: 0 PID: 85 at kernel/dma/debug.c:1011 check_unmap+0x6a0/0x900
[   17.018977] Modules linked in: xxxxx
[   17.038823] CPU: 0 PID: 85 Comm: irq/55-8000f000 Not tainted 5.4.0 torvalds#28
[   17.045345] Hardware name: xxxxx
[   17.049528] pstate: 60000005 (nZCv daif -PAN -UAO)
[   17.054322] pc : check_unmap+0x6a0/0x900
[   17.058243] lr : check_unmap+0x6a0/0x900
[   17.062163] sp : ffffffc010003c40
[   17.065470] x29: ffffffc010003c40 x28: 000000004000c03c
[   17.070783] x27: ffffffc010da7048 x26: ffffff8878e38800
[   17.076095] x25: ffffff8879d22810 x24: ffffffc010003cc8
[   17.081407] x23: 0000000000000000 x22: ffffffc010a08750
[   17.086719] x21: ffffff8878e3c7c0 x20: ffffffc010acb000
[   17.092032] x19: 0000000875e3e244 x18: 0000000000000010
[   17.097343] x17: 0000000000000000 x16: 0000000000000000
[   17.102647] x15: ffffff8879e4a988 x14: 0720072007200720
[   17.107959] x13: 0720072007200720 x12: 0720072007200720
[   17.113261] x11: 0720072007200720 x10: 0720072007200720
[   17.118565] x9 : 0720072007200720 x8 : 000000000000022d
[   17.123869] x7 : 0000000000000015 x6 : 0000000000000098
[   17.129173] x5 : 0000000000000000 x4 : 0000000000000000
[   17.134475] x3 : 00000000ffffffff x2 : ffffffc010a1d370
[   17.139778] x1 : b420c9d75d27bb00 x0 : 0000000000000000
[   17.145082] Call trace:
[   17.147524]  check_unmap+0x6a0/0x900
[   17.151091]  debug_dma_unmap_page+0x88/0x90
[   17.155266]  gem_rx+0x114/0x2f0
[   17.158396]  macb_poll+0x58/0x100
[   17.161705]  net_rx_action+0x118/0x400
[   17.165445]  __do_softirq+0x138/0x36c
[   17.169100]  irq_exit+0x98/0xc0
[   17.172234]  __handle_domain_irq+0x64/0xc0
[   17.176320]  gic_handle_irq+0x5c/0xc0
[   17.179974]  el1_irq+0xb8/0x140
[   17.183109]  xiic_process+0x5c/0xe30
[   17.186677]  irq_thread_fn+0x28/0x90
[   17.190244]  irq_thread+0x208/0x2a0
[   17.193724]  kthread+0x130/0x140
[   17.196945]  ret_from_fork+0x10/0x20
[   17.200510] ---[ end trace 7240980785f81d6f ]---

[  237.021490] ------------[ cut here ]------------
[  237.026129] DMA-API: exceeded 7 overlapping mappings of cacheline 0x0000000021d79e7b
[  237.033886] WARNING: CPU: 0 PID: 0 at kernel/dma/debug.c:499 add_dma_entry+0x214/0x240
[  237.041802] Modules linked in: xxxxx
[  237.061637] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W         5.4.0 torvalds#28
[  237.068941] Hardware name: xxxxx
[  237.073116] pstate: 80000085 (Nzcv daIf -PAN -UAO)
[  237.077900] pc : add_dma_entry+0x214/0x240
[  237.081986] lr : add_dma_entry+0x214/0x240
[  237.086072] sp : ffffffc010003c30
[  237.089379] x29: ffffffc010003c30 x28: ffffff8878a0be00
[  237.094683] x27: 0000000000000180 x26: ffffff8878e387c0
[  237.099987] x25: 0000000000000002 x24: 0000000000000000
[  237.105290] x23: 000000000000003b x22: ffffffc010a0fa00
[  237.110594] x21: 0000000021d79e7b x20: ffffffc010abe600
[  237.115897] x19: 00000000ffffffef x18: 0000000000000010
[  237.121201] x17: 0000000000000000 x16: 0000000000000000
[  237.126504] x15: ffffffc010a0fdc8 x14: 0720072007200720
[  237.131807] x13: 0720072007200720 x12: 0720072007200720
[  237.137111] x11: 0720072007200720 x10: 0720072007200720
[  237.142415] x9 : 0720072007200720 x8 : 0000000000000259
[  237.147718] x7 : 0000000000000001 x6 : 0000000000000000
[  237.153022] x5 : ffffffc010003a20 x4 : 0000000000000001
[  237.158325] x3 : 0000000000000006 x2 : 0000000000000007
[  237.163628] x1 : 8ac721b3a7dc1c00 x0 : 0000000000000000
[  237.168932] Call trace:
[  237.171373]  add_dma_entry+0x214/0x240
[  237.175115]  debug_dma_map_page+0xf8/0x120
[  237.179203]  gem_rx_refill+0x190/0x280
[  237.182942]  gem_rx+0x224/0x2f0
[  237.186075]  macb_poll+0x58/0x100
[  237.189384]  net_rx_action+0x118/0x400
[  237.193125]  __do_softirq+0x138/0x36c
[  237.196780]  irq_exit+0x98/0xc0
[  237.199914]  __handle_domain_irq+0x64/0xc0
[  237.204000]  gic_handle_irq+0x5c/0xc0
[  237.207654]  el1_irq+0xb8/0x140
[  237.210789]  arch_cpu_idle+0x40/0x200
[  237.214444]  default_idle_call+0x18/0x30
[  237.218359]  do_idle+0x200/0x280
[  237.221578]  cpu_startup_entry+0x20/0x30
[  237.225493]  rest_init+0xe4/0xf0
[  237.228713]  arch_call_rest_init+0xc/0x14
[  237.232714]  start_kernel+0x47c/0x4a8
[  237.236367] ---[ end trace 7240980785f81d70 ]---

Lars was fast to find an explanation: according to the datasheet
bit 2 of the rx buffer descriptor entry has a different meaning in the
extended mode:
  Address [2] of beginning of buffer, or
  in extended buffer descriptor mode (DMA configuration register [28] = 1),
  indicates a valid timestamp in the buffer descriptor entry.

The macb driver didn't mask this bit while getting an address and it
eventually caused a memory corruption and a dma failure.

The problem is resolved by explicitly clearing the problematic bit
if hw timestamping is used.

Fixes: 7b42961 ("net: macb: Add support for PTP timestamps in DMA descriptors")
Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev>
Co-developed-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20230412232144.770336-1-roman.gushchin@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
ammarfaizi2 pushed a commit to ammarfaizi2/linux-fork that referenced this pull request Apr 20, 2023
[ Upstream commit e8b7445 ]

For quite some time we were chasing a bug which looked like a sudden
permanent failure of networking and mmc on some of our devices.
The bug was very sensitive to any software changes and even more to
any kernel debug options.

Finally we got a setup where the problem was reproducible with
CONFIG_DMA_API_DEBUG=y and it revealed the issue with the rx dma:

[   16.992082] ------------[ cut here ]------------
[   16.996779] DMA-API: macb ff0b0000.ethernet: device driver tries to free DMA memory it has not allocated [device address=0x0000000875e3e244] [size=1536 bytes]
[   17.011049] WARNING: CPU: 0 PID: 85 at kernel/dma/debug.c:1011 check_unmap+0x6a0/0x900
[   17.018977] Modules linked in: xxxxx
[   17.038823] CPU: 0 PID: 85 Comm: irq/55-8000f000 Not tainted 5.4.0 torvalds#28
[   17.045345] Hardware name: xxxxx
[   17.049528] pstate: 60000005 (nZCv daif -PAN -UAO)
[   17.054322] pc : check_unmap+0x6a0/0x900
[   17.058243] lr : check_unmap+0x6a0/0x900
[   17.062163] sp : ffffffc010003c40
[   17.065470] x29: ffffffc010003c40 x28: 000000004000c03c
[   17.070783] x27: ffffffc010da7048 x26: ffffff8878e38800
[   17.076095] x25: ffffff8879d22810 x24: ffffffc010003cc8
[   17.081407] x23: 0000000000000000 x22: ffffffc010a08750
[   17.086719] x21: ffffff8878e3c7c0 x20: ffffffc010acb000
[   17.092032] x19: 0000000875e3e244 x18: 0000000000000010
[   17.097343] x17: 0000000000000000 x16: 0000000000000000
[   17.102647] x15: ffffff8879e4a988 x14: 0720072007200720
[   17.107959] x13: 0720072007200720 x12: 0720072007200720
[   17.113261] x11: 0720072007200720 x10: 0720072007200720
[   17.118565] x9 : 0720072007200720 x8 : 000000000000022d
[   17.123869] x7 : 0000000000000015 x6 : 0000000000000098
[   17.129173] x5 : 0000000000000000 x4 : 0000000000000000
[   17.134475] x3 : 00000000ffffffff x2 : ffffffc010a1d370
[   17.139778] x1 : b420c9d75d27bb00 x0 : 0000000000000000
[   17.145082] Call trace:
[   17.147524]  check_unmap+0x6a0/0x900
[   17.151091]  debug_dma_unmap_page+0x88/0x90
[   17.155266]  gem_rx+0x114/0x2f0
[   17.158396]  macb_poll+0x58/0x100
[   17.161705]  net_rx_action+0x118/0x400
[   17.165445]  __do_softirq+0x138/0x36c
[   17.169100]  irq_exit+0x98/0xc0
[   17.172234]  __handle_domain_irq+0x64/0xc0
[   17.176320]  gic_handle_irq+0x5c/0xc0
[   17.179974]  el1_irq+0xb8/0x140
[   17.183109]  xiic_process+0x5c/0xe30
[   17.186677]  irq_thread_fn+0x28/0x90
[   17.190244]  irq_thread+0x208/0x2a0
[   17.193724]  kthread+0x130/0x140
[   17.196945]  ret_from_fork+0x10/0x20
[   17.200510] ---[ end trace 7240980785f81d6f ]---

[  237.021490] ------------[ cut here ]------------
[  237.026129] DMA-API: exceeded 7 overlapping mappings of cacheline 0x0000000021d79e7b
[  237.033886] WARNING: CPU: 0 PID: 0 at kernel/dma/debug.c:499 add_dma_entry+0x214/0x240
[  237.041802] Modules linked in: xxxxx
[  237.061637] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W         5.4.0 torvalds#28
[  237.068941] Hardware name: xxxxx
[  237.073116] pstate: 80000085 (Nzcv daIf -PAN -UAO)
[  237.077900] pc : add_dma_entry+0x214/0x240
[  237.081986] lr : add_dma_entry+0x214/0x240
[  237.086072] sp : ffffffc010003c30
[  237.089379] x29: ffffffc010003c30 x28: ffffff8878a0be00
[  237.094683] x27: 0000000000000180 x26: ffffff8878e387c0
[  237.099987] x25: 0000000000000002 x24: 0000000000000000
[  237.105290] x23: 000000000000003b x22: ffffffc010a0fa00
[  237.110594] x21: 0000000021d79e7b x20: ffffffc010abe600
[  237.115897] x19: 00000000ffffffef x18: 0000000000000010
[  237.121201] x17: 0000000000000000 x16: 0000000000000000
[  237.126504] x15: ffffffc010a0fdc8 x14: 0720072007200720
[  237.131807] x13: 0720072007200720 x12: 0720072007200720
[  237.137111] x11: 0720072007200720 x10: 0720072007200720
[  237.142415] x9 : 0720072007200720 x8 : 0000000000000259
[  237.147718] x7 : 0000000000000001 x6 : 0000000000000000
[  237.153022] x5 : ffffffc010003a20 x4 : 0000000000000001
[  237.158325] x3 : 0000000000000006 x2 : 0000000000000007
[  237.163628] x1 : 8ac721b3a7dc1c00 x0 : 0000000000000000
[  237.168932] Call trace:
[  237.171373]  add_dma_entry+0x214/0x240
[  237.175115]  debug_dma_map_page+0xf8/0x120
[  237.179203]  gem_rx_refill+0x190/0x280
[  237.182942]  gem_rx+0x224/0x2f0
[  237.186075]  macb_poll+0x58/0x100
[  237.189384]  net_rx_action+0x118/0x400
[  237.193125]  __do_softirq+0x138/0x36c
[  237.196780]  irq_exit+0x98/0xc0
[  237.199914]  __handle_domain_irq+0x64/0xc0
[  237.204000]  gic_handle_irq+0x5c/0xc0
[  237.207654]  el1_irq+0xb8/0x140
[  237.210789]  arch_cpu_idle+0x40/0x200
[  237.214444]  default_idle_call+0x18/0x30
[  237.218359]  do_idle+0x200/0x280
[  237.221578]  cpu_startup_entry+0x20/0x30
[  237.225493]  rest_init+0xe4/0xf0
[  237.228713]  arch_call_rest_init+0xc/0x14
[  237.232714]  start_kernel+0x47c/0x4a8
[  237.236367] ---[ end trace 7240980785f81d70 ]---

Lars was fast to find an explanation: according to the datasheet
bit 2 of the rx buffer descriptor entry has a different meaning in the
extended mode:
  Address [2] of beginning of buffer, or
  in extended buffer descriptor mode (DMA configuration register [28] = 1),
  indicates a valid timestamp in the buffer descriptor entry.

The macb driver didn't mask this bit while getting an address and it
eventually caused a memory corruption and a dma failure.

The problem is resolved by explicitly clearing the problematic bit
if hw timestamping is used.

Fixes: 7b42961 ("net: macb: Add support for PTP timestamps in DMA descriptors")
Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev>
Co-developed-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20230412232144.770336-1-roman.gushchin@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
ammarfaizi2 pushed a commit to ammarfaizi2/linux-fork that referenced this pull request Apr 20, 2023
[ Upstream commit e8b7445 ]

For quite some time we were chasing a bug which looked like a sudden
permanent failure of networking and mmc on some of our devices.
The bug was very sensitive to any software changes and even more to
any kernel debug options.

Finally we got a setup where the problem was reproducible with
CONFIG_DMA_API_DEBUG=y and it revealed the issue with the rx dma:

[   16.992082] ------------[ cut here ]------------
[   16.996779] DMA-API: macb ff0b0000.ethernet: device driver tries to free DMA memory it has not allocated [device address=0x0000000875e3e244] [size=1536 bytes]
[   17.011049] WARNING: CPU: 0 PID: 85 at kernel/dma/debug.c:1011 check_unmap+0x6a0/0x900
[   17.018977] Modules linked in: xxxxx
[   17.038823] CPU: 0 PID: 85 Comm: irq/55-8000f000 Not tainted 5.4.0 torvalds#28
[   17.045345] Hardware name: xxxxx
[   17.049528] pstate: 60000005 (nZCv daif -PAN -UAO)
[   17.054322] pc : check_unmap+0x6a0/0x900
[   17.058243] lr : check_unmap+0x6a0/0x900
[   17.062163] sp : ffffffc010003c40
[   17.065470] x29: ffffffc010003c40 x28: 000000004000c03c
[   17.070783] x27: ffffffc010da7048 x26: ffffff8878e38800
[   17.076095] x25: ffffff8879d22810 x24: ffffffc010003cc8
[   17.081407] x23: 0000000000000000 x22: ffffffc010a08750
[   17.086719] x21: ffffff8878e3c7c0 x20: ffffffc010acb000
[   17.092032] x19: 0000000875e3e244 x18: 0000000000000010
[   17.097343] x17: 0000000000000000 x16: 0000000000000000
[   17.102647] x15: ffffff8879e4a988 x14: 0720072007200720
[   17.107959] x13: 0720072007200720 x12: 0720072007200720
[   17.113261] x11: 0720072007200720 x10: 0720072007200720
[   17.118565] x9 : 0720072007200720 x8 : 000000000000022d
[   17.123869] x7 : 0000000000000015 x6 : 0000000000000098
[   17.129173] x5 : 0000000000000000 x4 : 0000000000000000
[   17.134475] x3 : 00000000ffffffff x2 : ffffffc010a1d370
[   17.139778] x1 : b420c9d75d27bb00 x0 : 0000000000000000
[   17.145082] Call trace:
[   17.147524]  check_unmap+0x6a0/0x900
[   17.151091]  debug_dma_unmap_page+0x88/0x90
[   17.155266]  gem_rx+0x114/0x2f0
[   17.158396]  macb_poll+0x58/0x100
[   17.161705]  net_rx_action+0x118/0x400
[   17.165445]  __do_softirq+0x138/0x36c
[   17.169100]  irq_exit+0x98/0xc0
[   17.172234]  __handle_domain_irq+0x64/0xc0
[   17.176320]  gic_handle_irq+0x5c/0xc0
[   17.179974]  el1_irq+0xb8/0x140
[   17.183109]  xiic_process+0x5c/0xe30
[   17.186677]  irq_thread_fn+0x28/0x90
[   17.190244]  irq_thread+0x208/0x2a0
[   17.193724]  kthread+0x130/0x140
[   17.196945]  ret_from_fork+0x10/0x20
[   17.200510] ---[ end trace 7240980785f81d6f ]---

[  237.021490] ------------[ cut here ]------------
[  237.026129] DMA-API: exceeded 7 overlapping mappings of cacheline 0x0000000021d79e7b
[  237.033886] WARNING: CPU: 0 PID: 0 at kernel/dma/debug.c:499 add_dma_entry+0x214/0x240
[  237.041802] Modules linked in: xxxxx
[  237.061637] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W         5.4.0 torvalds#28
[  237.068941] Hardware name: xxxxx
[  237.073116] pstate: 80000085 (Nzcv daIf -PAN -UAO)
[  237.077900] pc : add_dma_entry+0x214/0x240
[  237.081986] lr : add_dma_entry+0x214/0x240
[  237.086072] sp : ffffffc010003c30
[  237.089379] x29: ffffffc010003c30 x28: ffffff8878a0be00
[  237.094683] x27: 0000000000000180 x26: ffffff8878e387c0
[  237.099987] x25: 0000000000000002 x24: 0000000000000000
[  237.105290] x23: 000000000000003b x22: ffffffc010a0fa00
[  237.110594] x21: 0000000021d79e7b x20: ffffffc010abe600
[  237.115897] x19: 00000000ffffffef x18: 0000000000000010
[  237.121201] x17: 0000000000000000 x16: 0000000000000000
[  237.126504] x15: ffffffc010a0fdc8 x14: 0720072007200720
[  237.131807] x13: 0720072007200720 x12: 0720072007200720
[  237.137111] x11: 0720072007200720 x10: 0720072007200720
[  237.142415] x9 : 0720072007200720 x8 : 0000000000000259
[  237.147718] x7 : 0000000000000001 x6 : 0000000000000000
[  237.153022] x5 : ffffffc010003a20 x4 : 0000000000000001
[  237.158325] x3 : 0000000000000006 x2 : 0000000000000007
[  237.163628] x1 : 8ac721b3a7dc1c00 x0 : 0000000000000000
[  237.168932] Call trace:
[  237.171373]  add_dma_entry+0x214/0x240
[  237.175115]  debug_dma_map_page+0xf8/0x120
[  237.179203]  gem_rx_refill+0x190/0x280
[  237.182942]  gem_rx+0x224/0x2f0
[  237.186075]  macb_poll+0x58/0x100
[  237.189384]  net_rx_action+0x118/0x400
[  237.193125]  __do_softirq+0x138/0x36c
[  237.196780]  irq_exit+0x98/0xc0
[  237.199914]  __handle_domain_irq+0x64/0xc0
[  237.204000]  gic_handle_irq+0x5c/0xc0
[  237.207654]  el1_irq+0xb8/0x140
[  237.210789]  arch_cpu_idle+0x40/0x200
[  237.214444]  default_idle_call+0x18/0x30
[  237.218359]  do_idle+0x200/0x280
[  237.221578]  cpu_startup_entry+0x20/0x30
[  237.225493]  rest_init+0xe4/0xf0
[  237.228713]  arch_call_rest_init+0xc/0x14
[  237.232714]  start_kernel+0x47c/0x4a8
[  237.236367] ---[ end trace 7240980785f81d70 ]---

Lars was fast to find an explanation: according to the datasheet
bit 2 of the rx buffer descriptor entry has a different meaning in the
extended mode:
  Address [2] of beginning of buffer, or
  in extended buffer descriptor mode (DMA configuration register [28] = 1),
  indicates a valid timestamp in the buffer descriptor entry.

The macb driver didn't mask this bit while getting an address and it
eventually caused a memory corruption and a dma failure.

The problem is resolved by explicitly clearing the problematic bit
if hw timestamping is used.

Fixes: 7b42961 ("net: macb: Add support for PTP timestamps in DMA descriptors")
Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev>
Co-developed-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20230412232144.770336-1-roman.gushchin@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
ammarfaizi2 pushed a commit to ammarfaizi2/linux-fork that referenced this pull request Apr 20, 2023
[ Upstream commit e8b7445 ]

For quite some time we were chasing a bug which looked like a sudden
permanent failure of networking and mmc on some of our devices.
The bug was very sensitive to any software changes and even more to
any kernel debug options.

Finally we got a setup where the problem was reproducible with
CONFIG_DMA_API_DEBUG=y and it revealed the issue with the rx dma:

[   16.992082] ------------[ cut here ]------------
[   16.996779] DMA-API: macb ff0b0000.ethernet: device driver tries to free DMA memory it has not allocated [device address=0x0000000875e3e244] [size=1536 bytes]
[   17.011049] WARNING: CPU: 0 PID: 85 at kernel/dma/debug.c:1011 check_unmap+0x6a0/0x900
[   17.018977] Modules linked in: xxxxx
[   17.038823] CPU: 0 PID: 85 Comm: irq/55-8000f000 Not tainted 5.4.0 torvalds#28
[   17.045345] Hardware name: xxxxx
[   17.049528] pstate: 60000005 (nZCv daif -PAN -UAO)
[   17.054322] pc : check_unmap+0x6a0/0x900
[   17.058243] lr : check_unmap+0x6a0/0x900
[   17.062163] sp : ffffffc010003c40
[   17.065470] x29: ffffffc010003c40 x28: 000000004000c03c
[   17.070783] x27: ffffffc010da7048 x26: ffffff8878e38800
[   17.076095] x25: ffffff8879d22810 x24: ffffffc010003cc8
[   17.081407] x23: 0000000000000000 x22: ffffffc010a08750
[   17.086719] x21: ffffff8878e3c7c0 x20: ffffffc010acb000
[   17.092032] x19: 0000000875e3e244 x18: 0000000000000010
[   17.097343] x17: 0000000000000000 x16: 0000000000000000
[   17.102647] x15: ffffff8879e4a988 x14: 0720072007200720
[   17.107959] x13: 0720072007200720 x12: 0720072007200720
[   17.113261] x11: 0720072007200720 x10: 0720072007200720
[   17.118565] x9 : 0720072007200720 x8 : 000000000000022d
[   17.123869] x7 : 0000000000000015 x6 : 0000000000000098
[   17.129173] x5 : 0000000000000000 x4 : 0000000000000000
[   17.134475] x3 : 00000000ffffffff x2 : ffffffc010a1d370
[   17.139778] x1 : b420c9d75d27bb00 x0 : 0000000000000000
[   17.145082] Call trace:
[   17.147524]  check_unmap+0x6a0/0x900
[   17.151091]  debug_dma_unmap_page+0x88/0x90
[   17.155266]  gem_rx+0x114/0x2f0
[   17.158396]  macb_poll+0x58/0x100
[   17.161705]  net_rx_action+0x118/0x400
[   17.165445]  __do_softirq+0x138/0x36c
[   17.169100]  irq_exit+0x98/0xc0
[   17.172234]  __handle_domain_irq+0x64/0xc0
[   17.176320]  gic_handle_irq+0x5c/0xc0
[   17.179974]  el1_irq+0xb8/0x140
[   17.183109]  xiic_process+0x5c/0xe30
[   17.186677]  irq_thread_fn+0x28/0x90
[   17.190244]  irq_thread+0x208/0x2a0
[   17.193724]  kthread+0x130/0x140
[   17.196945]  ret_from_fork+0x10/0x20
[   17.200510] ---[ end trace 7240980785f81d6f ]---

[  237.021490] ------------[ cut here ]------------
[  237.026129] DMA-API: exceeded 7 overlapping mappings of cacheline 0x0000000021d79e7b
[  237.033886] WARNING: CPU: 0 PID: 0 at kernel/dma/debug.c:499 add_dma_entry+0x214/0x240
[  237.041802] Modules linked in: xxxxx
[  237.061637] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W         5.4.0 torvalds#28
[  237.068941] Hardware name: xxxxx
[  237.073116] pstate: 80000085 (Nzcv daIf -PAN -UAO)
[  237.077900] pc : add_dma_entry+0x214/0x240
[  237.081986] lr : add_dma_entry+0x214/0x240
[  237.086072] sp : ffffffc010003c30
[  237.089379] x29: ffffffc010003c30 x28: ffffff8878a0be00
[  237.094683] x27: 0000000000000180 x26: ffffff8878e387c0
[  237.099987] x25: 0000000000000002 x24: 0000000000000000
[  237.105290] x23: 000000000000003b x22: ffffffc010a0fa00
[  237.110594] x21: 0000000021d79e7b x20: ffffffc010abe600
[  237.115897] x19: 00000000ffffffef x18: 0000000000000010
[  237.121201] x17: 0000000000000000 x16: 0000000000000000
[  237.126504] x15: ffffffc010a0fdc8 x14: 0720072007200720
[  237.131807] x13: 0720072007200720 x12: 0720072007200720
[  237.137111] x11: 0720072007200720 x10: 0720072007200720
[  237.142415] x9 : 0720072007200720 x8 : 0000000000000259
[  237.147718] x7 : 0000000000000001 x6 : 0000000000000000
[  237.153022] x5 : ffffffc010003a20 x4 : 0000000000000001
[  237.158325] x3 : 0000000000000006 x2 : 0000000000000007
[  237.163628] x1 : 8ac721b3a7dc1c00 x0 : 0000000000000000
[  237.168932] Call trace:
[  237.171373]  add_dma_entry+0x214/0x240
[  237.175115]  debug_dma_map_page+0xf8/0x120
[  237.179203]  gem_rx_refill+0x190/0x280
[  237.182942]  gem_rx+0x224/0x2f0
[  237.186075]  macb_poll+0x58/0x100
[  237.189384]  net_rx_action+0x118/0x400
[  237.193125]  __do_softirq+0x138/0x36c
[  237.196780]  irq_exit+0x98/0xc0
[  237.199914]  __handle_domain_irq+0x64/0xc0
[  237.204000]  gic_handle_irq+0x5c/0xc0
[  237.207654]  el1_irq+0xb8/0x140
[  237.210789]  arch_cpu_idle+0x40/0x200
[  237.214444]  default_idle_call+0x18/0x30
[  237.218359]  do_idle+0x200/0x280
[  237.221578]  cpu_startup_entry+0x20/0x30
[  237.225493]  rest_init+0xe4/0xf0
[  237.228713]  arch_call_rest_init+0xc/0x14
[  237.232714]  start_kernel+0x47c/0x4a8
[  237.236367] ---[ end trace 7240980785f81d70 ]---

Lars was fast to find an explanation: according to the datasheet
bit 2 of the rx buffer descriptor entry has a different meaning in the
extended mode:
  Address [2] of beginning of buffer, or
  in extended buffer descriptor mode (DMA configuration register [28] = 1),
  indicates a valid timestamp in the buffer descriptor entry.

The macb driver didn't mask this bit while getting an address and it
eventually caused a memory corruption and a dma failure.

The problem is resolved by explicitly clearing the problematic bit
if hw timestamping is used.

Fixes: 7b42961 ("net: macb: Add support for PTP timestamps in DMA descriptors")
Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev>
Co-developed-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20230412232144.770336-1-roman.gushchin@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
ammarfaizi2 pushed a commit to ammarfaizi2/linux-fork that referenced this pull request Apr 20, 2023
[ Upstream commit e8b7445 ]

For quite some time we were chasing a bug which looked like a sudden
permanent failure of networking and mmc on some of our devices.
The bug was very sensitive to any software changes and even more to
any kernel debug options.

Finally we got a setup where the problem was reproducible with
CONFIG_DMA_API_DEBUG=y and it revealed the issue with the rx dma:

[   16.992082] ------------[ cut here ]------------
[   16.996779] DMA-API: macb ff0b0000.ethernet: device driver tries to free DMA memory it has not allocated [device address=0x0000000875e3e244] [size=1536 bytes]
[   17.011049] WARNING: CPU: 0 PID: 85 at kernel/dma/debug.c:1011 check_unmap+0x6a0/0x900
[   17.018977] Modules linked in: xxxxx
[   17.038823] CPU: 0 PID: 85 Comm: irq/55-8000f000 Not tainted 5.4.0 torvalds#28
[   17.045345] Hardware name: xxxxx
[   17.049528] pstate: 60000005 (nZCv daif -PAN -UAO)
[   17.054322] pc : check_unmap+0x6a0/0x900
[   17.058243] lr : check_unmap+0x6a0/0x900
[   17.062163] sp : ffffffc010003c40
[   17.065470] x29: ffffffc010003c40 x28: 000000004000c03c
[   17.070783] x27: ffffffc010da7048 x26: ffffff8878e38800
[   17.076095] x25: ffffff8879d22810 x24: ffffffc010003cc8
[   17.081407] x23: 0000000000000000 x22: ffffffc010a08750
[   17.086719] x21: ffffff8878e3c7c0 x20: ffffffc010acb000
[   17.092032] x19: 0000000875e3e244 x18: 0000000000000010
[   17.097343] x17: 0000000000000000 x16: 0000000000000000
[   17.102647] x15: ffffff8879e4a988 x14: 0720072007200720
[   17.107959] x13: 0720072007200720 x12: 0720072007200720
[   17.113261] x11: 0720072007200720 x10: 0720072007200720
[   17.118565] x9 : 0720072007200720 x8 : 000000000000022d
[   17.123869] x7 : 0000000000000015 x6 : 0000000000000098
[   17.129173] x5 : 0000000000000000 x4 : 0000000000000000
[   17.134475] x3 : 00000000ffffffff x2 : ffffffc010a1d370
[   17.139778] x1 : b420c9d75d27bb00 x0 : 0000000000000000
[   17.145082] Call trace:
[   17.147524]  check_unmap+0x6a0/0x900
[   17.151091]  debug_dma_unmap_page+0x88/0x90
[   17.155266]  gem_rx+0x114/0x2f0
[   17.158396]  macb_poll+0x58/0x100
[   17.161705]  net_rx_action+0x118/0x400
[   17.165445]  __do_softirq+0x138/0x36c
[   17.169100]  irq_exit+0x98/0xc0
[   17.172234]  __handle_domain_irq+0x64/0xc0
[   17.176320]  gic_handle_irq+0x5c/0xc0
[   17.179974]  el1_irq+0xb8/0x140
[   17.183109]  xiic_process+0x5c/0xe30
[   17.186677]  irq_thread_fn+0x28/0x90
[   17.190244]  irq_thread+0x208/0x2a0
[   17.193724]  kthread+0x130/0x140
[   17.196945]  ret_from_fork+0x10/0x20
[   17.200510] ---[ end trace 7240980785f81d6f ]---

[  237.021490] ------------[ cut here ]------------
[  237.026129] DMA-API: exceeded 7 overlapping mappings of cacheline 0x0000000021d79e7b
[  237.033886] WARNING: CPU: 0 PID: 0 at kernel/dma/debug.c:499 add_dma_entry+0x214/0x240
[  237.041802] Modules linked in: xxxxx
[  237.061637] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W         5.4.0 torvalds#28
[  237.068941] Hardware name: xxxxx
[  237.073116] pstate: 80000085 (Nzcv daIf -PAN -UAO)
[  237.077900] pc : add_dma_entry+0x214/0x240
[  237.081986] lr : add_dma_entry+0x214/0x240
[  237.086072] sp : ffffffc010003c30
[  237.089379] x29: ffffffc010003c30 x28: ffffff8878a0be00
[  237.094683] x27: 0000000000000180 x26: ffffff8878e387c0
[  237.099987] x25: 0000000000000002 x24: 0000000000000000
[  237.105290] x23: 000000000000003b x22: ffffffc010a0fa00
[  237.110594] x21: 0000000021d79e7b x20: ffffffc010abe600
[  237.115897] x19: 00000000ffffffef x18: 0000000000000010
[  237.121201] x17: 0000000000000000 x16: 0000000000000000
[  237.126504] x15: ffffffc010a0fdc8 x14: 0720072007200720
[  237.131807] x13: 0720072007200720 x12: 0720072007200720
[  237.137111] x11: 0720072007200720 x10: 0720072007200720
[  237.142415] x9 : 0720072007200720 x8 : 0000000000000259
[  237.147718] x7 : 0000000000000001 x6 : 0000000000000000
[  237.153022] x5 : ffffffc010003a20 x4 : 0000000000000001
[  237.158325] x3 : 0000000000000006 x2 : 0000000000000007
[  237.163628] x1 : 8ac721b3a7dc1c00 x0 : 0000000000000000
[  237.168932] Call trace:
[  237.171373]  add_dma_entry+0x214/0x240
[  237.175115]  debug_dma_map_page+0xf8/0x120
[  237.179203]  gem_rx_refill+0x190/0x280
[  237.182942]  gem_rx+0x224/0x2f0
[  237.186075]  macb_poll+0x58/0x100
[  237.189384]  net_rx_action+0x118/0x400
[  237.193125]  __do_softirq+0x138/0x36c
[  237.196780]  irq_exit+0x98/0xc0
[  237.199914]  __handle_domain_irq+0x64/0xc0
[  237.204000]  gic_handle_irq+0x5c/0xc0
[  237.207654]  el1_irq+0xb8/0x140
[  237.210789]  arch_cpu_idle+0x40/0x200
[  237.214444]  default_idle_call+0x18/0x30
[  237.218359]  do_idle+0x200/0x280
[  237.221578]  cpu_startup_entry+0x20/0x30
[  237.225493]  rest_init+0xe4/0xf0
[  237.228713]  arch_call_rest_init+0xc/0x14
[  237.232714]  start_kernel+0x47c/0x4a8
[  237.236367] ---[ end trace 7240980785f81d70 ]---

Lars was fast to find an explanation: according to the datasheet
bit 2 of the rx buffer descriptor entry has a different meaning in the
extended mode:
  Address [2] of beginning of buffer, or
  in extended buffer descriptor mode (DMA configuration register [28] = 1),
  indicates a valid timestamp in the buffer descriptor entry.

The macb driver didn't mask this bit while getting an address and it
eventually caused a memory corruption and a dma failure.

The problem is resolved by explicitly clearing the problematic bit
if hw timestamping is used.

Fixes: 7b42961 ("net: macb: Add support for PTP timestamps in DMA descriptors")
Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev>
Co-developed-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20230412232144.770336-1-roman.gushchin@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
ammarfaizi2 pushed a commit to ammarfaizi2/linux-fork that referenced this pull request Apr 20, 2023
[ Upstream commit e8b7445 ]

For quite some time we were chasing a bug which looked like a sudden
permanent failure of networking and mmc on some of our devices.
The bug was very sensitive to any software changes and even more to
any kernel debug options.

Finally we got a setup where the problem was reproducible with
CONFIG_DMA_API_DEBUG=y and it revealed the issue with the rx dma:

[   16.992082] ------------[ cut here ]------------
[   16.996779] DMA-API: macb ff0b0000.ethernet: device driver tries to free DMA memory it has not allocated [device address=0x0000000875e3e244] [size=1536 bytes]
[   17.011049] WARNING: CPU: 0 PID: 85 at kernel/dma/debug.c:1011 check_unmap+0x6a0/0x900
[   17.018977] Modules linked in: xxxxx
[   17.038823] CPU: 0 PID: 85 Comm: irq/55-8000f000 Not tainted 5.4.0 torvalds#28
[   17.045345] Hardware name: xxxxx
[   17.049528] pstate: 60000005 (nZCv daif -PAN -UAO)
[   17.054322] pc : check_unmap+0x6a0/0x900
[   17.058243] lr : check_unmap+0x6a0/0x900
[   17.062163] sp : ffffffc010003c40
[   17.065470] x29: ffffffc010003c40 x28: 000000004000c03c
[   17.070783] x27: ffffffc010da7048 x26: ffffff8878e38800
[   17.076095] x25: ffffff8879d22810 x24: ffffffc010003cc8
[   17.081407] x23: 0000000000000000 x22: ffffffc010a08750
[   17.086719] x21: ffffff8878e3c7c0 x20: ffffffc010acb000
[   17.092032] x19: 0000000875e3e244 x18: 0000000000000010
[   17.097343] x17: 0000000000000000 x16: 0000000000000000
[   17.102647] x15: ffffff8879e4a988 x14: 0720072007200720
[   17.107959] x13: 0720072007200720 x12: 0720072007200720
[   17.113261] x11: 0720072007200720 x10: 0720072007200720
[   17.118565] x9 : 0720072007200720 x8 : 000000000000022d
[   17.123869] x7 : 0000000000000015 x6 : 0000000000000098
[   17.129173] x5 : 0000000000000000 x4 : 0000000000000000
[   17.134475] x3 : 00000000ffffffff x2 : ffffffc010a1d370
[   17.139778] x1 : b420c9d75d27bb00 x0 : 0000000000000000
[   17.145082] Call trace:
[   17.147524]  check_unmap+0x6a0/0x900
[   17.151091]  debug_dma_unmap_page+0x88/0x90
[   17.155266]  gem_rx+0x114/0x2f0
[   17.158396]  macb_poll+0x58/0x100
[   17.161705]  net_rx_action+0x118/0x400
[   17.165445]  __do_softirq+0x138/0x36c
[   17.169100]  irq_exit+0x98/0xc0
[   17.172234]  __handle_domain_irq+0x64/0xc0
[   17.176320]  gic_handle_irq+0x5c/0xc0
[   17.179974]  el1_irq+0xb8/0x140
[   17.183109]  xiic_process+0x5c/0xe30
[   17.186677]  irq_thread_fn+0x28/0x90
[   17.190244]  irq_thread+0x208/0x2a0
[   17.193724]  kthread+0x130/0x140
[   17.196945]  ret_from_fork+0x10/0x20
[   17.200510] ---[ end trace 7240980785f81d6f ]---

[  237.021490] ------------[ cut here ]------------
[  237.026129] DMA-API: exceeded 7 overlapping mappings of cacheline 0x0000000021d79e7b
[  237.033886] WARNING: CPU: 0 PID: 0 at kernel/dma/debug.c:499 add_dma_entry+0x214/0x240
[  237.041802] Modules linked in: xxxxx
[  237.061637] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W         5.4.0 torvalds#28
[  237.068941] Hardware name: xxxxx
[  237.073116] pstate: 80000085 (Nzcv daIf -PAN -UAO)
[  237.077900] pc : add_dma_entry+0x214/0x240
[  237.081986] lr : add_dma_entry+0x214/0x240
[  237.086072] sp : ffffffc010003c30
[  237.089379] x29: ffffffc010003c30 x28: ffffff8878a0be00
[  237.094683] x27: 0000000000000180 x26: ffffff8878e387c0
[  237.099987] x25: 0000000000000002 x24: 0000000000000000
[  237.105290] x23: 000000000000003b x22: ffffffc010a0fa00
[  237.110594] x21: 0000000021d79e7b x20: ffffffc010abe600
[  237.115897] x19: 00000000ffffffef x18: 0000000000000010
[  237.121201] x17: 0000000000000000 x16: 0000000000000000
[  237.126504] x15: ffffffc010a0fdc8 x14: 0720072007200720
[  237.131807] x13: 0720072007200720 x12: 0720072007200720
[  237.137111] x11: 0720072007200720 x10: 0720072007200720
[  237.142415] x9 : 0720072007200720 x8 : 0000000000000259
[  237.147718] x7 : 0000000000000001 x6 : 0000000000000000
[  237.153022] x5 : ffffffc010003a20 x4 : 0000000000000001
[  237.158325] x3 : 0000000000000006 x2 : 0000000000000007
[  237.163628] x1 : 8ac721b3a7dc1c00 x0 : 0000000000000000
[  237.168932] Call trace:
[  237.171373]  add_dma_entry+0x214/0x240
[  237.175115]  debug_dma_map_page+0xf8/0x120
[  237.179203]  gem_rx_refill+0x190/0x280
[  237.182942]  gem_rx+0x224/0x2f0
[  237.186075]  macb_poll+0x58/0x100
[  237.189384]  net_rx_action+0x118/0x400
[  237.193125]  __do_softirq+0x138/0x36c
[  237.196780]  irq_exit+0x98/0xc0
[  237.199914]  __handle_domain_irq+0x64/0xc0
[  237.204000]  gic_handle_irq+0x5c/0xc0
[  237.207654]  el1_irq+0xb8/0x140
[  237.210789]  arch_cpu_idle+0x40/0x200
[  237.214444]  default_idle_call+0x18/0x30
[  237.218359]  do_idle+0x200/0x280
[  237.221578]  cpu_startup_entry+0x20/0x30
[  237.225493]  rest_init+0xe4/0xf0
[  237.228713]  arch_call_rest_init+0xc/0x14
[  237.232714]  start_kernel+0x47c/0x4a8
[  237.236367] ---[ end trace 7240980785f81d70 ]---

Lars was fast to find an explanation: according to the datasheet
bit 2 of the rx buffer descriptor entry has a different meaning in the
extended mode:
  Address [2] of beginning of buffer, or
  in extended buffer descriptor mode (DMA configuration register [28] = 1),
  indicates a valid timestamp in the buffer descriptor entry.

The macb driver didn't mask this bit while getting an address and it
eventually caused a memory corruption and a dma failure.

The problem is resolved by explicitly clearing the problematic bit
if hw timestamping is used.

Fixes: 7b42961 ("net: macb: Add support for PTP timestamps in DMA descriptors")
Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev>
Co-developed-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20230412232144.770336-1-roman.gushchin@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
ammarfaizi2 pushed a commit to ammarfaizi2/linux-fork that referenced this pull request Apr 20, 2023
[ Upstream commit e8b7445 ]

For quite some time we were chasing a bug which looked like a sudden
permanent failure of networking and mmc on some of our devices.
The bug was very sensitive to any software changes and even more to
any kernel debug options.

Finally we got a setup where the problem was reproducible with
CONFIG_DMA_API_DEBUG=y and it revealed the issue with the rx dma:

[   16.992082] ------------[ cut here ]------------
[   16.996779] DMA-API: macb ff0b0000.ethernet: device driver tries to free DMA memory it has not allocated [device address=0x0000000875e3e244] [size=1536 bytes]
[   17.011049] WARNING: CPU: 0 PID: 85 at kernel/dma/debug.c:1011 check_unmap+0x6a0/0x900
[   17.018977] Modules linked in: xxxxx
[   17.038823] CPU: 0 PID: 85 Comm: irq/55-8000f000 Not tainted 5.4.0 torvalds#28
[   17.045345] Hardware name: xxxxx
[   17.049528] pstate: 60000005 (nZCv daif -PAN -UAO)
[   17.054322] pc : check_unmap+0x6a0/0x900
[   17.058243] lr : check_unmap+0x6a0/0x900
[   17.062163] sp : ffffffc010003c40
[   17.065470] x29: ffffffc010003c40 x28: 000000004000c03c
[   17.070783] x27: ffffffc010da7048 x26: ffffff8878e38800
[   17.076095] x25: ffffff8879d22810 x24: ffffffc010003cc8
[   17.081407] x23: 0000000000000000 x22: ffffffc010a08750
[   17.086719] x21: ffffff8878e3c7c0 x20: ffffffc010acb000
[   17.092032] x19: 0000000875e3e244 x18: 0000000000000010
[   17.097343] x17: 0000000000000000 x16: 0000000000000000
[   17.102647] x15: ffffff8879e4a988 x14: 0720072007200720
[   17.107959] x13: 0720072007200720 x12: 0720072007200720
[   17.113261] x11: 0720072007200720 x10: 0720072007200720
[   17.118565] x9 : 0720072007200720 x8 : 000000000000022d
[   17.123869] x7 : 0000000000000015 x6 : 0000000000000098
[   17.129173] x5 : 0000000000000000 x4 : 0000000000000000
[   17.134475] x3 : 00000000ffffffff x2 : ffffffc010a1d370
[   17.139778] x1 : b420c9d75d27bb00 x0 : 0000000000000000
[   17.145082] Call trace:
[   17.147524]  check_unmap+0x6a0/0x900
[   17.151091]  debug_dma_unmap_page+0x88/0x90
[   17.155266]  gem_rx+0x114/0x2f0
[   17.158396]  macb_poll+0x58/0x100
[   17.161705]  net_rx_action+0x118/0x400
[   17.165445]  __do_softirq+0x138/0x36c
[   17.169100]  irq_exit+0x98/0xc0
[   17.172234]  __handle_domain_irq+0x64/0xc0
[   17.176320]  gic_handle_irq+0x5c/0xc0
[   17.179974]  el1_irq+0xb8/0x140
[   17.183109]  xiic_process+0x5c/0xe30
[   17.186677]  irq_thread_fn+0x28/0x90
[   17.190244]  irq_thread+0x208/0x2a0
[   17.193724]  kthread+0x130/0x140
[   17.196945]  ret_from_fork+0x10/0x20
[   17.200510] ---[ end trace 7240980785f81d6f ]---

[  237.021490] ------------[ cut here ]------------
[  237.026129] DMA-API: exceeded 7 overlapping mappings of cacheline 0x0000000021d79e7b
[  237.033886] WARNING: CPU: 0 PID: 0 at kernel/dma/debug.c:499 add_dma_entry+0x214/0x240
[  237.041802] Modules linked in: xxxxx
[  237.061637] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W         5.4.0 torvalds#28
[  237.068941] Hardware name: xxxxx
[  237.073116] pstate: 80000085 (Nzcv daIf -PAN -UAO)
[  237.077900] pc : add_dma_entry+0x214/0x240
[  237.081986] lr : add_dma_entry+0x214/0x240
[  237.086072] sp : ffffffc010003c30
[  237.089379] x29: ffffffc010003c30 x28: ffffff8878a0be00
[  237.094683] x27: 0000000000000180 x26: ffffff8878e387c0
[  237.099987] x25: 0000000000000002 x24: 0000000000000000
[  237.105290] x23: 000000000000003b x22: ffffffc010a0fa00
[  237.110594] x21: 0000000021d79e7b x20: ffffffc010abe600
[  237.115897] x19: 00000000ffffffef x18: 0000000000000010
[  237.121201] x17: 0000000000000000 x16: 0000000000000000
[  237.126504] x15: ffffffc010a0fdc8 x14: 0720072007200720
[  237.131807] x13: 0720072007200720 x12: 0720072007200720
[  237.137111] x11: 0720072007200720 x10: 0720072007200720
[  237.142415] x9 : 0720072007200720 x8 : 0000000000000259
[  237.147718] x7 : 0000000000000001 x6 : 0000000000000000
[  237.153022] x5 : ffffffc010003a20 x4 : 0000000000000001
[  237.158325] x3 : 0000000000000006 x2 : 0000000000000007
[  237.163628] x1 : 8ac721b3a7dc1c00 x0 : 0000000000000000
[  237.168932] Call trace:
[  237.171373]  add_dma_entry+0x214/0x240
[  237.175115]  debug_dma_map_page+0xf8/0x120
[  237.179203]  gem_rx_refill+0x190/0x280
[  237.182942]  gem_rx+0x224/0x2f0
[  237.186075]  macb_poll+0x58/0x100
[  237.189384]  net_rx_action+0x118/0x400
[  237.193125]  __do_softirq+0x138/0x36c
[  237.196780]  irq_exit+0x98/0xc0
[  237.199914]  __handle_domain_irq+0x64/0xc0
[  237.204000]  gic_handle_irq+0x5c/0xc0
[  237.207654]  el1_irq+0xb8/0x140
[  237.210789]  arch_cpu_idle+0x40/0x200
[  237.214444]  default_idle_call+0x18/0x30
[  237.218359]  do_idle+0x200/0x280
[  237.221578]  cpu_startup_entry+0x20/0x30
[  237.225493]  rest_init+0xe4/0xf0
[  237.228713]  arch_call_rest_init+0xc/0x14
[  237.232714]  start_kernel+0x47c/0x4a8
[  237.236367] ---[ end trace 7240980785f81d70 ]---

Lars was fast to find an explanation: according to the datasheet
bit 2 of the rx buffer descriptor entry has a different meaning in the
extended mode:
  Address [2] of beginning of buffer, or
  in extended buffer descriptor mode (DMA configuration register [28] = 1),
  indicates a valid timestamp in the buffer descriptor entry.

The macb driver didn't mask this bit while getting an address and it
eventually caused a memory corruption and a dma failure.

The problem is resolved by explicitly clearing the problematic bit
if hw timestamping is used.

Fixes: 7b42961 ("net: macb: Add support for PTP timestamps in DMA descriptors")
Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev>
Co-developed-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20230412232144.770336-1-roman.gushchin@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
ammarfaizi2 pushed a commit to ammarfaizi2/linux-fork that referenced this pull request Apr 20, 2023
[ Upstream commit e8b7445 ]

For quite some time we were chasing a bug which looked like a sudden
permanent failure of networking and mmc on some of our devices.
The bug was very sensitive to any software changes and even more to
any kernel debug options.

Finally we got a setup where the problem was reproducible with
CONFIG_DMA_API_DEBUG=y and it revealed the issue with the rx dma:

[   16.992082] ------------[ cut here ]------------
[   16.996779] DMA-API: macb ff0b0000.ethernet: device driver tries to free DMA memory it has not allocated [device address=0x0000000875e3e244] [size=1536 bytes]
[   17.011049] WARNING: CPU: 0 PID: 85 at kernel/dma/debug.c:1011 check_unmap+0x6a0/0x900
[   17.018977] Modules linked in: xxxxx
[   17.038823] CPU: 0 PID: 85 Comm: irq/55-8000f000 Not tainted 5.4.0 torvalds#28
[   17.045345] Hardware name: xxxxx
[   17.049528] pstate: 60000005 (nZCv daif -PAN -UAO)
[   17.054322] pc : check_unmap+0x6a0/0x900
[   17.058243] lr : check_unmap+0x6a0/0x900
[   17.062163] sp : ffffffc010003c40
[   17.065470] x29: ffffffc010003c40 x28: 000000004000c03c
[   17.070783] x27: ffffffc010da7048 x26: ffffff8878e38800
[   17.076095] x25: ffffff8879d22810 x24: ffffffc010003cc8
[   17.081407] x23: 0000000000000000 x22: ffffffc010a08750
[   17.086719] x21: ffffff8878e3c7c0 x20: ffffffc010acb000
[   17.092032] x19: 0000000875e3e244 x18: 0000000000000010
[   17.097343] x17: 0000000000000000 x16: 0000000000000000
[   17.102647] x15: ffffff8879e4a988 x14: 0720072007200720
[   17.107959] x13: 0720072007200720 x12: 0720072007200720
[   17.113261] x11: 0720072007200720 x10: 0720072007200720
[   17.118565] x9 : 0720072007200720 x8 : 000000000000022d
[   17.123869] x7 : 0000000000000015 x6 : 0000000000000098
[   17.129173] x5 : 0000000000000000 x4 : 0000000000000000
[   17.134475] x3 : 00000000ffffffff x2 : ffffffc010a1d370
[   17.139778] x1 : b420c9d75d27bb00 x0 : 0000000000000000
[   17.145082] Call trace:
[   17.147524]  check_unmap+0x6a0/0x900
[   17.151091]  debug_dma_unmap_page+0x88/0x90
[   17.155266]  gem_rx+0x114/0x2f0
[   17.158396]  macb_poll+0x58/0x100
[   17.161705]  net_rx_action+0x118/0x400
[   17.165445]  __do_softirq+0x138/0x36c
[   17.169100]  irq_exit+0x98/0xc0
[   17.172234]  __handle_domain_irq+0x64/0xc0
[   17.176320]  gic_handle_irq+0x5c/0xc0
[   17.179974]  el1_irq+0xb8/0x140
[   17.183109]  xiic_process+0x5c/0xe30
[   17.186677]  irq_thread_fn+0x28/0x90
[   17.190244]  irq_thread+0x208/0x2a0
[   17.193724]  kthread+0x130/0x140
[   17.196945]  ret_from_fork+0x10/0x20
[   17.200510] ---[ end trace 7240980785f81d6f ]---

[  237.021490] ------------[ cut here ]------------
[  237.026129] DMA-API: exceeded 7 overlapping mappings of cacheline 0x0000000021d79e7b
[  237.033886] WARNING: CPU: 0 PID: 0 at kernel/dma/debug.c:499 add_dma_entry+0x214/0x240
[  237.041802] Modules linked in: xxxxx
[  237.061637] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W         5.4.0 torvalds#28
[  237.068941] Hardware name: xxxxx
[  237.073116] pstate: 80000085 (Nzcv daIf -PAN -UAO)
[  237.077900] pc : add_dma_entry+0x214/0x240
[  237.081986] lr : add_dma_entry+0x214/0x240
[  237.086072] sp : ffffffc010003c30
[  237.089379] x29: ffffffc010003c30 x28: ffffff8878a0be00
[  237.094683] x27: 0000000000000180 x26: ffffff8878e387c0
[  237.099987] x25: 0000000000000002 x24: 0000000000000000
[  237.105290] x23: 000000000000003b x22: ffffffc010a0fa00
[  237.110594] x21: 0000000021d79e7b x20: ffffffc010abe600
[  237.115897] x19: 00000000ffffffef x18: 0000000000000010
[  237.121201] x17: 0000000000000000 x16: 0000000000000000
[  237.126504] x15: ffffffc010a0fdc8 x14: 0720072007200720
[  237.131807] x13: 0720072007200720 x12: 0720072007200720
[  237.137111] x11: 0720072007200720 x10: 0720072007200720
[  237.142415] x9 : 0720072007200720 x8 : 0000000000000259
[  237.147718] x7 : 0000000000000001 x6 : 0000000000000000
[  237.153022] x5 : ffffffc010003a20 x4 : 0000000000000001
[  237.158325] x3 : 0000000000000006 x2 : 0000000000000007
[  237.163628] x1 : 8ac721b3a7dc1c00 x0 : 0000000000000000
[  237.168932] Call trace:
[  237.171373]  add_dma_entry+0x214/0x240
[  237.175115]  debug_dma_map_page+0xf8/0x120
[  237.179203]  gem_rx_refill+0x190/0x280
[  237.182942]  gem_rx+0x224/0x2f0
[  237.186075]  macb_poll+0x58/0x100
[  237.189384]  net_rx_action+0x118/0x400
[  237.193125]  __do_softirq+0x138/0x36c
[  237.196780]  irq_exit+0x98/0xc0
[  237.199914]  __handle_domain_irq+0x64/0xc0
[  237.204000]  gic_handle_irq+0x5c/0xc0
[  237.207654]  el1_irq+0xb8/0x140
[  237.210789]  arch_cpu_idle+0x40/0x200
[  237.214444]  default_idle_call+0x18/0x30
[  237.218359]  do_idle+0x200/0x280
[  237.221578]  cpu_startup_entry+0x20/0x30
[  237.225493]  rest_init+0xe4/0xf0
[  237.228713]  arch_call_rest_init+0xc/0x14
[  237.232714]  start_kernel+0x47c/0x4a8
[  237.236367] ---[ end trace 7240980785f81d70 ]---

Lars was fast to find an explanation: according to the datasheet
bit 2 of the rx buffer descriptor entry has a different meaning in the
extended mode:
  Address [2] of beginning of buffer, or
  in extended buffer descriptor mode (DMA configuration register [28] = 1),
  indicates a valid timestamp in the buffer descriptor entry.

The macb driver didn't mask this bit while getting an address and it
eventually caused a memory corruption and a dma failure.

The problem is resolved by explicitly clearing the problematic bit
if hw timestamping is used.

Fixes: 7b42961 ("net: macb: Add support for PTP timestamps in DMA descriptors")
Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev>
Co-developed-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20230412232144.770336-1-roman.gushchin@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
ammarfaizi2 pushed a commit to ammarfaizi2/linux-fork that referenced this pull request Apr 20, 2023
[ Upstream commit e8b7445 ]

For quite some time we were chasing a bug which looked like a sudden
permanent failure of networking and mmc on some of our devices.
The bug was very sensitive to any software changes and even more to
any kernel debug options.

Finally we got a setup where the problem was reproducible with
CONFIG_DMA_API_DEBUG=y and it revealed the issue with the rx dma:

[   16.992082] ------------[ cut here ]------------
[   16.996779] DMA-API: macb ff0b0000.ethernet: device driver tries to free DMA memory it has not allocated [device address=0x0000000875e3e244] [size=1536 bytes]
[   17.011049] WARNING: CPU: 0 PID: 85 at kernel/dma/debug.c:1011 check_unmap+0x6a0/0x900
[   17.018977] Modules linked in: xxxxx
[   17.038823] CPU: 0 PID: 85 Comm: irq/55-8000f000 Not tainted 5.4.0 torvalds#28
[   17.045345] Hardware name: xxxxx
[   17.049528] pstate: 60000005 (nZCv daif -PAN -UAO)
[   17.054322] pc : check_unmap+0x6a0/0x900
[   17.058243] lr : check_unmap+0x6a0/0x900
[   17.062163] sp : ffffffc010003c40
[   17.065470] x29: ffffffc010003c40 x28: 000000004000c03c
[   17.070783] x27: ffffffc010da7048 x26: ffffff8878e38800
[   17.076095] x25: ffffff8879d22810 x24: ffffffc010003cc8
[   17.081407] x23: 0000000000000000 x22: ffffffc010a08750
[   17.086719] x21: ffffff8878e3c7c0 x20: ffffffc010acb000
[   17.092032] x19: 0000000875e3e244 x18: 0000000000000010
[   17.097343] x17: 0000000000000000 x16: 0000000000000000
[   17.102647] x15: ffffff8879e4a988 x14: 0720072007200720
[   17.107959] x13: 0720072007200720 x12: 0720072007200720
[   17.113261] x11: 0720072007200720 x10: 0720072007200720
[   17.118565] x9 : 0720072007200720 x8 : 000000000000022d
[   17.123869] x7 : 0000000000000015 x6 : 0000000000000098
[   17.129173] x5 : 0000000000000000 x4 : 0000000000000000
[   17.134475] x3 : 00000000ffffffff x2 : ffffffc010a1d370
[   17.139778] x1 : b420c9d75d27bb00 x0 : 0000000000000000
[   17.145082] Call trace:
[   17.147524]  check_unmap+0x6a0/0x900
[   17.151091]  debug_dma_unmap_page+0x88/0x90
[   17.155266]  gem_rx+0x114/0x2f0
[   17.158396]  macb_poll+0x58/0x100
[   17.161705]  net_rx_action+0x118/0x400
[   17.165445]  __do_softirq+0x138/0x36c
[   17.169100]  irq_exit+0x98/0xc0
[   17.172234]  __handle_domain_irq+0x64/0xc0
[   17.176320]  gic_handle_irq+0x5c/0xc0
[   17.179974]  el1_irq+0xb8/0x140
[   17.183109]  xiic_process+0x5c/0xe30
[   17.186677]  irq_thread_fn+0x28/0x90
[   17.190244]  irq_thread+0x208/0x2a0
[   17.193724]  kthread+0x130/0x140
[   17.196945]  ret_from_fork+0x10/0x20
[   17.200510] ---[ end trace 7240980785f81d6f ]---

[  237.021490] ------------[ cut here ]------------
[  237.026129] DMA-API: exceeded 7 overlapping mappings of cacheline 0x0000000021d79e7b
[  237.033886] WARNING: CPU: 0 PID: 0 at kernel/dma/debug.c:499 add_dma_entry+0x214/0x240
[  237.041802] Modules linked in: xxxxx
[  237.061637] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W         5.4.0 torvalds#28
[  237.068941] Hardware name: xxxxx
[  237.073116] pstate: 80000085 (Nzcv daIf -PAN -UAO)
[  237.077900] pc : add_dma_entry+0x214/0x240
[  237.081986] lr : add_dma_entry+0x214/0x240
[  237.086072] sp : ffffffc010003c30
[  237.089379] x29: ffffffc010003c30 x28: ffffff8878a0be00
[  237.094683] x27: 0000000000000180 x26: ffffff8878e387c0
[  237.099987] x25: 0000000000000002 x24: 0000000000000000
[  237.105290] x23: 000000000000003b x22: ffffffc010a0fa00
[  237.110594] x21: 0000000021d79e7b x20: ffffffc010abe600
[  237.115897] x19: 00000000ffffffef x18: 0000000000000010
[  237.121201] x17: 0000000000000000 x16: 0000000000000000
[  237.126504] x15: ffffffc010a0fdc8 x14: 0720072007200720
[  237.131807] x13: 0720072007200720 x12: 0720072007200720
[  237.137111] x11: 0720072007200720 x10: 0720072007200720
[  237.142415] x9 : 0720072007200720 x8 : 0000000000000259
[  237.147718] x7 : 0000000000000001 x6 : 0000000000000000
[  237.153022] x5 : ffffffc010003a20 x4 : 0000000000000001
[  237.158325] x3 : 0000000000000006 x2 : 0000000000000007
[  237.163628] x1 : 8ac721b3a7dc1c00 x0 : 0000000000000000
[  237.168932] Call trace:
[  237.171373]  add_dma_entry+0x214/0x240
[  237.175115]  debug_dma_map_page+0xf8/0x120
[  237.179203]  gem_rx_refill+0x190/0x280
[  237.182942]  gem_rx+0x224/0x2f0
[  237.186075]  macb_poll+0x58/0x100
[  237.189384]  net_rx_action+0x118/0x400
[  237.193125]  __do_softirq+0x138/0x36c
[  237.196780]  irq_exit+0x98/0xc0
[  237.199914]  __handle_domain_irq+0x64/0xc0
[  237.204000]  gic_handle_irq+0x5c/0xc0
[  237.207654]  el1_irq+0xb8/0x140
[  237.210789]  arch_cpu_idle+0x40/0x200
[  237.214444]  default_idle_call+0x18/0x30
[  237.218359]  do_idle+0x200/0x280
[  237.221578]  cpu_startup_entry+0x20/0x30
[  237.225493]  rest_init+0xe4/0xf0
[  237.228713]  arch_call_rest_init+0xc/0x14
[  237.232714]  start_kernel+0x47c/0x4a8
[  237.236367] ---[ end trace 7240980785f81d70 ]---

Lars was fast to find an explanation: according to the datasheet
bit 2 of the rx buffer descriptor entry has a different meaning in the
extended mode:
  Address [2] of beginning of buffer, or
  in extended buffer descriptor mode (DMA configuration register [28] = 1),
  indicates a valid timestamp in the buffer descriptor entry.

The macb driver didn't mask this bit while getting an address and it
eventually caused a memory corruption and a dma failure.

The problem is resolved by explicitly clearing the problematic bit
if hw timestamping is used.

Fixes: 7b42961 ("net: macb: Add support for PTP timestamps in DMA descriptors")
Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev>
Co-developed-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20230412232144.770336-1-roman.gushchin@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
ammarfaizi2 pushed a commit to ammarfaizi2/linux-fork that referenced this pull request Apr 20, 2023
[ Upstream commit e8b7445 ]

For quite some time we were chasing a bug which looked like a sudden
permanent failure of networking and mmc on some of our devices.
The bug was very sensitive to any software changes and even more to
any kernel debug options.

Finally we got a setup where the problem was reproducible with
CONFIG_DMA_API_DEBUG=y and it revealed the issue with the rx dma:

[   16.992082] ------------[ cut here ]------------
[   16.996779] DMA-API: macb ff0b0000.ethernet: device driver tries to free DMA memory it has not allocated [device address=0x0000000875e3e244] [size=1536 bytes]
[   17.011049] WARNING: CPU: 0 PID: 85 at kernel/dma/debug.c:1011 check_unmap+0x6a0/0x900
[   17.018977] Modules linked in: xxxxx
[   17.038823] CPU: 0 PID: 85 Comm: irq/55-8000f000 Not tainted 5.4.0 torvalds#28
[   17.045345] Hardware name: xxxxx
[   17.049528] pstate: 60000005 (nZCv daif -PAN -UAO)
[   17.054322] pc : check_unmap+0x6a0/0x900
[   17.058243] lr : check_unmap+0x6a0/0x900
[   17.062163] sp : ffffffc010003c40
[   17.065470] x29: ffffffc010003c40 x28: 000000004000c03c
[   17.070783] x27: ffffffc010da7048 x26: ffffff8878e38800
[   17.076095] x25: ffffff8879d22810 x24: ffffffc010003cc8
[   17.081407] x23: 0000000000000000 x22: ffffffc010a08750
[   17.086719] x21: ffffff8878e3c7c0 x20: ffffffc010acb000
[   17.092032] x19: 0000000875e3e244 x18: 0000000000000010
[   17.097343] x17: 0000000000000000 x16: 0000000000000000
[   17.102647] x15: ffffff8879e4a988 x14: 0720072007200720
[   17.107959] x13: 0720072007200720 x12: 0720072007200720
[   17.113261] x11: 0720072007200720 x10: 0720072007200720
[   17.118565] x9 : 0720072007200720 x8 : 000000000000022d
[   17.123869] x7 : 0000000000000015 x6 : 0000000000000098
[   17.129173] x5 : 0000000000000000 x4 : 0000000000000000
[   17.134475] x3 : 00000000ffffffff x2 : ffffffc010a1d370
[   17.139778] x1 : b420c9d75d27bb00 x0 : 0000000000000000
[   17.145082] Call trace:
[   17.147524]  check_unmap+0x6a0/0x900
[   17.151091]  debug_dma_unmap_page+0x88/0x90
[   17.155266]  gem_rx+0x114/0x2f0
[   17.158396]  macb_poll+0x58/0x100
[   17.161705]  net_rx_action+0x118/0x400
[   17.165445]  __do_softirq+0x138/0x36c
[   17.169100]  irq_exit+0x98/0xc0
[   17.172234]  __handle_domain_irq+0x64/0xc0
[   17.176320]  gic_handle_irq+0x5c/0xc0
[   17.179974]  el1_irq+0xb8/0x140
[   17.183109]  xiic_process+0x5c/0xe30
[   17.186677]  irq_thread_fn+0x28/0x90
[   17.190244]  irq_thread+0x208/0x2a0
[   17.193724]  kthread+0x130/0x140
[   17.196945]  ret_from_fork+0x10/0x20
[   17.200510] ---[ end trace 7240980785f81d6f ]---

[  237.021490] ------------[ cut here ]------------
[  237.026129] DMA-API: exceeded 7 overlapping mappings of cacheline 0x0000000021d79e7b
[  237.033886] WARNING: CPU: 0 PID: 0 at kernel/dma/debug.c:499 add_dma_entry+0x214/0x240
[  237.041802] Modules linked in: xxxxx
[  237.061637] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W         5.4.0 torvalds#28
[  237.068941] Hardware name: xxxxx
[  237.073116] pstate: 80000085 (Nzcv daIf -PAN -UAO)
[  237.077900] pc : add_dma_entry+0x214/0x240
[  237.081986] lr : add_dma_entry+0x214/0x240
[  237.086072] sp : ffffffc010003c30
[  237.089379] x29: ffffffc010003c30 x28: ffffff8878a0be00
[  237.094683] x27: 0000000000000180 x26: ffffff8878e387c0
[  237.099987] x25: 0000000000000002 x24: 0000000000000000
[  237.105290] x23: 000000000000003b x22: ffffffc010a0fa00
[  237.110594] x21: 0000000021d79e7b x20: ffffffc010abe600
[  237.115897] x19: 00000000ffffffef x18: 0000000000000010
[  237.121201] x17: 0000000000000000 x16: 0000000000000000
[  237.126504] x15: ffffffc010a0fdc8 x14: 0720072007200720
[  237.131807] x13: 0720072007200720 x12: 0720072007200720
[  237.137111] x11: 0720072007200720 x10: 0720072007200720
[  237.142415] x9 : 0720072007200720 x8 : 0000000000000259
[  237.147718] x7 : 0000000000000001 x6 : 0000000000000000
[  237.153022] x5 : ffffffc010003a20 x4 : 0000000000000001
[  237.158325] x3 : 0000000000000006 x2 : 0000000000000007
[  237.163628] x1 : 8ac721b3a7dc1c00 x0 : 0000000000000000
[  237.168932] Call trace:
[  237.171373]  add_dma_entry+0x214/0x240
[  237.175115]  debug_dma_map_page+0xf8/0x120
[  237.179203]  gem_rx_refill+0x190/0x280
[  237.182942]  gem_rx+0x224/0x2f0
[  237.186075]  macb_poll+0x58/0x100
[  237.189384]  net_rx_action+0x118/0x400
[  237.193125]  __do_softirq+0x138/0x36c
[  237.196780]  irq_exit+0x98/0xc0
[  237.199914]  __handle_domain_irq+0x64/0xc0
[  237.204000]  gic_handle_irq+0x5c/0xc0
[  237.207654]  el1_irq+0xb8/0x140
[  237.210789]  arch_cpu_idle+0x40/0x200
[  237.214444]  default_idle_call+0x18/0x30
[  237.218359]  do_idle+0x200/0x280
[  237.221578]  cpu_startup_entry+0x20/0x30
[  237.225493]  rest_init+0xe4/0xf0
[  237.228713]  arch_call_rest_init+0xc/0x14
[  237.232714]  start_kernel+0x47c/0x4a8
[  237.236367] ---[ end trace 7240980785f81d70 ]---

Lars was fast to find an explanation: according to the datasheet
bit 2 of the rx buffer descriptor entry has a different meaning in the
extended mode:
  Address [2] of beginning of buffer, or
  in extended buffer descriptor mode (DMA configuration register [28] = 1),
  indicates a valid timestamp in the buffer descriptor entry.

The macb driver didn't mask this bit while getting an address and it
eventually caused a memory corruption and a dma failure.

The problem is resolved by explicitly clearing the problematic bit
if hw timestamping is used.

Fixes: 7b42961 ("net: macb: Add support for PTP timestamps in DMA descriptors")
Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev>
Co-developed-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20230412232144.770336-1-roman.gushchin@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
ammarfaizi2 pushed a commit to ammarfaizi2/linux-fork that referenced this pull request Apr 20, 2023
[ Upstream commit e8b7445 ]

For quite some time we were chasing a bug which looked like a sudden
permanent failure of networking and mmc on some of our devices.
The bug was very sensitive to any software changes and even more to
any kernel debug options.

Finally we got a setup where the problem was reproducible with
CONFIG_DMA_API_DEBUG=y and it revealed the issue with the rx dma:

[   16.992082] ------------[ cut here ]------------
[   16.996779] DMA-API: macb ff0b0000.ethernet: device driver tries to free DMA memory it has not allocated [device address=0x0000000875e3e244] [size=1536 bytes]
[   17.011049] WARNING: CPU: 0 PID: 85 at kernel/dma/debug.c:1011 check_unmap+0x6a0/0x900
[   17.018977] Modules linked in: xxxxx
[   17.038823] CPU: 0 PID: 85 Comm: irq/55-8000f000 Not tainted 5.4.0 torvalds#28
[   17.045345] Hardware name: xxxxx
[   17.049528] pstate: 60000005 (nZCv daif -PAN -UAO)
[   17.054322] pc : check_unmap+0x6a0/0x900
[   17.058243] lr : check_unmap+0x6a0/0x900
[   17.062163] sp : ffffffc010003c40
[   17.065470] x29: ffffffc010003c40 x28: 000000004000c03c
[   17.070783] x27: ffffffc010da7048 x26: ffffff8878e38800
[   17.076095] x25: ffffff8879d22810 x24: ffffffc010003cc8
[   17.081407] x23: 0000000000000000 x22: ffffffc010a08750
[   17.086719] x21: ffffff8878e3c7c0 x20: ffffffc010acb000
[   17.092032] x19: 0000000875e3e244 x18: 0000000000000010
[   17.097343] x17: 0000000000000000 x16: 0000000000000000
[   17.102647] x15: ffffff8879e4a988 x14: 0720072007200720
[   17.107959] x13: 0720072007200720 x12: 0720072007200720
[   17.113261] x11: 0720072007200720 x10: 0720072007200720
[   17.118565] x9 : 0720072007200720 x8 : 000000000000022d
[   17.123869] x7 : 0000000000000015 x6 : 0000000000000098
[   17.129173] x5 : 0000000000000000 x4 : 0000000000000000
[   17.134475] x3 : 00000000ffffffff x2 : ffffffc010a1d370
[   17.139778] x1 : b420c9d75d27bb00 x0 : 0000000000000000
[   17.145082] Call trace:
[   17.147524]  check_unmap+0x6a0/0x900
[   17.151091]  debug_dma_unmap_page+0x88/0x90
[   17.155266]  gem_rx+0x114/0x2f0
[   17.158396]  macb_poll+0x58/0x100
[   17.161705]  net_rx_action+0x118/0x400
[   17.165445]  __do_softirq+0x138/0x36c
[   17.169100]  irq_exit+0x98/0xc0
[   17.172234]  __handle_domain_irq+0x64/0xc0
[   17.176320]  gic_handle_irq+0x5c/0xc0
[   17.179974]  el1_irq+0xb8/0x140
[   17.183109]  xiic_process+0x5c/0xe30
[   17.186677]  irq_thread_fn+0x28/0x90
[   17.190244]  irq_thread+0x208/0x2a0
[   17.193724]  kthread+0x130/0x140
[   17.196945]  ret_from_fork+0x10/0x20
[   17.200510] ---[ end trace 7240980785f81d6f ]---

[  237.021490] ------------[ cut here ]------------
[  237.026129] DMA-API: exceeded 7 overlapping mappings of cacheline 0x0000000021d79e7b
[  237.033886] WARNING: CPU: 0 PID: 0 at kernel/dma/debug.c:499 add_dma_entry+0x214/0x240
[  237.041802] Modules linked in: xxxxx
[  237.061637] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W         5.4.0 torvalds#28
[  237.068941] Hardware name: xxxxx
[  237.073116] pstate: 80000085 (Nzcv daIf -PAN -UAO)
[  237.077900] pc : add_dma_entry+0x214/0x240
[  237.081986] lr : add_dma_entry+0x214/0x240
[  237.086072] sp : ffffffc010003c30
[  237.089379] x29: ffffffc010003c30 x28: ffffff8878a0be00
[  237.094683] x27: 0000000000000180 x26: ffffff8878e387c0
[  237.099987] x25: 0000000000000002 x24: 0000000000000000
[  237.105290] x23: 000000000000003b x22: ffffffc010a0fa00
[  237.110594] x21: 0000000021d79e7b x20: ffffffc010abe600
[  237.115897] x19: 00000000ffffffef x18: 0000000000000010
[  237.121201] x17: 0000000000000000 x16: 0000000000000000
[  237.126504] x15: ffffffc010a0fdc8 x14: 0720072007200720
[  237.131807] x13: 0720072007200720 x12: 0720072007200720
[  237.137111] x11: 0720072007200720 x10: 0720072007200720
[  237.142415] x9 : 0720072007200720 x8 : 0000000000000259
[  237.147718] x7 : 0000000000000001 x6 : 0000000000000000
[  237.153022] x5 : ffffffc010003a20 x4 : 0000000000000001
[  237.158325] x3 : 0000000000000006 x2 : 0000000000000007
[  237.163628] x1 : 8ac721b3a7dc1c00 x0 : 0000000000000000
[  237.168932] Call trace:
[  237.171373]  add_dma_entry+0x214/0x240
[  237.175115]  debug_dma_map_page+0xf8/0x120
[  237.179203]  gem_rx_refill+0x190/0x280
[  237.182942]  gem_rx+0x224/0x2f0
[  237.186075]  macb_poll+0x58/0x100
[  237.189384]  net_rx_action+0x118/0x400
[  237.193125]  __do_softirq+0x138/0x36c
[  237.196780]  irq_exit+0x98/0xc0
[  237.199914]  __handle_domain_irq+0x64/0xc0
[  237.204000]  gic_handle_irq+0x5c/0xc0
[  237.207654]  el1_irq+0xb8/0x140
[  237.210789]  arch_cpu_idle+0x40/0x200
[  237.214444]  default_idle_call+0x18/0x30
[  237.218359]  do_idle+0x200/0x280
[  237.221578]  cpu_startup_entry+0x20/0x30
[  237.225493]  rest_init+0xe4/0xf0
[  237.228713]  arch_call_rest_init+0xc/0x14
[  237.232714]  start_kernel+0x47c/0x4a8
[  237.236367] ---[ end trace 7240980785f81d70 ]---

Lars was fast to find an explanation: according to the datasheet
bit 2 of the rx buffer descriptor entry has a different meaning in the
extended mode:
  Address [2] of beginning of buffer, or
  in extended buffer descriptor mode (DMA configuration register [28] = 1),
  indicates a valid timestamp in the buffer descriptor entry.

The macb driver didn't mask this bit while getting an address and it
eventually caused a memory corruption and a dma failure.

The problem is resolved by explicitly clearing the problematic bit
if hw timestamping is used.

Fixes: 7b42961 ("net: macb: Add support for PTP timestamps in DMA descriptors")
Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev>
Co-developed-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20230412232144.770336-1-roman.gushchin@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
ammarfaizi2 pushed a commit to ammarfaizi2/linux-fork that referenced this pull request Apr 20, 2023
[ Upstream commit e8b7445 ]

For quite some time we were chasing a bug which looked like a sudden
permanent failure of networking and mmc on some of our devices.
The bug was very sensitive to any software changes and even more to
any kernel debug options.

Finally we got a setup where the problem was reproducible with
CONFIG_DMA_API_DEBUG=y and it revealed the issue with the rx dma:

[   16.992082] ------------[ cut here ]------------
[   16.996779] DMA-API: macb ff0b0000.ethernet: device driver tries to free DMA memory it has not allocated [device address=0x0000000875e3e244] [size=1536 bytes]
[   17.011049] WARNING: CPU: 0 PID: 85 at kernel/dma/debug.c:1011 check_unmap+0x6a0/0x900
[   17.018977] Modules linked in: xxxxx
[   17.038823] CPU: 0 PID: 85 Comm: irq/55-8000f000 Not tainted 5.4.0 torvalds#28
[   17.045345] Hardware name: xxxxx
[   17.049528] pstate: 60000005 (nZCv daif -PAN -UAO)
[   17.054322] pc : check_unmap+0x6a0/0x900
[   17.058243] lr : check_unmap+0x6a0/0x900
[   17.062163] sp : ffffffc010003c40
[   17.065470] x29: ffffffc010003c40 x28: 000000004000c03c
[   17.070783] x27: ffffffc010da7048 x26: ffffff8878e38800
[   17.076095] x25: ffffff8879d22810 x24: ffffffc010003cc8
[   17.081407] x23: 0000000000000000 x22: ffffffc010a08750
[   17.086719] x21: ffffff8878e3c7c0 x20: ffffffc010acb000
[   17.092032] x19: 0000000875e3e244 x18: 0000000000000010
[   17.097343] x17: 0000000000000000 x16: 0000000000000000
[   17.102647] x15: ffffff8879e4a988 x14: 0720072007200720
[   17.107959] x13: 0720072007200720 x12: 0720072007200720
[   17.113261] x11: 0720072007200720 x10: 0720072007200720
[   17.118565] x9 : 0720072007200720 x8 : 000000000000022d
[   17.123869] x7 : 0000000000000015 x6 : 0000000000000098
[   17.129173] x5 : 0000000000000000 x4 : 0000000000000000
[   17.134475] x3 : 00000000ffffffff x2 : ffffffc010a1d370
[   17.139778] x1 : b420c9d75d27bb00 x0 : 0000000000000000
[   17.145082] Call trace:
[   17.147524]  check_unmap+0x6a0/0x900
[   17.151091]  debug_dma_unmap_page+0x88/0x90
[   17.155266]  gem_rx+0x114/0x2f0
[   17.158396]  macb_poll+0x58/0x100
[   17.161705]  net_rx_action+0x118/0x400
[   17.165445]  __do_softirq+0x138/0x36c
[   17.169100]  irq_exit+0x98/0xc0
[   17.172234]  __handle_domain_irq+0x64/0xc0
[   17.176320]  gic_handle_irq+0x5c/0xc0
[   17.179974]  el1_irq+0xb8/0x140
[   17.183109]  xiic_process+0x5c/0xe30
[   17.186677]  irq_thread_fn+0x28/0x90
[   17.190244]  irq_thread+0x208/0x2a0
[   17.193724]  kthread+0x130/0x140
[   17.196945]  ret_from_fork+0x10/0x20
[   17.200510] ---[ end trace 7240980785f81d6f ]---

[  237.021490] ------------[ cut here ]------------
[  237.026129] DMA-API: exceeded 7 overlapping mappings of cacheline 0x0000000021d79e7b
[  237.033886] WARNING: CPU: 0 PID: 0 at kernel/dma/debug.c:499 add_dma_entry+0x214/0x240
[  237.041802] Modules linked in: xxxxx
[  237.061637] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W         5.4.0 torvalds#28
[  237.068941] Hardware name: xxxxx
[  237.073116] pstate: 80000085 (Nzcv daIf -PAN -UAO)
[  237.077900] pc : add_dma_entry+0x214/0x240
[  237.081986] lr : add_dma_entry+0x214/0x240
[  237.086072] sp : ffffffc010003c30
[  237.089379] x29: ffffffc010003c30 x28: ffffff8878a0be00
[  237.094683] x27: 0000000000000180 x26: ffffff8878e387c0
[  237.099987] x25: 0000000000000002 x24: 0000000000000000
[  237.105290] x23: 000000000000003b x22: ffffffc010a0fa00
[  237.110594] x21: 0000000021d79e7b x20: ffffffc010abe600
[  237.115897] x19: 00000000ffffffef x18: 0000000000000010
[  237.121201] x17: 0000000000000000 x16: 0000000000000000
[  237.126504] x15: ffffffc010a0fdc8 x14: 0720072007200720
[  237.131807] x13: 0720072007200720 x12: 0720072007200720
[  237.137111] x11: 0720072007200720 x10: 0720072007200720
[  237.142415] x9 : 0720072007200720 x8 : 0000000000000259
[  237.147718] x7 : 0000000000000001 x6 : 0000000000000000
[  237.153022] x5 : ffffffc010003a20 x4 : 0000000000000001
[  237.158325] x3 : 0000000000000006 x2 : 0000000000000007
[  237.163628] x1 : 8ac721b3a7dc1c00 x0 : 0000000000000000
[  237.168932] Call trace:
[  237.171373]  add_dma_entry+0x214/0x240
[  237.175115]  debug_dma_map_page+0xf8/0x120
[  237.179203]  gem_rx_refill+0x190/0x280
[  237.182942]  gem_rx+0x224/0x2f0
[  237.186075]  macb_poll+0x58/0x100
[  237.189384]  net_rx_action+0x118/0x400
[  237.193125]  __do_softirq+0x138/0x36c
[  237.196780]  irq_exit+0x98/0xc0
[  237.199914]  __handle_domain_irq+0x64/0xc0
[  237.204000]  gic_handle_irq+0x5c/0xc0
[  237.207654]  el1_irq+0xb8/0x140
[  237.210789]  arch_cpu_idle+0x40/0x200
[  237.214444]  default_idle_call+0x18/0x30
[  237.218359]  do_idle+0x200/0x280
[  237.221578]  cpu_startup_entry+0x20/0x30
[  237.225493]  rest_init+0xe4/0xf0
[  237.228713]  arch_call_rest_init+0xc/0x14
[  237.232714]  start_kernel+0x47c/0x4a8
[  237.236367] ---[ end trace 7240980785f81d70 ]---

Lars was fast to find an explanation: according to the datasheet
bit 2 of the rx buffer descriptor entry has a different meaning in the
extended mode:
  Address [2] of beginning of buffer, or
  in extended buffer descriptor mode (DMA configuration register [28] = 1),
  indicates a valid timestamp in the buffer descriptor entry.

The macb driver didn't mask this bit while getting an address and it
eventually caused a memory corruption and a dma failure.

The problem is resolved by explicitly clearing the problematic bit
if hw timestamping is used.

Fixes: 7b42961 ("net: macb: Add support for PTP timestamps in DMA descriptors")
Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev>
Co-developed-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20230412232144.770336-1-roman.gushchin@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
ammarfaizi2 pushed a commit to ammarfaizi2/linux-fork that referenced this pull request Apr 20, 2023
[ Upstream commit e8b7445 ]

For quite some time we were chasing a bug which looked like a sudden
permanent failure of networking and mmc on some of our devices.
The bug was very sensitive to any software changes and even more to
any kernel debug options.

Finally we got a setup where the problem was reproducible with
CONFIG_DMA_API_DEBUG=y and it revealed the issue with the rx dma:

[   16.992082] ------------[ cut here ]------------
[   16.996779] DMA-API: macb ff0b0000.ethernet: device driver tries to free DMA memory it has not allocated [device address=0x0000000875e3e244] [size=1536 bytes]
[   17.011049] WARNING: CPU: 0 PID: 85 at kernel/dma/debug.c:1011 check_unmap+0x6a0/0x900
[   17.018977] Modules linked in: xxxxx
[   17.038823] CPU: 0 PID: 85 Comm: irq/55-8000f000 Not tainted 5.4.0 torvalds#28
[   17.045345] Hardware name: xxxxx
[   17.049528] pstate: 60000005 (nZCv daif -PAN -UAO)
[   17.054322] pc : check_unmap+0x6a0/0x900
[   17.058243] lr : check_unmap+0x6a0/0x900
[   17.062163] sp : ffffffc010003c40
[   17.065470] x29: ffffffc010003c40 x28: 000000004000c03c
[   17.070783] x27: ffffffc010da7048 x26: ffffff8878e38800
[   17.076095] x25: ffffff8879d22810 x24: ffffffc010003cc8
[   17.081407] x23: 0000000000000000 x22: ffffffc010a08750
[   17.086719] x21: ffffff8878e3c7c0 x20: ffffffc010acb000
[   17.092032] x19: 0000000875e3e244 x18: 0000000000000010
[   17.097343] x17: 0000000000000000 x16: 0000000000000000
[   17.102647] x15: ffffff8879e4a988 x14: 0720072007200720
[   17.107959] x13: 0720072007200720 x12: 0720072007200720
[   17.113261] x11: 0720072007200720 x10: 0720072007200720
[   17.118565] x9 : 0720072007200720 x8 : 000000000000022d
[   17.123869] x7 : 0000000000000015 x6 : 0000000000000098
[   17.129173] x5 : 0000000000000000 x4 : 0000000000000000
[   17.134475] x3 : 00000000ffffffff x2 : ffffffc010a1d370
[   17.139778] x1 : b420c9d75d27bb00 x0 : 0000000000000000
[   17.145082] Call trace:
[   17.147524]  check_unmap+0x6a0/0x900
[   17.151091]  debug_dma_unmap_page+0x88/0x90
[   17.155266]  gem_rx+0x114/0x2f0
[   17.158396]  macb_poll+0x58/0x100
[   17.161705]  net_rx_action+0x118/0x400
[   17.165445]  __do_softirq+0x138/0x36c
[   17.169100]  irq_exit+0x98/0xc0
[   17.172234]  __handle_domain_irq+0x64/0xc0
[   17.176320]  gic_handle_irq+0x5c/0xc0
[   17.179974]  el1_irq+0xb8/0x140
[   17.183109]  xiic_process+0x5c/0xe30
[   17.186677]  irq_thread_fn+0x28/0x90
[   17.190244]  irq_thread+0x208/0x2a0
[   17.193724]  kthread+0x130/0x140
[   17.196945]  ret_from_fork+0x10/0x20
[   17.200510] ---[ end trace 7240980785f81d6f ]---

[  237.021490] ------------[ cut here ]------------
[  237.026129] DMA-API: exceeded 7 overlapping mappings of cacheline 0x0000000021d79e7b
[  237.033886] WARNING: CPU: 0 PID: 0 at kernel/dma/debug.c:499 add_dma_entry+0x214/0x240
[  237.041802] Modules linked in: xxxxx
[  237.061637] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W         5.4.0 torvalds#28
[  237.068941] Hardware name: xxxxx
[  237.073116] pstate: 80000085 (Nzcv daIf -PAN -UAO)
[  237.077900] pc : add_dma_entry+0x214/0x240
[  237.081986] lr : add_dma_entry+0x214/0x240
[  237.086072] sp : ffffffc010003c30
[  237.089379] x29: ffffffc010003c30 x28: ffffff8878a0be00
[  237.094683] x27: 0000000000000180 x26: ffffff8878e387c0
[  237.099987] x25: 0000000000000002 x24: 0000000000000000
[  237.105290] x23: 000000000000003b x22: ffffffc010a0fa00
[  237.110594] x21: 0000000021d79e7b x20: ffffffc010abe600
[  237.115897] x19: 00000000ffffffef x18: 0000000000000010
[  237.121201] x17: 0000000000000000 x16: 0000000000000000
[  237.126504] x15: ffffffc010a0fdc8 x14: 0720072007200720
[  237.131807] x13: 0720072007200720 x12: 0720072007200720
[  237.137111] x11: 0720072007200720 x10: 0720072007200720
[  237.142415] x9 : 0720072007200720 x8 : 0000000000000259
[  237.147718] x7 : 0000000000000001 x6 : 0000000000000000
[  237.153022] x5 : ffffffc010003a20 x4 : 0000000000000001
[  237.158325] x3 : 0000000000000006 x2 : 0000000000000007
[  237.163628] x1 : 8ac721b3a7dc1c00 x0 : 0000000000000000
[  237.168932] Call trace:
[  237.171373]  add_dma_entry+0x214/0x240
[  237.175115]  debug_dma_map_page+0xf8/0x120
[  237.179203]  gem_rx_refill+0x190/0x280
[  237.182942]  gem_rx+0x224/0x2f0
[  237.186075]  macb_poll+0x58/0x100
[  237.189384]  net_rx_action+0x118/0x400
[  237.193125]  __do_softirq+0x138/0x36c
[  237.196780]  irq_exit+0x98/0xc0
[  237.199914]  __handle_domain_irq+0x64/0xc0
[  237.204000]  gic_handle_irq+0x5c/0xc0
[  237.207654]  el1_irq+0xb8/0x140
[  237.210789]  arch_cpu_idle+0x40/0x200
[  237.214444]  default_idle_call+0x18/0x30
[  237.218359]  do_idle+0x200/0x280
[  237.221578]  cpu_startup_entry+0x20/0x30
[  237.225493]  rest_init+0xe4/0xf0
[  237.228713]  arch_call_rest_init+0xc/0x14
[  237.232714]  start_kernel+0x47c/0x4a8
[  237.236367] ---[ end trace 7240980785f81d70 ]---

Lars was fast to find an explanation: according to the datasheet
bit 2 of the rx buffer descriptor entry has a different meaning in the
extended mode:
  Address [2] of beginning of buffer, or
  in extended buffer descriptor mode (DMA configuration register [28] = 1),
  indicates a valid timestamp in the buffer descriptor entry.

The macb driver didn't mask this bit while getting an address and it
eventually caused a memory corruption and a dma failure.

The problem is resolved by explicitly clearing the problematic bit
if hw timestamping is used.

Fixes: 7b42961 ("net: macb: Add support for PTP timestamps in DMA descriptors")
Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev>
Co-developed-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20230412232144.770336-1-roman.gushchin@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
ammarfaizi2 pushed a commit to ammarfaizi2/linux-fork that referenced this pull request Apr 20, 2023
[ Upstream commit e8b7445 ]

For quite some time we were chasing a bug which looked like a sudden
permanent failure of networking and mmc on some of our devices.
The bug was very sensitive to any software changes and even more to
any kernel debug options.

Finally we got a setup where the problem was reproducible with
CONFIG_DMA_API_DEBUG=y and it revealed the issue with the rx dma:

[   16.992082] ------------[ cut here ]------------
[   16.996779] DMA-API: macb ff0b0000.ethernet: device driver tries to free DMA memory it has not allocated [device address=0x0000000875e3e244] [size=1536 bytes]
[   17.011049] WARNING: CPU: 0 PID: 85 at kernel/dma/debug.c:1011 check_unmap+0x6a0/0x900
[   17.018977] Modules linked in: xxxxx
[   17.038823] CPU: 0 PID: 85 Comm: irq/55-8000f000 Not tainted 5.4.0 torvalds#28
[   17.045345] Hardware name: xxxxx
[   17.049528] pstate: 60000005 (nZCv daif -PAN -UAO)
[   17.054322] pc : check_unmap+0x6a0/0x900
[   17.058243] lr : check_unmap+0x6a0/0x900
[   17.062163] sp : ffffffc010003c40
[   17.065470] x29: ffffffc010003c40 x28: 000000004000c03c
[   17.070783] x27: ffffffc010da7048 x26: ffffff8878e38800
[   17.076095] x25: ffffff8879d22810 x24: ffffffc010003cc8
[   17.081407] x23: 0000000000000000 x22: ffffffc010a08750
[   17.086719] x21: ffffff8878e3c7c0 x20: ffffffc010acb000
[   17.092032] x19: 0000000875e3e244 x18: 0000000000000010
[   17.097343] x17: 0000000000000000 x16: 0000000000000000
[   17.102647] x15: ffffff8879e4a988 x14: 0720072007200720
[   17.107959] x13: 0720072007200720 x12: 0720072007200720
[   17.113261] x11: 0720072007200720 x10: 0720072007200720
[   17.118565] x9 : 0720072007200720 x8 : 000000000000022d
[   17.123869] x7 : 0000000000000015 x6 : 0000000000000098
[   17.129173] x5 : 0000000000000000 x4 : 0000000000000000
[   17.134475] x3 : 00000000ffffffff x2 : ffffffc010a1d370
[   17.139778] x1 : b420c9d75d27bb00 x0 : 0000000000000000
[   17.145082] Call trace:
[   17.147524]  check_unmap+0x6a0/0x900
[   17.151091]  debug_dma_unmap_page+0x88/0x90
[   17.155266]  gem_rx+0x114/0x2f0
[   17.158396]  macb_poll+0x58/0x100
[   17.161705]  net_rx_action+0x118/0x400
[   17.165445]  __do_softirq+0x138/0x36c
[   17.169100]  irq_exit+0x98/0xc0
[   17.172234]  __handle_domain_irq+0x64/0xc0
[   17.176320]  gic_handle_irq+0x5c/0xc0
[   17.179974]  el1_irq+0xb8/0x140
[   17.183109]  xiic_process+0x5c/0xe30
[   17.186677]  irq_thread_fn+0x28/0x90
[   17.190244]  irq_thread+0x208/0x2a0
[   17.193724]  kthread+0x130/0x140
[   17.196945]  ret_from_fork+0x10/0x20
[   17.200510] ---[ end trace 7240980785f81d6f ]---

[  237.021490] ------------[ cut here ]------------
[  237.026129] DMA-API: exceeded 7 overlapping mappings of cacheline 0x0000000021d79e7b
[  237.033886] WARNING: CPU: 0 PID: 0 at kernel/dma/debug.c:499 add_dma_entry+0x214/0x240
[  237.041802] Modules linked in: xxxxx
[  237.061637] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W         5.4.0 torvalds#28
[  237.068941] Hardware name: xxxxx
[  237.073116] pstate: 80000085 (Nzcv daIf -PAN -UAO)
[  237.077900] pc : add_dma_entry+0x214/0x240
[  237.081986] lr : add_dma_entry+0x214/0x240
[  237.086072] sp : ffffffc010003c30
[  237.089379] x29: ffffffc010003c30 x28: ffffff8878a0be00
[  237.094683] x27: 0000000000000180 x26: ffffff8878e387c0
[  237.099987] x25: 0000000000000002 x24: 0000000000000000
[  237.105290] x23: 000000000000003b x22: ffffffc010a0fa00
[  237.110594] x21: 0000000021d79e7b x20: ffffffc010abe600
[  237.115897] x19: 00000000ffffffef x18: 0000000000000010
[  237.121201] x17: 0000000000000000 x16: 0000000000000000
[  237.126504] x15: ffffffc010a0fdc8 x14: 0720072007200720
[  237.131807] x13: 0720072007200720 x12: 0720072007200720
[  237.137111] x11: 0720072007200720 x10: 0720072007200720
[  237.142415] x9 : 0720072007200720 x8 : 0000000000000259
[  237.147718] x7 : 0000000000000001 x6 : 0000000000000000
[  237.153022] x5 : ffffffc010003a20 x4 : 0000000000000001
[  237.158325] x3 : 0000000000000006 x2 : 0000000000000007
[  237.163628] x1 : 8ac721b3a7dc1c00 x0 : 0000000000000000
[  237.168932] Call trace:
[  237.171373]  add_dma_entry+0x214/0x240
[  237.175115]  debug_dma_map_page+0xf8/0x120
[  237.179203]  gem_rx_refill+0x190/0x280
[  237.182942]  gem_rx+0x224/0x2f0
[  237.186075]  macb_poll+0x58/0x100
[  237.189384]  net_rx_action+0x118/0x400
[  237.193125]  __do_softirq+0x138/0x36c
[  237.196780]  irq_exit+0x98/0xc0
[  237.199914]  __handle_domain_irq+0x64/0xc0
[  237.204000]  gic_handle_irq+0x5c/0xc0
[  237.207654]  el1_irq+0xb8/0x140
[  237.210789]  arch_cpu_idle+0x40/0x200
[  237.214444]  default_idle_call+0x18/0x30
[  237.218359]  do_idle+0x200/0x280
[  237.221578]  cpu_startup_entry+0x20/0x30
[  237.225493]  rest_init+0xe4/0xf0
[  237.228713]  arch_call_rest_init+0xc/0x14
[  237.232714]  start_kernel+0x47c/0x4a8
[  237.236367] ---[ end trace 7240980785f81d70 ]---

Lars was fast to find an explanation: according to the datasheet
bit 2 of the rx buffer descriptor entry has a different meaning in the
extended mode:
  Address [2] of beginning of buffer, or
  in extended buffer descriptor mode (DMA configuration register [28] = 1),
  indicates a valid timestamp in the buffer descriptor entry.

The macb driver didn't mask this bit while getting an address and it
eventually caused a memory corruption and a dma failure.

The problem is resolved by explicitly clearing the problematic bit
if hw timestamping is used.

Fixes: 7b42961 ("net: macb: Add support for PTP timestamps in DMA descriptors")
Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev>
Co-developed-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20230412232144.770336-1-roman.gushchin@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
ammarfaizi2 pushed a commit to ammarfaizi2/linux-fork that referenced this pull request Apr 20, 2023
[ Upstream commit e8b7445 ]

For quite some time we were chasing a bug which looked like a sudden
permanent failure of networking and mmc on some of our devices.
The bug was very sensitive to any software changes and even more to
any kernel debug options.

Finally we got a setup where the problem was reproducible with
CONFIG_DMA_API_DEBUG=y and it revealed the issue with the rx dma:

[   16.992082] ------------[ cut here ]------------
[   16.996779] DMA-API: macb ff0b0000.ethernet: device driver tries to free DMA memory it has not allocated [device address=0x0000000875e3e244] [size=1536 bytes]
[   17.011049] WARNING: CPU: 0 PID: 85 at kernel/dma/debug.c:1011 check_unmap+0x6a0/0x900
[   17.018977] Modules linked in: xxxxx
[   17.038823] CPU: 0 PID: 85 Comm: irq/55-8000f000 Not tainted 5.4.0 torvalds#28
[   17.045345] Hardware name: xxxxx
[   17.049528] pstate: 60000005 (nZCv daif -PAN -UAO)
[   17.054322] pc : check_unmap+0x6a0/0x900
[   17.058243] lr : check_unmap+0x6a0/0x900
[   17.062163] sp : ffffffc010003c40
[   17.065470] x29: ffffffc010003c40 x28: 000000004000c03c
[   17.070783] x27: ffffffc010da7048 x26: ffffff8878e38800
[   17.076095] x25: ffffff8879d22810 x24: ffffffc010003cc8
[   17.081407] x23: 0000000000000000 x22: ffffffc010a08750
[   17.086719] x21: ffffff8878e3c7c0 x20: ffffffc010acb000
[   17.092032] x19: 0000000875e3e244 x18: 0000000000000010
[   17.097343] x17: 0000000000000000 x16: 0000000000000000
[   17.102647] x15: ffffff8879e4a988 x14: 0720072007200720
[   17.107959] x13: 0720072007200720 x12: 0720072007200720
[   17.113261] x11: 0720072007200720 x10: 0720072007200720
[   17.118565] x9 : 0720072007200720 x8 : 000000000000022d
[   17.123869] x7 : 0000000000000015 x6 : 0000000000000098
[   17.129173] x5 : 0000000000000000 x4 : 0000000000000000
[   17.134475] x3 : 00000000ffffffff x2 : ffffffc010a1d370
[   17.139778] x1 : b420c9d75d27bb00 x0 : 0000000000000000
[   17.145082] Call trace:
[   17.147524]  check_unmap+0x6a0/0x900
[   17.151091]  debug_dma_unmap_page+0x88/0x90
[   17.155266]  gem_rx+0x114/0x2f0
[   17.158396]  macb_poll+0x58/0x100
[   17.161705]  net_rx_action+0x118/0x400
[   17.165445]  __do_softirq+0x138/0x36c
[   17.169100]  irq_exit+0x98/0xc0
[   17.172234]  __handle_domain_irq+0x64/0xc0
[   17.176320]  gic_handle_irq+0x5c/0xc0
[   17.179974]  el1_irq+0xb8/0x140
[   17.183109]  xiic_process+0x5c/0xe30
[   17.186677]  irq_thread_fn+0x28/0x90
[   17.190244]  irq_thread+0x208/0x2a0
[   17.193724]  kthread+0x130/0x140
[   17.196945]  ret_from_fork+0x10/0x20
[   17.200510] ---[ end trace 7240980785f81d6f ]---

[  237.021490] ------------[ cut here ]------------
[  237.026129] DMA-API: exceeded 7 overlapping mappings of cacheline 0x0000000021d79e7b
[  237.033886] WARNING: CPU: 0 PID: 0 at kernel/dma/debug.c:499 add_dma_entry+0x214/0x240
[  237.041802] Modules linked in: xxxxx
[  237.061637] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W         5.4.0 torvalds#28
[  237.068941] Hardware name: xxxxx
[  237.073116] pstate: 80000085 (Nzcv daIf -PAN -UAO)
[  237.077900] pc : add_dma_entry+0x214/0x240
[  237.081986] lr : add_dma_entry+0x214/0x240
[  237.086072] sp : ffffffc010003c30
[  237.089379] x29: ffffffc010003c30 x28: ffffff8878a0be00
[  237.094683] x27: 0000000000000180 x26: ffffff8878e387c0
[  237.099987] x25: 0000000000000002 x24: 0000000000000000
[  237.105290] x23: 000000000000003b x22: ffffffc010a0fa00
[  237.110594] x21: 0000000021d79e7b x20: ffffffc010abe600
[  237.115897] x19: 00000000ffffffef x18: 0000000000000010
[  237.121201] x17: 0000000000000000 x16: 0000000000000000
[  237.126504] x15: ffffffc010a0fdc8 x14: 0720072007200720
[  237.131807] x13: 0720072007200720 x12: 0720072007200720
[  237.137111] x11: 0720072007200720 x10: 0720072007200720
[  237.142415] x9 : 0720072007200720 x8 : 0000000000000259
[  237.147718] x7 : 0000000000000001 x6 : 0000000000000000
[  237.153022] x5 : ffffffc010003a20 x4 : 0000000000000001
[  237.158325] x3 : 0000000000000006 x2 : 0000000000000007
[  237.163628] x1 : 8ac721b3a7dc1c00 x0 : 0000000000000000
[  237.168932] Call trace:
[  237.171373]  add_dma_entry+0x214/0x240
[  237.175115]  debug_dma_map_page+0xf8/0x120
[  237.179203]  gem_rx_refill+0x190/0x280
[  237.182942]  gem_rx+0x224/0x2f0
[  237.186075]  macb_poll+0x58/0x100
[  237.189384]  net_rx_action+0x118/0x400
[  237.193125]  __do_softirq+0x138/0x36c
[  237.196780]  irq_exit+0x98/0xc0
[  237.199914]  __handle_domain_irq+0x64/0xc0
[  237.204000]  gic_handle_irq+0x5c/0xc0
[  237.207654]  el1_irq+0xb8/0x140
[  237.210789]  arch_cpu_idle+0x40/0x200
[  237.214444]  default_idle_call+0x18/0x30
[  237.218359]  do_idle+0x200/0x280
[  237.221578]  cpu_startup_entry+0x20/0x30
[  237.225493]  rest_init+0xe4/0xf0
[  237.228713]  arch_call_rest_init+0xc/0x14
[  237.232714]  start_kernel+0x47c/0x4a8
[  237.236367] ---[ end trace 7240980785f81d70 ]---

Lars was fast to find an explanation: according to the datasheet
bit 2 of the rx buffer descriptor entry has a different meaning in the
extended mode:
  Address [2] of beginning of buffer, or
  in extended buffer descriptor mode (DMA configuration register [28] = 1),
  indicates a valid timestamp in the buffer descriptor entry.

The macb driver didn't mask this bit while getting an address and it
eventually caused a memory corruption and a dma failure.

The problem is resolved by explicitly clearing the problematic bit
if hw timestamping is used.

Fixes: 7b42961 ("net: macb: Add support for PTP timestamps in DMA descriptors")
Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev>
Co-developed-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20230412232144.770336-1-roman.gushchin@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
ammarfaizi2 pushed a commit to ammarfaizi2/linux-fork that referenced this pull request Apr 20, 2023
[ Upstream commit e8b7445 ]

For quite some time we were chasing a bug which looked like a sudden
permanent failure of networking and mmc on some of our devices.
The bug was very sensitive to any software changes and even more to
any kernel debug options.

Finally we got a setup where the problem was reproducible with
CONFIG_DMA_API_DEBUG=y and it revealed the issue with the rx dma:

[   16.992082] ------------[ cut here ]------------
[   16.996779] DMA-API: macb ff0b0000.ethernet: device driver tries to free DMA memory it has not allocated [device address=0x0000000875e3e244] [size=1536 bytes]
[   17.011049] WARNING: CPU: 0 PID: 85 at kernel/dma/debug.c:1011 check_unmap+0x6a0/0x900
[   17.018977] Modules linked in: xxxxx
[   17.038823] CPU: 0 PID: 85 Comm: irq/55-8000f000 Not tainted 5.4.0 torvalds#28
[   17.045345] Hardware name: xxxxx
[   17.049528] pstate: 60000005 (nZCv daif -PAN -UAO)
[   17.054322] pc : check_unmap+0x6a0/0x900
[   17.058243] lr : check_unmap+0x6a0/0x900
[   17.062163] sp : ffffffc010003c40
[   17.065470] x29: ffffffc010003c40 x28: 000000004000c03c
[   17.070783] x27: ffffffc010da7048 x26: ffffff8878e38800
[   17.076095] x25: ffffff8879d22810 x24: ffffffc010003cc8
[   17.081407] x23: 0000000000000000 x22: ffffffc010a08750
[   17.086719] x21: ffffff8878e3c7c0 x20: ffffffc010acb000
[   17.092032] x19: 0000000875e3e244 x18: 0000000000000010
[   17.097343] x17: 0000000000000000 x16: 0000000000000000
[   17.102647] x15: ffffff8879e4a988 x14: 0720072007200720
[   17.107959] x13: 0720072007200720 x12: 0720072007200720
[   17.113261] x11: 0720072007200720 x10: 0720072007200720
[   17.118565] x9 : 0720072007200720 x8 : 000000000000022d
[   17.123869] x7 : 0000000000000015 x6 : 0000000000000098
[   17.129173] x5 : 0000000000000000 x4 : 0000000000000000
[   17.134475] x3 : 00000000ffffffff x2 : ffffffc010a1d370
[   17.139778] x1 : b420c9d75d27bb00 x0 : 0000000000000000
[   17.145082] Call trace:
[   17.147524]  check_unmap+0x6a0/0x900
[   17.151091]  debug_dma_unmap_page+0x88/0x90
[   17.155266]  gem_rx+0x114/0x2f0
[   17.158396]  macb_poll+0x58/0x100
[   17.161705]  net_rx_action+0x118/0x400
[   17.165445]  __do_softirq+0x138/0x36c
[   17.169100]  irq_exit+0x98/0xc0
[   17.172234]  __handle_domain_irq+0x64/0xc0
[   17.176320]  gic_handle_irq+0x5c/0xc0
[   17.179974]  el1_irq+0xb8/0x140
[   17.183109]  xiic_process+0x5c/0xe30
[   17.186677]  irq_thread_fn+0x28/0x90
[   17.190244]  irq_thread+0x208/0x2a0
[   17.193724]  kthread+0x130/0x140
[   17.196945]  ret_from_fork+0x10/0x20
[   17.200510] ---[ end trace 7240980785f81d6f ]---

[  237.021490] ------------[ cut here ]------------
[  237.026129] DMA-API: exceeded 7 overlapping mappings of cacheline 0x0000000021d79e7b
[  237.033886] WARNING: CPU: 0 PID: 0 at kernel/dma/debug.c:499 add_dma_entry+0x214/0x240
[  237.041802] Modules linked in: xxxxx
[  237.061637] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W         5.4.0 torvalds#28
[  237.068941] Hardware name: xxxxx
[  237.073116] pstate: 80000085 (Nzcv daIf -PAN -UAO)
[  237.077900] pc : add_dma_entry+0x214/0x240
[  237.081986] lr : add_dma_entry+0x214/0x240
[  237.086072] sp : ffffffc010003c30
[  237.089379] x29: ffffffc010003c30 x28: ffffff8878a0be00
[  237.094683] x27: 0000000000000180 x26: ffffff8878e387c0
[  237.099987] x25: 0000000000000002 x24: 0000000000000000
[  237.105290] x23: 000000000000003b x22: ffffffc010a0fa00
[  237.110594] x21: 0000000021d79e7b x20: ffffffc010abe600
[  237.115897] x19: 00000000ffffffef x18: 0000000000000010
[  237.121201] x17: 0000000000000000 x16: 0000000000000000
[  237.126504] x15: ffffffc010a0fdc8 x14: 0720072007200720
[  237.131807] x13: 0720072007200720 x12: 0720072007200720
[  237.137111] x11: 0720072007200720 x10: 0720072007200720
[  237.142415] x9 : 0720072007200720 x8 : 0000000000000259
[  237.147718] x7 : 0000000000000001 x6 : 0000000000000000
[  237.153022] x5 : ffffffc010003a20 x4 : 0000000000000001
[  237.158325] x3 : 0000000000000006 x2 : 0000000000000007
[  237.163628] x1 : 8ac721b3a7dc1c00 x0 : 0000000000000000
[  237.168932] Call trace:
[  237.171373]  add_dma_entry+0x214/0x240
[  237.175115]  debug_dma_map_page+0xf8/0x120
[  237.179203]  gem_rx_refill+0x190/0x280
[  237.182942]  gem_rx+0x224/0x2f0
[  237.186075]  macb_poll+0x58/0x100
[  237.189384]  net_rx_action+0x118/0x400
[  237.193125]  __do_softirq+0x138/0x36c
[  237.196780]  irq_exit+0x98/0xc0
[  237.199914]  __handle_domain_irq+0x64/0xc0
[  237.204000]  gic_handle_irq+0x5c/0xc0
[  237.207654]  el1_irq+0xb8/0x140
[  237.210789]  arch_cpu_idle+0x40/0x200
[  237.214444]  default_idle_call+0x18/0x30
[  237.218359]  do_idle+0x200/0x280
[  237.221578]  cpu_startup_entry+0x20/0x30
[  237.225493]  rest_init+0xe4/0xf0
[  237.228713]  arch_call_rest_init+0xc/0x14
[  237.232714]  start_kernel+0x47c/0x4a8
[  237.236367] ---[ end trace 7240980785f81d70 ]---

Lars was fast to find an explanation: according to the datasheet
bit 2 of the rx buffer descriptor entry has a different meaning in the
extended mode:
  Address [2] of beginning of buffer, or
  in extended buffer descriptor mode (DMA configuration register [28] = 1),
  indicates a valid timestamp in the buffer descriptor entry.

The macb driver didn't mask this bit while getting an address and it
eventually caused a memory corruption and a dma failure.

The problem is resolved by explicitly clearing the problematic bit
if hw timestamping is used.

Fixes: 7b42961 ("net: macb: Add support for PTP timestamps in DMA descriptors")
Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev>
Co-developed-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20230412232144.770336-1-roman.gushchin@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
ammarfaizi2 pushed a commit to ammarfaizi2/linux-fork that referenced this pull request Apr 20, 2023
[ Upstream commit e8b7445 ]

For quite some time we were chasing a bug which looked like a sudden
permanent failure of networking and mmc on some of our devices.
The bug was very sensitive to any software changes and even more to
any kernel debug options.

Finally we got a setup where the problem was reproducible with
CONFIG_DMA_API_DEBUG=y and it revealed the issue with the rx dma:

[   16.992082] ------------[ cut here ]------------
[   16.996779] DMA-API: macb ff0b0000.ethernet: device driver tries to free DMA memory it has not allocated [device address=0x0000000875e3e244] [size=1536 bytes]
[   17.011049] WARNING: CPU: 0 PID: 85 at kernel/dma/debug.c:1011 check_unmap+0x6a0/0x900
[   17.018977] Modules linked in: xxxxx
[   17.038823] CPU: 0 PID: 85 Comm: irq/55-8000f000 Not tainted 5.4.0 torvalds#28
[   17.045345] Hardware name: xxxxx
[   17.049528] pstate: 60000005 (nZCv daif -PAN -UAO)
[   17.054322] pc : check_unmap+0x6a0/0x900
[   17.058243] lr : check_unmap+0x6a0/0x900
[   17.062163] sp : ffffffc010003c40
[   17.065470] x29: ffffffc010003c40 x28: 000000004000c03c
[   17.070783] x27: ffffffc010da7048 x26: ffffff8878e38800
[   17.076095] x25: ffffff8879d22810 x24: ffffffc010003cc8
[   17.081407] x23: 0000000000000000 x22: ffffffc010a08750
[   17.086719] x21: ffffff8878e3c7c0 x20: ffffffc010acb000
[   17.092032] x19: 0000000875e3e244 x18: 0000000000000010
[   17.097343] x17: 0000000000000000 x16: 0000000000000000
[   17.102647] x15: ffffff8879e4a988 x14: 0720072007200720
[   17.107959] x13: 0720072007200720 x12: 0720072007200720
[   17.113261] x11: 0720072007200720 x10: 0720072007200720
[   17.118565] x9 : 0720072007200720 x8 : 000000000000022d
[   17.123869] x7 : 0000000000000015 x6 : 0000000000000098
[   17.129173] x5 : 0000000000000000 x4 : 0000000000000000
[   17.134475] x3 : 00000000ffffffff x2 : ffffffc010a1d370
[   17.139778] x1 : b420c9d75d27bb00 x0 : 0000000000000000
[   17.145082] Call trace:
[   17.147524]  check_unmap+0x6a0/0x900
[   17.151091]  debug_dma_unmap_page+0x88/0x90
[   17.155266]  gem_rx+0x114/0x2f0
[   17.158396]  macb_poll+0x58/0x100
[   17.161705]  net_rx_action+0x118/0x400
[   17.165445]  __do_softirq+0x138/0x36c
[   17.169100]  irq_exit+0x98/0xc0
[   17.172234]  __handle_domain_irq+0x64/0xc0
[   17.176320]  gic_handle_irq+0x5c/0xc0
[   17.179974]  el1_irq+0xb8/0x140
[   17.183109]  xiic_process+0x5c/0xe30
[   17.186677]  irq_thread_fn+0x28/0x90
[   17.190244]  irq_thread+0x208/0x2a0
[   17.193724]  kthread+0x130/0x140
[   17.196945]  ret_from_fork+0x10/0x20
[   17.200510] ---[ end trace 7240980785f81d6f ]---

[  237.021490] ------------[ cut here ]------------
[  237.026129] DMA-API: exceeded 7 overlapping mappings of cacheline 0x0000000021d79e7b
[  237.033886] WARNING: CPU: 0 PID: 0 at kernel/dma/debug.c:499 add_dma_entry+0x214/0x240
[  237.041802] Modules linked in: xxxxx
[  237.061637] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W         5.4.0 torvalds#28
[  237.068941] Hardware name: xxxxx
[  237.073116] pstate: 80000085 (Nzcv daIf -PAN -UAO)
[  237.077900] pc : add_dma_entry+0x214/0x240
[  237.081986] lr : add_dma_entry+0x214/0x240
[  237.086072] sp : ffffffc010003c30
[  237.089379] x29: ffffffc010003c30 x28: ffffff8878a0be00
[  237.094683] x27: 0000000000000180 x26: ffffff8878e387c0
[  237.099987] x25: 0000000000000002 x24: 0000000000000000
[  237.105290] x23: 000000000000003b x22: ffffffc010a0fa00
[  237.110594] x21: 0000000021d79e7b x20: ffffffc010abe600
[  237.115897] x19: 00000000ffffffef x18: 0000000000000010
[  237.121201] x17: 0000000000000000 x16: 0000000000000000
[  237.126504] x15: ffffffc010a0fdc8 x14: 0720072007200720
[  237.131807] x13: 0720072007200720 x12: 0720072007200720
[  237.137111] x11: 0720072007200720 x10: 0720072007200720
[  237.142415] x9 : 0720072007200720 x8 : 0000000000000259
[  237.147718] x7 : 0000000000000001 x6 : 0000000000000000
[  237.153022] x5 : ffffffc010003a20 x4 : 0000000000000001
[  237.158325] x3 : 0000000000000006 x2 : 0000000000000007
[  237.163628] x1 : 8ac721b3a7dc1c00 x0 : 0000000000000000
[  237.168932] Call trace:
[  237.171373]  add_dma_entry+0x214/0x240
[  237.175115]  debug_dma_map_page+0xf8/0x120
[  237.179203]  gem_rx_refill+0x190/0x280
[  237.182942]  gem_rx+0x224/0x2f0
[  237.186075]  macb_poll+0x58/0x100
[  237.189384]  net_rx_action+0x118/0x400
[  237.193125]  __do_softirq+0x138/0x36c
[  237.196780]  irq_exit+0x98/0xc0
[  237.199914]  __handle_domain_irq+0x64/0xc0
[  237.204000]  gic_handle_irq+0x5c/0xc0
[  237.207654]  el1_irq+0xb8/0x140
[  237.210789]  arch_cpu_idle+0x40/0x200
[  237.214444]  default_idle_call+0x18/0x30
[  237.218359]  do_idle+0x200/0x280
[  237.221578]  cpu_startup_entry+0x20/0x30
[  237.225493]  rest_init+0xe4/0xf0
[  237.228713]  arch_call_rest_init+0xc/0x14
[  237.232714]  start_kernel+0x47c/0x4a8
[  237.236367] ---[ end trace 7240980785f81d70 ]---

Lars was fast to find an explanation: according to the datasheet
bit 2 of the rx buffer descriptor entry has a different meaning in the
extended mode:
  Address [2] of beginning of buffer, or
  in extended buffer descriptor mode (DMA configuration register [28] = 1),
  indicates a valid timestamp in the buffer descriptor entry.

The macb driver didn't mask this bit while getting an address and it
eventually caused a memory corruption and a dma failure.

The problem is resolved by explicitly clearing the problematic bit
if hw timestamping is used.

Fixes: 7b42961 ("net: macb: Add support for PTP timestamps in DMA descriptors")
Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev>
Co-developed-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20230412232144.770336-1-roman.gushchin@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
ammarfaizi2 pushed a commit to ammarfaizi2/linux-fork that referenced this pull request Apr 20, 2023
[ Upstream commit e8b7445 ]

For quite some time we were chasing a bug which looked like a sudden
permanent failure of networking and mmc on some of our devices.
The bug was very sensitive to any software changes and even more to
any kernel debug options.

Finally we got a setup where the problem was reproducible with
CONFIG_DMA_API_DEBUG=y and it revealed the issue with the rx dma:

[   16.992082] ------------[ cut here ]------------
[   16.996779] DMA-API: macb ff0b0000.ethernet: device driver tries to free DMA memory it has not allocated [device address=0x0000000875e3e244] [size=1536 bytes]
[   17.011049] WARNING: CPU: 0 PID: 85 at kernel/dma/debug.c:1011 check_unmap+0x6a0/0x900
[   17.018977] Modules linked in: xxxxx
[   17.038823] CPU: 0 PID: 85 Comm: irq/55-8000f000 Not tainted 5.4.0 torvalds#28
[   17.045345] Hardware name: xxxxx
[   17.049528] pstate: 60000005 (nZCv daif -PAN -UAO)
[   17.054322] pc : check_unmap+0x6a0/0x900
[   17.058243] lr : check_unmap+0x6a0/0x900
[   17.062163] sp : ffffffc010003c40
[   17.065470] x29: ffffffc010003c40 x28: 000000004000c03c
[   17.070783] x27: ffffffc010da7048 x26: ffffff8878e38800
[   17.076095] x25: ffffff8879d22810 x24: ffffffc010003cc8
[   17.081407] x23: 0000000000000000 x22: ffffffc010a08750
[   17.086719] x21: ffffff8878e3c7c0 x20: ffffffc010acb000
[   17.092032] x19: 0000000875e3e244 x18: 0000000000000010
[   17.097343] x17: 0000000000000000 x16: 0000000000000000
[   17.102647] x15: ffffff8879e4a988 x14: 0720072007200720
[   17.107959] x13: 0720072007200720 x12: 0720072007200720
[   17.113261] x11: 0720072007200720 x10: 0720072007200720
[   17.118565] x9 : 0720072007200720 x8 : 000000000000022d
[   17.123869] x7 : 0000000000000015 x6 : 0000000000000098
[   17.129173] x5 : 0000000000000000 x4 : 0000000000000000
[   17.134475] x3 : 00000000ffffffff x2 : ffffffc010a1d370
[   17.139778] x1 : b420c9d75d27bb00 x0 : 0000000000000000
[   17.145082] Call trace:
[   17.147524]  check_unmap+0x6a0/0x900
[   17.151091]  debug_dma_unmap_page+0x88/0x90
[   17.155266]  gem_rx+0x114/0x2f0
[   17.158396]  macb_poll+0x58/0x100
[   17.161705]  net_rx_action+0x118/0x400
[   17.165445]  __do_softirq+0x138/0x36c
[   17.169100]  irq_exit+0x98/0xc0
[   17.172234]  __handle_domain_irq+0x64/0xc0
[   17.176320]  gic_handle_irq+0x5c/0xc0
[   17.179974]  el1_irq+0xb8/0x140
[   17.183109]  xiic_process+0x5c/0xe30
[   17.186677]  irq_thread_fn+0x28/0x90
[   17.190244]  irq_thread+0x208/0x2a0
[   17.193724]  kthread+0x130/0x140
[   17.196945]  ret_from_fork+0x10/0x20
[   17.200510] ---[ end trace 7240980785f81d6f ]---

[  237.021490] ------------[ cut here ]------------
[  237.026129] DMA-API: exceeded 7 overlapping mappings of cacheline 0x0000000021d79e7b
[  237.033886] WARNING: CPU: 0 PID: 0 at kernel/dma/debug.c:499 add_dma_entry+0x214/0x240
[  237.041802] Modules linked in: xxxxx
[  237.061637] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W         5.4.0 torvalds#28
[  237.068941] Hardware name: xxxxx
[  237.073116] pstate: 80000085 (Nzcv daIf -PAN -UAO)
[  237.077900] pc : add_dma_entry+0x214/0x240
[  237.081986] lr : add_dma_entry+0x214/0x240
[  237.086072] sp : ffffffc010003c30
[  237.089379] x29: ffffffc010003c30 x28: ffffff8878a0be00
[  237.094683] x27: 0000000000000180 x26: ffffff8878e387c0
[  237.099987] x25: 0000000000000002 x24: 0000000000000000
[  237.105290] x23: 000000000000003b x22: ffffffc010a0fa00
[  237.110594] x21: 0000000021d79e7b x20: ffffffc010abe600
[  237.115897] x19: 00000000ffffffef x18: 0000000000000010
[  237.121201] x17: 0000000000000000 x16: 0000000000000000
[  237.126504] x15: ffffffc010a0fdc8 x14: 0720072007200720
[  237.131807] x13: 0720072007200720 x12: 0720072007200720
[  237.137111] x11: 0720072007200720 x10: 0720072007200720
[  237.142415] x9 : 0720072007200720 x8 : 0000000000000259
[  237.147718] x7 : 0000000000000001 x6 : 0000000000000000
[  237.153022] x5 : ffffffc010003a20 x4 : 0000000000000001
[  237.158325] x3 : 0000000000000006 x2 : 0000000000000007
[  237.163628] x1 : 8ac721b3a7dc1c00 x0 : 0000000000000000
[  237.168932] Call trace:
[  237.171373]  add_dma_entry+0x214/0x240
[  237.175115]  debug_dma_map_page+0xf8/0x120
[  237.179203]  gem_rx_refill+0x190/0x280
[  237.182942]  gem_rx+0x224/0x2f0
[  237.186075]  macb_poll+0x58/0x100
[  237.189384]  net_rx_action+0x118/0x400
[  237.193125]  __do_softirq+0x138/0x36c
[  237.196780]  irq_exit+0x98/0xc0
[  237.199914]  __handle_domain_irq+0x64/0xc0
[  237.204000]  gic_handle_irq+0x5c/0xc0
[  237.207654]  el1_irq+0xb8/0x140
[  237.210789]  arch_cpu_idle+0x40/0x200
[  237.214444]  default_idle_call+0x18/0x30
[  237.218359]  do_idle+0x200/0x280
[  237.221578]  cpu_startup_entry+0x20/0x30
[  237.225493]  rest_init+0xe4/0xf0
[  237.228713]  arch_call_rest_init+0xc/0x14
[  237.232714]  start_kernel+0x47c/0x4a8
[  237.236367] ---[ end trace 7240980785f81d70 ]---

Lars was fast to find an explanation: according to the datasheet
bit 2 of the rx buffer descriptor entry has a different meaning in the
extended mode:
  Address [2] of beginning of buffer, or
  in extended buffer descriptor mode (DMA configuration register [28] = 1),
  indicates a valid timestamp in the buffer descriptor entry.

The macb driver didn't mask this bit while getting an address and it
eventually caused a memory corruption and a dma failure.

The problem is resolved by explicitly clearing the problematic bit
if hw timestamping is used.

Fixes: 7b42961 ("net: macb: Add support for PTP timestamps in DMA descriptors")
Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev>
Co-developed-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20230412232144.770336-1-roman.gushchin@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
waby38b pushed a commit to avolmat/linux that referenced this pull request Apr 20, 2023
For quite some time we were chasing a bug which looked like a sudden
permanent failure of networking and mmc on some of our devices.
The bug was very sensitive to any software changes and even more to
any kernel debug options.

Finally we got a setup where the problem was reproducible with
CONFIG_DMA_API_DEBUG=y and it revealed the issue with the rx dma:

[   16.992082] ------------[ cut here ]------------
[   16.996779] DMA-API: macb ff0b0000.ethernet: device driver tries to free DMA memory it has not allocated [device address=0x0000000875e3e244] [size=1536 bytes]
[   17.011049] WARNING: CPU: 0 PID: 85 at kernel/dma/debug.c:1011 check_unmap+0x6a0/0x900
[   17.018977] Modules linked in: xxxxx
[   17.038823] CPU: 0 PID: 85 Comm: irq/55-8000f000 Not tainted 5.4.0 torvalds#28
[   17.045345] Hardware name: xxxxx
[   17.049528] pstate: 60000005 (nZCv daif -PAN -UAO)
[   17.054322] pc : check_unmap+0x6a0/0x900
[   17.058243] lr : check_unmap+0x6a0/0x900
[   17.062163] sp : ffffffc010003c40
[   17.065470] x29: ffffffc010003c40 x28: 000000004000c03c
[   17.070783] x27: ffffffc010da7048 x26: ffffff8878e38800
[   17.076095] x25: ffffff8879d22810 x24: ffffffc010003cc8
[   17.081407] x23: 0000000000000000 x22: ffffffc010a08750
[   17.086719] x21: ffffff8878e3c7c0 x20: ffffffc010acb000
[   17.092032] x19: 0000000875e3e244 x18: 0000000000000010
[   17.097343] x17: 0000000000000000 x16: 0000000000000000
[   17.102647] x15: ffffff8879e4a988 x14: 0720072007200720
[   17.107959] x13: 0720072007200720 x12: 0720072007200720
[   17.113261] x11: 0720072007200720 x10: 0720072007200720
[   17.118565] x9 : 0720072007200720 x8 : 000000000000022d
[   17.123869] x7 : 0000000000000015 x6 : 0000000000000098
[   17.129173] x5 : 0000000000000000 x4 : 0000000000000000
[   17.134475] x3 : 00000000ffffffff x2 : ffffffc010a1d370
[   17.139778] x1 : b420c9d75d27bb00 x0 : 0000000000000000
[   17.145082] Call trace:
[   17.147524]  check_unmap+0x6a0/0x900
[   17.151091]  debug_dma_unmap_page+0x88/0x90
[   17.155266]  gem_rx+0x114/0x2f0
[   17.158396]  macb_poll+0x58/0x100
[   17.161705]  net_rx_action+0x118/0x400
[   17.165445]  __do_softirq+0x138/0x36c
[   17.169100]  irq_exit+0x98/0xc0
[   17.172234]  __handle_domain_irq+0x64/0xc0
[   17.176320]  gic_handle_irq+0x5c/0xc0
[   17.179974]  el1_irq+0xb8/0x140
[   17.183109]  xiic_process+0x5c/0xe30
[   17.186677]  irq_thread_fn+0x28/0x90
[   17.190244]  irq_thread+0x208/0x2a0
[   17.193724]  kthread+0x130/0x140
[   17.196945]  ret_from_fork+0x10/0x20
[   17.200510] ---[ end trace 7240980785f81d6f ]---

[  237.021490] ------------[ cut here ]------------
[  237.026129] DMA-API: exceeded 7 overlapping mappings of cacheline 0x0000000021d79e7b
[  237.033886] WARNING: CPU: 0 PID: 0 at kernel/dma/debug.c:499 add_dma_entry+0x214/0x240
[  237.041802] Modules linked in: xxxxx
[  237.061637] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W         5.4.0 torvalds#28
[  237.068941] Hardware name: xxxxx
[  237.073116] pstate: 80000085 (Nzcv daIf -PAN -UAO)
[  237.077900] pc : add_dma_entry+0x214/0x240
[  237.081986] lr : add_dma_entry+0x214/0x240
[  237.086072] sp : ffffffc010003c30
[  237.089379] x29: ffffffc010003c30 x28: ffffff8878a0be00
[  237.094683] x27: 0000000000000180 x26: ffffff8878e387c0
[  237.099987] x25: 0000000000000002 x24: 0000000000000000
[  237.105290] x23: 000000000000003b x22: ffffffc010a0fa00
[  237.110594] x21: 0000000021d79e7b x20: ffffffc010abe600
[  237.115897] x19: 00000000ffffffef x18: 0000000000000010
[  237.121201] x17: 0000000000000000 x16: 0000000000000000
[  237.126504] x15: ffffffc010a0fdc8 x14: 0720072007200720
[  237.131807] x13: 0720072007200720 x12: 0720072007200720
[  237.137111] x11: 0720072007200720 x10: 0720072007200720
[  237.142415] x9 : 0720072007200720 x8 : 0000000000000259
[  237.147718] x7 : 0000000000000001 x6 : 0000000000000000
[  237.153022] x5 : ffffffc010003a20 x4 : 0000000000000001
[  237.158325] x3 : 0000000000000006 x2 : 0000000000000007
[  237.163628] x1 : 8ac721b3a7dc1c00 x0 : 0000000000000000
[  237.168932] Call trace:
[  237.171373]  add_dma_entry+0x214/0x240
[  237.175115]  debug_dma_map_page+0xf8/0x120
[  237.179203]  gem_rx_refill+0x190/0x280
[  237.182942]  gem_rx+0x224/0x2f0
[  237.186075]  macb_poll+0x58/0x100
[  237.189384]  net_rx_action+0x118/0x400
[  237.193125]  __do_softirq+0x138/0x36c
[  237.196780]  irq_exit+0x98/0xc0
[  237.199914]  __handle_domain_irq+0x64/0xc0
[  237.204000]  gic_handle_irq+0x5c/0xc0
[  237.207654]  el1_irq+0xb8/0x140
[  237.210789]  arch_cpu_idle+0x40/0x200
[  237.214444]  default_idle_call+0x18/0x30
[  237.218359]  do_idle+0x200/0x280
[  237.221578]  cpu_startup_entry+0x20/0x30
[  237.225493]  rest_init+0xe4/0xf0
[  237.228713]  arch_call_rest_init+0xc/0x14
[  237.232714]  start_kernel+0x47c/0x4a8
[  237.236367] ---[ end trace 7240980785f81d70 ]---

Lars was fast to find an explanation: according to the datasheet
bit 2 of the rx buffer descriptor entry has a different meaning in the
extended mode:
  Address [2] of beginning of buffer, or
  in extended buffer descriptor mode (DMA configuration register [28] = 1),
  indicates a valid timestamp in the buffer descriptor entry.

The macb driver didn't mask this bit while getting an address and it
eventually caused a memory corruption and a dma failure.

The problem is resolved by explicitly clearing the problematic bit
if hw timestamping is used.

Fixes: 7b42961 ("net: macb: Add support for PTP timestamps in DMA descriptors")
Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev>
Co-developed-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20230412232144.770336-1-roman.gushchin@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Apr 24, 2023
To avoid possible deadlock when allocating AGFL blocks, change the behaviour.
The orignal hehaviour for freeing extents in an EFI is that the extents in
the EFI were processed one by one in the same transaction. When the second
and subsequent extents are being processed, we have produced busy extents for
previous extents. If the second and subsequent extents are from the same AG
as the busy extents are, we have the risk of deadlock when allocating AGFL
blocks. A typical calltrace for the deadlock is like this:

	#0	context_switch() kernel/sched/core.c:3881
	#1	__schedule() kernel/sched/core.c:5111
	#2	schedule() kernel/sched/core.c:5186
	#3	xfs_extent_busy_flush() fs/xfs/xfs_extent_busy.c:598
	#4	xfs_alloc_ag_vextent_size() fs/xfs/libxfs/xfs_alloc.c:1641
	#5	xfs_alloc_ag_vextent() fs/xfs/libxfs/xfs_alloc.c:828
	torvalds#6	xfs_alloc_fix_freelist() fs/xfs/libxfs/xfs_alloc.c:2362
	torvalds#7	xfs_free_extent_fix_freelist() fs/xfs/libxfs/xfs_alloc.c:3029
	torvalds#8	__xfs_free_extent() fs/xfs/libxfs/xfs_alloc.c:3067
	torvalds#9	xfs_trans_free_extent() fs/xfs/xfs_extfree_item.c:370
	torvalds#10	xfs_efi_recover() fs/xfs/xfs_extfree_item.c:626
	torvalds#11	xlog_recover_process_efi() fs/xfs/xfs_log_recover.c:4605
	torvalds#12	xlog_recover_process_intents() fs/xfs/xfs_log_recover.c:4893
	torvalds#13	xlog_recover_finish() fs/xfs/xfs_log_recover.c:5824
	torvalds#14	xfs_log_mount_finish() fs/xfs/xfs_log.c:764
	torvalds#15	xfs_mountfs() fs/xfs/xfs_mount.c:978
	torvalds#16	xfs_fs_fill_super() fs/xfs/xfs_super.c:1908
	torvalds#17	mount_bdev() fs/super.c:1417
	torvalds#18	xfs_fs_mount() fs/xfs/xfs_super.c:1985
	torvalds#19	legacy_get_tree() fs/fs_context.c:647
	torvalds#20	vfs_get_tree() fs/super.c:1547
	torvalds#21	do_new_mount() fs/namespace.c:2843
	torvalds#22	do_mount() fs/namespace.c:3163
	torvalds#23	ksys_mount() fs/namespace.c:3372
	torvalds#24	__do_sys_mount() fs/namespace.c:3386
	torvalds#25	__se_sys_mount() fs/namespace.c:3383
	torvalds#26	__x64_sys_mount() fs/namespace.c:3383
	torvalds#27	do_syscall_64() arch/x86/entry/common.c:296
	torvalds#28	entry_SYSCALL_64() arch/x86/entry/entry_64.S:180

The deadlock could happen at both IO time and log recover time.

To avoid above deadlock, this patch changes the extent free procedure.
1) it always let the first extent from the EFI go (no change).
2) increase the (new) AG counter when it let a extent go.
3) for the 2nd+ extents, if the owning AGs ready have pending extents
   don't let the extent go with current transaction. Instead, move the
   extent in question and subsequent extents to a new EFI and try the new
   EFI again with new transaction (by rolling current transaction).
4) for the EFD to orginal EFI, fill it with all the extents from the original
   EFI.
5) though the new EFI is placed after original EFD, it's safe as they are in
   same in-memory transaction.
6) The new AG counter for pending extent freeings is decremented after the
   log items in in-memory transaction is committed to CIL.

Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com>
sirlucjan pushed a commit to CachyOS/linux that referenced this pull request Apr 26, 2023
To avoid possible deadlock when allocating AGFL blocks, change the behaviour.
The orignal hehaviour for freeing extents in an EFI is that the extents in
the EFI were processed one by one in the same transaction. When the second
and subsequent extents are being processed, we have produced busy extents for
previous extents. If the second and subsequent extents are from the same AG
as the busy extents are, we have the risk of deadlock when allocating AGFL
blocks. A typical calltrace for the deadlock is like this:

	#0	context_switch() kernel/sched/core.c:3881
	#1	__schedule() kernel/sched/core.c:5111
	#2	schedule() kernel/sched/core.c:5186
	#3	xfs_extent_busy_flush() fs/xfs/xfs_extent_busy.c:598
	#4	xfs_alloc_ag_vextent_size() fs/xfs/libxfs/xfs_alloc.c:1641
	#5	xfs_alloc_ag_vextent() fs/xfs/libxfs/xfs_alloc.c:828
	torvalds#6	xfs_alloc_fix_freelist() fs/xfs/libxfs/xfs_alloc.c:2362
	torvalds#7	xfs_free_extent_fix_freelist() fs/xfs/libxfs/xfs_alloc.c:3029
	torvalds#8	__xfs_free_extent() fs/xfs/libxfs/xfs_alloc.c:3067
	torvalds#9	xfs_trans_free_extent() fs/xfs/xfs_extfree_item.c:370
	torvalds#10	xfs_efi_recover() fs/xfs/xfs_extfree_item.c:626
	torvalds#11	xlog_recover_process_efi() fs/xfs/xfs_log_recover.c:4605
	torvalds#12	xlog_recover_process_intents() fs/xfs/xfs_log_recover.c:4893
	torvalds#13	xlog_recover_finish() fs/xfs/xfs_log_recover.c:5824
	torvalds#14	xfs_log_mount_finish() fs/xfs/xfs_log.c:764
	torvalds#15	xfs_mountfs() fs/xfs/xfs_mount.c:978
	torvalds#16	xfs_fs_fill_super() fs/xfs/xfs_super.c:1908
	torvalds#17	mount_bdev() fs/super.c:1417
	torvalds#18	xfs_fs_mount() fs/xfs/xfs_super.c:1985
	torvalds#19	legacy_get_tree() fs/fs_context.c:647
	torvalds#20	vfs_get_tree() fs/super.c:1547
	torvalds#21	do_new_mount() fs/namespace.c:2843
	torvalds#22	do_mount() fs/namespace.c:3163
	torvalds#23	ksys_mount() fs/namespace.c:3372
	torvalds#24	__do_sys_mount() fs/namespace.c:3386
	torvalds#25	__se_sys_mount() fs/namespace.c:3383
	torvalds#26	__x64_sys_mount() fs/namespace.c:3383
	torvalds#27	do_syscall_64() arch/x86/entry/common.c:296
	torvalds#28	entry_SYSCALL_64() arch/x86/entry/entry_64.S:180

The deadlock could happen at both IO time and log recover time.

To avoid above deadlock, this patch changes the extent free procedure.
1) it always let the first extent from the EFI go (no change).
2) increase the (new) AG counter when it let a extent go.
3) for the 2nd+ extents, if the owning AGs ready have pending extents
   don't let the extent go with current transaction. Instead, move the
   extent in question and subsequent extents to a new EFI and try the new
   EFI again with new transaction (by rolling current transaction).
4) for the EFD to orginal EFI, fill it with all the extents from the original
   EFI.
5) though the new EFI is placed after original EFD, it's safe as they are in
   same in-memory transaction.
6) The new AG counter for pending extent freeings is decremented after the
   log items in in-memory transaction is committed to CIL.

Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com>
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request May 19, 2023
The following calltrace is seen:
	#0	context_switch() kernel/sched/core.c:3881
	#1	__schedule() kernel/sched/core.c:5111
	#2	schedule() kernel/sched/core.c:5186
	#3	xfs_extent_busy_flush() fs/xfs/xfs_extent_busy.c:598
	#4	xfs_alloc_ag_vextent_size() fs/xfs/libxfs/xfs_alloc.c:1641
	#5	xfs_alloc_ag_vextent() fs/xfs/libxfs/xfs_alloc.c:828
	torvalds#6	xfs_alloc_fix_freelist() fs/xfs/libxfs/xfs_alloc.c:2362
	torvalds#7	xfs_free_extent_fix_freelist() fs/xfs/libxfs/xfs_alloc.c:3029
	torvalds#8	__xfs_free_extent() fs/xfs/libxfs/xfs_alloc.c:3067
	torvalds#9	xfs_trans_free_extent() fs/xfs/xfs_extfree_item.c:370
	torvalds#10	xfs_efi_recover() fs/xfs/xfs_extfree_item.c:626
	torvalds#11	xlog_recover_process_efi() fs/xfs/xfs_log_recover.c:4605
	torvalds#12	xlog_recover_process_intents() fs/xfs/xfs_log_recover.c:4893
	torvalds#13	xlog_recover_finish() fs/xfs/xfs_log_recover.c:5824
	torvalds#14	xfs_log_mount_finish() fs/xfs/xfs_log.c:764
	torvalds#15	xfs_mountfs() fs/xfs/xfs_mount.c:978
	torvalds#16	xfs_fs_fill_super() fs/xfs/xfs_super.c:1908
	torvalds#17	mount_bdev() fs/super.c:1417
	torvalds#18	xfs_fs_mount() fs/xfs/xfs_super.c:1985
	torvalds#19	legacy_get_tree() fs/fs_context.c:647
	torvalds#20	vfs_get_tree() fs/super.c:1547
	torvalds#21	do_new_mount() fs/namespace.c:2843
	torvalds#22	do_mount() fs/namespace.c:3163
	torvalds#23	ksys_mount() fs/namespace.c:3372
	torvalds#24	__do_sys_mount() fs/namespace.c:3386
	torvalds#25	__se_sys_mount() fs/namespace.c:3383
	torvalds#26	__x64_sys_mount() fs/namespace.c:3383
	torvalds#27	do_syscall_64() arch/x86/entry/common.c:296
	torvalds#28	entry_SYSCALL_64() arch/x86/entry/entry_64.S:180

During the process of the 2nd and subsequetial record in an EFI.
It is waiting for the busy blocks to be cleaned, but the only busy extent
is still hold in current xfs_trans->t_busy. That busy extent was added when
processing previous EFI record. And because that busy extent is not committed
yet, it can't be cleaned.

To avoid above deadlock, we don't block in xfs_extent_busy_flush() when
allocating AGFL blocks, instead it returns -EAGAIN. On receiving -EAGAIN
we are able to retry that EFI record with a new transaction after committing
the old transactin. With old transaction committed, the busy extent attached
to the old transaction get the change to be cleaned. On the retry, there is
no existing busy extents in the new transaction, thus no deadlock.

Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com>
sirlucjan pushed a commit to CachyOS/linux that referenced this pull request May 26, 2023
The following calltrace is seen:
	#0	context_switch() kernel/sched/core.c:3881
	#1	__schedule() kernel/sched/core.c:5111
	#2	schedule() kernel/sched/core.c:5186
	#3	xfs_extent_busy_flush() fs/xfs/xfs_extent_busy.c:598
	#4	xfs_alloc_ag_vextent_size() fs/xfs/libxfs/xfs_alloc.c:1641
	#5	xfs_alloc_ag_vextent() fs/xfs/libxfs/xfs_alloc.c:828
	torvalds#6	xfs_alloc_fix_freelist() fs/xfs/libxfs/xfs_alloc.c:2362
	torvalds#7	xfs_free_extent_fix_freelist() fs/xfs/libxfs/xfs_alloc.c:3029
	torvalds#8	__xfs_free_extent() fs/xfs/libxfs/xfs_alloc.c:3067
	torvalds#9	xfs_trans_free_extent() fs/xfs/xfs_extfree_item.c:370
	torvalds#10	xfs_efi_recover() fs/xfs/xfs_extfree_item.c:626
	torvalds#11	xlog_recover_process_efi() fs/xfs/xfs_log_recover.c:4605
	torvalds#12	xlog_recover_process_intents() fs/xfs/xfs_log_recover.c:4893
	torvalds#13	xlog_recover_finish() fs/xfs/xfs_log_recover.c:5824
	torvalds#14	xfs_log_mount_finish() fs/xfs/xfs_log.c:764
	torvalds#15	xfs_mountfs() fs/xfs/xfs_mount.c:978
	torvalds#16	xfs_fs_fill_super() fs/xfs/xfs_super.c:1908
	torvalds#17	mount_bdev() fs/super.c:1417
	torvalds#18	xfs_fs_mount() fs/xfs/xfs_super.c:1985
	torvalds#19	legacy_get_tree() fs/fs_context.c:647
	torvalds#20	vfs_get_tree() fs/super.c:1547
	torvalds#21	do_new_mount() fs/namespace.c:2843
	torvalds#22	do_mount() fs/namespace.c:3163
	torvalds#23	ksys_mount() fs/namespace.c:3372
	torvalds#24	__do_sys_mount() fs/namespace.c:3386
	torvalds#25	__se_sys_mount() fs/namespace.c:3383
	torvalds#26	__x64_sys_mount() fs/namespace.c:3383
	torvalds#27	do_syscall_64() arch/x86/entry/common.c:296
	torvalds#28	entry_SYSCALL_64() arch/x86/entry/entry_64.S:180

During the process of the 2nd and subsequetial record in an EFI.
It is waiting for the busy blocks to be cleaned, but the only busy extent
is still hold in current xfs_trans->t_busy. That busy extent was added when
processing previous EFI record. And because that busy extent is not committed
yet, it can't be cleaned.

To avoid above deadlock, we don't block in xfs_extent_busy_flush() when
allocating AGFL blocks, instead it returns -EAGAIN. On receiving -EAGAIN
we are able to retry that EFI record with a new transaction after committing
the old transactin. With old transaction committed, the busy extent attached
to the old transaction get the change to be cleaned. On the retry, there is
no existing busy extents in the new transaction, thus no deadlock.

Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com>
Mimoja pushed a commit to Mimoja/linux that referenced this pull request Aug 26, 2023
The imx_sec_dsim_probe may be deferred (e.g. when uising ti,sn65dsi84).
After a deferral, of_reset_control_array_get will fail
because of the call in the previous imx_sec_dsim_probe.

To fix this, unwind by doing the following when component_add is deferred:
- Release reset controls by calling sec_dsim_of_put_resets
- Disable PM by calling pm_runtime_disable.

Tested using ti,sn65dsi84 on the following Variscite configurations:
- VAR-SOM-MX8M-NANO v1.1 + Symphony v1.6
- DART-MX8M-MINI v1.1 + DT8MCustomBoard v3.0

Fixes the following:

[    1.929043] imx_sec_dsim_drv 32e10000.dsi_controller: version number is 0x1060200
[    1.936603] [drm:drm_bridge_attach] *ERROR* failed to attach bridge /soc@0/bus@32c00000/dsi_controller@32e10000 to encoder DSI-34: -517
[    1.948815] imx_sec_dsim_drv 32e10000.dsi_controller: Failed to attach bridge: 32e10000.dsi_controller
[    1.958137] imx_sec_dsim_drv 32e10000.dsi_controller: failed to bind sec dsim bridge: -517
[    1.972737] pps pps0: new PPS source ptp0
[    2.083716] vddio: Bringing 1500000uV into 1800000-1800000uV
[    2.090807] fec 30be0000.ethernet eth0: registered PHC device 0
[    2.103846] imx-cpufreq-dt imx-cpufreq-dt: cpu speed grade 8 mkt segment 0 supported-hw 0x100 0x1
[    2.113964] Hot alarm is canceled. GPU3D clock will return to 64/64
[    2.123475] sdhci-esdhc-imx 30b50000.mmc: Got CD GPIO
[    2.128789] mxc-mipi-csi2-sam 32e30000.csi: supply mipi-phy not found, using dummy regulator
[    2.138023] mxc-mipi-csi2-sam 32e30000.csi: lanes: 2, hs_settle: 13, clk_settle: 2, wclk: 1, freq: 333000000
[    2.150900] ------------[ cut here ]------------
[    2.155536] WARNING: CPU: 2 PID: 8 at drivers/reset/core.c:765 __reset_control_get_internal+0x68/0x160
[    2.164859] Modules linked in:
[    2.165181] mmc1: SDHCI controller on 30b50000.mmc [30b50000.mmc] using ADMA
[    2.167919] CPU: 2 PID: 8 Comm: kworker/u8:0 Not tainted 5.15.60-05940-gd9b705ee8975 torvalds#28
[    2.167926] Hardware name: Variscite VAR-SOM-MX8M-NANO on Symphony-Board (DT)
[    2.167932] Workqueue: events_unbound deferred_probe_work_func
[    2.167944] pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[    2.202981] pc : __reset_control_get_internal+0x68/0x160
[    2.208295] lr : __of_reset_control_get+0x16c/0x1d0
[    2.213176] sp : ffff800009df3990
[    2.216488] x29: ffff800009df3990 x28: ffff000017dd1e80 x27: ffff00000516fb38
[    2.223629] x26: 0000000000000001 x25: 0000000000000000 x24: 0000000000000001
[    2.230767] x23: 0000000000000000 x22: ffff000004982880 x21: 0000000000000001
[    2.237907] x20: ffff0000049828a0 x19: ffff00000545bd80 x18: ffffffffffffffff
[    2.245046] x17: 7465735f6b6c6320 x16: 2c3331203a656c74 x15: ffff00000516f78a
[    2.252186] x14: ffffffffffffffff x13: 0000000000000018 x12: 0101010101010101
[    2.259326] x11: 0000000000000030 x10: 0101010101010101 x9 : 0000000000000000
[    2.266465] x8 : 7f7f7f7f7f7f7f7f x7 : 70ff726b6b64622c x6 : 0000000000008072
[    2.273604] x5 : fffffbfffdc01818 x4 : 0000000000000000 x3 : 0000000000000001
[    2.280743] x2 : 0000000000000001 x1 : 0000000000000001 x0 : 0000000000000000
[    2.287884] Call trace:
[    2.290329]  __reset_control_get_internal+0x68/0x160
[    2.295296]  __of_reset_control_get+0x16c/0x1d0
[    2.299828]  of_reset_control_array_get+0xac/0x230
[    2.304622]  imx_sec_dsim_probe+0x21c/0x30c
[    2.308809]  platform_probe+0x68/0xe0
[    2.312475]  really_probe.part.0+0x9c/0x30c
[    2.316660]  __driver_probe_device+0x98/0x144
[    2.321018]  driver_probe_device+0xc8/0x15c
[    2.325202]  __device_attach_driver+0xb4/0x120
[    2.329645]  bus_for_each_drv+0x78/0xd0
[    2.333483]  __device_attach+0xa8/0x1c0
[    2.337320]  device_initial_probe+0x14/0x20
[    2.341505]  bus_probe_device+0x9c/0xa4
[    2.345342]  deferred_probe_work_func+0x80/0xc0
[    2.349874]  process_one_work+0x1d0/0x354
[    2.353890]  worker_thread+0x2c0/0x470
[    2.357641]  kthread+0x150/0x160
[    2.360871]  ret_from_fork+0x10/0x20
[    2.364451] ---[ end trace 60172b2d7c2580e5 ]---
[    2.369212] ------------[ cut here ]------------
[    2.373828] WARNING: CPU: 2 PID: 8 at drivers/reset/core.c:765 __reset_control_get_internal+0x68/0x160
[    2.383141] Modules linked in:
[    2.386198] CPU: 2 PID: 8 Comm: kworker/u8:0 Tainted: G        W         5.15.60-05940-gd9b705ee8975 torvalds#28
[    2.395677] Hardware name: Variscite VAR-SOM-MX8M-NANO on Symphony-Board (DT)
[    2.402811] Workqueue: events_unbound deferred_probe_work_func
[    2.408648] pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[    2.415610] pc : __reset_control_get_internal+0x68/0x160
[    2.420924] lr : __of_reset_control_get+0x16c/0x1d0
[    2.425803] sp : ffff800009df3990
[    2.429114] x29: ffff800009df3990 x28: ffff000017dd20a8 x27: ffff00000516fb38
[    2.436255] x26: 0000000000000001 x25: 0000000000000000 x24: 0000000000000001
[    2.443393] x23: 0000000000000000 x22: ffff000004982f80 x21: 0000000000000001
[    2.450533] x20: ffff000004982fa0 x19: ffff00000545bf00 x18: ffffffffffffffff
[    2.457672] x17: 7465735f6b6c6320 x16: 2c3331203a656c74 x15: ffff00000516f78a
[    2.464812] x14: ffffffffffffffff x13: 0000000000000018 x12: 0101010101010101
[    2.471952] x11: 0000000000000030 x10: 0101010101010101 x9 : 0000000000000000
[    2.479091] x8 : 7f7f7f7f7f7f7f7f x7 : 70ff726b6b64622c x6 : 0000000000008072
[    2.486230] x5 : fffffbfffdc01868 x4 : 0000000000000000 x3 : 0000000000000001
[    2.493369] x2 : 0000000000000001 x1 : 0000000000000001 x0 : 0000000000000000
[    2.500509] Call trace:
[    2.502954]  __reset_control_get_internal+0x68/0x160
[    2.507920]  __of_reset_control_get+0x16c/0x1d0
[    2.512452]  of_reset_control_array_get+0xac/0x230
[    2.517245]  imx_sec_dsim_probe+0x21c/0x30c
[    2.521433]  platform_probe+0x68/0xe0
[    2.525097]  really_probe.part.0+0x9c/0x30c
[    2.529281]  __driver_probe_device+0x98/0x144
[    2.533640]  driver_probe_device+0xc8/0x15c
[    2.537824]  __device_attach_driver+0xb4/0x120
[    2.542269]  bus_for_each_drv+0x78/0xd0
[    2.546107]  __device_attach+0xa8/0x1c0
[    2.549945]  device_initial_probe+0x14/0x20
[    2.554129]  bus_probe_device+0x9c/0xa4
[    2.557965]  deferred_probe_work_func+0x80/0xc0
[    2.562497]  process_one_work+0x1d0/0x354
[    2.566509]  worker_thread+0x2c0/0x470
[    2.570260]  kthread+0x150/0x160
[    2.573490]  ret_from_fork+0x10/0x20
[    2.577067] ---[ end trace 60172b2d7c2580e6 ]---
[    2.581883] ------------[ cut here ]------------
[    2.586499] WARNING: CPU: 2 PID: 8 at drivers/reset/core.c:765 __reset_control_get_internal+0x68/0x160
[    2.595808] Modules linked in:
[    2.598862] CPU: 2 PID: 8 Comm: kworker/u8:0 Tainted: G        W         5.15.60-05940-gd9b705ee8975 torvalds#28
[    2.608341] Hardware name: Variscite VAR-SOM-MX8M-NANO on Symphony-Board (DT)
[    2.615474] Workqueue: events_unbound deferred_probe_work_func
[    2.621309] pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[    2.628271] pc : __reset_control_get_internal+0x68/0x160
[    2.633584] lr : __of_reset_control_get+0x16c/0x1d0
[    2.638464] sp : ffff800009df3990
[    2.641775] x29: ffff800009df3990 x28: ffff000017dd22d0 x27: ffff00000516fb38
[    2.648915] x26: 0000000000000001 x25: 0000000000000000 x24: 0000000000000001
[    2.656053] x23: 0000000000000000 x22: ffff000004983680 x21: 0000000000000001
[    2.663192] x20: ffff0000049836a0 x19: ffff000004909500 x18: ffffffffffffffff
[    2.670331] x17: 7465735f6b6c6320 x16: 2c3331203a656c74 x15: ffff00000516f78a
[    2.677469] x14: ffffffffffffffff x13: 0000000000000018 x12: 0101010101010101
[    2.684609] x11: 0000000000000030 x10: 0101010101010101 x9 : 0000000000000000
[    2.691747] x8 : 7f7f7f7f7f7f7f7f x7 : 70ff726b6b64622c x6 : 0000000000008072
[    2.698886] x5 : fffffbfffdc018b8 x4 : 0000000000000000 x3 : 0000000000000001
[    2.706024] x2 : 0000000000000001 x1 : 0000000000000001 x0 : 0000000000000000
[    2.713162] Call trace:
[    2.715606]  __reset_control_get_internal+0x68/0x160
[    2.720572]  __of_reset_control_get+0x16c/0x1d0
[    2.725103]  of_reset_control_array_get+0xac/0x230
[    2.729895]  imx_sec_dsim_probe+0x21c/0x30c
[    2.734080]  platform_probe+0x68/0xe0
[    2.737744]  really_probe.part.0+0x9c/0x30c
[    2.741928]  __driver_probe_device+0x98/0x144
[    2.746285]  driver_probe_device+0xc8/0x15c
[    2.750469]  __device_attach_driver+0xb4/0x120
[    2.754914]  bus_for_each_drv+0x78/0xd0
[    2.758750]  __device_attach+0xa8/0x1c0
[    2.762586]  device_initial_probe+0x14/0x20
[    2.766771]  bus_probe_device+0x9c/0xa4
[    2.770606]  deferred_probe_work_func+0x80/0xc0
[    2.775137]  process_one_work+0x1d0/0x354
[    2.779149]  worker_thread+0x2c0/0x470
[    2.782899]  kthread+0x150/0x160
[    2.786127]  ret_from_fork+0x10/0x20
[    2.789703] ---[ end trace 60172b2d7c2580e7 ]---
[    2.794359] imx_sec_dsim_drv 32e10000.dsi_controller: no invalid reset control exists
[    2.802553] imx_sec_dsim_drv: probe of 32e10000.dsi_controller failed with error -22

Signed-off-by: Nate Drude <nate.d@variscite.com>
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Dec 7, 2023
It is dangerous to output warnings in task_call_func,
which may lead to the risk of deadlock. task_call_func
uses p->pi_lock, and the serial port output may call
try_to_wake_up to wake up the write buff, which also
uses p->pi_lock.

 WARNING: possible circular locking dependency detected
 6.7.0-rc4 torvalds#28 Not tainted
 ------------------------------------------------------
 sh/475 is trying to acquire lock:
 ffff800082b17f20 ((console_sem).lock){-.-.}-{2:2}, at: down_trylock+0x18/0x48

 but task is already holding lock:
 ffff0000c582dde0 (&p->pi_lock){-.-.}-{2:2}, at: task_call_func+0x40/0x124

 which lock already depends on the new lock.

 the existing dependency chain (in reverse order) is:

 -> #1 (&p->pi_lock){-.-.}-{2:2}:
        _raw_spin_lock_irqsave+0x68/0xd0
        try_to_wake_up+0x5c/0x820
        wake_up_process+0x18/0x24
        __up.isra.0+0x40/0x4c
        up+0x5c/0x78
        console_unlock+0x124/0x138
        vt_move_to_console+0x48/0xb8
        pm_restore_console+0x44/0x5c
        pm_suspend+0x2f0/0x688
        state_store+0x8c/0x118
        kobj_attr_store+0x18/0x2c
        sysfs_kf_write+0x4c/0x78
        kernfs_fop_write_iter+0x120/0x1b4
        vfs_write+0x3b4/0x558
        ksys_write+0x6c/0xfc
        __arm64_sys_write+0x1c/0x28
        invoke_syscall+0x44/0x104
        el0_svc_common.constprop.0+0xc0/0xe0
        do_el0_svc+0x1c/0x28
        el0_svc+0x50/0xec
        el0t_64_sync_handler+0xc0/0xc4
        el0t_64_sync+0x190/0x194

 -> #0 ((console_sem).lock){-.-.}-{2:2}:
        __lock_acquire+0x1248/0x1ab4
        lock_acquire+0x120/0x308
        _raw_spin_lock_irqsave+0x68/0xd0
        down_trylock+0x18/0x48
        __down_trylock_console_sem+0x38/0xc4
        console_trylock+0x34/0x78
        vprintk_emit+0x124/0x3a4
        vprintk_default+0x38/0x44
        vprintk+0xb0/0xc0
        _printk+0x60/0x88
        report_bug+0x208/0x270
        bug_handler+0x24/0x6c
        brk_handler+0x70/0xd4
        do_debug_exception+0x9c/0x16c
        el1_dbg+0x74/0x90
        el1h_64_sync_handler+0xc8/0xe4
        el1h_64_sync+0x64/0x68
        __set_task_frozen+0x64/0xac
        task_call_func+0xa0/0x124
        freeze_task+0xb4/0x10c
        try_to_freeze_tasks+0xd8/0x3fc
        freeze_processes+0xd4/0xe4
        pm_suspend+0x21c/0x688
        state_store+0x8c/0x118
        kobj_attr_store+0x18/0x2c
        sysfs_kf_write+0x4c/0x78
        kernfs_fop_write_iter+0x120/0x1b4
        vfs_write+0x3b4/0x558
        ksys_write+0x6c/0xfc
        __arm64_sys_write+0x1c/0x28
        invoke_syscall+0x44/0x104
        el0_svc_common.constprop.0+0xc0/0xe0
        do_el0_svc+0x1c/0x28
        el0_svc+0x50/0xec
        el0t_64_sync_handler+0xc0/0xc4
        el0t_64_sync+0x190/0x194

 other info that might help us debug this:

  Possible unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock(&p->pi_lock);
                                lock((console_sem).lock);
                                lock(&p->pi_lock);

  *** DEADLOCK ***

 7 locks held by sh/475:
  #0: ffff0000c7245420 (sb_writers#5){.+.+}-{0:0}, at: ksys_write+0x6c/0xfc
  #1: ffff0000c313a890 (&of->mutex){+.+.}-{3:3}, at: kernfs_fop_write_iter+0xf0/0x1b4
  #2: ffff0000c0d3e0b8 (kn->active#48){.+.+}-{0:0}, at: kernfs_fop_write_iter+0xf8/0x1b4
  #3: ffff800082b11530 (system_transition_mutex){+.+.}-{3:3}, at: pm_suspend+0xb4/0x688
  #4: ffff800082ae6058 (tasklist_lock){.+.+}-{2:2}, at: try_to_freeze_tasks+0x88/0x3fc
  #5: ffff800082b93f10 (freezer_lock){....}-{2:2}, at: freeze_task+0x3c/0x10c
  torvalds#6: ffff0000c582dde0 (&p->pi_lock){-.-.}-{2:2}, at: task_call_func+0x40/0x124

Fixes: f5d39b0 ("freezer,sched: Rewrite core freezer logic")
Signed-off-by: Xiaolei Wang <xiaolei.wang@windriver.com>
logic10492 pushed a commit to logic10492/linux-amd-zen2 that referenced this pull request Jan 18, 2024
scx_pair uses the default stride value of nr_cpu_ids / 2, which matches most
x86 SMT configurations. However, it does allow specifying a custom stride
value with -S so that e.g. neighboring CPUs can be paired up. However, not
all stride values work and errors were not reported very well.

This patch improves error handling so that scx_pair fails with clear error
message if CPUs can't be paired up with the specified stride value. scx_pair
now also prints out how CPUs are paired on startup.

This should address issues torvalds#28 and torvalds#29.
gyroninja added a commit to gyroninja/linux that referenced this pull request Jan 28, 2024
KSAN calls into rcu code which then triggers a write that reenters into KSAN
getting the system stuck doing infinite recursion.

#0  kmsan_get_context () at mm/kmsan/kmsan.h:106
#1  __msan_get_context_state () at mm/kmsan/instrumentation.c:331
#2  0xffffffff81495671 in get_current () at ./arch/x86/include/asm/current.h:42
#3  rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379
#4  __rcu_read_lock () at kernel/rcu/tree_plugin.h:402
#5  0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748
torvalds#6  pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016
torvalds#7  kmsan_virt_addr_valid (addr=addr@entry=0xffffffff8620d974 <init_task+1012>) at ./arch/x86/include/asm/kmsan.h:82
torvalds#8  virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/shadow.c:75
torvalds#9  0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff8620d974 <init_task+1012>, is_origin=false) at mm/kmsan/shadow.c:143
torvalds#10 kmsan_get_shadow_origin_ptr (address=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/shadow.c:97
torvalds#11 0xffffffff81b1dbd2 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/instrumentation.c:36
torvalds#12 __msan_metadata_ptr_for_load_4 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:91
torvalds#13 0xffffffff8149568f in rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379
torvalds#14 __rcu_read_lock () at kernel/rcu/tree_plugin.h:402
torvalds#15 0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748
torvalds#16 pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016
torvalds#17 kmsan_virt_addr_valid (addr=addr@entry=0xffffffff8620d974 <init_task+1012>) at ./arch/x86/include/asm/kmsan.h:82
torvalds#18 virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/shadow.c:75
torvalds#19 0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff8620d974 <init_task+1012>, is_origin=false) at mm/kmsan/shadow.c:143
torvalds#20 kmsan_get_shadow_origin_ptr (address=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/shadow.c:97
torvalds#21 0xffffffff81b1dbd2 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/instrumentation.c:36
torvalds#22 __msan_metadata_ptr_for_load_4 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:91
torvalds#23 0xffffffff8149568f in rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379
torvalds#24 __rcu_read_lock () at kernel/rcu/tree_plugin.h:402
torvalds#25 0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748
torvalds#26 pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016
torvalds#27 kmsan_virt_addr_valid (addr=addr@entry=0xffffffff8620d974 <init_task+1012>) at ./arch/x86/include/asm/kmsan.h:82
torvalds#28 virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/shadow.c:75
torvalds#29 0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff8620d974 <init_task+1012>, is_origin=false) at mm/kmsan/shadow.c:143
torvalds#30 kmsan_get_shadow_origin_ptr (address=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/shadow.c:97
torvalds#31 0xffffffff81b1dbd2 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/instrumentation.c:36
torvalds#32 __msan_metadata_ptr_for_load_4 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:91
torvalds#33 0xffffffff8149568f in rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379
torvalds#34 __rcu_read_lock () at kernel/rcu/tree_plugin.h:402
torvalds#35 0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748
torvalds#36 pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016
torvalds#37 kmsan_virt_addr_valid (addr=addr@entry=0xffffffff8620d974 <init_task+1012>) at ./arch/x86/include/asm/kmsan.h:82
torvalds#38 virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/shadow.c:75
torvalds#39 0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff8620d974 <init_task+1012>, is_origin=false) at mm/kmsan/shadow.c:143
torvalds#40 kmsan_get_shadow_origin_ptr (address=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/shadow.c:97
torvalds#41 0xffffffff81b1dbd2 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/instrumentation.c:36
torvalds#42 __msan_metadata_ptr_for_load_4 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:91
torvalds#43 0xffffffff8149568f in rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379
torvalds#44 __rcu_read_lock () at kernel/rcu/tree_plugin.h:402
torvalds#45 0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748
torvalds#46 pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016
torvalds#47 kmsan_virt_addr_valid (addr=addr@entry=0xffffffff8620d974 <init_task+1012>) at ./arch/x86/include/asm/kmsan.h:82
torvalds#48 virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/shadow.c:75
torvalds#49 0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff8620d974 <init_task+1012>, is_origin=false) at mm/kmsan/shadow.c:143
torvalds#50 kmsan_get_shadow_origin_ptr (address=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/shadow.c:97
torvalds#51 0xffffffff81b1dbd2 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/instrumentation.c:36
#52 __msan_metadata_ptr_for_load_4 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:91
#53 0xffffffff8149568f in rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379
torvalds#54 __rcu_read_lock () at kernel/rcu/tree_plugin.h:402
torvalds#55 0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748
torvalds#56 pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016
torvalds#57 kmsan_virt_addr_valid (addr=addr@entry=0xffffffff8620d974 <init_task+1012>) at ./arch/x86/include/asm/kmsan.h:82
#58 virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/shadow.c:75
torvalds#59 0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff8620d974 <init_task+1012>, is_origin=false) at mm/kmsan/shadow.c:143
torvalds#60 kmsan_get_shadow_origin_ptr (address=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/shadow.c:97
torvalds#61 0xffffffff81b1dbd2 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/instrumentation.c:36
torvalds#62 __msan_metadata_ptr_for_load_4 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:91
torvalds#63 0xffffffff8149568f in rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379
torvalds#64 __rcu_read_lock () at kernel/rcu/tree_plugin.h:402
torvalds#65 0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748
torvalds#66 pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016
torvalds#67 kmsan_virt_addr_valid (addr=addr@entry=0xffffffff8620d974 <init_task+1012>) at ./arch/x86/include/asm/kmsan.h:82
torvalds#68 virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/shadow.c:75
torvalds#69 0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff8620d974 <init_task+1012>, is_origin=false) at mm/kmsan/shadow.c:143
#70 kmsan_get_shadow_origin_ptr (address=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/shadow.c:97
torvalds#71 0xffffffff81b1dbd2 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/instrumentation.c:36
torvalds#72 __msan_metadata_ptr_for_load_4 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:91
torvalds#73 0xffffffff8149568f in rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379
torvalds#74 __rcu_read_lock () at kernel/rcu/tree_plugin.h:402
torvalds#75 0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748
torvalds#76 pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016
torvalds#77 kmsan_virt_addr_valid (addr=addr@entry=0xffffffff86203c90) at ./arch/x86/include/asm/kmsan.h:82
torvalds#78 virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff86203c90) at mm/kmsan/shadow.c:75
torvalds#79 0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff86203c90, is_origin=false) at mm/kmsan/shadow.c:143
torvalds#80 kmsan_get_shadow_origin_ptr (address=0xffffffff86203c90, size=8, store=false) at mm/kmsan/shadow.c:97
torvalds#81 0xffffffff81b1dc72 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=8, store=false) at mm/kmsan/instrumentation.c:36
torvalds#82 __msan_metadata_ptr_for_load_8 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:92
torvalds#83 0xffffffff814fdb9e in filter_irq_stacks (entries=<optimized out>, nr_entries=4) at kernel/stacktrace.c:397
torvalds#84 0xffffffff829520e8 in stack_depot_save_flags (entries=0xffffffff8620d974 <init_task+1012>, nr_entries=4, alloc_flags=0, depot_flags=0) at lib/stackdepot.c:500
torvalds#85 0xffffffff81b1e560 in __msan_poison_alloca (address=0xffffffff86203da0, size=24, descr=<optimized out>) at mm/kmsan/instrumentation.c:285
torvalds#86 0xffffffff8562821c in _printk (fmt=0xffffffff85f191a5 "\0016Attempting lock1") at kernel/printk/printk.c:2324
torvalds#87 0xffffffff81942aa2 in kmem_cache_create_usercopy (name=0xffffffff85f18903 "mm_struct", size=1296, align=0, flags=270336, useroffset=<optimized out>, usersize=<optimized out>, ctor=0x0 <fixed_percpu_data>) at mm/slab_common.c:296
torvalds#88 0xffffffff86f337a0 in mm_cache_init () at kernel/fork.c:3262
torvalds#89 0xffffffff86eacb8e in start_kernel () at init/main.c:932
torvalds#90 0xffffffff86ecdf94 in x86_64_start_reservations (real_mode_data=0x140e0 <exception_stacks+28896> <error: Cannot access memory at address 0x140e0>) at arch/x86/kernel/head64.c:555
torvalds#91 0xffffffff86ecde9b in x86_64_start_kernel (real_mode_data=0x140e0 <exception_stacks+28896> <error: Cannot access memory at address 0x140e0>) at arch/x86/kernel/head64.c:536
torvalds#92 0xffffffff810001d3 in secondary_startup_64 () at /pool/workspace/linux/arch/x86/kernel/head_64.S:461
torvalds#93 0x0000000000000000 in ??
gyroninja added a commit to gyroninja/linux that referenced this pull request Jan 28, 2024
As of 5ec8e8e(mm/sparsemem: fix race in accessing memory_section->usage) KMSAN
now calls into RCU tree code during kmsan_get_metadata. This will trigger a
write that will reenter into KMSAN getting the system stuck doing infinite
recursion.

#0  kmsan_get_context () at mm/kmsan/kmsan.h:106
#1  __msan_get_context_state () at mm/kmsan/instrumentation.c:331
#2  0xffffffff81495671 in get_current () at ./arch/x86/include/asm/current.h:42
#3  rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379
#4  __rcu_read_lock () at kernel/rcu/tree_plugin.h:402
#5  0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748
torvalds#6  pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016
torvalds#7  kmsan_virt_addr_valid (addr=addr@entry=0xffffffff8620d974 <init_task+1012>) at ./arch/x86/include/asm/kmsan.h:82
torvalds#8  virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/shadow.c:75
torvalds#9  0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff8620d974 <init_task+1012>, is_origin=false) at mm/kmsan/shadow.c:143
torvalds#10 kmsan_get_shadow_origin_ptr (address=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/shadow.c:97
torvalds#11 0xffffffff81b1dbd2 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/instrumentation.c:36
torvalds#12 __msan_metadata_ptr_for_load_4 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:91
torvalds#13 0xffffffff8149568f in rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379
torvalds#14 __rcu_read_lock () at kernel/rcu/tree_plugin.h:402
torvalds#15 0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748
torvalds#16 pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016
torvalds#17 kmsan_virt_addr_valid (addr=addr@entry=0xffffffff8620d974 <init_task+1012>) at ./arch/x86/include/asm/kmsan.h:82
torvalds#18 virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/shadow.c:75
torvalds#19 0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff8620d974 <init_task+1012>, is_origin=false) at mm/kmsan/shadow.c:143
torvalds#20 kmsan_get_shadow_origin_ptr (address=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/shadow.c:97
torvalds#21 0xffffffff81b1dbd2 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/instrumentation.c:36
torvalds#22 __msan_metadata_ptr_for_load_4 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:91
torvalds#23 0xffffffff8149568f in rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379
torvalds#24 __rcu_read_lock () at kernel/rcu/tree_plugin.h:402
torvalds#25 0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748
torvalds#26 pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016
torvalds#27 kmsan_virt_addr_valid (addr=addr@entry=0xffffffff8620d974 <init_task+1012>) at ./arch/x86/include/asm/kmsan.h:82
torvalds#28 virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/shadow.c:75
torvalds#29 0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff8620d974 <init_task+1012>, is_origin=false) at mm/kmsan/shadow.c:143
torvalds#30 kmsan_get_shadow_origin_ptr (address=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/shadow.c:97
torvalds#31 0xffffffff81b1dbd2 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/instrumentation.c:36
torvalds#32 __msan_metadata_ptr_for_load_4 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:91
torvalds#33 0xffffffff8149568f in rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379
torvalds#34 __rcu_read_lock () at kernel/rcu/tree_plugin.h:402
torvalds#35 0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748
torvalds#36 pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016
torvalds#37 kmsan_virt_addr_valid (addr=addr@entry=0xffffffff8620d974 <init_task+1012>) at ./arch/x86/include/asm/kmsan.h:82
torvalds#38 virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/shadow.c:75
torvalds#39 0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff8620d974 <init_task+1012>, is_origin=false) at mm/kmsan/shadow.c:143
torvalds#40 kmsan_get_shadow_origin_ptr (address=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/shadow.c:97
torvalds#41 0xffffffff81b1dbd2 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/instrumentation.c:36
torvalds#42 __msan_metadata_ptr_for_load_4 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:91
torvalds#43 0xffffffff8149568f in rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379
torvalds#44 __rcu_read_lock () at kernel/rcu/tree_plugin.h:402
torvalds#45 0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748
torvalds#46 pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016
torvalds#47 kmsan_virt_addr_valid (addr=addr@entry=0xffffffff8620d974 <init_task+1012>) at ./arch/x86/include/asm/kmsan.h:82
torvalds#48 virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/shadow.c:75
torvalds#49 0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff8620d974 <init_task+1012>, is_origin=false) at mm/kmsan/shadow.c:143
torvalds#50 kmsan_get_shadow_origin_ptr (address=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/shadow.c:97
torvalds#51 0xffffffff81b1dbd2 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/instrumentation.c:36
#52 __msan_metadata_ptr_for_load_4 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:91
#53 0xffffffff8149568f in rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379
torvalds#54 __rcu_read_lock () at kernel/rcu/tree_plugin.h:402
torvalds#55 0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748
torvalds#56 pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016
torvalds#57 kmsan_virt_addr_valid (addr=addr@entry=0xffffffff8620d974 <init_task+1012>) at ./arch/x86/include/asm/kmsan.h:82
#58 virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/shadow.c:75
torvalds#59 0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff8620d974 <init_task+1012>, is_origin=false) at mm/kmsan/shadow.c:143
torvalds#60 kmsan_get_shadow_origin_ptr (address=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/shadow.c:97
torvalds#61 0xffffffff81b1dbd2 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/instrumentation.c:36
torvalds#62 __msan_metadata_ptr_for_load_4 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:91
torvalds#63 0xffffffff8149568f in rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379
torvalds#64 __rcu_read_lock () at kernel/rcu/tree_plugin.h:402
torvalds#65 0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748
torvalds#66 pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016
torvalds#67 kmsan_virt_addr_valid (addr=addr@entry=0xffffffff8620d974 <init_task+1012>) at ./arch/x86/include/asm/kmsan.h:82
torvalds#68 virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/shadow.c:75
torvalds#69 0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff8620d974 <init_task+1012>, is_origin=false) at mm/kmsan/shadow.c:143
#70 kmsan_get_shadow_origin_ptr (address=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/shadow.c:97
torvalds#71 0xffffffff81b1dbd2 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/instrumentation.c:36
torvalds#72 __msan_metadata_ptr_for_load_4 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:91
torvalds#73 0xffffffff8149568f in rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379
torvalds#74 __rcu_read_lock () at kernel/rcu/tree_plugin.h:402
torvalds#75 0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748
torvalds#76 pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016
torvalds#77 kmsan_virt_addr_valid (addr=addr@entry=0xffffffff86203c90) at ./arch/x86/include/asm/kmsan.h:82
torvalds#78 virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff86203c90) at mm/kmsan/shadow.c:75
torvalds#79 0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff86203c90, is_origin=false) at mm/kmsan/shadow.c:143
torvalds#80 kmsan_get_shadow_origin_ptr (address=0xffffffff86203c90, size=8, store=false) at mm/kmsan/shadow.c:97
torvalds#81 0xffffffff81b1dc72 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=8, store=false) at mm/kmsan/instrumentation.c:36
torvalds#82 __msan_metadata_ptr_for_load_8 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:92
torvalds#83 0xffffffff814fdb9e in filter_irq_stacks (entries=<optimized out>, nr_entries=4) at kernel/stacktrace.c:397
torvalds#84 0xffffffff829520e8 in stack_depot_save_flags (entries=0xffffffff8620d974 <init_task+1012>, nr_entries=4, alloc_flags=0, depot_flags=0) at lib/stackdepot.c:500
torvalds#85 0xffffffff81b1e560 in __msan_poison_alloca (address=0xffffffff86203da0, size=24, descr=<optimized out>) at mm/kmsan/instrumentation.c:285
torvalds#86 0xffffffff8562821c in _printk (fmt=0xffffffff85f191a5 "\0016Attempting lock1") at kernel/printk/printk.c:2324
torvalds#87 0xffffffff81942aa2 in kmem_cache_create_usercopy (name=0xffffffff85f18903 "mm_struct", size=1296, align=0, flags=270336, useroffset=<optimized out>, usersize=<optimized out>, ctor=0x0 <fixed_percpu_data>) at mm/slab_common.c:296
torvalds#88 0xffffffff86f337a0 in mm_cache_init () at kernel/fork.c:3262
torvalds#89 0xffffffff86eacb8e in start_kernel () at init/main.c:932
torvalds#90 0xffffffff86ecdf94 in x86_64_start_reservations (real_mode_data=0x140e0 <exception_stacks+28896> <error: Cannot access memory at address 0x140e0>) at arch/x86/kernel/head64.c:555
torvalds#91 0xffffffff86ecde9b in x86_64_start_kernel (real_mode_data=0x140e0 <exception_stacks+28896> <error: Cannot access memory at address 0x140e0>) at arch/x86/kernel/head64.c:536
torvalds#92 0xffffffff810001d3 in secondary_startup_64 () at /pool/workspace/linux/arch/x86/kernel/head_64.S:461
torvalds#93 0x0000000000000000 in ??
gyohng pushed a commit to gyohng/linux-h616 that referenced this pull request May 15, 2024
A wait operation can be interrupted for the following reasons:

- a signal was received while waiting for the event.

- the caller got forcibly unblocked while waiting for the event.

- a signal was received or the caller got forcibly unblocked while
  trying to reacquire the lock once the event was successfully
  received.

Ensure these conditions are differentiated when returning to the
caller via the operation status (req.status), which enables the user
interface to decide whether it needs to restart the syscall
transparently after some fixups on signal receipt, or bail out on
error if forcibly unblocked for any other reason.

This change causes a bump of the current ABI version (torvalds#28). However,
it does not affect current and older libevl releases since none of
them check the operation status when an interrupted wait is detected.

Signed-off-by: Philippe Gerum <rpm@xenomai.org>
gyohng pushed a commit to gyohng/linux-h616 that referenced this pull request Jun 3, 2024
A wait operation can be interrupted for the following reasons:

- a signal was received while waiting for the event.

- the caller got forcibly unblocked while waiting for the event.

- a signal was received or the caller got forcibly unblocked while
  trying to reacquire the lock once the event was successfully
  received.

Ensure these conditions are differentiated when returning to the
caller via the operation status (req.status), which enables the user
interface to decide whether it needs to restart the syscall
transparently after some fixups on signal receipt, or bail out on
error if forcibly unblocked for any other reason.

This change causes a bump of the current ABI version (torvalds#28). However,
it does not affect current and older libevl releases since none of
them check the operation status when an interrupted wait is detected.

Signed-off-by: Philippe Gerum <rpm@xenomai.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant