Corrected filter for write_same and added install/update scripts. #1

nickme · 2015-04-03T19:58:08Z

No description provided.

Corrected filter for write_same and added install/update scripts.

…ct sense and error codes

…or codes.

get in sync with LIS master

make repo current

Merge upstream

pull update from master

Bring 4.2.3 and master in sync

backport "hv: kvp: Avoid reading past allocated blocks from KVP file

…speed_up_computing_padding Hv netsvc use reciprocal divide to speed up computing padding

In the case of cpumask_equal(mask, cpu_online_mask) == false, "mask" may be a superset of "cfg->domain", and the real affinity is still saved in "cfg->domain", after __ioapic_set_affinity() returns. See the line "cpumask_copy(cfg->domain, tmp_mask);" in RHEL 7.x's kernel function __assign_irq_vector(). So we should always use "cfg->domain", otherwise the NVMe driver may fail to receive the expected interrupt, and later the buggy error handling code in nvme_dev_disable() can cause the below panic: [ 71.695565] nvme nvme7: I/O 19 QID 0 timeout, disable controller [ 71.724221] ------------[ cut here ]------------ [ 71.725067] WARNING: CPU: 4 PID: 11317 at kernel/irq/manage.c:1348 __free_irq+0xb3/0x280 [ 71.725067] Trying to free already-free IRQ 226 [ 71.725067] Modules linked in: ... [ 71.725067] CPU: 4 PID: 11317 Comm: kworker/4:1H Tainted: G OE ------------ T 3.10.0-957.10.1.el7.x86_64 LIS#1 [ 71.725067] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090007 05/18/2018 [ 71.725067] Workqueue: kblockd blk_mq_timeout_work [ 71.725067] Call Trace: [ 71.725067] [<ffffffff8cf62e41>] dump_stack+0x19/0x1b [ 71.725067] [<ffffffff8c897688>] __warn+0xd8/0x100 [ 71.725067] [<ffffffff8c89770f>] warn_slowpath_fmt+0x5f/0x80 [ 71.725067] [<ffffffff8c94ac83>] __free_irq+0xb3/0x280 [ 71.725067] [<ffffffff8c94aed9>] free_irq+0x39/0x90 [ 71.725067] [<ffffffffc046b33c>] nvme_dev_disable+0x11c/0x4b0 [nvme] [ 71.725067] [<ffffffff8cca465c>] ? dev_warn+0x6c/0x90 [ 71.725067] [<ffffffffc046bb34>] nvme_timeout+0x204/0x2d0 [nvme] [ 71.725067] [<ffffffff8cb55c6d>] ? blk_mq_do_dispatch_sched+0x9d/0x130 [ 71.725067] [<ffffffff8c8e015c>] ? update_curr+0x14c/0x1e0 [ 71.725067] [<ffffffff8cb505a2>] blk_mq_rq_timed_out+0x32/0x80 [ 71.725067] [<ffffffff8cb5064c>] blk_mq_check_expired+0x5c/0x60 [ 71.725067] [<ffffffff8cb53924>] bt_iter+0x54/0x60 [ 71.725067] [<ffffffff8cb5425b>] blk_mq_queue_tag_busy_iter+0x11b/0x290 [ 71.725067] [<ffffffff8cb505f0>] ? blk_mq_rq_timed_out+0x80/0x80 [ 71.725067] [<ffffffff8cb505f0>] ? blk_mq_rq_timed_out+0x80/0x80 [ 71.725067] [<ffffffff8cb4f1db>] blk_mq_timeout_work+0x8b/0x180 [ 71.725067] [<ffffffff8c8b9d8f>] process_one_work+0x17f/0x440 [ 71.725067] [<ffffffff8c8bae26>] worker_thread+0x126/0x3c0 [ 71.725067] [<ffffffff8c8bad00>] ? manage_workers.isra.25+0x2a0/0x2a0 [ 71.725067] [<ffffffff8c8c1c71>] kthread+0xd1/0xe0 [ 71.725067] [<ffffffff8c8c1ba0>] ? insert_kthread_work+0x40/0x40 [ 71.725067] [<ffffffff8cf75c24>] ret_from_fork_nospec_begin+0xe/0x21 [ 71.725067] [<ffffffff8c8c1ba0>] ? insert_kthread_work+0x40/0x40 [ 71.725067] ---[ end trace b3257623bc50d02a ]--- [ 72.196556] BUG: unable to handle kernel NULL pointer dereference at 0000000000000048 [ 72.211013] IP: [<ffffffff8c94aed9>] free_irq+0x39/0x90 It looks the bug is more easily triggered when the VM has a lot of vCPUs, e.g. L64v2 or L80v2 VM sizes. Presumably, in such a VM, the NVMe driver can pass a "mask" which has multiple bits of 1, but is not equal to "cpu_online_mask". Previously we incorrctly assumed the "mask" either contains only 1 bit of "1" or equals to "cpu_online_mask". Fixes: 9c8bbae ("RH7: PCI: hv: respect the affinity setting") Signed-off-by: Dexuan Cui <decui@microsoft.com>

In the case of cpumask_equal(mask, cpu_online_mask) == false, "mask" may be a superset of "cfg->domain", and the real affinity is still saved in "cfg->domain", after __ioapic_set_affinity() returns. See the line "cpumask_copy(cfg->domain, tmp_mask);" in RHEL 7.x's kernel function __assign_irq_vector(). So we should always use "cfg->domain", otherwise the NVMe driver may fail to receive the expected interrupt, and later the buggy error handling code in nvme_dev_disable() can cause the below panic: [ 71.695565] nvme nvme7: I/O 19 QID 0 timeout, disable controller [ 71.724221] ------------[ cut here ]------------ [ 71.725067] WARNING: CPU: 4 PID: 11317 at kernel/irq/manage.c:1348 __free_irq+0xb3/0x280 [ 71.725067] Trying to free already-free IRQ 226 [ 71.725067] Modules linked in: ... [ 71.725067] CPU: 4 PID: 11317 Comm: kworker/4:1H Tainted: G OE ------------ T 3.10.0-957.10.1.el7.x86_64 #1 [ 71.725067] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090007 05/18/2018 [ 71.725067] Workqueue: kblockd blk_mq_timeout_work [ 71.725067] Call Trace: [ 71.725067] [<ffffffff8cf62e41>] dump_stack+0x19/0x1b [ 71.725067] [<ffffffff8c897688>] __warn+0xd8/0x100 [ 71.725067] [<ffffffff8c89770f>] warn_slowpath_fmt+0x5f/0x80 [ 71.725067] [<ffffffff8c94ac83>] __free_irq+0xb3/0x280 [ 71.725067] [<ffffffff8c94aed9>] free_irq+0x39/0x90 [ 71.725067] [<ffffffffc046b33c>] nvme_dev_disable+0x11c/0x4b0 [nvme] [ 71.725067] [<ffffffff8cca465c>] ? dev_warn+0x6c/0x90 [ 71.725067] [<ffffffffc046bb34>] nvme_timeout+0x204/0x2d0 [nvme] [ 71.725067] [<ffffffff8cb55c6d>] ? blk_mq_do_dispatch_sched+0x9d/0x130 [ 71.725067] [<ffffffff8c8e015c>] ? update_curr+0x14c/0x1e0 [ 71.725067] [<ffffffff8cb505a2>] blk_mq_rq_timed_out+0x32/0x80 [ 71.725067] [<ffffffff8cb5064c>] blk_mq_check_expired+0x5c/0x60 [ 71.725067] [<ffffffff8cb53924>] bt_iter+0x54/0x60 [ 71.725067] [<ffffffff8cb5425b>] blk_mq_queue_tag_busy_iter+0x11b/0x290 [ 71.725067] [<ffffffff8cb505f0>] ? blk_mq_rq_timed_out+0x80/0x80 [ 71.725067] [<ffffffff8cb505f0>] ? blk_mq_rq_timed_out+0x80/0x80 [ 71.725067] [<ffffffff8cb4f1db>] blk_mq_timeout_work+0x8b/0x180 [ 71.725067] [<ffffffff8c8b9d8f>] process_one_work+0x17f/0x440 [ 71.725067] [<ffffffff8c8bae26>] worker_thread+0x126/0x3c0 [ 71.725067] [<ffffffff8c8bad00>] ? manage_workers.isra.25+0x2a0/0x2a0 [ 71.725067] [<ffffffff8c8c1c71>] kthread+0xd1/0xe0 [ 71.725067] [<ffffffff8c8c1ba0>] ? insert_kthread_work+0x40/0x40 [ 71.725067] [<ffffffff8cf75c24>] ret_from_fork_nospec_begin+0xe/0x21 [ 71.725067] [<ffffffff8c8c1ba0>] ? insert_kthread_work+0x40/0x40 [ 71.725067] ---[ end trace b3257623bc50d02a ]--- [ 72.196556] BUG: unable to handle kernel NULL pointer dereference at 0000000000000048 [ 72.211013] IP: [<ffffffff8c94aed9>] free_irq+0x39/0x90 It looks the bug is more easily triggered when the VM has a lot of vCPUs, e.g. L64v2 or L80v2 VM sizes. Presumably, in such a VM, the NVMe driver can pass a "mask" which has multiple bits of 1, but is not equal to "cpu_online_mask". Previously we incorrctly assumed the "mask" either contains only 1 bit of "1" or equals to "cpu_online_mask". Fixes: 9c8bbae ("RH7: PCI: hv: respect the affinity setting") Signed-off-by: Dexuan Cui <decui@microsoft.com>

nickme added a commit that referenced this pull request Apr 3, 2015

Merge pull request #1 from nickme/master

b321949

Corrected filter for write_same and added install/update scripts.

nickme merged commit b321949 into LIS:master Apr 3, 2015

root added 2 commits April 3, 2015 14:30

Added install scripts. Corrected filter for write_same with the corre…

a3e1013

…ct sense and error codes

update to handle filter for write_same with the correct sense and err…

fa60192

…or codes.

vyadavmsft added a commit that referenced this pull request Jul 24, 2015

Merge pull request #1 from LIS/master

386d3d3

get in sync with LIS master

vyadavmsft added a commit that referenced this pull request Feb 26, 2016

Merge pull request #1 from LIS/master

71f9e25

make repo current

vyadavmsft pushed a commit that referenced this pull request Jun 9, 2016

Merge pull request #1 from LIS/master

de3808a

Merge upstream

vyadavmsft added a commit that referenced this pull request Jun 23, 2016

Merge pull request #1 from LIS/master

3f02cef

pull update from master

dcui mentioned this pull request Apr 27, 2017

set 100G memory but only 66G in centos 6.8/6.9 #321

Closed

bkrakesh mentioned this pull request Jul 10, 2017

removing hv_netvsc module causes a task hung on rhel-6.9 #391

Closed

vyadavmsft pushed a commit that referenced this pull request Jul 19, 2017

Merge pull request #1 from alexngmsft/master

887ae24

Bring 4.2.3 and master in sync

chvalean pushed a commit that referenced this pull request Jan 3, 2018

Merge pull request #1 from iamshital/dev

da4b38c

backport "hv: kvp: Avoid reading past allocated blocks from KVP file

chvalean pushed a commit that referenced this pull request Jan 9, 2018

Merge pull request #1 from seansp/hv_netsvc_use_reciprocal_divide_to_…

c9c9a9e

…speed_up_computing_padding Hv netsvc use reciprocal divide to speed up computing padding

mgsmith1000 mentioned this pull request Feb 19, 2019

Loading uio_hv_generic causes crash on CentOS 7.6 #647

Closed

dcui mentioned this pull request May 22, 2019

RH7: PCI: hv: Fix the affinity setting for the NVMe crash #713

Merged

dcui mentioned this pull request Jul 6, 2019

Very slow framebuffer with hyperv_fb on recent Windows hosts, especially in Gen2 VM #655

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Corrected filter for write_same and added install/update scripts. #1

Corrected filter for write_same and added install/update scripts. #1

nickme commented Apr 3, 2015

Corrected filter for write_same and added install/update scripts. #1

Corrected filter for write_same and added install/update scripts. #1

Conversation

nickme commented Apr 3, 2015