Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

-Wswitch in drivers/scsi/* #85

Closed
shenki opened this issue Sep 17, 2018 · 16 comments
Closed

-Wswitch in drivers/scsi/* #85

shenki opened this issue Sep 17, 2018 · 16 comments
Assignees
Labels
-Wswitch [BUG] linux A bug that should be fixed in the mainline kernel. [FIXED][LINUX] 5.1 This bug was fixed in Linux 5.1

Comments

@shenki
Copy link
Member

shenki commented Sep 17, 2018

drivers/scsi/cxlflash/superpipe.c:1948:7: warning: overflow converting case value to switch condition type
      (3228093062 to 18446744072642677382) [-Wswitch]
        case DK_CXLFLASH_MANAGE_LUN:
             ^
./include/uapi/scsi/cxlflash_ioctl.h:199:33: note: expanded from macro 'DK_CXLFLASH_MANAGE_LUN'
#define DK_CXLFLASH_MANAGE_LUN          CXL_IOWR(0x86, dk_cxlflash_manage_lun)
                                        ^
./include/uapi/scsi/cxlflash_ioctl.h:187:26: note: expanded from macro 'CXL_IOWR'
#define CXL_IOWR(_n, _s)        _IOWR(CXL_MAGIC, _n, struct _s)
                                ^
./include/uapi/asm-generic/ioctl.h:88:29: note: expanded from macro '_IOWR'
#define _IOWR(type,nr,size)     _IOC(_IOC_READ|_IOC_WRITE,(type),(nr),(_IOC_TYPECHECK(size)))
                                ^
./include/uapi/asm-generic/ioctl.h:70:2: note: expanded from macro '_IOC'
        (((dir)  << _IOC_DIRSHIFT) | \
        ^
@shenki shenki added help wanted good first issue Good for newcomers [BUG] linux A bug that should be fixed in the mainline kernel. labels Sep 17, 2018
@shenki
Copy link
Member Author

shenki commented Sep 17, 2018

Note this also warns in main.c:

drivers/scsi/cxlflash/main.c:3295:7: warning: overflow converting case value to switch condition type
      (3231763135 to 18446744072646347455) [-Wswitch]
        case HT_CXLFLASH_LUN_PROVISION:
             ^

@tpimh
Copy link

tpimh commented Sep 17, 2018

Is this only a warning? If it is nothing fatal, I suggest adding "low priority" label to it.

@shenki shenki added low priority This bug is not critical and not a priority and removed low priority This bug is not critical and not a priority labels Sep 18, 2018
@nickdesaulniers
Copy link
Member

All of our -Wswitch warnings go through _IOC. Where there's smoke, there's fire.

@nathanchance
Copy link
Member

The problem with the -Wswitch scsi warnings is they all come from ioctl functions, where the cmd parameter type is defined as int in include/scsi/scsi_host.h so fixing these warnings by turning the type to unsigned int is going to be a scsi wide change (and that's not taking into account if there are any ioctls that do allow a negative command).

Will probably need to brainstorm and think of other possible solutions.

@nathanchance
Copy link
Member

I went ahead and sent an email to the SCSI people for clarification: https://lore.kernel.org/lkml/20181004183047.GA1885@flashbox/

@nathanchance
Copy link
Member

Proposed diff: https://lore.kernel.org/lkml/20181019065104.GA27170@flashbox/

I'll clean it up and write a formal commit message tomorrow but it at least builds for both x86_64 and arm64 allyesconfig

@nathanchance nathanchance self-assigned this Oct 19, 2018
@nathanchance
Copy link
Member

@nathanchance nathanchance added [PATCH] Submitted A patch has been submitted for review and removed good first issue Good for newcomers labels Oct 19, 2018
fengguang pushed a commit to 0day-ci/linux that referenced this issue Oct 20, 2018
…late

Clang warns several times in the scsi subsystem (trimmed for brevity):

drivers/scsi/hpsa.c:6209:7: warning: overflow converting case value to
switch condition type (2147762695 to 18446744071562347015) [-Wswitch]
        case CCISS_GETBUSTYPES:
             ^
drivers/scsi/hpsa.c:6208:7: warning: overflow converting case value to
switch condition type (2147762694 to 18446744071562347014) [-Wswitch]
        case CCISS_GETHEARTBEAT:
             ^

The root cause is that the _IOC macro can generate really large numbers,
which don't find into type 'int', which is used for the cmd paremeter in
the ioctls in scsi_host_template. My research into how GCC and Clang are
handling this at a low level didn't prove fruitful. However, looking at
the rest of the kernel tree, all ioctls use an 'unsigned int' for the
cmd parameter, which will fit all of the _IOC values in the scsi/ata
subsystems.

Make that change because none of the ioctls expect to take a negative
value, it brings the ioctls inline with the reset of the kernel, and it
removes ambiguity, which is never good when dealing with compilers.

Link: ClangBuiltLinux#85
Link: ClangBuiltLinux#154
Link: ClangBuiltLinux#157
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
@nathanchance
Copy link
Member

nathanchance pushed a commit that referenced this issue Nov 6, 2018
The vport should be initialized to hdev->vport for each bp group,
otherwise it will cause out-of-bounds access and bp setting not
correct problem.

[   35.254124] BUG: KASAN: slab-out-of-bounds in hclge_pause_setup_hw+0x2a0/0x3f8 [hclge]
[   35.254126] Read of size 2 at addr ffff803b6651581a by task kworker/0:1/14

[   35.254132] CPU: 0 PID: 14 Comm: kworker/0:1 Not tainted 4.19.0-rc7-hulk+ #85
[   35.254133] Hardware name: Huawei D06/D06, BIOS Hisilicon D06 UEFI RC0 - B052 (V0.52) 09/14/2018
[   35.254141] Workqueue: events work_for_cpu_fn
[   35.254144] Call trace:
[   35.254147]  dump_backtrace+0x0/0x2f0
[   35.254149]  show_stack+0x24/0x30
[   35.254154]  dump_stack+0x110/0x184
[   35.254157]  print_address_description+0x168/0x2b0
[   35.254160]  kasan_report+0x184/0x310
[   35.254162]  __asan_load2+0x7c/0xa0
[   35.254170]  hclge_pause_setup_hw+0x2a0/0x3f8 [hclge]
[   35.254177]  hclge_tm_init_hw+0x794/0x9f0 [hclge]
[   35.254184]  hclge_tm_schd_init+0x48/0x58 [hclge]
[   35.254191]  hclge_init_ae_dev+0x778/0x1168 [hclge]
[   35.254196]  hnae3_register_ae_dev+0x14c/0x298 [hnae3]
[   35.254206]  hns3_probe+0x88/0xa8 [hns3]
[   35.254210]  local_pci_probe+0x7c/0xf0
[   35.254212]  work_for_cpu_fn+0x34/0x50
[   35.254214]  process_one_work+0x4d4/0xa38
[   35.254216]  worker_thread+0x55c/0x8d8
[   35.254219]  kthread+0x1b0/0x1b8
[   35.254222]  ret_from_fork+0x10/0x1c

[   35.254224] The buggy address belongs to the page:
[   35.254228] page:ffff7e00ed994400 count:1 mapcount:0 mapping:0000000000000000 index:0x0 compound_mapcount: 0
[   35.273835] flags: 0xfffff8000008000(head)
[   35.282007] raw: 0fffff8000008000 dead000000000100 dead000000000200 0000000000000000
[   35.282010] raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000
[   35.282012] page dumped because: kasan: bad access detected

[   35.282014] Memory state around the buggy address:
[   35.282017]  ffff803b66515700: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe
[   35.282019]  ffff803b66515780: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe
[   35.282021] >ffff803b66515800: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe
[   35.282022]                             ^
[   35.282024]  ffff803b66515880: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe
[   35.282026]  ffff803b66515900: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe
[   35.282028] ==================================================================
[   35.282029] Disabling lock debugging due to kernel taint
[   35.282747] hclge driver initialization finished.

Fixes: 67bf254 ("net: hns3: Fixes the back pressure setting when sriov is enabled")
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
@nathanchance
Copy link
Member

Patch v3 to fix the second comment's warning (which I missed because I wasn't building PPC locally, that's been fixed now): https://lore.kernel.org/lkml/20190114044206.6351-1-natechancellor@gmail.com/

fengguang pushed a commit to 0day-ci/linux that referenced this issue Jan 14, 2019
…late

Clang warns several times in the scsi subsystem (trimmed for brevity):

drivers/scsi/hpsa.c:6209:7: warning: overflow converting case value to
switch condition type (2147762695 to 18446744071562347015) [-Wswitch]
        case CCISS_GETBUSTYPES:
             ^
drivers/scsi/hpsa.c:6208:7: warning: overflow converting case value to
switch condition type (2147762694 to 18446744071562347014) [-Wswitch]
        case CCISS_GETHEARTBEAT:
             ^

The root cause is that the _IOC macro can generate really large numbers,
which don't find into type 'int', which is used for the cmd paremeter in
the ioctls in scsi_host_template. My research into how GCC and Clang are
handling this at a low level didn't prove fruitful. However, looking at
the rest of the kernel tree, all ioctls use an 'unsigned int' for the
cmd parameter, which will fit all of the _IOC values in the scsi/ata
subsystems.

Make that change because none of the ioctls expect to take a negative
value, it brings the ioctls inline with the reset of the kernel, and it
removes ambiguity, which is never good when dealing with compilers.

Link: ClangBuiltLinux#85
Link: ClangBuiltLinux#154
Link: ClangBuiltLinux#157
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
fengguang pushed a commit to 0day-ci/linux that referenced this issue Jan 14, 2019
…late

Clang warns several times in the scsi subsystem (trimmed for brevity):

drivers/scsi/hpsa.c:6209:7: warning: overflow converting case value to
switch condition type (2147762695 to 18446744071562347015) [-Wswitch]
        case CCISS_GETBUSTYPES:
             ^
drivers/scsi/hpsa.c:6208:7: warning: overflow converting case value to
switch condition type (2147762694 to 18446744071562347014) [-Wswitch]
        case CCISS_GETHEARTBEAT:
             ^

The root cause is that the _IOC macro can generate really large numbers,
which don't find into type 'int', which is used for the cmd paremeter in
the ioctls in scsi_host_template. My research into how GCC and Clang are
handling this at a low level didn't prove fruitful. However, looking at
the rest of the kernel tree, all ioctls use an 'unsigned int' for the
cmd parameter, which will fit all of the _IOC values in the scsi/ata
subsystems.

Make that change because none of the ioctls expect to take a negative
value, it brings the ioctls inline with the reset of the kernel, and it
removes ambiguity, which is never good when dealing with compilers.

Link: ClangBuiltLinux#85
Link: ClangBuiltLinux#154
Link: ClangBuiltLinux#157
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
@nickdesaulniers nickdesaulniers changed the title -Wswitch in drivers/scsi/cxlflash/superpipe.c -Wswitch in drivers/scsi/* Jan 14, 2019
@nickdesaulniers
Copy link
Member

nickdesaulniers commented Jan 26, 2019

Probably time to re-ping the thread (please put the maintainers in the To: field) @nathanchance .

@nathanchance
Copy link
Member

I've sent v4 with Bart's suggested change, his review, and a plea for a response.

https://lore.kernel.org/lkml/20190126075256.29608-1-natechancellor@gmail.com/

If you could review/test it yourself to try and get the ball rolling, I'd appreciate it :)

@nickdesaulniers
Copy link
Member

nickdesaulniers commented Jan 26, 2019

@nathanchance it's ok to be frustrated, but remember that maintaining FLOSS is a thankless job and the maintainers are only human. When I get frustrated, I find it helpful to type out a big email spelling out what I really think, then go for a walk, get a cup of coffee, then delete the email. I've done that for responses to Linus a few times now.

I will do my best to add a tested by tag sometime this week.

@nathanchance
Copy link
Member

Indeed. I tried my best to tone it down but it probably could have been even better (or just not there at all).

Thank you, hopefully we'll see some progress soon!

@nickdesaulniers
Copy link
Member

No worries; one thing to note as well is that FLOSS software development involves dealing with people and their politics; sometimes even more so than with code. This effort in particular strives to strike a balance and find compromise between two different, large, and import FLOSS code bases with different goals, interpretations, and developers.

While LKML is famous for being a shit show, do strive to hold yourself to a higher standard and try to do the right thing. Both require constant and active pursuit. And remind me when I need to remember this point; because it's easy to get frustrated.

@nickdesaulniers
Copy link
Member

nickdesaulniers commented Feb 8, 2019

https://lore.kernel.org/r/yq1pns1x64t.fsf@oracle.com/
https://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git/commit/?h=5.1/scsi-queue&id=6f4e626fb0cc93d50b49b79c2ee33bd769ee57f0

@nickdesaulniers nickdesaulniers added [PATCH] Accepted A submitted patch has been accepted upstream and removed [PATCH] Submitted A patch has been submitted for review labels Feb 8, 2019
fengguang pushed a commit to 0day-ci/linux that referenced this issue Feb 10, 2019
…plate

Clang warns several times in the scsi subsystem (trimmed for brevity):

drivers/scsi/hpsa.c:6209:7: warning: overflow converting case value to
switch condition type (2147762695 to 18446744071562347015) [-Wswitch]
        case CCISS_GETBUSTYPES:
             ^
drivers/scsi/hpsa.c:6208:7: warning: overflow converting case value to
switch condition type (2147762694 to 18446744071562347014) [-Wswitch]
        case CCISS_GETHEARTBEAT:
             ^

The root cause is that the _IOC macro can generate really large numbers,
which don't fit into type 'int', which is used for the cmd parameter in
the ioctls in scsi_host_template. My research into how GCC and Clang are
handling this at a low level didn't prove fruitful. However, looking at
the rest of the kernel tree, all ioctls use an 'unsigned int' for the
cmd parameter, which will fit all of the _IOC values in the scsi/ata
subsystems.

Make that change because none of the ioctls expect a negative value for
any command, it brings the ioctls inline with the reset of the kernel,
and it removes ambiguity, which is never good when dealing with compilers.

Link: ClangBuiltLinux#85
Link: ClangBuiltLinux#154
Link: ClangBuiltLinux#157
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Acked-by: Bradley Grove <bgrove@attotech.com>
Acked-by: Don Brace <don.brace@microsemi.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
@nathanchance
Copy link
Member

@nathanchance nathanchance added [FIXED][LINUX] 5.1 This bug was fixed in Linux 5.1 and removed [PATCH] Accepted A submitted patch has been accepted upstream low priority This bug is not critical and not a priority labels Mar 10, 2019
nathanchance pushed a commit that referenced this issue Sep 2, 2019
This patch fixes an issue that the following error is
possible to happen when ohci hardware causes an interruption
and the system is shutting down at the same time.

[   34.851754] usb 2-1: USB disconnect, device number 2
[   35.166658] irq 156: nobody cared (try booting with the "irqpoll" option)
[   35.173445] CPU: 0 PID: 22 Comm: kworker/0:1 Not tainted 5.3.0-rc5 #85
[   35.179964] Hardware name: Renesas Salvator-X 2nd version board based on r8a77965 (DT)
[   35.187886] Workqueue: usb_hub_wq hub_event
[   35.192063] Call trace:
[   35.194509]  dump_backtrace+0x0/0x150
[   35.198165]  show_stack+0x14/0x20
[   35.201475]  dump_stack+0xa0/0xc4
[   35.204785]  __report_bad_irq+0x34/0xe8
[   35.208614]  note_interrupt+0x2cc/0x318
[   35.212446]  handle_irq_event_percpu+0x5c/0x88
[   35.216883]  handle_irq_event+0x48/0x78
[   35.220712]  handle_fasteoi_irq+0xb4/0x188
[   35.224802]  generic_handle_irq+0x24/0x38
[   35.228804]  __handle_domain_irq+0x5c/0xb0
[   35.232893]  gic_handle_irq+0x58/0xa8
[   35.236548]  el1_irq+0xb8/0x180
[   35.239681]  __do_softirq+0x94/0x23c
[   35.243253]  irq_exit+0xd0/0xd8
[   35.246387]  __handle_domain_irq+0x60/0xb0
[   35.250475]  gic_handle_irq+0x58/0xa8
[   35.254130]  el1_irq+0xb8/0x180
[   35.257268]  kernfs_find_ns+0x5c/0x120
[   35.261010]  kernfs_find_and_get_ns+0x3c/0x60
[   35.265361]  sysfs_unmerge_group+0x20/0x68
[   35.269454]  dpm_sysfs_remove+0x2c/0x68
[   35.273284]  device_del+0x80/0x370
[   35.276683]  hid_destroy_device+0x28/0x60
[   35.280686]  usbhid_disconnect+0x4c/0x80
[   35.284602]  usb_unbind_interface+0x6c/0x268
[   35.288867]  device_release_driver_internal+0xe4/0x1b0
[   35.293998]  device_release_driver+0x14/0x20
[   35.298261]  bus_remove_device+0x110/0x128
[   35.302350]  device_del+0x148/0x370
[   35.305832]  usb_disable_device+0x8c/0x1d0
[   35.309921]  usb_disconnect+0xc8/0x2d0
[   35.313663]  hub_event+0x6e0/0x1128
[   35.317146]  process_one_work+0x1e0/0x320
[   35.321148]  worker_thread+0x40/0x450
[   35.324805]  kthread+0x124/0x128
[   35.328027]  ret_from_fork+0x10/0x18
[   35.331594] handlers:
[   35.333862] [<0000000079300c1d>] usb_hcd_irq
[   35.338126] [<0000000079300c1d>] usb_hcd_irq
[   35.342389] Disabling IRQ #156

ohci_shutdown() disables all the interrupt and rh_state is set to
OHCI_RH_HALTED. In other hand, ohci_irq() is possible to enable
OHCI_INTR_SF and OHCI_INTR_MIE on ohci_irq(). Note that OHCI_INTR_SF
is possible to be set by start_ed_unlink() which is called:
 ohci_irq()
  -> process_done_list()
   -> takeback_td()
    -> start_ed_unlink()

So, ohci_irq() has the following condition, the issue happens by
&ohci->regs->intrenable = OHCI_INTR_MIE | OHCI_INTR_SF and
ohci->rh_state = OHCI_RH_HALTED:

	/* interrupt for some other device? */
	if (ints == 0 || unlikely(ohci->rh_state == OHCI_RH_HALTED))
		return IRQ_NOTMINE;

To fix the issue, ohci_shutdown() holds the spin lock while disabling
the interruption and changing the rh_state flag to prevent reenable
the OHCI_INTR_MIE unexpectedly. Note that io_watchdog_func() also
calls the ohci_shutdown() and it already held the spin lock, so that
the patch makes a new function as _ohci_shutdown().

This patch is inspired by a Renesas R-Car Gen3 BSP patch
from Tho Vu.

Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Cc: stable <stable@vger.kernel.org>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Link: https://lore.kernel.org/r/1566877910-6020-1-git-send-email-yoshihiro.shimoda.uh@renesas.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
nathanchance pushed a commit that referenced this issue Feb 21, 2020
The commit 54e53b2
  ("tty: serial: 8250: pass IRQ shared flag to UART ports")
nicely explained the problem:

---8<---8<---

On some systems IRQ lines between multiple UARTs might be shared. If so, the
irqflags have to be configured accordingly. The reason is: The 8250 port startup
code performs IRQ tests *before* the IRQ handler for that particular port is
registered. This is performed in serial8250_do_startup(). This function checks
whether IRQF_SHARED is configured and only then disables the IRQ line while
testing.

This test is performed upon each open() of the UART device. Imagine two UARTs
share the same IRQ line: On is already opened and the IRQ is active. When the
second UART is opened, the IRQ line has to be disabled while performing IRQ
tests. Otherwise an IRQ might handler might be invoked, but the IRQ itself
cannot be handled, because the corresponding handler isn't registered,
yet. That's because the 8250 code uses a chain-handler and invokes the
corresponding port's IRQ handling routines himself.

Unfortunately this IRQF_SHARED flag isn't configured for UARTs probed via device
tree even if the IRQs are shared. This way, the actual and shared IRQ line isn't
disabled while performing tests and the kernel correctly detects a spurious
IRQ. So, adding this flag to the DT probe solves the issue.

Note: The UPF_SHARE_IRQ flag is configured unconditionally. Therefore, the
IRQF_SHARED flag can be set unconditionally as well.

Example stack trace by performing `echo 1 > /dev/ttyS2` on a non-patched system:

|irq 85: nobody cared (try booting with the "irqpoll" option)
| [...]
|handlers:
|[<ffff0000080fc628>] irq_default_primary_handler threaded [<ffff00000855fbb8>] serial8250_interrupt
|Disabling IRQ #85

---8<---8<---

But unfortunately didn't fix the root cause. Let's try again here by moving
IRQ flag assignment from serial_link_irq_chain() to serial8250_do_startup().

This should fix the similar issue reported for 8250_pnp case.

Since this change we don't need to have custom solutions in 8250_aspeed_vuart
and 8250_of drivers, thus, drop them.

Fixes: 1c2f049 ("serial: 8250: add IRQ trigger support")
Reported-by: Li RongQing <lirongqing@baidu.com>
Cc: Kurt Kanzenbach <kurt@linutronix.de>
Cc: Vikram Pandita <vikram.pandita@ti.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: stable <stable@vger.kernel.org>
Acked-by: Kurt Kanzenbach <kurt@linutronix.de>
Link: https://lore.kernel.org/r/20200211135559.85960-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
-Wswitch [BUG] linux A bug that should be fixed in the mainline kernel. [FIXED][LINUX] 5.1 This bug was fixed in Linux 5.1
Projects
None yet
Development

No branches or pull requests

4 participants