Mike-Christie/…
Commits on Apr 16, 2021
-
scsi: qedi: always wake up if cmd_cleanup_req is set
If we got a response then we should always wake up the conn. For both the cmd_cleanup_req == 0 or cmd_cleanup_req > 0, we shouldn't dig into iscsi_itt_to_task because we don't know what the upper layers are doing. We can also remove the qedi_clear_task_idx call here because once we signal success libiscsi will loop over the affected commands and end up calling the cleanup_task callout which will release it. Signed-off-by: Mike Christie <michael.christie@oracle.com> Reviewed-by: Manish Rangankar <mrangankar@marvell.com>
-
scsi: qedi: complete TMF works before disconnect
We need to make sure that abort and reset completion work has completed before ep_disconnect returns. After ep_disconnect we can't manipulate cmds because libiscsi will call conn_stop and take onwership. We are trying to make sure abort work and reset completion work has completed before we do the cmd clean up in ep_disconnect. The problem is that: 1. the work function sets the QEDI_CONN_FW_CLEANUP bit, so if the work was still pending we would not see the bit set. We need to do this before the work is queued. 2. If we had multiple works queued then we could break from the loop in qedi_ep_disconnect early because when abort work 1 completes it could clear QEDI_CONN_FW_CLEANUP. qedi_ep_disconnect could then see that before work 2 has run. 3. A TMF reset completion work could run after ep_disconnect starts cleaning up cmds via qedi_clearsq. ep_disconnect's call to qedi_clearsq -> qedi_cleanup_all_io would might think it's done cleaning up cmds, but the reset completion work could still be running. We then return from ep_disconnect while still doing cleanup. This replaces the bit with a counter to track the number of queued TMF works, and adds a bool to prevent new works from starting from the completion path once a ep_disconnect starts. Signed-off-by: Mike Christie <michael.christie@oracle.com> Reviewed-by: Manish Rangankar <mrangankar@marvell.com>
-
scsi: qedi: pass send_iscsi_tmf task to abort
qedi_abort_work knows what task to abort so just pass it to send_iscsi_tmf. Signed-off-by: Mike Christie <michael.christie@oracle.com> Reviewed-by: Manish Rangankar <mrangankar@marvell.com>
-
scsi: qedi: fix cleanup session block/unblock use
Drivers shouldn't be calling block/unblock session for cmd cleanup because the functions can change the session state from under libiscsi. This adds a new a driver level bit so it can block all IO the host while it drains the card. Signed-off-by: Mike Christie <michael.christie@oracle.com> Reviewed-by: Manish Rangankar <mrangankar@marvell.com>
-
scsi: qedi: fix TMF session block/unblock use
Drivers shouldn't be calling block/unblock session for tmf handling because the functions can change the session state from under libiscsi. We now block the session for the drivers during tmf processing, so we can remove these calls. Signed-off-by: Mike Christie <michael.christie@oracle.com> Reviewed-by: Manish Rangankar <mrangankar@marvell.com>
-
scsi: qedi: use GFP_NOIO for tmf allocation
We run from a workqueue with no locks held so use GFP_NOIO. Signed-off-by: Mike Christie <michael.christie@oracle.com> Reviewed-by: Manish Rangankar <mrangankar@marvell.com>
-
scsi: qedi: fix TMF tid allocation
qedi_iscsi_abort_work and qedi_tmf_work allocates a tid then calls qedi_send_iscsi_tmf which also allcoates a tid. This removes the tid allocation from the callers. Signed-off-by: Mike Christie <michael.christie@oracle.com> Reviewed-by: Manish Rangankar <mrangankar@marvell.com>
-
scsi: qedi: fix use after free during abort cleanup
This fixes two bugs: 1. The scsi cmd task could be completed and the abort could timeout while we are running qedi_tmf_work so we need to get a ref to the task. 2. If qedi_tmf_work's qedi_wait_for_cleanup_request call times out we will also force the clean up of the qedi_work_map but qedi_process_cmd_cleanup_resp could still be accessing the qedi_cmd for the abort TMF. We can then race where qedi_process_cmd_cleanup_resp is still accessing the mtask's qedi_cmd but libiscsi has escalated to session level cleanup and is cleaning up the mtask while we are still accessing it. To fix this issue we extend where we hold the tmf_work_lock and back_lock so the qedi_process_cmd_cleanup_resp access is serialized with the cleanup done in qedi_tmf_work and any completion handling for the iscsi_task. Signed-off-by: Mike Christie <michael.christie@oracle.com> Reviewed-by: Manish Rangankar <mrangankar@marvell.com>
-
scsi: qedi: fix race during abort timeouts
If the SCSI cmd completes after qedi_tmf_work calls iscsi_itt_to_task then the qedi qedi_cmd->task_id could be freed and used for another cmd. If we then call qedi_iscsi_cleanup_task with that task_id we will be cleaning up the wrong cmd. This patch has us wait to release the task_id until the last put has been done on the iscsi_task. Because libiscsi grabs a ref to the task when sending the abort, we know that for the non abort timeout case that the task_id we are referencing is for the cmd that was supposed to be aborted. The next patch will fix the case where the abort timesout while we are running qedi_tmf_work. Signed-off-by: Mike Christie <michael.christie@oracle.com> Reviewed-by: Manish Rangankar <mrangankar@marvell.com>
-
scsi: qedi: fix null ref during abort handling
If qedi_process_cmd_cleanup_resp finds the cmd it frees the work and sets list_tmf_work to NULL, so qedi_tmf_work should check if list_tmf_work is non-NULL when it wants to force cleanup. Signed-off-by: Mike Christie <michael.christie@oracle.com> Reviewed-by: Manish Rangankar <mrangankar@marvell.com>
-
scsi: iscsi: move pool freeing
This doesn't fix any bugs, but it makes more sense to free the pool after we have removed the session. At that time we know nothing is touching any of the session fields, because all devices have been removed and scans are stopped. Signed-off-by: Mike Christie <michael.christie@oracle.com>
-
scsi: iscsi: fix use conn use after free
If we haven't done a unbind target call we can race where iscsi_conn_teardown wakes up the eh/abort thread and then frees the conn while those threads are still accessing the conn ehwait. We can only do one TMF per session so this just moves the TMF fields from the conn to the session. We can then rely on the iscsi_session_teardown->iscsi_remove_session->__iscsi_unbind_session call to remove the target and it's devices, and know after that point there is no device or scsi-ml callout trying to access the session. Signed-off-by: Mike Christie <michael.christie@oracle.com>
-
scsi: iscsi: wait on cmds before freeing conn
If we haven't done an unbind target call, we can race during conn destruction where iscsi_conn_teardown wakes up the eh/abort thread and its still accessing a task while iscsi_conn_teardown is freeing the conn. This patch has us wait for all threads to drop their refs to outstanding tasks during conn destruction. There is also an issue where we could be accessing the conn directly via fields like conn->ehwait in the eh callbacks. The next patch will fix those. Signed-off-by: Mike Christie <michael.christie@oracle.com>
-
scsi: iscsi: drop suspend calls from ep_disconnect
libiscsi will now suspend the send/tx queue for the drivers so we can drop it from the drivers ep_disconnect. Signed-off-by: Mike Christie <michael.christie@oracle.com>
-
scsi: iscsi: stop queueing during ep_disconnect
During ep_disconnect we have been doing iscsi_suspend_tx/queue to block new IO but every driver except cxgbi and iscsi_tcp can still get IO from __iscsi_conn_send_pdu if we haven't called iscsi_conn_failure before ep_disconnect. This could happen if we were terminating the session, and the logout timedout before it was even sent to libiscsi. This patch fixes the issue by adding a helper which reverses the bind_conn call that allows new IO to be queued. Drivers implementing ep_disconnect can use this to make sure new IO is not queued to them when handling the disconnect. Signed-off-by: Mike Christie <michael.christie@oracle.com>
-
scsi: iscsi: sync libiscsi and driver reset cleanup
If we are handling a sg io reset there is a small window where cmds can sneak into iscsi_queuecommand even though libiscsi has sent a TMF to the driver. This does seems to not be a problem for normal drivers because they know exactly what was sent to the FW. For libiscsi this is a problem because it knows what it has sent to the driver but not what the driver sent to the FW. When we go to cleanup the running tasks, libiscsi might cleanup the sneaky cmd but the driver/FQ may not have seen the sneaky cmd and it's left running in FW. This has libiscsi just stop queueing cmds when it sends the TMF down to the driver. Both then know they see the same set of cmds. Signed-off-by: Mike Christie <michael.christie@oracle.com>
-
scsi: iscsi: add task completion helper
This adds a helper to detect if a cmd has completed but not yet freed. Signed-off-by: Mike Christie <michael.christie@oracle.com>
Commits on Apr 10, 2021
-
Merge branch 'misc' into for-next
Conflicts: drivers/scsi/ufs/ufshcd.c
James Bottomley authored and James Bottomley committedApr 10, 2021 -
Merge branch 'fixes' into for-next
James Bottomley authored and James Bottomley committedApr 10, 2021
Commits on Apr 8, 2021
-
scsi: libsas: Reset num_scatter if libata marks qc as NODATA
When the cache_type for the SCSI device is changed, the SCSI layer issues a MODE_SELECT command. The caching mode details are communicated via a request buffer associated with the SCSI command with data direction set as DMA_TO_DEVICE (scsi_mode_select()). When this command reaches the libata layer, as a part of generic initial setup, libata layer sets up the scatterlist for the command using the SCSI command (ata_scsi_qc_new()). This command is then translated by the libata layer into ATA_CMD_SET_FEATURES (ata_scsi_mode_select_xlat()). The libata layer treats this as a non-data command (ata_mselect_caching()), since it only needs an ATA taskfile to pass the caching on/off information to the device. It does not need the scatterlist that has been setup, so it does not perform dma_map_sg() on the scatterlist (ata_qc_issue()). Unfortunately, when this command reaches the libsas layer (sas_ata_qc_issue()), libsas layer sees it as a non-data command with a scatterlist. It cannot extract the correct DMA length since the scatterlist has not been mapped with dma_map_sg() for a DMA operation. When this partially constructed SAS task reaches pm80xx LLDD, it results in the following warning: "pm80xx_chip_sata_req 6058: The sg list address start_addr=0x0000000000000000 data_len=0x0end_addr_high=0xffffffff end_addr_low=0xffffffff has crossed 4G boundary" Update libsas to handle ATA non-data commands separately so num_scatter and total_xfer_len remain 0. Link: https://lore.kernel.org/r/20210318225632.2481291-1-jollys@google.com Fixes: 53de092 ("scsi: libsas: Set data_dir as DMA_NONE if libata marks qc as NODATA") Tested-by: Luo Jiaxing <luojiaxing@huawei.com> Reviewed-by: John Garry <john.garry@huawei.com> Signed-off-by: Jolly Shah <jollys@google.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
scsi: iscsi: Fix iSCSI cls conn state
In commit 9e67600 ("scsi: iscsi: Fix race condition between login and sync thread") I missed that libiscsi was now setting the iSCSI class state, and that patch ended up resetting the state during conn stoppage and using the wrong state value during ep_disconnect. This patch moves the setting of the class state to the class module and then fixes the two issues above. Link: https://lore.kernel.org/r/20210406171746.5016-1-michael.christie@oracle.com Fixes: 9e67600 ("scsi: iscsi: Fix race condition between login and sync thread") Cc: Gulam Mohamed <gulam.mohamed@oracle.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Commits on Apr 6, 2021
-
Merge branch 'fixes' into for-next
Conflicts: drivers/target/iscsi/iscsi_target.c
James Bottomley authored and James Bottomley committedApr 6, 2021 -
scsi: libsas: Clean up whitespace
checkpatch reported several whitespace errors. Fix them all. Link: https://lore.kernel.org/r/1616675396-6108-3-git-send-email-luojiaxing@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
scsi: libsas: Correctly indent statements in sas_to_ata_err()
checkpatch reported an error in sas_to_ata_err(). switch and case statements are incorrectly indented. Link: https://lore.kernel.org/r/1616675396-6108-2-git-send-email-luojiaxing@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
scsi: qedf: Remove unused include of linux/version.h
Remove unnecessary include of linux/version.h. Link: https://lore.kernel.org/r/1617283618-19346-1-git-send-email-tiantao6@hisilicon.com Signed-off-by: Tian Tao <tiantao6@hisilicon.com> Signed-off-by: Zhiqi Song <songzhiqi1@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
scsi: bfa: Remove unnecessary struct declarations
struct bfa_fcs_s is declared twice. First is declared at line 50, remove the duplicate. struct bfa_fcs_fabric_s is defined at line 175, remove unnecessary declaration. Link: https://lore.kernel.org/r/20210401063535.992487-1-wanjiabing@vivo.com Signed-off-by: Wan Jiabing <wanjiabing@vivo.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
scsi: ufs: core: Fix out-of-bounds warnings in ufshcd_exec_raw_upiu_c…
…md() Fix the following out-of-bounds warnings by enclosing some structure members into new structure objects upiu_req and upiu_rsp: include/linux/fortify-string.h:20:29: warning: '__builtin_memcpy' offset [29, 48] from the object at 'treq' is out of the bounds of referenced subobject 'req_header' with type 'struct utp_upiu_header' at offset 16 [-Warray-bounds] include/linux/fortify-string.h:20:29: warning: '__builtin_memcpy' offset [61, 80] from the object at 'treq' is out of the bounds of referenced subobject 'rsp_header' with type 'struct utp_upiu_header' at offset 48 [-Warray-bounds] arch/m68k/include/asm/string.h:72:25: warning: '__builtin_memcpy' offset [29, 48] from the object at 'treq' is out of the bounds of referenced subobject 'req_header' with type 'struct utp_upiu_header' at offset 16 [-Warray-bounds] arch/m68k/include/asm/string.h:72:25: warning: '__builtin_memcpy' offset [61, 80] from the object at 'treq' is out of the bounds of referenced subobject 'rsp_header' with type 'struct utp_upiu_header' at offset 48 [-Warray-bounds] Refactor the code by making it more structured. The problem is that the original code is trying to copy data into a bunch of struct members adjacent to each other in a single call to memcpy(). Now that a new struct _upiu_req_ enclosing all those adjacent members is introduced, memcpy() doesn't overrun the length of &treq.req_header, because the address of the new struct object _upiu_req_ is used as the destination, instead. The same problem is present when memcpy() overruns the length of the source &treq.rsp_header; in this case the address of the new struct object _upiu_rsp_ is used, instead. Also, this helps with the ongoing efforts to enable -Warray-bounds and avoid confusing the compiler. Link: KSPP#109 Link: https://lore.kernel.org/lkml/60640558.lsAxiK6otPwTo9rv%25lkp@intel.com/ Link: https://lore.kernel.org/r/20210331224338.GA347171@embeddedor Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Avri Altman <avri.altman@wdc.com> Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Build-tested-by: kernel test robot <lkp@intel.com>
-
scsi: fnic: Remove unnecessary spin_lock_init() and INIT_LIST_HEAD()
The spinlock and list head of fnic_list are initialized statically. It is unnecessary to initialize them. Link: https://lore.kernel.org/r/20210330125911.1050879-1-yangyingliang@huawei.com Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
scsi: message: fusion: Use BUG_ON instead of if condition followed by…
… BUG BUG_ON() uses unlikely in if() which can be optimized at compile time. Link: https://lore.kernel.org/r/1617108361-6870-1-git-send-email-zhouchuangao@vivo.com Signed-off-by: zhouchuangao <zhouchuangao@vivo.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
scsi: mpt3sas: Block PCI config access from userspace during reset
While diag reset is in progress there is short duration where all access to controller's PCI config space from the host needs to be blocked. This is due to a hardware limitation of the IOC controllers. Block all access to controller's config space from userland applications by calling pci_cfg_access_lock() while diag reset is in progress and unlocking it again after the controller comes back to ready state. Link: https://lore.kernel.org/r/20210330105137.20728-1-sreekanth.reddy@broadcom.com Cc: stable@vger.kernel.org #v5.4.108+ Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
scsi: mpt3sas: Fix endianness for ActiveCablePowerRequirement
Covert ActiveCablePowerRequirement's value to target CPU endian before displaying it. Link: https://lore.kernel.org/r/20210330105106.20569-1-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
scsi: mpt3sas: Only one vSES is present even when IOC has multi vSES
Whenever the driver is adding a vSES to virtual-phys list it is reinitializing the list head. Hence those vSES devices which were added previously are lost. Stop reinitializing the list every time a new vSES device is added. Link: https://lore.kernel.org/r/20210330105004.20413-1-sreekanth.reddy@broadcom.com Cc: stable@vger.kernel.org #v5.11.10+ Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
scsi: target: Make the virtual LUN 0 device
Create the device for the virtual LUN 0 using the DUMMY flag. This change makes it possible to remove some special-casing in the INQUIRY code. Link: https://lore.kernel.org/r/20210322200938.53300-3-k.shelekhin@yadro.com Reviewed-by: Roman Bolshakov <r.bolshakov@yadro.com> Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Konstantin Shelekhin <k.shelekhin@yadro.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
scsi: target: Add the DUMMY flag to rd_mcp
This commit adds the DUMMY flag to the rd_mcp backend that forces a logical unit to report itself as not connected device of an unknown type. Essentially this allows users to create devices identical to the device for the virtual LUN 0, making it possible to explicitly create a LUN 0 device and configure its WWNs (e.g. vendor or product name). Link: https://lore.kernel.org/r/20210322200938.53300-2-k.shelekhin@yadro.com Reviewed-by: Roman Bolshakov <r.bolshakov@yadro.com> Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Konstantin Shelekhin <k.shelekhin@yadro.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
scsi: scsi_dh_alua: Remove check for ASC 24h in alua_rtpg()
Some arrays return ILLEGAL_REQUEST with ASC 00h if they don't support the RTPG extended header so remove the check for INVALID FIELD IN CDB. Link: https://lore.kernel.org/r/20210331201154.20348-1-emilne@redhat.com Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>