Skip to content
Permalink
Mike-Christie/…
Switch branches/tags

Commits on Apr 16, 2021

  1. scsi: qedi: always wake up if cmd_cleanup_req is set

    If we got a response then we should always wake up the conn. For both
    the cmd_cleanup_req == 0 or cmd_cleanup_req > 0, we shouldn't dig into
    iscsi_itt_to_task because we don't know what the upper layers are doing.
    
    We can also remove the qedi_clear_task_idx call here because once we
    signal success libiscsi will loop over the affected commands and end
    up calling the cleanup_task callout which will release it.
    
    Signed-off-by: Mike Christie <michael.christie@oracle.com>
    Reviewed-by: Manish Rangankar <mrangankar@marvell.com>
    mikechristie authored and intel-lab-lkp committed Apr 16, 2021
  2. scsi: qedi: complete TMF works before disconnect

    We need to make sure that abort and reset completion work has completed
    before ep_disconnect returns. After ep_disconnect we can't manipulate
    cmds because libiscsi will call conn_stop and take onwership.
    
    We are trying to make sure abort work and reset completion work has
    completed before we do the cmd clean up in ep_disconnect. The problem is
    that:
    
    1. the work function sets the QEDI_CONN_FW_CLEANUP bit, so if the work was
    still pending we would not see the bit set. We need to do this before the
    work is queued.
    
    2. If we had multiple works queued then we could break from the loop in
    qedi_ep_disconnect early because when abort work 1 completes it could
    clear QEDI_CONN_FW_CLEANUP. qedi_ep_disconnect could then see that before
    work 2 has run.
    
    3. A TMF reset completion work could run after ep_disconnect starts
    cleaning up cmds via qedi_clearsq. ep_disconnect's call to qedi_clearsq
    -> qedi_cleanup_all_io would might think it's done cleaning up cmds, but
    the reset completion work could still be running. We then return from
    ep_disconnect while still doing cleanup.
    
    This replaces the bit with a counter to track the number of queued TMF
    works, and adds a bool to prevent new works from starting from the
    completion path once a ep_disconnect starts.
    
    Signed-off-by: Mike Christie <michael.christie@oracle.com>
    Reviewed-by: Manish Rangankar <mrangankar@marvell.com>
    mikechristie authored and intel-lab-lkp committed Apr 16, 2021
  3. scsi: qedi: pass send_iscsi_tmf task to abort

    qedi_abort_work knows what task to abort so just pass it to send_iscsi_tmf.
    
    Signed-off-by: Mike Christie <michael.christie@oracle.com>
    Reviewed-by: Manish Rangankar <mrangankar@marvell.com>
    mikechristie authored and intel-lab-lkp committed Apr 16, 2021
  4. scsi: qedi: fix cleanup session block/unblock use

    Drivers shouldn't be calling block/unblock session for cmd cleanup
    because the functions can change the session state from under libiscsi.
    This adds a new a driver level bit so it can block all IO the host while
    it drains the card.
    
    Signed-off-by: Mike Christie <michael.christie@oracle.com>
    Reviewed-by: Manish Rangankar <mrangankar@marvell.com>
    mikechristie authored and intel-lab-lkp committed Apr 16, 2021
  5. scsi: qedi: fix TMF session block/unblock use

    Drivers shouldn't be calling block/unblock session for tmf handling
    because the functions can change the session state from under libiscsi. We
    now block the session for the drivers during tmf processing, so we can
    remove these calls.
    
    Signed-off-by: Mike Christie <michael.christie@oracle.com>
    Reviewed-by: Manish Rangankar <mrangankar@marvell.com>
    mikechristie authored and intel-lab-lkp committed Apr 16, 2021
  6. scsi: qedi: use GFP_NOIO for tmf allocation

    We run from a workqueue with no locks held so use GFP_NOIO.
    
    Signed-off-by: Mike Christie <michael.christie@oracle.com>
    Reviewed-by: Manish Rangankar <mrangankar@marvell.com>
    mikechristie authored and intel-lab-lkp committed Apr 16, 2021
  7. scsi: qedi: fix TMF tid allocation

    qedi_iscsi_abort_work and qedi_tmf_work allocates a tid then calls
    qedi_send_iscsi_tmf which also allcoates a tid. This removes the tid
    allocation from the callers.
    
    Signed-off-by: Mike Christie <michael.christie@oracle.com>
    Reviewed-by: Manish Rangankar <mrangankar@marvell.com>
    mikechristie authored and intel-lab-lkp committed Apr 16, 2021
  8. scsi: qedi: fix use after free during abort cleanup

    This fixes two bugs:
    
    1. The scsi cmd task could be completed and the abort could timeout while
    we are running qedi_tmf_work so we need to get a ref to the task.
    
    2. If qedi_tmf_work's qedi_wait_for_cleanup_request call times out we will
    also force the clean up of the qedi_work_map but
    qedi_process_cmd_cleanup_resp could still be accessing the qedi_cmd for the
    abort TMF. We can then race where qedi_process_cmd_cleanup_resp is still
    accessing the mtask's qedi_cmd but libiscsi has escalated to session level
    cleanup and is cleaning up the mtask while we are still accessing it.
    
    To fix this issue we extend where we hold the tmf_work_lock and back_lock
    so the qedi_process_cmd_cleanup_resp access is serialized with the cleanup
    done in qedi_tmf_work and any completion handling for the iscsi_task.
    
    Signed-off-by: Mike Christie <michael.christie@oracle.com>
    Reviewed-by: Manish Rangankar <mrangankar@marvell.com>
    mikechristie authored and intel-lab-lkp committed Apr 16, 2021
  9. scsi: qedi: fix race during abort timeouts

    If the SCSI cmd completes after qedi_tmf_work calls iscsi_itt_to_task then
    the qedi qedi_cmd->task_id could be freed and used for another cmd. If we
    then call qedi_iscsi_cleanup_task with that task_id we will be cleaning up
    the wrong cmd.
    
    This patch has us wait to release the task_id until the last put has been
    done on the iscsi_task. Because libiscsi grabs a ref to the task when
    sending the abort, we know that for the non abort timeout case that the
    task_id we are referencing is for the cmd that was supposed to be aborted.
    
    The next patch will fix the case where the abort timesout while we are
    running qedi_tmf_work.
    
    Signed-off-by: Mike Christie <michael.christie@oracle.com>
    Reviewed-by: Manish Rangankar <mrangankar@marvell.com>
    mikechristie authored and intel-lab-lkp committed Apr 16, 2021
  10. scsi: qedi: fix null ref during abort handling

    If qedi_process_cmd_cleanup_resp finds the cmd it frees the work and sets
    list_tmf_work to NULL, so qedi_tmf_work should check if list_tmf_work is
    non-NULL when it wants to force cleanup.
    
    Signed-off-by: Mike Christie <michael.christie@oracle.com>
    Reviewed-by: Manish Rangankar <mrangankar@marvell.com>
    mikechristie authored and intel-lab-lkp committed Apr 16, 2021
  11. scsi: iscsi: move pool freeing

    This doesn't fix any bugs, but it makes more sense to free the pool after
    we have removed the session. At that time we know nothing is touching any
    of the session fields, because all devices have been removed and scans are
    stopped.
    
    Signed-off-by: Mike Christie <michael.christie@oracle.com>
    mikechristie authored and intel-lab-lkp committed Apr 16, 2021
  12. scsi: iscsi: fix use conn use after free

    If we haven't done a unbind target call we can race where
    iscsi_conn_teardown wakes up the eh/abort thread and then frees the conn
    while those threads are still accessing the conn ehwait.
    
    We can only do one TMF per session so this just moves the TMF fields from
    the conn to the session. We can then rely on the
    iscsi_session_teardown->iscsi_remove_session->__iscsi_unbind_session call
    to remove the target and it's devices, and know after that point there is
    no device or scsi-ml callout trying to access the session.
    
    Signed-off-by: Mike Christie <michael.christie@oracle.com>
    mikechristie authored and intel-lab-lkp committed Apr 16, 2021
  13. scsi: iscsi: wait on cmds before freeing conn

    If we haven't done an unbind target call, we can race during conn
    destruction where iscsi_conn_teardown wakes up the eh/abort thread and its
    still accessing a task while iscsi_conn_teardown is freeing the conn. This
    patch has us wait for all threads to drop their refs to outstanding tasks
    during conn destruction.
    
    There is also an issue where we could be accessing the conn directly via
    fields like conn->ehwait in the eh callbacks. The next patch will fix
    those.
    
    Signed-off-by: Mike Christie <michael.christie@oracle.com>
    mikechristie authored and intel-lab-lkp committed Apr 16, 2021
  14. scsi: iscsi: drop suspend calls from ep_disconnect

    libiscsi will now suspend the send/tx queue for the drivers so we can drop
    it from the drivers ep_disconnect.
    
    Signed-off-by: Mike Christie <michael.christie@oracle.com>
    mikechristie authored and intel-lab-lkp committed Apr 16, 2021
  15. scsi: iscsi: stop queueing during ep_disconnect

    During ep_disconnect we have been doing iscsi_suspend_tx/queue to block
    new IO but every driver except cxgbi and iscsi_tcp can still get IO from
    __iscsi_conn_send_pdu if we haven't called iscsi_conn_failure before
    ep_disconnect. This could happen if we were terminating the session, and
    the logout timedout before it was even sent to libiscsi.
    
    This patch fixes the issue by adding a helper which reverses the bind_conn
    call that allows new IO to be queued. Drivers implementing ep_disconnect
    can use this to make sure new IO is not queued to them when handling the
    disconnect.
    
    Signed-off-by: Mike Christie <michael.christie@oracle.com>
    mikechristie authored and intel-lab-lkp committed Apr 16, 2021
  16. scsi: iscsi: sync libiscsi and driver reset cleanup

    If we are handling a sg io reset there is a small window where cmds can
    sneak into iscsi_queuecommand even though libiscsi has sent a TMF to the
    driver. This does seems to not be a problem for normal drivers because they
    know exactly what was sent to the FW. For libiscsi this is a problem
    because it knows what it has sent to the driver but not what the driver
    sent to the FW. When we go to cleanup the running tasks, libiscsi might
    cleanup the sneaky cmd but the driver/FQ may not have seen the sneaky cmd
    and it's left running in FW.
    
    This has libiscsi just stop queueing cmds when it sends the TMF down to the
    driver. Both then know they see the same set of cmds.
    
    Signed-off-by: Mike Christie <michael.christie@oracle.com>
    mikechristie authored and intel-lab-lkp committed Apr 16, 2021
  17. scsi: iscsi: add task completion helper

    This adds a helper to detect if a cmd has completed but not yet freed.
    
    Signed-off-by: Mike Christie <michael.christie@oracle.com>
    mikechristie authored and intel-lab-lkp committed Apr 16, 2021

Commits on Apr 10, 2021

  1. Merge branch 'misc' into for-next

     Conflicts:
    	drivers/scsi/ufs/ufshcd.c
    James Bottomley authored and James Bottomley committed Apr 10, 2021
  2. Merge branch 'fixes' into for-next

    James Bottomley authored and James Bottomley committed Apr 10, 2021

Commits on Apr 8, 2021

  1. scsi: libsas: Reset num_scatter if libata marks qc as NODATA

    When the cache_type for the SCSI device is changed, the SCSI layer issues a
    MODE_SELECT command. The caching mode details are communicated via a
    request buffer associated with the SCSI command with data direction set as
    DMA_TO_DEVICE (scsi_mode_select()). When this command reaches the libata
    layer, as a part of generic initial setup, libata layer sets up the
    scatterlist for the command using the SCSI command (ata_scsi_qc_new()).
    This command is then translated by the libata layer into
    ATA_CMD_SET_FEATURES (ata_scsi_mode_select_xlat()). The libata layer treats
    this as a non-data command (ata_mselect_caching()), since it only needs an
    ATA taskfile to pass the caching on/off information to the device. It does
    not need the scatterlist that has been setup, so it does not perform
    dma_map_sg() on the scatterlist (ata_qc_issue()). Unfortunately, when this
    command reaches the libsas layer (sas_ata_qc_issue()), libsas layer sees it
    as a non-data command with a scatterlist. It cannot extract the correct DMA
    length since the scatterlist has not been mapped with dma_map_sg() for a
    DMA operation. When this partially constructed SAS task reaches pm80xx
    LLDD, it results in the following warning:
    
    "pm80xx_chip_sata_req 6058: The sg list address
    start_addr=0x0000000000000000 data_len=0x0end_addr_high=0xffffffff
    end_addr_low=0xffffffff has crossed 4G boundary"
    
    Update libsas to handle ATA non-data commands separately so num_scatter and
    total_xfer_len remain 0.
    
    Link: https://lore.kernel.org/r/20210318225632.2481291-1-jollys@google.com
    Fixes: 53de092 ("scsi: libsas: Set data_dir as DMA_NONE if libata marks qc as NODATA")
    Tested-by: Luo Jiaxing <luojiaxing@huawei.com>
    Reviewed-by: John Garry <john.garry@huawei.com>
    Signed-off-by: Jolly Shah <jollys@google.com>
    Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
    Jolly Shah authored and martinkpetersen committed Apr 8, 2021
  2. scsi: iscsi: Fix iSCSI cls conn state

    In commit 9e67600 ("scsi: iscsi: Fix race condition between login and
    sync thread") I missed that libiscsi was now setting the iSCSI class state,
    and that patch ended up resetting the state during conn stoppage and using
    the wrong state value during ep_disconnect. This patch moves the setting of
    the class state to the class module and then fixes the two issues above.
    
    Link: https://lore.kernel.org/r/20210406171746.5016-1-michael.christie@oracle.com
    Fixes: 9e67600 ("scsi: iscsi: Fix race condition between login and sync thread")
    Cc: Gulam Mohamed <gulam.mohamed@oracle.com>
    Signed-off-by: Mike Christie <michael.christie@oracle.com>
    Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
    mikechristie authored and martinkpetersen committed Apr 8, 2021

Commits on Apr 6, 2021

  1. Merge branch 'fixes' into for-next

     Conflicts:
    	drivers/target/iscsi/iscsi_target.c
    James Bottomley authored and James Bottomley committed Apr 6, 2021
  2. scsi: libsas: Clean up whitespace

    checkpatch reported several whitespace errors. Fix them all.
    
    Link: https://lore.kernel.org/r/1616675396-6108-3-git-send-email-luojiaxing@huawei.com
    Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
    Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
    Luojiaxing1991 authored and martinkpetersen committed Apr 6, 2021
  3. scsi: libsas: Correctly indent statements in sas_to_ata_err()

    checkpatch reported an error in sas_to_ata_err(). switch and case
    statements are incorrectly indented.
    
    Link: https://lore.kernel.org/r/1616675396-6108-2-git-send-email-luojiaxing@huawei.com
    Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
    Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
    Luojiaxing1991 authored and martinkpetersen committed Apr 6, 2021
  4. scsi: qedf: Remove unused include of linux/version.h

    Remove unnecessary include of linux/version.h.
    
    Link: https://lore.kernel.org/r/1617283618-19346-1-git-send-email-tiantao6@hisilicon.com
    Signed-off-by: Tian Tao <tiantao6@hisilicon.com>
    Signed-off-by: Zhiqi Song <songzhiqi1@huawei.com>
    Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
    Tian Tao authored and martinkpetersen committed Apr 6, 2021
  5. scsi: bfa: Remove unnecessary struct declarations

    struct bfa_fcs_s is declared twice. First is declared at line 50, remove
    the duplicate.
    
    struct bfa_fcs_fabric_s is defined at line 175, remove unnecessary
    declaration.
    
    Link: https://lore.kernel.org/r/20210401063535.992487-1-wanjiabing@vivo.com
    Signed-off-by: Wan Jiabing <wanjiabing@vivo.com>
    Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
    Wan Jiabing authored and martinkpetersen committed Apr 6, 2021
  6. scsi: ufs: core: Fix out-of-bounds warnings in ufshcd_exec_raw_upiu_c…

    …md()
    
    Fix the following out-of-bounds warnings by enclosing some structure
    members into new structure objects upiu_req and upiu_rsp:
    
    include/linux/fortify-string.h:20:29: warning: '__builtin_memcpy' offset [29, 48] from the object at 'treq' is out of the bounds of referenced subobject 'req_header' with type 'struct utp_upiu_header' at offset 16 [-Warray-bounds]
    include/linux/fortify-string.h:20:29: warning: '__builtin_memcpy' offset [61, 80] from the object at 'treq' is out of the bounds of referenced subobject 'rsp_header' with type 'struct utp_upiu_header' at offset 48 [-Warray-bounds]
    arch/m68k/include/asm/string.h:72:25: warning: '__builtin_memcpy' offset [29, 48] from the object at 'treq' is out of the bounds of referenced subobject 'req_header' with type 'struct utp_upiu_header' at offset 16 [-Warray-bounds]
    arch/m68k/include/asm/string.h:72:25: warning: '__builtin_memcpy' offset [61, 80] from the object at 'treq' is out of the bounds of referenced subobject 'rsp_header' with type 'struct utp_upiu_header' at offset 48 [-Warray-bounds]
    
    Refactor the code by making it more structured.
    
    The problem is that the original code is trying to copy data into a bunch
    of struct members adjacent to each other in a single call to memcpy(). Now
    that a new struct _upiu_req_ enclosing all those adjacent members is
    introduced, memcpy() doesn't overrun the length of &treq.req_header,
    because the address of the new struct object _upiu_req_ is used as the
    destination, instead. The same problem is present when memcpy() overruns
    the length of the source &treq.rsp_header; in this case the address of the
    new struct object _upiu_rsp_ is used, instead.
    
    Also, this helps with the ongoing efforts to enable -Warray-bounds and
    avoid confusing the compiler.
    
    Link: KSPP#109
    Link: https://lore.kernel.org/lkml/60640558.lsAxiK6otPwTo9rv%25lkp@intel.com/
    Link: https://lore.kernel.org/r/20210331224338.GA347171@embeddedor
    Reported-by: kernel test robot <lkp@intel.com>
    Reviewed-by: Avri Altman <avri.altman@wdc.com>
    Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
    Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
    Build-tested-by: kernel test robot <lkp@intel.com>
    GustavoARSilva authored and martinkpetersen committed Apr 6, 2021
  7. scsi: fnic: Remove unnecessary spin_lock_init() and INIT_LIST_HEAD()

    The spinlock and list head of fnic_list are initialized statically.  It is
    unnecessary to initialize them.
    
    Link: https://lore.kernel.org/r/20210330125911.1050879-1-yangyingliang@huawei.com
    Reported-by: Hulk Robot <hulkci@huawei.com>
    Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
    Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
    Yang Yingliang authored and martinkpetersen committed Apr 6, 2021
  8. scsi: message: fusion: Use BUG_ON instead of if condition followed by…

    … BUG
    
    BUG_ON() uses unlikely in if() which can be optimized at compile time.
    
    Link: https://lore.kernel.org/r/1617108361-6870-1-git-send-email-zhouchuangao@vivo.com
    Signed-off-by: zhouchuangao <zhouchuangao@vivo.com>
    Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
    zhouchuangao authored and martinkpetersen committed Apr 6, 2021
  9. scsi: mpt3sas: Block PCI config access from userspace during reset

    While diag reset is in progress there is short duration where all access to
    controller's PCI config space from the host needs to be blocked. This is
    due to a hardware limitation of the IOC controllers.
    
    Block all access to controller's config space from userland applications by
    calling pci_cfg_access_lock() while diag reset is in progress and unlocking
    it again after the controller comes back to ready state.
    
    Link: https://lore.kernel.org/r/20210330105137.20728-1-sreekanth.reddy@broadcom.com
    Cc: stable@vger.kernel.org #v5.4.108+
    Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
    Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
    sreekanthbrcm authored and martinkpetersen committed Apr 6, 2021
  10. scsi: mpt3sas: Fix endianness for ActiveCablePowerRequirement

    Covert ActiveCablePowerRequirement's value to target CPU endian before
    displaying it.
    
    Link: https://lore.kernel.org/r/20210330105106.20569-1-sreekanth.reddy@broadcom.com
    Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
    Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
    sreekanthbrcm authored and martinkpetersen committed Apr 6, 2021
  11. scsi: mpt3sas: Only one vSES is present even when IOC has multi vSES

    Whenever the driver is adding a vSES to virtual-phys list it is
    reinitializing the list head. Hence those vSES devices which were added
    previously are lost.
    
    Stop reinitializing the list every time a new vSES device is added.
    
    Link: https://lore.kernel.org/r/20210330105004.20413-1-sreekanth.reddy@broadcom.com
    Cc: stable@vger.kernel.org #v5.11.10+
    Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
    Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
    sreekanthbrcm authored and martinkpetersen committed Apr 6, 2021
  12. scsi: target: Make the virtual LUN 0 device

    Create the device for the virtual LUN 0 using the DUMMY flag. This change
    makes it possible to remove some special-casing in the INQUIRY code.
    
    Link: https://lore.kernel.org/r/20210322200938.53300-3-k.shelekhin@yadro.com
    Reviewed-by: Roman Bolshakov <r.bolshakov@yadro.com>
    Reviewed-by: Mike Christie <michael.christie@oracle.com>
    Signed-off-by: Konstantin Shelekhin <k.shelekhin@yadro.com>
    Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
    Konstantin Shelekhin authored and martinkpetersen committed Apr 6, 2021
  13. scsi: target: Add the DUMMY flag to rd_mcp

    This commit adds the DUMMY flag to the rd_mcp backend that forces a logical
    unit to report itself as not connected device of an unknown type.
    Essentially this allows users to create devices identical to the device for
    the virtual LUN 0, making it possible to explicitly create a LUN 0 device
    and configure its WWNs (e.g. vendor or product name).
    
    Link: https://lore.kernel.org/r/20210322200938.53300-2-k.shelekhin@yadro.com
    Reviewed-by: Roman Bolshakov <r.bolshakov@yadro.com>
    Reviewed-by: Mike Christie <michael.christie@oracle.com>
    Signed-off-by: Konstantin Shelekhin <k.shelekhin@yadro.com>
    Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
    Konstantin Shelekhin authored and martinkpetersen committed Apr 6, 2021
  14. scsi: scsi_dh_alua: Remove check for ASC 24h in alua_rtpg()

    Some arrays return ILLEGAL_REQUEST with ASC 00h if they don't support the
    RTPG extended header so remove the check for INVALID FIELD IN CDB.
    
    Link: https://lore.kernel.org/r/20210331201154.20348-1-emilne@redhat.com
    Reviewed-by: Hannes Reinecke <hare@suse.de>
    Signed-off-by: Ewan D. Milne <emilne@redhat.com>
    Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
    Ewan D. Milne authored and martinkpetersen committed Apr 6, 2021
Older