Skip to content

Commit

Permalink
net/mlx5: Fix SF health recovery flow
Browse files Browse the repository at this point in the history
[ Upstream commit 33de865 ]

SF do not directly control the PCI device. During recovery flow SF
should not be allowed to do pci disable or pci reset, its PF will do it.

It fixes the following kernel trace:
mlx5_core.sf mlx5_core.sf.25: mlx5_health_try_recover:387:(pid 40948): starting health recovery flow
mlx5_core 0000:03:00.0: mlx5_pci_slot_reset was called
mlx5_core 0000:03:00.0: wait vital counter value 0xab175 after 1 iterations
mlx5_core.sf mlx5_core.sf.25: firmware version: 24.32.532
mlx5_core.sf mlx5_core.sf.23: mlx5_health_try_recover:387:(pid 40946): starting health recovery flow
mlx5_core 0000:03:00.0: mlx5_pci_slot_reset was called
mlx5_core 0000:03:00.0: wait vital counter value 0xab193 after 1 iterations
mlx5_core.sf mlx5_core.sf.23: firmware version: 24.32.532
mlx5_core.sf mlx5_core.sf.25: mlx5_cmd_check:813:(pid 40948): ENABLE_HCA(0x104) op_mod(0x0) failed,
status bad resource state(0x9), syndrome (0x658908)
mlx5_core.sf mlx5_core.sf.25: mlx5_function_setup:1292:(pid 40948): enable hca failed
mlx5_core.sf mlx5_core.sf.25: mlx5_health_try_recover:389:(pid 40948): health recovery failed

Fixes: 1958fc2 ("net/mlx5: SF, Add auxiliary device driver")
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
  • Loading branch information
mosheshemesh2 authored and gregkh committed Jan 5, 2022
1 parent c0cc069 commit 5da639d
Showing 1 changed file with 6 additions and 5 deletions.
11 changes: 6 additions & 5 deletions drivers/net/ethernet/mellanox/mlx5/core/main.c
Expand Up @@ -1775,12 +1775,13 @@ void mlx5_disable_device(struct mlx5_core_dev *dev)

int mlx5_recover_device(struct mlx5_core_dev *dev)
{
int ret = -EIO;
if (!mlx5_core_is_sf(dev)) {
mlx5_pci_disable_device(dev);
if (mlx5_pci_slot_reset(dev->pdev) != PCI_ERS_RESULT_RECOVERED)
return -EIO;
}

mlx5_pci_disable_device(dev);
if (mlx5_pci_slot_reset(dev->pdev) == PCI_ERS_RESULT_RECOVERED)
ret = mlx5_load_one(dev);
return ret;
return mlx5_load_one(dev);
}

static struct pci_driver mlx5_core_driver = {
Expand Down

0 comments on commit 5da639d

Please sign in to comment.