Skip to content

Commit a0e026e

Browse files
fancerdavem330
authored andcommitted
net: phy: Fix deadlocking in phy_error() invocation
Since commit 91a7cda ("net: phy: Fix race condition on link status change") all the phy_error() method invocations have been causing the nested-mutex-lock deadlock because it's normally done in the PHY-driver threaded IRQ handlers which since that change have been called with the phydev->lock mutex held. Here is the calls thread: IRQ: phy_interrupt() +-> mutex_lock(&phydev->lock); <--------------------+ drv->handle_interrupt() | Deadlock due +-> ERROR: phy_error() + to the nested +-> phy_process_error() | mutex lock +-> mutex_lock(&phydev->lock); <-+ phydev->state = PHY_ERROR; mutex_unlock(&phydev->lock); mutex_unlock(&phydev->lock); The problem can be easily reproduced just by calling phy_error() from any PHY-device threaded interrupt handler. Fix it by dropping the phydev->lock mutex lock from the phy_process_error() method and printing a nasty error message to the system log if the mutex isn't held in the caller execution context. Note for the fix to work correctly in the PHY-subsystem itself the phydev->lock mutex locking must be added to the phy_error_precise() function. Link: https://lore.kernel.org/netdev/20230816180944.19262-1-fancer.lancer@gmail.com Fixes: 91a7cda ("net: phy: Fix race condition on link status change") Suggested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Serge Semin <fancer.lancer@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
1 parent db1a6ad commit a0e026e

File tree

1 file changed

+7
-4
lines changed

1 file changed

+7
-4
lines changed

drivers/net/phy/phy.c

Lines changed: 7 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1184,9 +1184,11 @@ void phy_stop_machine(struct phy_device *phydev)
11841184

11851185
static void phy_process_error(struct phy_device *phydev)
11861186
{
1187-
mutex_lock(&phydev->lock);
1187+
/* phydev->lock must be held for the state change to be safe */
1188+
if (!mutex_is_locked(&phydev->lock))
1189+
phydev_err(phydev, "PHY-device data unsafe context\n");
1190+
11881191
phydev->state = PHY_ERROR;
1189-
mutex_unlock(&phydev->lock);
11901192

11911193
phy_trigger_machine(phydev);
11921194
}
@@ -1195,7 +1197,9 @@ static void phy_error_precise(struct phy_device *phydev,
11951197
const void *func, int err)
11961198
{
11971199
WARN(1, "%pS: returned: %d\n", func, err);
1200+
mutex_lock(&phydev->lock);
11981201
phy_process_error(phydev);
1202+
mutex_unlock(&phydev->lock);
11991203
}
12001204

12011205
/**
@@ -1204,8 +1208,7 @@ static void phy_error_precise(struct phy_device *phydev,
12041208
*
12051209
* Moves the PHY to the ERROR state in response to a read
12061210
* or write error, and tells the controller the link is down.
1207-
* Must not be called from interrupt context, or while the
1208-
* phydev->lock is held.
1211+
* Must be called with phydev->lock held.
12091212
*/
12101213
void phy_error(struct phy_device *phydev)
12111214
{

0 commit comments

Comments
 (0)