Skip to content

Commit

Permalink
Fix transitions in CXQ state machine
Browse files Browse the repository at this point in the history
transition() did not actually check the current state, which caused IO
handlers to open the queue prematurely, leading to crashes in a
multi-process setup.

There probably still lurks a race condition without an atomic "LOCK
CMPXCHG" operation, as commented in the code.

The CXQ state machine is called from the push()/pull() methods and
could introduce branches that might have a negative impact on
performance (side traces).
  • Loading branch information
alexandergall committed Oct 27, 2020
1 parent adc7d3d commit 2219748
Showing 1 changed file with 6 additions and 2 deletions.
8 changes: 6 additions & 2 deletions src/apps/mellanox/connectx4.lua
Expand Up @@ -163,8 +163,12 @@ local DEAD = 3
-- Returns true on successful transition, false if oldstate does not match.
function transition (cxq, oldstate, newstate)
-- XXX use atomic x86 "LOCK CMPXCHG" instruction. Have to teach DynASM.
cxq.state = newstate
return true
if cxq.state == oldstate then
cxq.state = newstate
return true
else
return false
end
end

---------------------------------------------------------------
Expand Down

0 comments on commit 2219748

Please sign in to comment.