AxiLiteCrossbarI2cMux: deselect TCA9548 channels between transactions#1420
Merged
Conversation
Each AXIL transaction previously wrote a single byte to the I2C MUX
channel-select register and assumed the device would atomically replace
the prior channel. On the BittWare XUP-VV8 this leaves enough sticky
state on TCA9548 channels 5 and 7 (QSFP slots 1 and 3) that the second
I2C transaction targeting those slots returns RESP=2 SLVERR and every
subsequent access cascades. Channels 4 and 6 (slots 0, 2) happen to
tolerate the same sequence, which made the failure look QSFP+ specific
when the actual differential is the TCA9548 channel mask.
Insert an explicit deselect-all write (0x00) before the channel-select
write. 0x00 is the documented "no channel selected" state for every
device in I2cMuxPkg.vhd (TCA9548, PCA9547, PCA9544A, PCA9546A, PCA9540B),
so the fix applies uniformly regardless of which decode map is used.
Two new states are added to the existing FSM:
DESELECT_S - waits for the 0x00 write to ack, then loads the saved
target channel mask and pulses i2cRstL low again.
MUX_RST_S - mirrors RST_S but transitions straight to MUX_S so the
deselect path does not re-enter DESELECT_S.
The chanMask record field carries the target channel mask through the
deselect-write phase. Flow is now:
IDLE_S -> RST_S -> DESELECT_S -> MUX_RST_S -> MUX_S -> XBAR_S -> IDLE_S
Validated on a BittWare XUP-VV8 (VU13P) with QSFP28 in slot 0 and QSFP+
modules in slots 1, 2, 3:
Before: slots 1 and 3 fail every transaction after the first one;
Qsfp[1].ReadDevice() cascades into hundreds of SLVERRs.
After: 300/300 ok on every slot in a sustained burst (~940 txn/s),
Qsfp[i].ReadDevice() succeeds on all 4 slots, ErrorCount = 0.
The change is additive for existing surf users - the deselect step is a
single extra I2C byte write per AXIL transaction and gives every
MUX-fronted bus the same clean channel-select handshake regardless of
whether the previous downstream device left the bus in a marginal state.
3010ea3 to
d67ab28
Compare
Contributor
Author
|
I regressed tested this code on KCU1500, which uses the i2cRstL port and confirmed that works as well with this new patch for the AXI-Lite I2C MUX |
bengineerd
approved these changes
May 18, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
0x00to the TCA9548) before each channel-select write inAxiLiteCrossbarI2cMux.DESELECT_S(waits for the 0x00 write to ack, loads target channel mask, pulsesi2cRstLagain) andMUX_RST_S(mirrorsRST_Sbut goes straight toMUX_Sso the deselect path doesn't re-enterDESELECT_S).IDLE_S → RST_S → DESELECT_S → MUX_RST_S → MUX_S → XBAR_S → IDLE_S.Why
On the BittWare XUP-VV8 (VU13P), every AXIL transaction to TCA9548 channels 5 and 7 (QSFP slots 1 and 3) returned
RESP=2 SLVERRand cascaded into hundreds of masked I2C failures, while channels 4 and 6 (slots 0 and 2) worked. The failure originally looked QSFP+ specific because slot 0 had a QSFP28 and slots 1/3 had QSFP+ modules — but a QSFP+ in slot 2 worked fine, proving the differential was the TCA9548 channel mask, not the module type.The TCA9548 channel-select register is a single byte that should be atomically overwritten by each new mask. In practice the previous channel left enough sticky state on certain channel pairs that the next transaction NACK'd or returned an undefined response. Forcing a
0x00write between transactions gives every channel a clean deassert before the new channel asserts.Validation
Tested on a BittWare XUP-VV8 (VU13P) with:
Qsfp[1].ReadDevice()(originally failing)Qsfp[i].ReadDevice()Compatibility
The change is additive — every TCA9548-fronted AXIL transaction picks up one extra single-byte I2C write to deassert all channels before the new channel-select. Existing surf users get the same cleaner handshake whether they need it or not; cost is one byte (~90 µs at 100 kHz SCL) per AXIL transaction.
Test plan
updatePcieFpga.py, reboot, confirmBittWareXupVv8DmaLoopbackHardware Type 0x002 enumeratesQsfp[i].Identifier.get()× 30 per slot — zero failuresQsfp[i].ReadDevice()(recursive, exercises_UpperPageProxy) on all 4 slots — zero failures