New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sayma PRBS errors when FPGA JESD transceiver clock is GTP_CLK2 #1080
Comments
Maybe this has something to to with TXEN pin of DAC? |
Maybe. But it happens on both DACs and we only altered the TXEN on DAC2... |
that's true. But this was the only modification I did. There is 3.3V -> 1.8V conversion using 200R resistor that injects current to 1.8V port of DAC and FPGA. Theoretically the FPGA has protection diodes, but DAC may not like voltage peaks of rougly 2.5V (1.8V + 0.7V of diode). I have no idea how this could affect second DAC channel in such bizarre way. |
My guess was that it's due to one of the recent ARTIQ commits rather than the HW changes. But, I might be wrong -- I haven't given it too much thought yet. |
the funny thing is that I started seeing PRBS errors on one board a few days ago, another was workin well. And next day second board also got PRBS "sickness ". |
Have you tried |
Okay. I'll try that and your blinker next to see if we can find some issue with a simpler logic block that we can focus on instead of debugging complex jesd/memory issues. |
One data point here: running with the SAWG held in reset, I don't see the "crash kernel" crash. But, I do see a bunch of errors during init (JESD PRBS, can't determine SYSREF margin at FPGA) |
Note to self: try this with a no-sawg build. It would be interesting to see if there is a difference between no SAWG and SAWG in reset. If there is, then this seems much more like a vivado issue than a hardware issue. |
I rebuilt the current master without SAWG and still see this: https://hastebin.com/fafokucoqa.sql I have never seen this until recently (around the time that slave loading was added), but it's been in 100% of my recent builds. @sbourdeauducq do you see this issue on your board? |
No I don't. |
One change that is likely to have exposed this bug is this: |
I'm currently building with: f9910ab |
de7d64d can be easily reverted on top of master. It's a very simple change (disable the other 7043 clock output - not required in theory but let's be paranoid - and change dac_refclk back to 0). The sysref phase doesn't have an impact on PRBS. |
Running current master with de7d64d reverted does indeed work. |
@gkasprow @enjoy-digital any idea why that happens? The GTH quad imbalance is the same in both cases, and the number of crossed quads is within spec. |
Do you have the same behaviour if you only keep de7d64d but disable the GTP_CLK1 output of HMC7043? Could it be related to the fact that we now have two reference clocks active and still using QPLLXREFCLKSEL=0b001? (Table 2-8 of UG576) |
The ARTIQ code is not using the QPLL. Should it? |
Ah sorry we are using CPLL. Then maybe check CPLLREFCLKSEL & https://github.com/m-labs/jesd204b/blob/master/jesd204b/phy/gth.py#L260. Should we connect GTREFCLK1 and set CPLLREFCLKSEL to 2? |
How would that help? |
" a single external reference clock with multiple transceivers connected to multiple Quads. The user design connects the IBUDFS_GTE3 output (O) to the GTREFCLK0 ports of the GTHE3/4_COMMON and GTHE3/4_CHANNEL primitives for the GTH transceiver. |
@sbourdeauducq do you still want me to do this? I'm a bit short on time atm... |
No, that's unlikely to help. The other clock is either unrouted or routed to the other quad for DRTIO. And GTREFCLK0 is the correct setting as per the transceiver user guide I quoted above. |
@gkasprow remind me, did you try turning the HMC7043 GTP_CLK{1,2} outputs back to LVPECL? Did that help the PRBS errors? |
@hartytp I did help with clock amplitude but not with PRBS errors. |
@marmeladapk that's what I thought, thanks for confirming. It was a long shot, but I wondered if this was some SI issue related to the low clock amplitudes using LVDS outputs in combination with 200R LVPECL bias resistors. If you've tested that then I won't bother looking at it again. |
I still see this error after fixing the Vccint supply (I measure 0.951V at the 0R power resistors): https://hastebin.com/hevawerodo.sql I believe that @marmeladapk also found that the Vccint rework did not help the PRBS errors... |
Yes, today I got these errors once. |
@gkasprow Can you check SI and jitter on GTP_CLK1 and GTP_CLK2 on a board that exhibits this problem? And generally investigate this? |
Okay, good! So, the question is how the RTM affects the PRBS...Clock SI? Some PI issue? An issue with the DACs themselves? |
Does that RTM work when using GTP_CLK1? We need to make sure this is the same issue. |
If you prepare a bitstream I will test immediately. |
PRBS occurs on both DACs |
@hartytp I can do this, could you just point me in the general direction? I didn't follow this discussion closely. |
OK, I got it! |
How does this explain the behavior where CLK1 works but CLK2 doesn't? And why did you measure good clocks on both CLK1 and CLK2? |
If that looks good, look at the JESD lanes. And see if it all looks okay (trigger the scope from the JESD clock). You can try looking at the JESD lanes both with the PRBS pattern and with a square wave (using this patch #1080 (comment)) |
I didn't said that it works with CLK1. it was pure coincidence, maybe they broadcasted something else on FM when we did tests a few months ago :) |
@sbourdeauducq are you happy with that description of what needs to be done. #1080 (comment) |
@hartytp Without-sawg will have this problem? |
On my board, SAWG doesn't make any difference to this, so I've been testing --without-sawg to speed up my builds. |
Remember to check the UART for PRBS errors before testing, as there is no point testing on a working AMC/RTM pair. |
And since the SMA is grounded via 10pF, this has enough impedance to pickup nearby RF. I did tests:
then I removed the short circuit
then I disabled the generator
then I enabled the generator
The signal from generator leaks via non-ideal cable shield and via 10pF capacitor to the LTC chip but with poor quality, but enough for HMC830 to lock |
I cannot say I am "happy" about any of this, but yes, this procedure looks correct and hopefully will turn up something. |
@gkasprow that's posted on the wrong issue, I think. This is already hard to follow without cross-posting! |
I've noticed that, too many threads opened... |
@gkasprow to make sure we don't waste time, please can you send me the binaries you were using for your tests? I'd like to check that I can reproduce the PRBS issues on my board with your binaries. If I can then I'll post it back to you tomorrow morning. |
@marmeladapk please post here binaries I used for tests. It is on your computer on your account. |
Thanks for sending me the binaries @marmeladapk. Using them, I see:
|
No idea why I'm seeing this new error on DAC0. But, anyway, this does show up the PRBS errors on at least one DAC, so it seems to be a fine binary for testing. |
@gkasprow can you remind me where on the AMC I can probe CLK2? I'll double check that before I return the board. |
@gkasprow the AMC + RTM have been delivered and signed for in WUT. |
@gkasprow What was the problem? Why was it not found when measuring the clocks? |
The board I got from @hartytp had one of coupling capacitors missing (damaged mechanically). The board I got from @sbourdeauducq had SMA input shorted to its shell. Both caused PRBS issue. |
Do the hardware bugs discovered by @gkasprow explain all the reported PRBS errors related to this issue? |
Yes. |
Since doing the slave FPGA loading rework and upgrading to the latest master, I've started seeing PRBS errors roughly 100% of the time on boot. I never saw that before. Not sure if it's due to the rework or to changes in the code. IIRC @gkasprow saw this as well...
https://pastebin.com/2X0Y17B6
The text was updated successfully, but these errors were encountered: