Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix/simplify serwb clocking #967

Closed
sbourdeauducq opened this issue Mar 21, 2018 · 66 comments
Closed

fix/simplify serwb clocking #967

sbourdeauducq opened this issue Mar 21, 2018 · 66 comments
Assignees
Milestone

Comments

@sbourdeauducq
Copy link
Member

The serwb clocking is needlessly complicated (too many clock domains) and precludes the use of Xilinx-recommended clocking schemes for the IOSERDES:
https://www.xilinx.com/support/documentation/application_notes/xapp1324-design-selectio-component-primitives.pdf
https://www.xilinx.com/support/answers/67885.html

I suggest the following:

  • move all clocking (MMCMs etc.) out of serwb, make them part of the system CRG.
  • operate at 1Gbps (there is already a gearbox for non-integer ratios, so that should not be very complicated?).
  • reuse existing SDRAM and system clocks.

Note: for speed grade 1 (which is what we have), on HR banks, the SERDES in less-insane (as opposed to "native") mode is spec'd to 1000Mbps only.

@sbourdeauducq sbourdeauducq added this to the 4.0 milestone Mar 21, 2018
@hartytp
Copy link
Collaborator

hartytp commented Mar 26, 2018

What's the timeline for this? If the mem test now works, the SER WB is the main source of unreliability on Sayma from the ARTIQ side of things AFAICT. It would be great to have this working more robustly.

@enjoy-digital
Copy link
Contributor

@hartytp: i'll work on that on wednesday.

@enjoy-digital
Copy link
Contributor

@sbourdeauducq: i just want to check things with you before implementing:

  • serwb_serdes_20x_clk would be sys4x_clk
  • serwb_serdes_5x_clk would be sys_clk
  • serwb_serdes would be removed, we would use sys_clk with stb/ack dataflow control.

If so, should i keep specific serwb_serdes clock domains (ie serwb_serdes_4x/serwb_serdes) or should i use sys4x_clk/sys_clk?

@sbourdeauducq
Copy link
Member Author

Use sys4x_clk/sys_clk - the Wishbone side is in the sys_clk domain and synchronous to the SERDES.

@hartytp
Copy link
Collaborator

hartytp commented Apr 2, 2018

What's the status of this? It's been a while since the SDRAM was fixed, and I'd love to have a version of Sayma I can actually use.

@enjoy-digital
Copy link
Contributor

@hartytp: yes sorry, i started working on that but need to test and commit.

@enjoy-digital
Copy link
Contributor

@sbourdeauducq
Copy link
Member Author

I was doing some tests with a simple design but crashed JTAG on sayma3 with CTRL-C...

Doesn't @whitequark's usbreset.c work around this bug?

@enjoy-digital
Copy link
Contributor

@sbourdeauducq: i'm going to try, can you review the code to tell me if that's what you want?

@sbourdeauducq
Copy link
Member Author

No, that's not what was described here: #967 (comment)
We should not need a 25MHz clock, or any additional clock for that matter.

What I was thinking was:

  • the SERDES is in 8:1 mode and operates at 1Gbps line rate.
  • a "gearbox", running at 125MHz, takes 8-bit words from the ISERDES and turns them into 10-bit words (with a strobe signal indicating valid words) for subsequent 8b10b decoding.
  • another gearbox (with an ack signal indicating word acceptance) does 10->8 for the OSERDES.
  • I don't think using the dataflow stream classes is necessary there since it's just a strobe signal in one case and a ack signal in the other (but that's a detail, the stream classes are OK if you really prefer them).

@sbourdeauducq
Copy link
Member Author

Do we have a HP or HR bank?
HR is limited to 1.00Gbps. If we have HP, maybe we can instead keep 1.25Gbps, replace that stupid Ultrascale IOSERDES with IODDR (from my experience with RGMII, that one component wasn't lousy with bugs and misfeatures, though there may be surprises at high speeds), and do 5:10 in fabric.
Or even drop the special IOs completely and operate at a speed attainable by the registers (>500Mbps should still be attainable I believe, though the IODELAY range may then become an issue).

@enjoy-digital
Copy link
Contributor

I've implemented something and will test it tomorrow.

@enjoy-digital
Copy link
Contributor

I still have to do some debug, but i should not be that far from having something working using only sys/sys4x clock.

@enjoy-digital
Copy link
Contributor

It's working on a simple design (Nexys Video / SERWB overa HDMI cable and the two HDMI TX/RX ports / A master and a Slave in the same design). I'm doing more testing on that (P&R is only 30 seconds, so it's easier), cleaning up things, and i'll test/integrate in ARTIQ.

@enjoy-digital
Copy link
Contributor

Integrated and tested on ARTIQ. Scrambling is not enabled yet. I'll try to enable it next week. That would be good if @hartytp (or someone else) could do more testing and give me some feedback now that it's fresh in my mind.

@hartytp
Copy link
Collaborator

hartytp commented Apr 7, 2018

Great. Will look at that soon

@hartytp
Copy link
Collaborator

hartytp commented Apr 9, 2018

@enjoy-digital Thanks for doing that!

I'll have a look at that now. Anything you want me to do beyond power cycling it a few times and looking for errors?

@enjoy-digital
Copy link
Contributor

enjoy-digital commented Apr 9, 2018

@hartytp: yes you can do that. You also said you had spi issues with trying to debug hmc830. If that's something easy to reproduce, that would be interesting to test you still have issues.

@hartytp
Copy link
Collaborator

hartytp commented Apr 9, 2018

I never got to the point of having a proper test that reproducibly crashed the ser-WB as there were other more pressing issues at the time.

I'll start using Sayma again and keep an eye out, it things are as bad as before I'll see it soon enough.

@enjoy-digital
Copy link
Contributor

ok fine, thanks.

@hartytp
Copy link
Collaborator

hartytp commented Apr 13, 2018

Here is as many repeats as I could be bothered with of: artiq_flash -t sayma start followed by openocd -f sayma_new.cfg -c "pld load 0 rtm.bit; exit" https://hastebin.com/soxibaweka.sql

@hartytp
Copy link
Collaborator

hartytp commented Apr 13, 2018

Thanks @enjoy-digital !

@sbourdeauducq
Copy link
Member Author

sbourdeauducq commented Apr 13, 2018

@hartytp And does it go further (e.g talks to the HMC7043), or crash somewhere?

@sbourdeauducq
Copy link
Member Author

@enjoy-digital Are you implementing all the tricks that were found to be necessary for the Ultrascale IO garbage to behave with SDRAM into serwb?

@enjoy-digital
Copy link
Contributor

@hartytp, @jbqubit: sorry, stupid question, but was the rtm gateware also regenerated? (i just want to be sure)

Note that with old serwb, we were not able to run it a 1.25Gbps on all boards and reduced it to 625Mbps. It's now increased to 1Gbps, so it would be interesting to test on the others sayma as @sbourdeauducq is suggesting since IIRC it was not working correctly at 1.25Gbps.

@sbourdeauducq: we should implement the tricks but i'll double check.

@hartytp
Copy link
Collaborator

hartytp commented Apr 14, 2018

@enjoy_digital yes I rebuilt everything from scratch. Bit if you're worried then feel free to send me binaries to load.

@hartytp
Copy link
Collaborator

hartytp commented Apr 14, 2018

What about adding a prbs check to verify the link/clocking?

@enjoy-digital
Copy link
Contributor

enjoy-digital commented Apr 14, 2018

@hartytp: yes that's a good idea. I'll add a prbs check after initialization.

@hartytp
Copy link
Collaborator

hartytp commented Apr 14, 2018

Great, thanks!

@sbourdeauducq
Copy link
Member Author

You should be able to go to 500Mbps, and keep the same clocks, by putting the SERDES into 4:1 mode.
Speed for serwb is not very important, the priority is to get something to work at all and without intermittent or board-dependent bugs that waste the maximum amount of time.

@sbourdeauducq
Copy link
Member Author

Though, the maximum range of IDELAY varies between 1.28ns and 7.68ns with PVT (not a typo, welcome to Ultrascale). So, anything below ~780Mbps has to deal with the IDELAY range being maxed out, otherwise we will stay in the above-mentioned bug hell. Maybe the OSERDES can be in 4:1 and the ISERDES in 1:8 and you switch to the other set of outputs when the IDELAY maxes out. Use EN_VTC, I don't trust the Ultrascale taps to be stable over any significant amount of time.

@enjoy-digital
Copy link
Contributor

I've reduced serwb speed to 500Mbps, it will maybe improve things.

@hartytp: I've added PRBS on my simple design, but if we want to add PRBS to serwb in ARTIQ i have to think a bit how things should sequenced, since we are not able to control the RTM when in PRBS mode.
@jbqubit: your analysis is interesting, but i'm not sure to understand what's the difference between doing "artiq_flash -t sayma -f sayma.config storage start" and reloading the AMC, can you explain?

@jbqubit
Copy link
Contributor

jbqubit commented Apr 16, 2018

By "reloading the AMC" I mean that I bypass the Flash by running this.

@sbourdeauducq
Copy link
Member Author

sbourdeauducq commented Apr 17, 2018

I've reduced serwb speed to 500Mbps, it will maybe improve things.

Not this way: the RX timing windows are still the same, you depend on the data being correct on an arbitrary hardcoded set of downsampled ISERDES bits. You require hardware capable of 1Gbps but use it at 500Mbps.
Instead, you need to switch between two sets when the IDELAY maxes out.

TX looks fine.

@sbourdeauducq
Copy link
Member Author

sbourdeauducq commented Apr 17, 2018

have to think a bit how things should sequenced, since we are not able to control the RTM when in PRBS mode.

Have the gateware send/receive exactly X megabytes of PRBS data (on an already established link) then go back to normal operation?

@sbourdeauducq
Copy link
Member Author

Also I don't see how the scan algorithm can possibly work correctly right now, the delay range is less than 1 UI.
I suggest keeping 1Gb unless there is a clear reason not to, handling the delay range limit without bug is not straightforward.

@enjoy-digital
Copy link
Contributor

@sbourdeauducq: ok, i reverted linerate, moved some things, added a line test after initialization.
@jbqubit: i did some changes on RTM reset and added some debugs informations, can you do a test on your board?

@jbqubit
Copy link
Contributor

jbqubit commented Apr 17, 2018

@enjoy-digital I'm building --without-sawg now and will test shortly.

@enjoy-digital
Copy link
Contributor

@jbqubit: thanks

@jbqubit
Copy link
Contributor

jbqubit commented Apr 17, 2018

I built from master with --without-sawg. Using AMC+RTM. @enjoy-digital Looks like this was a step backward. memtest fails 10 of 10 times I tried on this board pair (AMC s/n -8).

  __  __ _ ____         ____ 
|  \/  (_) ___|  ___  / ___|
| |\/| | \___ \ / _ \| |    
| |  | | |___) | (_) | |___ 
|_|  |_|_|____/ \___/ \____|

MiSoC Bootloader
Copyright (c) 2017-2018 M-Labs Limited

Bootloader CRC passed
Gateware ident 4.0.dev+838.gfe689ab4.dirty
Initializing SDRAM...
DQS initial delay: 94 taps
Write leveling scan:
Module 3:
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001000111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111011111101000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
Module 2:
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000010011111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111011010101000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
Module 1:
0000000000000000000000000000000000000000000000000000000000000000000000000000000000001000111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111011011101000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
Module 0:
0000000000000000000000000000000000000000000000000000000000000000000000000000000000111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111101110000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
DQS initial delay: 94 taps
Write leveling: 80 85 106 107 done
Read leveling scan:
Module 3:
00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
Module 2:
00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
Module 1:
00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
Module 0:
00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
Read leveling: Zero window: 3: 0-0 (0)
SDRAM initialization failed
Halting.

memtest passes on this same board (AMC s/n -8) for 20180413 build from master using --without-sawg. I checked again today with the 20180413 version and see that it works.


Try with second AMC+RTM that's in the lab (AMC s/n -7). ... memtest and serwb passes but AD9154 init fails 6 of 6 times I tried.

__  __ _ ____         ____ 
|  \/  (_) ___|  ___  / ___|
| |\/| | \___ \ / _ \| |    
| |  | | |___) | (_) | |___ 
|_|  |_|_|____/ \___/ \____|

MiSoC Bootloader
Copyright (c) 2017-2018 M-Labs Limited

Bootloader CRC passed
Gateware ident 4.0.dev+838.gfe689ab4.dirty
Initializing SDRAM...
DQS initial delay: 114 taps
Write leveling scan:
Module 3:
00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000110111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111110000000000000000000000000000000000000000000000
Module 2:
00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001011111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111101111100100000000000000000000000000000000000
Module 1:
00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111000010000000000000000000000000000000000000000000000000000000000000000000000000
Module 0:
00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000101011111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111110001000000000000000000000000000000000000000000000000000000000
DQS initial delay: 114 taps
Write leveling: 110 105 126 121 done
Read leveling scan:
Module 3:
00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000100101111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111010000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
Module 2:
00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000100110011111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111100110000001000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
Module 1:
00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000010011011001111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111010000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
Module 0:
00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000110011111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111000100000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
Read leveling: 251+-88 238+-95 218+-91 209+-89 done
SDRAM initialized
Memory test passed

Booting from flash...
Starting firmware.
[     0.000007s]  INFO(runtime): ARTIQ runtime starting...
[     0.003889s]  INFO(runtime): software version 4.0.dev+796.g5ca59467
[     0.010157s]  INFO(runtime): gateware version 4.0.dev+838.gfe689ab4.dirty
[     0.016980s]  INFO(runtime): log level set to INFO by default
[     0.022677s]  INFO(runtime): UART log level set to INFO by default
[     0.028809s]  INFO(board_artiq::serwb): waiting for AMC/RTM serwb bridge to be ready...
[     2.632375s]  INFO(board_artiq::serwb): done.
[     2.635478s]  INFO(board_artiq::serwb): RTM gateware version 4.0.dev+838.gfe689ab4.dirty
[     2.643494s]  INFO(runtime): press 'e' to erase startup and idle kernels...
[     3.643005s]  INFO(runtime): continuing boot
[     3.645995s]  INFO(board_artiq::hmc830_7043::hmc7043): HMC7043 found
[     3.652293s]  INFO(board_artiq::hmc830_7043::hmc7043): HMC7043 configuration...
[     3.670405s]  INFO(board_artiq::ad9154): AD9154-0 found
[     3.674330s]  INFO(board_artiq::ad9154): AD9154-0 configuration...
[     3.882010s]  WARN(board_artiq::ad9154): AD9154-0 config attempt #0 failed (AD9154 SERDES PLL lock timeout), retrying
[     3.901589s]  INFO(board_artiq::ad9154): AD9154-0 found
[     3.905505s]  INFO(board_artiq::ad9154): AD9154-0 configuration...
[     4.113008s]  WARN(board_artiq::ad9154): AD9154-0 config attempt #1 failed (AD9154 SERDES PLL lock timeout), retrying
[     4.132581s]  INFO(board_artiq::ad9154): AD9154-0 found
[     4.136497s]  INFO(board_artiq::ad9154): AD9154-0 configuration...
[     4.344009s]  WARN(board_artiq::ad9154): AD9154-0 config attempt #2 failed (AD9154 SERDES PLL lock timeout), retrying
[     4.363583s]  INFO(board_artiq::ad9154): AD9154-0 found
[     4.367499s]  INFO(board_artiq::ad9154): AD9154-0 configuration...
[     4.575011s]  WARN(board_artiq::ad9154): AD9154-0 config attempt #3 failed (AD9154 SERDES PLL lock timeout), retrying
[     4.594585s]  INFO(board_artiq::ad9154): AD9154-0 found
[     4.598501s]  INFO(board_artiq::ad9154): AD9154-0 configuration...
[     4.806013s]  WARN(board_artiq::ad9154): AD9154-0 config attempt #4 failed (AD9154 SERDES PLL lock timeout), retrying
[     4.825587s]  INFO(board_artiq::ad9154): AD9154-0 found
[     4.829503s]  INFO(board_artiq::ad9154): AD9154-0 configuration...

memtest passes on this same board (AMC s/n -7) for 20180413 build from master using --without-sawg.

@sbourdeauducq
Copy link
Member Author

Please open two new issues for those two problems.

@sbourdeauducq
Copy link
Member Author

[ 0.003889s] INFO(runtime): software version 4.0.dev+796.g5ca59467
[ 0.010157s] INFO(runtime): gateware version 4.0.dev+838.gfe689ab4.dirty

And address the version mismatch before reporting any issues. Keep amc bitstream, rtm bitstream, bootloader and runtime in sync at all times, otherwise you are wasting time.

@enjoy-digital
Copy link
Contributor

@jbqubit: thanks for testing. You are indeed using old software, serwb should output more informations.

@jbqubit
Copy link
Contributor

jbqubit commented Apr 18, 2018

@sbourdeauducq You're right, I didn't roll back ARTIQ when trying to revert back to my archived build from 20180413. The tests with 838.gfe689ab4.dirty were with matching version of ARTIQ.

@jbqubit
Copy link
Contributor

jbqubit commented Apr 19, 2018

I built from master last night with SAWG. Here's what I see on my two Sayma AMC+RTM setup. Now with software gateware version match.

board 1

 __  __ _ ____         ____ 
|  \/  (_) ___|  ___  / ___|
| |\/| | \___ \ / _ \| |    
| |  | | |___) | (_) | |___ 
|_|  |_|_|____/ \___/ \____|

MiSoC Bootloader
Copyright (c) 2017-2018 M-Labs Limited

Bootloader CRC passed
Gateware ident 4.0.dev+840.ga4f18770
Initializing SDRAM...
DQS initial delay: 98 taps
Write leveling scan:
Module 3:
000000000000000000000000000000000000000000000000000000000000000000000000100001111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111110110000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000100111111111
Module 2:
000000000000000000000000000000000000000000000000000000000000000000000000000000000111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111011010000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000100111111111
Module 1:
000000000000000000000000000000000000000000000000000000001000111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111100110000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000101010111111111111111111111111
Module 0:
000000000000000000000000000000000000000000000000000000001111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111110101000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000100011011111111111111111111111111111
DQS initial delay: 98 taps
Write leveling: 57 57 78 74 done
Read leveling scan:
Module 3:
00000000000000000000000000000000000111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111110101010000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
Module 2:
00000000000000000000001111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111010100000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
Module 1:
00101011111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111010101010000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
Module 0:
11111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111110101000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
Read leveling: 97+-61 88+-67 68+-60 62+-63 done
SDRAM initialized
Memory test passed

Booting from flash...
Starting firmware.
[     0.000006s]  INFO(runtime): ARTIQ runtime starting...
[     0.003888s]  INFO(runtime): software version 4.0.dev+840.ga4f18770
[     0.010154s]  INFO(runtime): gateware version 4.0.dev+840.ga4f18770
[     0.016456s]  INFO(runtime): log level set to INFO by default
[     0.022155s]  INFO(runtime): UART log level set to INFO by default
[     0.028288s]  INFO(board_artiq::serwb): waiting for AMC/RTM serwb bridge to be ready...
[     2.456337s]  INFO(board_artiq::serwb): done.
[     2.459373s]  INFO(board_artiq::serwb): RTM to AMC Link test
[     3.465020s]  INFO(board_artiq::serwb): 209716 errors
[     3.468768s]  INFO(board_artiq::serwb): AMC to RTM Link test

Running artiq_flash start several times in succession I see that the number of serwb errors is fixed at 209716.

board 2

 __  __ _ ____         ____ 
|  \/  (_) ___|  ___  / ___|
| |\/| | \___ \ / _ \| |    
| |  | | |___) | (_) | |___ 
|_|  |_|_|____/ \___/ \____|

MiSoC Bootloader
Copyright (c) 2017-2018 M-Labs Limited

Bootloader CRC passed
Gateware ident 4.0.dev+840.ga4f18770
Initializing SDRAM...
DQS initial delay: 115 taps
Write leveling scan:
Module 3:
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000011101011111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111110101101100001010000001000000000000000000000000000000000000000000000000000000000
Module 2:
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000010010110111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111001100000000000000000000000000000000000000000000000000000000000000
Module 1:
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000010111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111011111000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
Module 0:
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001011111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111110111010000000000000000000000000000000000000000000000000000000000000000000000000000
DQS initial delay: 115 taps
Write leveling: 96 91 103 108 done
Read leveling scan:
Module 3:
00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000100000010100000100101110111111111110111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111011111111111110110101100010000100000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
Module 2:
00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000010000000010000010001010001011111110110111101111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111100111110011111000000000000000100110100000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
Module 1:
00000000000000000000000000000000000000000000000000000000000000000000100000000001000000110000100111111111101101101111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111101111111111111101110011011110111100000000100000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
Module 0:
00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000010101111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111011111011011111001111101011000011000010000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
Read leveling: 216+-54 195+-53 177+-52 175+-53 done
SDRAM initialized
Memory test passed

Booting from flash...
Starting firmware.
[     0.000006s]  INFO(runtime): ARTIQ runtime starting...
[     0.003888s]  INFO(runtime): software version 4.0.dev+840.ga4f18770
[     0.010154s]  INFO(runtime): gateware version 4.0.dev+840.ga4f18770
[     0.016456s]  INFO(runtime): log level set to INFO by default
[     0.022155s]  INFO(runtime): UART log level set to INFO by default
[     0.028288s]  INFO(board_artiq::serwb): waiting for AMC/RTM serwb bridge to be ready...

@hartytp
Copy link
Collaborator

hartytp commented Apr 19, 2018

Joe do those bad mem test eyes go away when you unplug the RTM?

@hartytp
Copy link
Collaborator

hartytp commented Apr 19, 2018

Even though the men test passes on board 2 the eye scan looks like garbage so I'd not be surprised if it crashes pretty quickly.

@sbourdeauducq
Copy link
Member Author

sbourdeauducq commented Apr 20, 2018

No errors on the HKG board

[     0.003888s]  INFO(runtime): software version 4.0.dev+840.ga4f18770
[     0.010154s]  INFO(runtime): gateware version 4.0.dev+840.ga4f18770
[     0.016425s]  INFO(runtime): log level set to INFO by default
[     0.022136s]  INFO(runtime): UART log level set to INFO by default
[     0.028288s]  INFO(board_artiq::serwb): waiting for AMC/RTM serwb bridge to be ready...
[     0.323355s]  INFO(board_artiq::serwb): done.
[     0.326386s]  INFO(board_artiq::serwb): RTM to AMC Link test
[     1.332034s]  INFO(board_artiq::serwb): 0 errors
[     1.335337s]  INFO(board_artiq::serwb): AMC to RTM Link test
[     2.340984s]  INFO(board_artiq::serwb): 0 errors
[     2.344285s]  INFO(board_artiq::serwb): AMC serwb settings:
[     2.349840s]  INFO(board_artiq::serwb):   delay_min_found: 1
[     2.355484s]  INFO(board_artiq::serwb):   delay_min: 32
[     2.360694s]  INFO(board_artiq::serwb):   delay_max_found: 1
[     2.366339s]  INFO(board_artiq::serwb):   delay_max: 215
[     2.371635s]  INFO(board_artiq::serwb):   delay: 123
[     2.376584s]  INFO(board_artiq::serwb):   bitslip: 2
[     2.381534s]  INFO(board_artiq::serwb):   ready: 1
[     2.386310s]  INFO(board_artiq::serwb):   error: 0
[     2.391085s]  INFO(board_artiq::serwb): RTM serwb settings:
[     2.396644s]  INFO(board_artiq::serwb):   delay_min_found: 1
[     2.402288s]  INFO(board_artiq::serwb):   delay_min: 2
[     2.407410s]  INFO(board_artiq::serwb):   delay_max_found: 1
[     2.413054s]  INFO(board_artiq::serwb):   delay_max: 12
[     2.418264s]  INFO(board_artiq::serwb):   delay: 7
[     2.423040s]  INFO(board_artiq::serwb):   bitslip: 16
[     2.428076s]  INFO(board_artiq::serwb):   ready: 1
[     2.432852s]  INFO(board_artiq::serwb):   error: 0
[     2.437669s]  INFO(board_artiq::serwb): RTM gateware version 4.0.dev+840.ga4f18770

Note: this is with Vivado 2018.1.

@enjoy-digital
Copy link
Contributor

@jbqubit, @sbourdeauducq: thanks for testing.

For information, i still have issues when enabling scrambling on the HKG board (which is not the case with my artix7 test design). I'll investigate on that to try to understand and see if it could be related to @jbqubit issue. If we are not able to reproduce the issue with the HKG board, i'll need to have access to a board that is not working.

@jbqubit
Copy link
Contributor

jbqubit commented Apr 20, 2018

Joe do those bad mem test eyes go away when you unplug the RTM?

I sent 2 AMC and 3 RTM back to WUT yesterday so I can’t do this test. They should arrive on Monday.

@enjoy-digital
Copy link
Contributor

Clocking has been simplified. If you still have issues with serwb, please open specific ones.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants