Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

the kc705 board memory bar may be broken #968

Closed
mingshenli opened this issue Mar 22, 2018 · 13 comments
Closed

the kc705 board memory bar may be broken #968

mingshenli opened this issue Mar 22, 2018 · 13 comments

Comments

@mingshenli
Copy link

the kc705 board can not function and it is believed that the memory bar is broken. However, we change a new 2G memory but it still can not be connect. @sbourdeauducq

the above is new and below is the original one.
477780476110960315

538664690862815004

@KaifengC
Copy link
Contributor

KaifengC commented Apr 19, 2018

We have the same problem with some of our KC705 boards.

At the beginning we guess it's the same of #525 . But even following the discussion we can't solve it.

I was wondering is that a memory bar problem? Did your board work well before it broken?

@sbourdeauducq
Copy link
Member

@KaifengC Can you post the full log?
Also please try with ARTIQ-4 which has improved memory support (plus prints more detailed logs).

@KaifengC
Copy link
Contributor

KaifengC commented Apr 19, 2018

There is nothing new. The gateware is artiq 2.x.
Rarely (<5%) after restart it will go through the Memory initialization part and works well.
But even in these cases, the board will lose its response at any time during working.

MiSoC BIOS
(c) Copyright 2007-2016 M-Labs Limited
Built Oct  2 2017 13:03:47

BIOS CRC passed (5c1c54fe)
Initializing SDRAM...
Write leveling: 15* 17* 13  15*  9   8   5   7  completed
Read bitslip: 7 6 5 4 3 2
Read delays: 7:03-13  6:02-14  5:05-15  4:06-16  3:12-22  2:11-21  1:00-11  0:01-11  completed
Memtest failed: 29181/532736 words incorrect
Memory initialization failed
BIOS>

Talking about upgrading ARTIQ-4, we are encountering another problem #984 .

@mingshenli
Copy link
Author

we send the board back to xlinx, but they said that the memory bar is ok. we are still trying to find the problem.

@KaifengC
Copy link
Contributor

Tried it using artiq 4.0.dev and got more information via serial port:

 __  __ _ ____         ____                                                     
|  \/  (_) ___|  ___  / ___|                                                    
| |\/| | \___ \ / _ \| |                                                        
| |  | | |___) | (_) | |___                                                     
|_|  |_|_|____/ \___/ \____|                                                    
                                                                                
MiSoC Bootloader                                                                
Copyright (c) 2017-2018 M-Labs Limited                                          
                                                                                
Bootloader CRC passed                                                           
Gateware ident 4.0.dev+820.gbb90fb7d                                            
Initializing SDRAM...                                                           
Write leveling scan:                                                            
Module 7:                                                                       
00000001111111111111000000000000                                                
Module 6:                                                                       
00000111111111111110000000000000                                                
Module 5:                                                                       
00000000111111111111110000110000                                                
Module 4:                                                                       
00000000011111111111100000110000                                                
Module 3:                                                                       
10000000000000011111111111111111                                                
Module 2:                                                                       
00000000000001111111111111111111                                                
Module 1:                                                                       
11110000000000000111111111111111                                                
Module 0:                                                                       
11000000000000011111111111111111                                                
Write leveling: 15* 17* 13 15* 9 8 5 7 done                                     
Read bitslip: 7 6 5 4 3 2                                                       
Read leveling scan:                                                             
Module 7:                                                                       
00111111111110000000000000000000                                                
Module 6:                                                                       
00111111111111000000000000000000                                                
Module 5:                                                                       
00000111111111100000000000000000                                                
Module 4:                                                                       
00000111111111111000000000000000                                                
Module 3:                                                                       
00000000000111111111110000000000                                                
Module 2:                                                                       
00000000000111111111000000000000                                                
Module 1:                                                                       
11111111110000000000000000000001                                                
Module 0:                                                                       
01111111110000000000000000000000                                                
Read leveling: 6+-5 7+-5 9+-5 10+-5 16+-5 15+-4 4+-5 5+-4 done                  
SDRAM initialized                                                               
Memory test failed (2075780/4458496 words incorrect)                            
Halting. 

I will install well-tested memory bar on it and try again.

@sbourdeauducq
Copy link
Member

The write leveling scans look unusual (but do not indicate broken hardware). Maybe the algo does not handle those corner cases correctly.

@KaifengC
Copy link
Contributor

Yes, it's the memory bar's problem.

I exchanged the memory bar of this board with another one taken from a well-working board.
It worked, and the "well-working board" can't go through the memory test now.

It strange that all this two boards are almost new. I can't figure out any difference using my eyes except for the SN number.

@KaifengC
Copy link
Contributor

By the way, it seems the artiq_flash command has changed in artiq 4.0.dev.
So how do I set the ip/mac address now?
The -m and proxy options are not working now.

@sbourdeauducq
Copy link
Member

-m is renamed -V (see release notes) and proxy is automatic (you don't need to specify it manually anymore).

@sbourdeauducq
Copy link
Member

Yes, it's the memory bar's problem.

DDR memory systems have a lot of board-to-board variation, this is why we have this calibration algorithm that runs at board startup. I suspect that the non-working memory module can be made to work by debugging and improving the algorithm:
https://github.com/m-labs/artiq/blob/master/artiq/firmware/libboard/sdram.rs

@KaifengC
Copy link
Contributor

KaifengC commented Apr 26, 2018

Okay, shall I send you one of this kind of non-working memory bar?

@gkasprow
Copy link
Collaborator

Did you try to run xilinx reference design on this board?

@KaifengC
Copy link
Contributor

Did you try to run xilinx reference design on this board?

Yes, an engineer from Xilinx came to my lab and tested the board. He told me that the board was completely normal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants