Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SDRAM initialization fails on some OrangeCrabs when LDM/UDM are used independently #174

Closed
Disasm opened this issue Dec 22, 2020 · 29 comments

Comments

@Disasm
Copy link

Disasm commented Dec 22, 2020

I'm not sure where this bug belongs, but I found that a pre-built orangecrab image from #164 can't initialize SDRAM:

Initializing SDRAM @0x40000000...
Switching SDRAM to software control.
Read leveling:
  m0, b0: |11100000| delays: 01+-01
  m0, b1: |00000000| delays: -
  m0, b2: |00000000| delays: -
  m0, b3: |00000000| delays: -
  best: m0, b00 delays: 01+-01
  m1, b0: |11100000| delays: 02+-01
  m1, b1: |00000000| delays: -
  m1, b2: |00000000| delays: 00+-00
  m1, b3: |00000000| delays: -
  best: m1, b00 delays: 01+-01
Switching SDRAM to hardware control.
Memtest at 0x40000000 (2MiB)...
  Write: 0x40000000-0x40200000 2MiB     
   Read: 0x40000000-0x40200000 2MiB     
  bus errors:  2/256
  addr errors: 32/8192
  data errors: 524279/524288
Memtest KO
Memory initialization failed

Same thing happens with a bitstream built with ./make.py --board=orangecrab --cpu-count=1 --build.
I tried with two different OrangeCrab boards, the outcome is the same. The orangecrab target from litex-boards works without problems.

@Disasm
Copy link
Author

Disasm commented Dec 22, 2020

May be the same root cause as in #154

I tried changing parameters in the orangecrab target based on vexriscv (not -smp) and found that this behavior is triggered when I set l2_size to zero. However, changing it to 2048 in linux-on-litex-vexriscv doesn't help.

@enjoy-digital
Copy link
Member

enjoy-digital commented Dec 22, 2020

@Disasm: thanks for the feedback. The bitstream from #164 has been tested on hardware. Can you also test this one that I just re-generated and tested? orange_crab_2020_12_22.zip For now I just want to see if there are variations between boards.

@Disasm
Copy link
Author

Disasm commented Dec 22, 2020

Same error on both boards, but output is slightly different:

Initializing SDRAM @0x40000000...
Switching SDRAM to software control.
Read leveling:
  m0, b0: |11100000| delays: 01+-01
  m0, b1: |00000000| delays: -
  m0, b2: |00000000| delays: -
  m0, b3: |00000000| delays: -
  best: m0, b00 delays: 01+-01
  m1, b0: |11100000| delays: 02+-01
  m1, b1: |00000000| delays: -
  m1, b2: |00000000| delays: 00+-00
  m1, b3: |00000000| delays: -
  best: m1, b00 delays: 01+-01
Switching SDRAM to hardware control.
Memtest at 0x40000000 (2MiB)...
  Write: 0x40000000-0x40200000 2MiB     
   Read: 0x40000000-0x40200000 2MiB     
  bus errors:  2/256
  addr errors: 32/8192
  data errors: 524279/524288
Memtest KO
Memory initialization failed
nitializing SDRAM @0x40000000...
Switching SDRAM to software control.
Read leveling:
  m0, b0: |11100000| delays: 02+-01
  m0, b1: |00000000| delays: -
  m0, b2: |00000000| delays: 00+-00
  m0, b3: |00000000| delays: -
  best: m0, b00 delays: 01+-01
  m1, b0: |11100000| delays: 02+-01
  m1, b1: |00000000| delays: -
  m1, b2: |00000000| delays: 00+-00
  m1, b3: |00000000| delays: -
  best: m1, b00 delays: 01+-01
Switching SDRAM to hardware control.
Memtest at 0x40000000 (2MiB)...
  Write: 0x40000000-0x40200000 2MiB     
   Read: 0x40000000-0x40200000 2MiB     
  bus errors:  2/256
  addr errors: 32/8192
  data errors: 524279/524288
Memtest KO
Memory initialization failed

@enjoy-digital
Copy link
Member

@Disasm: I'm not able to reproduce the issue. Can you provide more information:

  • Was it working before? If yes can you provide more info on the version you tested? (git commits?)
  • Are you using a regular OrangeCrab from the groupgets campaign? (R0.2 + 25F + MT41K64M16)? If not can you describe your OrangeCrab config?

@Disasm
Copy link
Author

Disasm commented Jan 4, 2021

I haven't tried older versions of linux-on-litex-vexriscv. I tested with commit 2758b5d (plus litex d90d3e043bc81b434d6424c8585c757d77853d2c, litedram 103072c68a2e3ec9c81f198e50e5427e5780580c), even tried building in a clean environment. All my attempts to use linux-on-litex-vexriscv failed in the same way. At the same time, the target from litex-boards works (unless I set l2_size to 0).

It's a regular OrangeCrab from the GroupGets campaign, with the parts you mentioned.

@wtfuzz
Copy link
Contributor

wtfuzz commented Jan 5, 2021

I am also getting memory failures on my OrangeCrab r0.2. This is built from all latest Git masters (I perform a pristine build on a local Gitlab CI/CD setup - trellis included).

Pre built images are failing as well. I have not used LiteX on this board before, so I don't have any data if a previous revision worked or not.

The code on the BGA is D9SFT which Micron says is a MT41K64M16TW-107 https://www.micron.com/products/dram/ddr3-sdram/part-catalog/mt41k64m16tw-107


        __   _ __      _  __
       / /  (_) /____ | |/_/
      / /__/ / __/ -_)>  <
     /____/_/\__/\__/_/|_|
   Build your hardware, easily!

 (c) Copyright 2012-2020 Enjoy-Digital
 (c) Copyright 2007-2015 M-Labs

 BIOS built on Jan  4 2021 19:58:31
 BIOS CRC passed (575ec431)

 Migen git sha1: d42aa6f
 LiteX git sha1: 16008d3f

--=============== SoC ==================--
CPU:		VexRiscv SMP-LINUX @ 64MHz
BUS:		WISHBONE 32-bit @ 4GiB
CSR:		32-bit data
ROM:		40KiB
SRAM:		8KiB
L2:		0KiB
SDRAM:		131072KiB 16-bit @ 256MT/s (CL-6 CWL-5)

--========== Initialization ============--
Initializing SDRAM @0x40000000...
Switching SDRAM to software control.
Read leveling:
  m0, b0: |11100000| delays: 01+-01
  m0, b1: |00000000| delays: -
  m0, b2: |00000000| delays: -
  m0, b3: |00000000| delays: -
  best: m0, b00 delays: 01+-01
  m1, b0: |11100000| delays: 02+-01
  m1, b1: |00000000| delays: -
  m1, b2: |00000000| delays: 00+-00
  m1, b3: |00000000| delays: -
  best: m1, b00 delays: 01+-01
Switching SDRAM to hardware control.
Memtest at 0x40000000 (2MiB)...
  Write: 0x40000000-0x40200000 2MiB     
   Read: 0x40000000-0x40200000 2MiB     
  bus errors:  2/256
  addr errors: 32/8192
  data errors: 524279/524288
Memtest KO
Memory initialization failed

--============= Console ================--

litex>

@enjoy-digital
Copy link
Member

Thanks @wtfuzz for the feedback. It seems the issue was already present before but hidden by the L2 cache that we no longer use here with VexRiscv-SMP. I've already fixed DM on the OrangeCrab that was inverted on the hardware (LDM/UDM) but there is probably something else. I've been able to reproduce the issue on a variant with a MT41K512M16 memory and will investigate.

@enjoy-digital enjoy-digital changed the title SDRAM initialization fails on OrangeCrab SDRAM initialization fails on some OrangeCrabs Jan 12, 2021
@enjoy-digital
Copy link
Member

Some results of testing I just did with 3 OrangeCrabs I have. (2 MT41K64M16 variants from the campaign and 1 MT41K512M16). The 2 MT42K64M16 are always passing memtest here, the MT41K512M16 never. The MT41K512M16 seems to be passing memtest as soon as DM are not used (which is the case with the L2 Cache that always transfers 128-bits to the DRAM with the OrangeCrab config). When disabling the L2 Cache or using VexRiscv-SMP (that is directly attached to the DRAM and use DMs), this boards always fails. So the issue was already present before but masked by the L2 Cache, that is no longer used with VexRiscv-SMP. I've not been able to reproduce this issue for now on others ECP5 boards (tested on Trellisboard and ECPIX5).

On boards that are not working, we don't get complete garbage data, only miss-placed:

image

This will need to be investigated further.

@Disasm
Copy link
Author

Disasm commented Jan 12, 2021

Hmm, on my board read results seem to be affected by the previous reads:
Screenshot at 2021-01-12 23-06-16
Also write to 0x4000000c writes to the next address:
Screenshot at 2021-01-12 22-43-59
Maybe both are caused by the same effect.

I found this on the board schematics:
Screenshot at 2021-01-12 23-27-25
I'm not sure if it's a mistake of a clever hack, but I can't "fix" it by swapping two pins in the board definition: I get an error I don't understand. Maybe @gregdavill can tell something here. UPD: this was intentional.

@enjoy-digital
Copy link
Member

@Disasm: the LDQS/UDQS swap is explained here: orangecrab-fpga/orangecrab-hardware#24 but UDM/LDM should also have been swapped which is not the case. I've already been able to add a workaround for the UDM/LDM swap in the gateware with enjoy-digital/litedram@33f3aa5 and litex-hub/litex-boards@00fc2c5 but this just a workaround since UDM is still coupled to the LDQS/LDAT timings in the FPGA and same for LDM with UDQS/UDAT. This maybe explains why it works on some boards and not on others. We are currently not able to swap directly UDM/LDM due to the ODDRX2DQA that is used to drive DM and linked to the byte group through i_DQSW270.

@enjoy-digital enjoy-digital changed the title SDRAM initialization fails on some OrangeCrabs SDRAM initialization fails on some OrangeCrabs when LDM/UDM are used independently Jan 13, 2021
@gregdavill
Copy link
Contributor

Huh, Maybe I'll have to dig into this a bit more. on my end if this is related to the OC hardware.

When I swapped the UDQS/LDQS I had thought I also swapped UDM/DQ + LDM/DQ... This was mostly to assist with routing/layout.

But then all the pin swapping was done in the LiteX board platform file, so I wouldn't have thought anything needed to change in LiteDRAM.

@enjoy-digital
Copy link
Member

@gregdavill: I've just been able to reproduce the issue on a Versa ECP5, so this probably has more chance to be related to the DM handling in the ECP5 LiteDRAM PHY than OC hardware (possible the DM swap does not help here, but this does not seem to be the root cause). I'll try to investigate more (and it's easier on the 45F of the Versa that has more room for LiteScope) but I have to work others priorities for now.

@enjoy-digital
Copy link
Member

@Disasm, @wtfuzz; would you mind testing this bitstream on your OrangeCrab?: orangecrab_2020_01_20_test.zip (and reset the board multiple times with the button of the OrangeCrab to verify it's calibrating correctly each time?) Thanks.

@wtfuzz
Copy link
Contributor

wtfuzz commented Jan 20, 2021

@enjoy-digital no joy here.

(fpga_env) fuzz@ryzen:~/oc/test$ lxterm --speed 1e6 /dev/ttyACM1

        __   _ __      _  __
       / /  (_) /____ | |/_/
      / /__/ / __/ -_)>  <
     /____/_/\__/\__/_/|_|
   Build your hardware, easily!

 (c) Copyright 2012-2020 Enjoy-Digital
 (c) Copyright 2007-2015 M-Labs

 BIOS built on Jan 20 2021 19:46:25
 BIOS CRC passed (b373203c)

 Migen git sha1: 40b1092
 LiteX git sha1: 57289dd4

--=============== SoC ==================--
CPU:		VexRiscv SMP-LINUX @ 64MHz
BUS:		WISHBONE 32-bit @ 4GiB
CSR:		32-bit data
ROM:		40KiB
SRAM:		8KiB
L2:		0KiB
SDRAM:		131072KiB 16-bit @ 256MT/s (CL-6 CWL-5)

--========== Initialization ============--
Initializing SDRAM @0x40000000...
Switching SDRAM to software control.
Read leveling:
  m0, b0: |00111000| delays: 03+-01
  m0, b1: |00000000| delays: -
  m0, b2: |00000000| delays: -
  m0, b3: |00000000| delays: -
  best: m0, b00 delays: 03+-01
  m1, b0: |00111000| delays: 03+-01
  m1, b1: |00000000| delays: -
  m1, b2: |00000000| delays: -
  m1, b3: |00000000| delays: -
  best: m1, b00 delays: 03+-01
Switching SDRAM to hardware control.
Memtest at 0x40000000 (2MiB)...
  Write: 0x40000000-0x40200000 2MiB     
   Read: 0x40000000-0x40200000 2MiB     
  bus errors:  2/256
  addr errors: 32/8192
  data errors: 524279/524288
Memtest KO
Memory initialization failed

--============= Console ================--

@enjoy-digital
Copy link
Member

Thanks for the test @wtfuzz, even if not working that's useful to collect results. I start understanding the issue I think and will probably have another bitstream to test soon.

@Disasm
Copy link
Author

Disasm commented Jan 21, 2021

        __  __  __
        __   _ __      _  __
       / /  (_) /____ | |/_/
      / /__/ / __/ -_)>  <
     /____/_/\__/\__/_/|_|
   Build your hardware, easily!

 (c) Copyright 2012-2020 Enjoy-Digital
 (c) Copyright 2007-2015 M-Labs

 BIOS built on Jan 20 2021 19:46:25
 BIOS CRC passed (b373203c)

 Migen git sha1: 40b1092
 LiteX git sha1: 57289dd4

--=============== SoC ==================--
CPU:		VexRiscv SMP-LINUX @ 64MHz
BUS:		WISHBONE 32-bit @ 4GiB
CSR:		32-bit data
ROM:		40KiB
SRAM:		8KiB
L2:		0KiB
SDRAM:		131072KiB 16-bit @ 256MT/s (CL-6 CWL-5)

--========== Initialization ============--
Initializing SDRAM @0x40000000...
Switching SDRAM to software control.
Read leveling:
  m0, b0: |00111000| delays: 03+-01
  m0, b1: |00000000| delays: -
  m0, b2: |00000000| delays: -
  m0, b3: |00000000| delays: -
  best: m0, b00 delays: 03+-01
  m1, b0: |00111000| delays: 03+-01
  m1, b1: |00000000| delays: -
  m1, b2: |00000000| delays: -
  m1, b3: |00000000| delays: -
  best: m1, b00 delays: 03+-01
Switching SDRAM to hardware control.
Memtest at 0x40000000 (2MiB)...
  Write: 0x40000000-0x40200000 2MiB     
   Read: 0x40000000-0x40200000 2MiB     
  bus errors:  2/256
  addr errors: 32/8192
  data errors: 524279/524288
Memtest KO
Memory initialization failed

The output is consistent across different runs and two OrangeCrab boards.

@enjoy-digital
Copy link
Member

Thanks @Disasm, while you are testing this, would you mind also testing this one?: orangecrab_2021_01_21.zip

@gregdavill
Copy link
Contributor

I just gave this a spin. Same error.

        __   _ __      _  __
       / /  (_) /____ | |/_/
      / /__/ / __/ -_)>  <
     /____/_/\__/\__/_/|_|
   Build your hardware, easily!

 (c) Copyright 2012-2020 Enjoy-Digital
 (c) Copyright 2007-2015 M-Labs

 BIOS built on Jan 21 2021 10:00:51
 BIOS CRC passed (e544b8bd)

 Migen git sha1: 40b1092
 LiteX git sha1: 57289dd4

--=============== SoC ==================--
CPU:            VexRiscv SMP-LINUX @ 64MHz
BUS:            WISHBONE 32-bit @ 4GiB
CSR:            32-bit data
ROM:            40KiB
SRAM:           8KiB
L2:             0KiB
SDRAM:          131072KiB 16-bit @ 256MT/s (CL-6 CWL-5)

--========== Initialization ============--
Initializing SDRAM @0x40000000...
Switching SDRAM to software control.
Read leveling:
  m0, b0: |00111000| delays: 04+-01
  m0, b1: |00000000| delays: -
  m0, b2: |00000000| delays: -
  m0, b3: |00000000| delays: -
  best: m0, b00 delays: 04+-01
  m1, b0: |00111000| delays: 04+-01
  m1, b1: |00000000| delays: -
  m1, b2: |00000000| delays: -
  m1, b3: |00000000| delays: -
  best: m1, b00 delays: 04+-01
Switching SDRAM to hardware control.
Memtest at 0x40000000 (2MiB)...
  Write: 0x40000000-0x40200000 2MiB     
   Read: 0x40000000-0x40200000 2MiB     
  bus errors:  2/256
  addr errors: 32/8192
  data errors: 524279/524288
Memtest KO
Memory initialization failed

@enjoy-digital
Copy link
Member

enjoy-digital commented Jan 21, 2021

@gregdavill: thanks, would you mind trying with CL=7?:

./ddr3_mr_gen.py --cl=7
DDR3 Timing Settings:
cl:  7
cwl: 5
DDR3 Electrical Settings:
rtt_nom: 60ohm
rtt_wr:  60ohm
ron:     34ohm
Commands to be used with LiteX BIOS:
sdram_mr_write 0 2608
sdram_mr_write 1 2054
sdram_mr_write 2 512

Just recopy the sdram_mr_write commands in the BIOS, then do sdram_cal and sdram_test.

@gregdavill
Copy link
Contributor

Looks like it's applied the CL value, because cal shows a different position.
But test still fails :/

litex> sdram_mr_write 0 2608

Switching SDRAM to software control.
Writing 0x0a30 to MR0
Switching SDRAM to hardware control.

litex> sdram_mr_write 1 2054

Switching SDRAM to software control.
Writing 0x0806 to MR1
Switching SDRAM to hardware control.

litex> sdram_mr_write 2 512

Switching SDRAM to software control.
Writing 0x0200 to MR2
Switching SDRAM to hardware control.

litex> sdram_cal

Switching SDRAM to software control.
Read leveling:
  m0, b0: |00000011| delays: 08+-01
  m0, b1: |00000000| delays: -
  m0, b2: |00000000| delays: -
  m0, b3: |00000000| delays: -
  best: m0, b00 delays: 08+-01
  m1, b0: |00000011| delays: 08+-01
  m1, b1: |00000000| delays: -
  m1, b2: |00000000| delays: 02+-00
  m1, b3: |00000000| delays: -
  best: m1, b00 delays: 08+-01
Switching SDRAM to hardware control.

litex> sdram_test

Memtest at 0x40000000 (4MiB)...
  Write: 0x40000000-0x40400000 4MiB     
   Read: 0x40000000-0x40400000 4MiB     
  bus errors:  9/256
  addr errors: 1419/8192
  data errors: 933264/1048576
Memtest KO

@Disasm
Copy link
Author

Disasm commented Jan 21, 2021

Thanks @Disasm, while you are testing this, would you mind also testing this one?: orangecrab_2021_01_21.zip

I got the same output as the one reported by @gregdavill.

@enjoy-digital
Copy link
Member

Use of DMs are also causing issues on the Colorlight I5: https://github.com/kazkojima/colorlight-i5-tips#sdram-issue-on-linux-on-litex-vexriscv-head here because DM are connected to ground. So even if the issue is fixed on OrangeCrab, we'll have to support a mode where we'll always doing non-masked access to the DRAM (as before with VexRiscv interfaced through Wishbone and L2 to LiteDRAM).

@enjoy-digital
Copy link
Member

Since DMs cannot be used on all boards, we added a mode with @Dolu1990 to do the DRAM accesses through the peripheral bus/L2 cache for boards that require it (this the the behavior that was previously in place for VexRiscv Linux non-SMP). Here is the bistream for the OrangeCrab re-generated with this: orangecrab_2021_01_24.zip @Disasm, @gregdavill, @wtfuzz would you mind testing it? If also working on your boards, I'll merge the changes and will enable this for the OrangeCrab.

@Disasm
Copy link
Author

Disasm commented Jan 24, 2021

@enjoy-digital Now all DRAM tests passed. One small issue: 0x0x40000000 in the output.

@wtfuzz
Copy link
Contributor

wtfuzz commented Jan 24, 2021

Tests are passing here.

Memtest at 0x0x40000000 (2MiB)...
  Write: 0x40000000-0x40200000 2MiB     
   Read: 0x40000000-0x40200000 2MiB     
Memtest OK
Memspeed at 0x0x40000000 (2MiB)...
  Write speed: 16MiB/s
   Read speed: 12MiB/s

@enjoy-digital
Copy link
Member

Thanks @Disasm, @wtfuzz. The double 0x0x was a typo in enjoy-digital/litex#776 that is fixed by enjoy-digital/litex@1a38d51.

@gregdavill
Copy link
Contributor

Works fine here too! Thanks @enjoy-digital & @Dolu1990

@enjoy-digital
Copy link
Member

The changes have been merged:

And the OrangeCrab bistream updated in #164.

A git submodule update --init --recursive will required in pythondata-cpu-vexriscv_smp to fully update it. Thanks for the feedback/tests in this issue.

@Disasm
Copy link
Author

Disasm commented Jan 25, 2021

Thank you!

arturkow2 added a commit to Dasharo/TwPM_toplevel that referenced this issue Nov 17, 2023
DM swap is required for DRAM to work properly on OrangeCrab due to
swapped LDM/UDM lines on the board.

See litex-hub/linux-on-litex-vexriscv#174 (comment)
arturkow2 added a commit to Dasharo/TwPM_toplevel that referenced this issue Nov 17, 2023
DM swap is required for DRAM to work properly on OrangeCrab due to
swapped LDM/UDM lines on the board.

See litex-hub/linux-on-litex-vexriscv#174 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants