Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't boot linux with MAIN_RAM sizes above 512mb #1922

Open
JoyBed opened this issue Apr 10, 2024 · 71 comments
Open

Can't boot linux with MAIN_RAM sizes above 512mb #1922

JoyBed opened this issue Apr 10, 2024 · 71 comments

Comments

@JoyBed
Copy link
Contributor

JoyBed commented Apr 10, 2024

I stumbled upon an interesting bug. Linux is unable to boot if the memory size is above 512mb. In litex BIOS the whole ram passes the tests and its working but as linux start to boot it cant boot if I have ram size specified above 512mb. Where does this limitation come from?
image

@JoyBed
Copy link
Contributor Author

JoyBed commented Apr 20, 2024

Well... When I limit it in the dts to 512mb then the linux can boot but the biggest problem I always had is NaxRiscv. The main_ram works on ANY softcore but not on the NaxRiscv, here is the screenshot:
image
Thats when its connected to main ram thru peripheral bus, when I connect the main ram directly to the axi4 ports it gets stuck at the "memtest at 0x40000000" and doesnt even count.

@Dolu1990
Copy link
Collaborator

Hi,

I just tested via :
litex_sim --cpu-type=naxriscv --with-sdram --sdram-module=MT41K128M16 --sdram-data-width=64

I got :

--=============== SoC ==================--
CPU:		NaxRiscv 32-bit @ 1MHz
BUS:		wishbone 32-bit @ 4GiB
CSR:		32-bit data
ROM:		128.0KiB
SRAM:		8.0KiB
L2:		8.0KiB
SDRAM:		1.0GiB 64-bit @ 8MT/s (CL-6 CWL-5)
MAIN-RAM:	1.0GiB

--========== Initialization ============--
Initializing SDRAM @0x40000000...
Switching SDRAM to software control.
Switching SDRAM to hardware control.
Memtest at 0x40000000 (8.0KiB)...
  Write: 0x40000000-0x40002000 8.0KiB   
   Read: 0x40000000-0x40002000 8.0KiB   
Memtest OK
Memspeed at 0x40000000 (Sequential, 8.0KiB)...
  Write speed: 2.9MiB/s
   Read speed: 3.4MiB/s

--============== Boot ==================--
Booting from serial...
Press Q or ESC to abort boot completely.
sL5DdSMmkekro
Cancelled

--============= Console ================--
litex> mem_test 0x40000000 0x1000
Memtest at 0x40000000 (4.0KiB)...
  Write: 0x40000000-0x40001000 4.0KiB   
   Read: 0x40000000-0x40001000 4.0KiB   
Memtest OK

litex> mem_test 0x50000000 0x1000
Memtest at 0x50000000 (4.0KiB)...
  Write: 0x50000000-0x50001000 4.0KiB   
   Read: 0x50000000-0x50001000 4.0KiB   
Memtest OK

litex> mem_test 0x60000000 0x1000
Memtest at 0x60000000 (4.0KiB)...
  Write: 0x60000000-0x60001000 4.0KiB   
   Read: 0x60000000-0x60001000 4.0KiB   
Memtest OK

litex> mem_test 0x70000000 0x1000
Memtest at 0x70000000 (4.0KiB)...
  Write: 0x70000000-0x70001000 4.0KiB   
   Read: 0x70000000-0x70001000 4.0KiB   
Memtest OK

What command line did you used ?
One specific thing about NaxRiscv, is that the CPU is generated with the exact knowledge of "where is some ram i can access".
So if there is a bug there it could create your case.

@JoyBed
Copy link
Contributor Author

JoyBed commented Apr 22, 2024

I used this:
./xilinx_zybo_z7_20.py --variant=original --cpu-type=naxriscv --xlen=64 --scala-args='rvc=true,rvf=true,rvd=true,alu-count=2,decode-count=2,mmu=true' --with-fpu --with-rvc --with-ps7 --bus-standard=axi-lite --with-spi-sdcard --sys-clk-freq=125e6 --with-xadc --csr-json zybo.json --uart-baudrate=2000000 --build --update-repo=wipe+recommended --vivado-synth-directive=PerformanceOptimized --vivado-route-directive=AggressiveExplore --with-hdmi-video-framebuffer --l2-bytes=262144 --l2-ways=16 --with-jtag-tap
When I generate a NaxRiscv with mbus thats connected to DRAM than it freezes at mem_test at 0x40000000, when I generate it with DRAM connected to pbus than I have those data errors as in the screenshot.

@Dolu1990
Copy link
Collaborator

when I generate it with DRAM connected to pbus

Hmm should realy not do that, the Nax SoC is realy intended to use cacheable memory through mbus. Things through pbus may not support atomic access and stuff like that.

When I generate a NaxRiscv with mbus thats connected to DRAM than it freezes at mem_test at 0x40000000,

Ahhh that is one thing. Probably related to the zynq nature of the FPGA. Not sure if there is a way to setup a simulation of the SoC / zynq with vivado ?

@trabucayre
Copy link
Collaborator

@JoyBed the DRAM present on zybo board is connected to the PS -> you can't use it from PL

@JoyBed
Copy link
Contributor Author

JoyBed commented Apr 22, 2024

when I generate it with DRAM connected to pbus

Hmm should realy not do that, the Nax SoC is realy intended to use cacheable memory through mbus. Things through pbus may not support atomic access and stuff like that.

When I generate a NaxRiscv with mbus thats connected to DRAM than it freezes at mem_test at 0x40000000,

Ahhh that is one thing. Probably related to the zynq nature of the FPGA. Not sure if there is a way to setup a simulation of the SoC / zynq with vivado ?

I dont know, I can launch simulation within Vivado but thats not helping much as I cant connect to the UART. At least i dont know about a way to do that.

@JoyBed the DRAM present on zybo board is connected to the PS -> you can't use it from PL

Actually you can thru the slave ports of the PS7 system, the HP slave ports are connected directly to DRAM. I am using that. Its working with EVERY softcore in the Litex except the NaxRiscv. Here you can see my target file:

#!/usr/bin/env python3

#
# This file is part of LiteX-Boards.
#
# Copyright (c) 2019-2020 Florent Kermarrec <florent@enjoy-digital.fr>,
# Copyright (c) 2022-2023 Oliver Szabo <16oliver16@gmail.com>
# SPDX-License-Identifier: BSD-2-Clause

import math
import os
from migen import *
from litex.gen import LiteXModule
from litex_boards.platforms import digilent_zybo_z7_20
from litex.soc.interconnect import axi
from litex.soc.interconnect import wishbone
from litex.soc.cores.clock import *
from litex.soc.integration.soc_core import *
from litex.soc.integration.builder import *
from litex.soc.cores.video import VideoVGAPHY
from litex.soc.cores.video import VideoS7HDMIPHY
from litex.soc.cores.usb_ohci import USBOHCI
from litex.soc.cores.led import LedChaser
from litex.soc.cores.xadc import XADC
from litex.soc.cores.dna  import DNA
from litex.soc.integration.soc import SoCRegion
from litex.soc.interconnect import csr_eventmanager
from litex.soc.interconnect.csr_eventmanager import EventManager, EventSourceLevel, EventSourcePulse
from litex.soc.interconnect.csr import AutoCSR
from litex.soc.cores import cpu


# CRG ----------------------------------------------------------------------------------------------

class _CRG(LiteXModule):
    def __init__(self, platform, sys_clk_freq, toolchain="vivado", use_ps7_clk=False, with_video_pll=False, with_usb_pll=False):
        self.rst    = Signal()
        self.cd_sys = ClockDomain()
        self.cd_vga = ClockDomain()
        self.cd_hdmi = ClockDomain()
        self.cd_hdmi5x = ClockDomain()
        self.cd_usb    = ClockDomain()
        # # #

	# Clk
        clk125 = platform.request("clk125")
        
        if use_ps7_clk:
            self.comb   +=  ClockSignal("sys").eq(ClockSignal("ps7"))
            self.comb   +=  ResetSignal("sys").eq(ResetSignal("ps7") | self.rst)
        else:
            # MMCM.
            #if toolchain == "vivado":
            #    self.mmcm = mmcm = S7MMCM(speedgrade=-2)
            #else:
            #    self.mmcm = mmcm = S7PLL(speedgrade=-2)
            self.mmcm = mmcm = S7MMCM(speedgrade=-2)
            #self.mmcm = mmcm = S7PLL(speedgrade=-1)
            #self.comb += mmcm.reset.eq(self.rst)
            mmcm.register_clkin(clk125, 125e6)
            mmcm.create_clkout(self.cd_sys, sys_clk_freq)
            platform.add_false_path_constraints(self.cd_sys.clk, mmcm.clkin) # Ignore sys_clk to mmcm.clkin path created by SoC's rst.
            mmcm.expose_drp()
            self.comb += mmcm.reset.eq(mmcm.drp_reset.re | self.rst)
            
        # Video PLL.
        if with_video_pll:
            self.video_pll = video_pll = S7PLL(speedgrade=-2)
            self.comb += video_pll.reset.eq(self.rst)
            video_pll.register_clkin(clk125, 125e6)
            #video_pll.create_clkout(self.cd_vga, 40e6)
            video_pll.create_clkout(self.cd_hdmi,   148.5e6)
            video_pll.create_clkout(self.cd_hdmi5x, 5*148.5e6)
            platform.add_false_path_constraints(self.cd_sys.clk, video_pll.clkin) # Ignore sys_clk to video_pll.clkin path created by SoC's rst.
            
        # USB PLL
        if with_usb_pll:
            mmcm.create_clkout(self.cd_usb, 48e6)
            
# BaseSoC ------------------------------------------------------------------------------------------

class BaseSoC(SoCCore):
    mem_map = {**SoCCore.mem_map, **{
        #"usb_ohci":     0xc0000000,
        "usb_ohci":	0x18000000,
    }}
    def __init__(self, sys_clk_freq=100e6, 
    	variant = "original",
    	toolchain="vivado", 
    	with_ps7 = False,
    	with_dna = False,
    	with_xadc = False,
    	with_usb_host=False, 
    	with_led_chaser = False,
    	with_video_terminal = False,
        with_video_framebuffer = False,
        with_hdmi_video_terminal = False,
        with_hdmi_video_framebuffer = False, 
    	**kwargs):

        self.interrupt_map = {
            "ps" : 2,
        }

        platform = digilent_zybo_z7_20.Platform(variant=variant)
        self.builder    = None
        self.with_ps7   = with_ps7
        
        # CRG --------------------------------------------------------------------------------------
        use_ps7_clk     = (kwargs.get("cpu_type", None) == "zynq7000")
        with_video_pll  = (with_hdmi_video_terminal or with_hdmi_video_framebuffer)
        with_usb_pll    = with_usb_host
        self.crg        = _CRG(platform, sys_clk_freq, use_ps7_clk, with_video_pll = with_hdmi_video_terminal or with_video_terminal or with_hdmi_video_framebuffer or with_video_framebuffer, with_usb_pll = with_usb_host)

        # SoCCore ----------------------------------------------------------------------------------
        if kwargs["uart_name"] == "serial":
            kwargs["uart_name"] = "usb_uart" # Use USB-UART Pmod on JB.
        if kwargs.get("cpu_type", None) == "zynq7000":
            kwargs["integrated_sram_size"] = 0x0
            kwargs["with_uart"] = False
            self.mem_map = {
                'csr': 0x4000_0000,  # Zynq GP0 default
            }
        SoCCore.__init__(self, platform, sys_clk_freq, ident="LiteX SoC on Zybo Z7/original Zybo", **kwargs)
        
        # USB Host ---------------------------------------------------------------------------------
        if with_usb_host:
            self.submodules.usb_ohci = USBOHCI(platform, platform.request("usb_host"), usb_clk_freq=int(48e6))
            self.bus.add_slave("usb_ohci_ctrl", self.usb_ohci.wb_ctrl, region=SoCRegion(origin=self.mem_map["usb_ohci"], size=0x100000, cached=False))
            #self.bus.add_slave("usb_ohci_ctrl", self.usb_ohci.wb_ctrl)
            self.dma_bus.add_master("usb_ohci_dma", master=self.usb_ohci.wb_dma)
            self.comb += self.cpu.interrupt[16].eq(self.usb_ohci.interrupt)
        
        # Zynq7000 Integration ---------------------------------------------------------------------
        if kwargs.get("cpu_type", None) == "zynq7000":
            self.cpu.use_rom = True
            if variant in ["z7-10", "z7-20", "original"]:
                # Get and set the pre-generated .xci FIXME: change location? add it to the repository? Make config
                os.makedirs("xci", exist_ok=True)
                os.system("wget https://github.com/litex-hub/litex-boards/files/8339591/zybo_z7_ps7.txt")
                os.system("mv zybo_z7_ps7.txt xci/zybo_z7_ps7.xci")
                self.cpu.set_ps7_xci("xci/zybo_z7_ps7.xci")
            else:
                self.cpu.set_ps7(name="ps", config = platform.ps7_config)

            # Connect AXI GP0 to the SoC with base address of 0x40000000 (default one)
            wb_gp0  = wishbone.Interface()
            self.submodules += axi.AXI2Wishbone(
                axi          = self.cpu.add_axi_gp_master(),
                wishbone     = wb_gp0,
                base_address = 0x40000000)
            self.bus.add_master(master=wb_gp0)
            #TODO memory size dependend on board variant
            self.bus.add_region("sram", SoCRegion(
                origin = self.cpu.mem_map["sram"],
                size   = 512 * 1024 * 1024 - self.cpu.mem_map["sram"])
            )
            self.bus.add_region("rom", SoCRegion(
                origin = self.cpu.mem_map["rom"],
                size   = 256 * 1024 * 1024 // 8,
                linker = True)
            )
            self.constants["CONFIG_CLOCK_FREQUENCY"] = 666666687
            self.bus.add_region("flash", SoCRegion(
                origin = 0xFC00_0000,
                size = 0x4_0000,
                mode = "rwx")
            )

        # PS7 as Slave Integration ---------------------------------------------------------------------
        elif with_ps7:
            cpu_cls = cpu.CPUS["zynq7000"]
            zynq = cpu_cls(self.platform, "standard") # zynq7000 has no variants
            zynq.set_ps7(name="ps", config = platform.ps7_config)
            #axi_M_GP0 = zynq.add_axi_gp_master()
            #self.bus.add_master(master=axi_M_GP0)
            axi_S_HP0     = zynq.add_axi_hp_slave(clock_domain = self.crg.cd_sys.name)
            axi_S_HP1     = zynq.add_axi_hp_slave(clock_domain = self.crg.cd_sys.name)
            axi_S_HP2     = zynq.add_axi_hp_slave(clock_domain = self.crg.cd_sys.name)
            axi_S_HP3     = zynq.add_axi_hp_slave(clock_domain = self.crg.cd_sys.name)
            axi_S_GP0     = zynq.add_axi_gp_slave(clock_domain = self.crg.cd_sys.name)
            hp_ports      = [axi_S_HP0, axi_S_HP1, axi_S_HP2, axi_S_HP3]

            # PS7 DDR3 Interface -----------------------------
            ddr_addr      = self.cpu.mem_map["main_ram"]
            #map_fct_ddr   = lambda sig : sig - ddr_addr + 0x0008_0000
            map_fct_ddr   = lambda sig : sig - ddr_addr + 0x0010_0000
            sdram_size = 0x4000_0000
            
            if hasattr(self.cpu, "add_memory_buses"):
                self.cpu.add_memory_buses(address_width = 32, data_width = 64)
            
            if len(self.cpu.memory_buses): # if CPU has dedicated memory bus
                print("--------Connecting DDR to direct RAM port of the softcore using HP bus.--------")
                for mem_bus in self.cpu.memory_buses:
                    i = 0
                    axi_ddr = axi.AXIInterface(hp_ports[i].data_width, hp_ports[i].address_width, "byte", hp_ports[i].id_width)
                    self.comb += axi_ddr.connect_mapped(hp_ports[i], map_fct_ddr)
                    data_width_ratio = int(axi_ddr.data_width/mem_bus.data_width)
                    print("Connecting: ", str(mem_bus), " to ", str(axi_ddr))
                    print("CPU memory bus data width: ", mem_bus.data_width, " bits")
                    print("DDR bus data width: ", axi_ddr.data_width, " bits")
                    print("CPU memory bus address width: ", mem_bus.address_width, " bits")
                    print("DDR bus address width: ", axi_ddr.address_width, " bits")
                    print("CPU memory bus id width: ", mem_bus.id_width, " bits")
                    print("DDR bus id width: ", axi_ddr.id_width, " bits")
                    # Connect directly
                    if data_width_ratio == 1:
                        print("Direct connection")
                        self.comb += mem_bus.connect(axi_ddr)
                    # UpConvert
                    elif data_width_ratio > 1:
                        print("UpConversion")
                        axi_port = axi.AXIInterface(data_width = axi_ddr.data_width, addressing="byte", id_width = len(mem_bus.aw.id))
                        self.submodules += axi.AXIUpConverter(axi_from = mem_bus, axi_to = axi_port,)
                        self.comb += axi_port.connect(axi_ddr)
                    # DownConvert
                    else:
                        print("DownConversion")
                        axi_port = axi.AXIInterface(data_width = axi_ddr.data_width, addressing="byte", id_width = len(mem_bus.aw.id))
                        self.submodules += axi.AXIDownConverter(axi_from = mem_bus, axi_to = axi_port,)
                        self.comb += axi_port.connect(axi_ddr)
                    i = i + 1
                # Add SDRAM region
                origin = None
                main_ram_region = SoCRegion(
                    origin = self.mem_map.get("main_ram", origin),
                    size   = sdram_size,
                    mode   = "rwx")
                self.bus.add_region("main_ram", main_ram_region)
            else:
                print("--------Connecting DDR to general bus of the softcore using GP bus.--------")
                axi_ddr = axi.AXIInterface(axi_S_GP0.data_width, axi_S_GP0.address_width, "byte", axi_S_GP0.id_width)
                #axi_ddr = axi.AXIInterface(axi_S_HP0.data_width, axi_S_HP0.address_width, addressing="byte", axi_S_HP0.id_width)
                self.comb += axi_ddr.connect_mapped(axi_S_GP0, map_fct_ddr)
                #self.comb += axi_ddr.connect_mapped(axi_S_HP0, map_fct_ddr)
                
                self.bus.add_slave(
                   name="main_ram",slave=axi_ddr,
                    region=SoCRegion(
                        origin=ddr_addr,
                        size=sdram_size,
                        mode="rwx"
                    )
                )
            print("---------------------------- End ----------------------------------------------")
            
        
        # Video VGA ------------------------------------------------------------------------------------
        if with_video_terminal or with_video_framebuffer:
            if with_video_terminal:
                self.videophy = VideoVGAPHY(platform.request("vga"), clock_domain="vga")
                self.add_video_terminal(phy=self.videophy, timings="800x600@60Hz", clock_domain="vga")
            if with_video_framebuffer:
                #TODO
                print("Not implemented yet!")
                
        # Video HDMI ------------------------------------------------------------------------------------
        if with_hdmi_video_terminal or with_hdmi_video_framebuffer:
            if with_hdmi_video_terminal:
                self.videophy = VideoS7HDMIPHY(platform.request("hdmi_out"), clock_domain="hdmi")
                self.add_video_terminal(phy=self.videophy, timings="800x600@60Hz", clock_domain="hdmi")
            if with_hdmi_video_framebuffer:
                from my_modules import dvi_framebuffer
                platform.add_source("./my_modules/dvi_framebuffer.v")
                self.cfg_bus = cfg_bus = axi.AXILiteInterface(address_width=32, data_width=32, addressing="byte")
                axi_S_GP1 = zynq.add_axi_gp_slave(clock_domain = self.crg.cd_hdmi.name)
                self.out_bus = out_bus = axi.AXIInterface(axi_S_GP1.data_width, axi_S_GP1.address_width, "byte", axi_S_GP1.id_width)
                self.comb += out_bus.connect_mapped(axi_S_GP1, map_fct_ddr)
                self.submodules.hdmi_framebuffer = hdmi_framebuffer = dvi_framebuffer.dvi_framebuffer(self.crg.cd_hdmi.clk, self.crg.cd_hdmi5x.clk, self.crg.rst, Signal(), cfg_bus, out_bus, platform.request("hdmi_out"))
                self.bus.add_slave("framebuffer_ctrl", cfg_bus, region=SoCRegion(origin=0x87000000, size=0x10000, mode="rw", cached=False))
                
        #Leds -------------------------------------------------------------------------------------
        if with_led_chaser:
            self.leds = LedChaser(
                pads         = platform.request_all("user_led"),
                sys_clk_freq = sys_clk_freq)
        
        # XADC -------------------------------------------------------------------------------------
        if with_xadc:
            self.xadc = XADC()
            
        # DNA --------------------------------------------------------------------------------------
        if with_dna:
            self.dna = DNA()
            self.dna.add_timing_constraints(platform, sys_clk_freq, self.crg.cd_sys.clk)

    def finalize(self, *args, **kwargs):
        super(BaseSoC, self).finalize(*args, **kwargs)
        if self.cpu_type == "zynq7000":
            libxil_path = os.path.join(self.builder.software_dir, 'libxil')
            os.makedirs(os.path.realpath(libxil_path), exist_ok=True)
            lib = os.path.join(libxil_path, 'embeddedsw')
            if not os.path.exists(lib):
                os.system("git clone --depth 1 https://github.com/Xilinx/embeddedsw {}".format(lib))

            os.makedirs(os.path.realpath(self.builder.include_dir), exist_ok=True)
            for header in [
                'XilinxProcessorIPLib/drivers/uartps/src/xuartps_hw.h',
                'lib/bsp/standalone/src/common/xil_types.h',
                'lib/bsp/standalone/src/common/xil_assert.h',
                'lib/bsp/standalone/src/common/xil_io.h',
                'lib/bsp/standalone/src/common/xil_printf.h',
                'lib/bsp/standalone/src/common/xstatus.h',
                'lib/bsp/standalone/src/common/xdebug.h',
                'lib/bsp/standalone/src/arm/cortexa9/xpseudo_asm.h',
                'lib/bsp/standalone/src/arm/cortexa9/xreg_cortexa9.h',
                'lib/bsp/standalone/src/arm/cortexa9/xil_cache.h',
                'lib/bsp/standalone/src/arm/cortexa9/xparameters_ps.h',
                'lib/bsp/standalone/src/arm/cortexa9/xil_errata.h',
                'lib/bsp/standalone/src/arm/cortexa9/xtime_l.h',
                'lib/bsp/standalone/src/arm/common/xil_exception.h',
                'lib/bsp/standalone/src/arm/common/gcc/xpseudo_asm_gcc.h',
            ]:
                shutil.copy(os.path.join(lib, header), self.builder.include_dir)
            write_to_file(os.path.join(self.builder.include_dir, 'bspconfig.h'),
                        '#define FPU_HARD_FLOAT_ABI_ENABLED 1')
            write_to_file(os.path.join(self.builder.include_dir, 'xparameters.h'), '''
#ifndef __XPARAMETERS_H
#define __XPARAMETERS_H

#include "xparameters_ps.h"

#define STDOUT_BASEADDRESS            XPS_UART1_BASEADDR
#define XPAR_PS7_DDR_0_S_AXI_BASEADDR 0x00100000
#define XPAR_PS7_DDR_0_S_AXI_HIGHADDR 0x3FFFFFFF
#endif
''')

        elif self.with_ps7:
            libxil_path = os.path.join(self.builder.software_dir, 'libxil')
            os.makedirs(os.path.realpath(libxil_path), exist_ok=True)
            lib = os.path.join(libxil_path, 'embeddedsw')
            if not os.path.exists(lib):
                os.system("git clone --depth 1 https://github.com/Xilinx/embeddedsw {}".format(lib))

            os.makedirs(os.path.realpath(self.builder.include_dir), exist_ok=True)
            for header in [
                'XilinxProcessorIPLib/drivers/uartps/src/xuartps_hw.h',
                'XilinxProcessorIPLib/drivers/uartps/src/xuartps.h',
                'lib/bsp/standalone/src/common/xil_types.h',
                'lib/bsp/standalone/src/common/xil_assert.h',
                'lib/bsp/standalone/src/common/xil_io.h',
                'lib/bsp/standalone/src/common/xil_printf.h',
                'lib/bsp/standalone/src/common/xplatform_info.h',
                'lib/bsp/standalone/src/common/xstatus.h',
                'lib/bsp/standalone/src/common/xdebug.h'
            ]:
                shutil.copy(os.path.join(lib, header), self.builder.include_dir)
            write_to_file(os.path.join(self.builder.include_dir, 'uart_ps.h'), '''
#ifdef __cplusplus
extern "C" {
#endif
#include "xuartps_hw.h"
#include "system.h"
#define CSR_UART_BASE
#define UART_POLLING
static inline void uart_rxtx_write(char c) {
    XUartPs_WriteReg(STDOUT_BASEADDRESS, XUARTPS_FIFO_OFFSET, (uint32_t) c);
}
static inline uint8_t uart_rxtx_read(void) {
    return XUartPs_ReadReg(STDOUT_BASEADDRESS, XUARTPS_FIFO_OFFSET);
}
static inline uint8_t uart_txfull_read(void) {
    return XUartPs_IsTransmitFull(STDOUT_BASEADDRESS);
}
static inline uint8_t uart_rxempty_read(void) {
    return !XUartPs_IsReceiveData(STDOUT_BASEADDRESS);
}
static inline void uart_ev_pending_write(uint8_t x) { }
static inline uint8_t uart_ev_pending_read(void) {
    return 0;
}
static inline void uart_ev_enable_write(uint8_t x) { }
#ifdef __cplusplus
}
#endif
''')
            write_to_file(os.path.join(self.builder.include_dir, 'xil_cache.h'), '''
#ifndef XIL_CACHE_H
#define XIL_CACHE_H
#include "xil_types.h"
#include "xparameters.h"
#include "system.h"
#ifdef __cplusplus
extern "C" {
#endif
void Xil_DCacheFlush(void);
void Xil_ICacheFlush(void);
void Xil_L2CacheFlush(void);
#ifdef __cplusplus
}
#endif
#endif
''')
            write_to_file(os.path.join(self.builder.include_dir, 'xil_cache.c'), '''
#include "system.h"
void Xil_DCacheFlush(void){
    flush_cpu_dcache();
}
void Xil_ICacheFlush(void) {
    flush_cpu_icache();
}
void Xil_L2CacheFlush(void) {
    flush_l2_cache();
}
''')
            write_to_file(os.path.join(self.builder.include_dir, 'xparameters.h'), '''
#ifndef __XPARAMETERS_H
#define __XPARAMETERS_H
#include "generated/mem.h"
#define STDOUT_BASEADDRESS            PS_IO_BASE + 0x1000
#define STDIN_BASEADDRESS             PS_IO_BASE + 0x1000
#define XPAR_PS7_DDR_0_S_AXI_BASEADDR MAIN_RAM_BASE
#define XPAR_PS7_DDR_0_S_AXI_HIGHADDR MAIN_RAM_BASE + MAIN_RAM_SIZE
#endif
''')
            write_to_file(os.path.join(self.builder.include_dir, 'xpseudo_asm.h'), '''
#ifndef XPSEUDO_ASM_H
#define XPSEUDO_ASM_H
#endif
''')
            write_to_file(os.path.join(self.builder.include_dir, 'bspconfig.h'), '''
#ifndef XPSEUDO_ASM_H
#define XPSEUDO_ASM_H
#endif
''')


# Build --------------------------------------------------------------------------------------------

def main():
    from litex.build.parser import LiteXArgumentParser
    parser = LiteXArgumentParser(platform=digilent_zybo_z7_20.Platform, description="LiteX SoC on Zybo Z7/original Zybo")
    parser.add_target_argument("--sys-clk-freq",          default=125e6, type=float,     help="System clock frequency.")
    parser.add_target_argument("--variant",               default="original",            help="Board variant (z7-10, z7-20 or original).")
    parser.add_target_argument("--with-ps7",              action="store_true",           help="Add the PS7 as slave for soft CPUs.")
    parser.add_target_argument("--with-usb-host",         action="store_true",           help="Enable USB host support.(PMOD)")
    parser.add_target_argument("--with-xadc",             action="store_true",           help="Enable 7-Series XADC.")
    parser.add_target_argument("--with-dna",              action="store_true",           help="Enable 7-Series DNA.")
    sdopts = parser.target_group.add_mutually_exclusive_group()
    sdopts.add_argument("--with-spi-sdcard",              action="store_true",           help="Enable SPI-mode SDCard support.(PMOD)")
    sdopts.add_argument("--with-sdcard",      		  action="store_true",      	 help="Enable SDCard support.(PMOD)")
    viopts = parser.target_group.add_mutually_exclusive_group()
    viopts.add_argument("--with-video-terminal",          action="store_true",           help="Enable Video Terminal (VGA).")
    viopts.add_argument("--with-video-framebuffer",       action="store_true",           help="Enable Video Framebuffer (VGA).")
    viopts.add_argument("--with-hdmi-video-terminal",     action="store_true",           help="Enable Video Terminal (HDMI).")
    viopts.add_argument("--with-hdmi-video-framebuffer",  action="store_true",           help="Enable Video Framebuffer (HDMI).")
    args = parser.parse_args()

    soc = BaseSoC(
        sys_clk_freq = args.sys_clk_freq,
        variant = args.variant,
        with_ps7 = args.with_ps7,
        with_xadc = args.with_xadc,
        with_dna = args.with_dna,
        with_usb_host = args.with_usb_host,
        with_video_terminal = args.with_video_terminal,
        with_video_framebuffer = args.with_video_framebuffer,
        with_hdmi_video_terminal = args.with_hdmi_video_terminal,
        with_hdmi_video_framebuffer = args.with_hdmi_video_framebuffer,
        **soc_core_argdict(args)
    )
    
    if args.with_spi_sdcard:
        soc.platform.add_extension(digilent_zybo_z7_20._sd_card_pmod_io)
        soc.add_spi_sdcard(software_debug=True)
    if args.with_sdcard:
        soc.platform.add_extension(digilent_zybo_z7_20._sd_card_pmod_io)
        soc.add_sdcard(software_debug=True)
    
    builder = Builder(soc, **builder_argdict(args))
    
    if args.cpu_type == "zynq7000" or args.with_ps7:
        soc.builder = builder
        builder.add_software_package('libxil')
        builder.add_software_library('libxil')
    if args.build:
        builder.build(**parser.toolchain_argdict)
    if args.load:
        prog = soc.platform.create_programmer()
        prog.load_bitstream(builder.get_bitstream_filename(mode="sram"), device=1)

if __name__ == "__main__":
    main()```

@Dolu1990
Copy link
Collaborator

Maybe the reason why this isn't working, is that if let's say, we specify that the memory is on the AXI bus at 0x40000000, then naxriscv can access it, but the memory accesses on that axi bus will be emited without that 0x40000000 offset.

I don't know what is the expected behaviour from the zynq / litex side

@JoyBed
Copy link
Contributor Author

JoyBed commented Apr 22, 2024

Maybe the reason why this isn't working, is that if let's say, we specify that the memory is on the AXI bus at 0x40000000, then naxriscv can access it, but the memory accesses on that axi bus will be emited without that 0x40000000 offset.

I don't know what is the expected behaviour from the zynq / litex side

I dont quite understand now. If it has it on the mbus then accesses on the mbus are done without the 0x40000000 offset? So a access to 0x45000000 thru mbus emits address 0x05000000 ?

@Dolu1990
Copy link
Collaborator

So a access to 0x45000000 thru mbus emits address 0x05000000 ?

Yes, i need to double check but that is quite possible.

@JoyBed
Copy link
Contributor Author

JoyBed commented Apr 23, 2024

So a access to 0x45000000 thru mbus emits address 0x05000000 ?

Yes, i need to double check but that is quite possible.

I tried it with mbus connected to DRAM with bus address offset stripping (the 0x40000000) still when DRAM is connected to the mbus it locks up at the bootup mem_test. I dont know whats wrong, before experimenting with NaxRiscv I was using Rocket and that had its memory bus connected straight to the DRAM and it worked like a charm. Before that I used VexRiscv which had the same main_ram address 0x40000000 as NaxRiscv and it was working too. Even the Microwatt and Serv were able to work with it. NaxRiscv is the only one doing this. Im out of ideas.

@Dolu1990
Copy link
Collaborator

I'm looking at it. Trying to get the offset to be preserved.

Also, did you tried vexriscv_smp cpu ?

@JoyBed
Copy link
Contributor Author

JoyBed commented Apr 23, 2024

I'm looking at it. Trying to get the offset to be preserved.

Also, did you tried vexriscv_smp cpu ?

Yes, and it was working and booting linux just fine, but not with memory bus to dram as the memory bus of the vexriscv smp has litedram interface not axi4 as the PS7 of the zynq has. I also successfully booted linux on Rocket and Microwatt and I wanted to move to NaxRiscv for performance reasons, also I can fit 2 NaxRiscv cores into my FPGA and with Linux variant of Rocket I can only fit one.

@JoyBed
Copy link
Contributor Author

JoyBed commented Apr 24, 2024

Very strange behaviour, thru mbus it locks up at mem_test evne when i specified no L2 cache. I also tried it again with dram to pbus without L2 to see if it makes a difference. Still the same behaviour.
image
EDIT:
Now I stumbled uppon an interesting thing, it has problems with some addresses in the 0x40000000-0x41000000 region(seems like that its with the address 0x40c00000) and then on every 0xX0be0000, everything inbetween tests OK... strange.
image
image

@JoyBed
Copy link
Contributor Author

JoyBed commented Apr 25, 2024

Also if the DRAM is connected thru pbus and I just bypass the non functional addresses then it can even boot:
image
Even linux booted when I specified in device tree that those addresses are reserved memory regions.

@Dolu1990
Copy link
Collaborator

Also if the DRAM is connected thru pbus and I just bypass the non functional addresses then it can even boot:

I do not expect it to boot in that configuration. Also performances will be very bad.

I pushed a potential fix with :
#1940

With this one, mBus accesses will preserve the full 32 bits address, instead of removing the 0x40000000 offset.

I don't have any zynq board, let's me know how it goes :)

Also, keep in mind, VexiiRiscv is very close to get feature parity with performance not too far away. (WIP)

@JoyBed
Copy link
Contributor Author

JoyBed commented Apr 25, 2024

Also if the DRAM is connected thru pbus and I just bypass the non functional addresses then it can even boot:

I do not expect it to boot in that configuration. Also performances will be very bad.

I pushed a potential fix with : #1940

With this one, mBus accesses will preserve the full 32 bits address, instead of removing the 0x40000000 offset.

I don't have any zynq board, let's me know how it goes :)

Also, keep in mind, VexiiRiscv is very close to get feature parity with performance not too far away. (WIP)

I tried it but still locked up on mem_test when DRAM is connected to mbus:
image
I dont understand this behaviour, its very strange.

@Dolu1990
Copy link
Collaborator

Did you checked it passes the timings ?

@JoyBed
Copy link
Contributor Author

JoyBed commented Apr 25, 2024

Did you checked it passes the timings ?

Yes, it passes. Timings are in positive numbers. no negative slack or hold.
image

@Dolu1990
Copy link
Collaborator

Weird, I just tested on Digilent nexys video, can run debian just fine
Did you deleted the pythondata-cpu-naxriscv/pythondata_cpu_naxriscv/verilog/NaxRiscvLitex_????.v files before retrying with the fixes ?
Else it will not regenerate the naxriscv SoC and use it as a cache.

@JoyBed
Copy link
Contributor Author

JoyBed commented Apr 25, 2024

Yes, I deleted the generated verilog files. I dont understand this strange behavior at all. My goal is to boot Debian/Fedora on it as I did with Rocket. I am thinking maybe adding a litescope to or jtagbone to the SoC to see the signals on mbus<->dram bus to see what is going on.
EDIT: Alternatively I can give you remote access to my workstation for you to check out the code and behaviour of the SoC faster.

@Dolu1990
Copy link
Collaborator

to see the signals on mbus<->dram bus to see what is going on.

Yes that would be a the way to proceed.
probing the mbus, aswell as probing the dbus which get out of the CPU itself

Idealy, instead of relying on hardware debug, we would run a simulation, that would give use full visibility on what is happening.

@JoyBed
Copy link
Contributor Author

JoyBed commented Apr 25, 2024

to see the signals on mbus<->dram bus to see what is going on.

Yes that would be a the way to proceed. probing the mbus, aswell as probing the dbus which get out of the CPU itself

Idealy, instead of relying on hardware debug, we would run a simulation, that would give use full visibility on what is happening.

I dont know if simulation would do anything as the mbus is connected to the PS7 block which contains hardened components not softcores. Also you cant connect to UART in a Vivado simulation. So the only way to somehow usefully debug it is the debug options in the Litex.

@JoyBed
Copy link
Contributor Author

JoyBed commented Apr 28, 2024

Here it is with Rocket softcore. Memory bus of the Rocket connected straight to DRAM. Working as with any other softcore except the NaxRiscv.
image
Even whole ram is ther and OK.
image

@Dolu1990
Copy link
Collaborator

can you send your custom board files ? for me to recreate. Maybe it is the memory region definition which is messed up.

@JoyBed
Copy link
Contributor Author

JoyBed commented Apr 28, 2024

can you send your custom board files ? for me to recreate. Maybe it is the memory region definition which is messed up.

target and platform files ?

@Dolu1990
Copy link
Collaborator

yes

@JoyBed
Copy link
Contributor Author

JoyBed commented Apr 28, 2024

yes

Here you go.
custom_zybo.zip
There is the platform and target file and also the modified Zynq7000 core file so that the HP ports have the ACLK taken from softcore bus, otherwise it will not work.
Also use this pull request, im using that function: https://github.com/enjoy-digital/litex/pull/1522

@JoyBed
Copy link
Contributor Author

JoyBed commented Apr 29, 2024

Any news ?

@JoyBed
Copy link
Contributor Author

JoyBed commented May 16, 2024

@JoyBed #1923 is now good.

Still a few optimisations to do, but take take time ^^

I'm using it to run debian with for instance :

python3 -m litex_boards.targets.digilent_nexys_video --cpu-type=vexiiriscv  --with-jtag-tap  --bus-standard axi-lite --vexii-args=" \
--allow-bypass-from=0 --debug-privileged --with-mul --with-div --div-ipc --with-rva --with-supervisor --performance-counters 9 \
--regfile-async --xlen=64 --with-rvc --with-rvf --with-rvd --fma-reduced-accuracy \
--fetch-l1 --fetch-l1-ways=4 --fetch-l1-mem-data-width-min=64 \
--lsu-l1 --lsu-l1-ways=4  --lsu-l1-mem-data-width-min=64 --lsu-l1-store-buffer-ops=32 --lsu-l1-refill-count 2 --lsu-l1-writeback-count 2 --lsu-l1-store-buffer-slots=2  --with-lsu-bypass \
--with-btb --with-ras --with-gshare --relaxed-branch"  --cpu-count=4 --with-jtag-tap  --with-video-framebuffer --with-sdcard --with-ethernet --with-coherent-dma --l2-byte=262144 --update-repo=no  --sys-clk-freq 100000000 

How many LUTs does a dualcore VexiiRiscv takes ?

@Dolu1990
Copy link
Collaborator

For which ISA ?
RV64IMAFDC to run debian ?
Or something softcore friendly like RV32IMA to just run linux ?

@Dolu1990
Copy link
Collaborator

Dolu1990 commented May 16, 2024

For RV64IMAFDC single issue everything enabled with memory coherency and single core, it is around 12K lut, at learly 100 Mhz on Artix 7 -1 (slow speed grade).
The FPU take a lot of space, around 5K lut per core. RVC is also a pain in the ass ^^

@JoyBed
Copy link
Contributor Author

JoyBed commented May 16, 2024

For RV64IMAFDC single issue everything enabled with memory coherency and single core, it is around 12K lut, at learly 100 Mhz on Artix 7 -1 (slow speed grade).

The FPU take a lot of space, around 5K lut per core. RVC is also a pain in the ass ^^

Wow, a Debian able core only in 12k LUTs? Thats amazing! I can comfortably fit then even 4 of them in my FPGA!

@Dolu1990
Copy link
Collaborator

Note a recent change in litex broke things XD It works up to litex 86a43c9

@JoyBed
Copy link
Contributor Author

JoyBed commented May 16, 2024

Note a recent change in litex broke things XD It works up to litex 86a43c9

Just reverting this will fix things ?

@Dolu1990
Copy link
Collaborator

I mean, you can checkout 86a43c9 and it will work, but later it will not.

Otherwise there is two commit to revert to get things to work :

@JoyBed
Copy link
Contributor Author

JoyBed commented May 16, 2024

I dont have these in my local so I dont need to worry about them. My local s like a month old.

@Dolu1990
Copy link
Collaborator

I updated the vexii with a fix. Now it work with upstream litex.

@JoyBed
Copy link
Contributor Author

JoyBed commented May 16, 2024

I updated the vexii with a fix. Now it work with upstream litex.

Funny, for some reason I cant checkout your PR. Nevermind tho, I will manually add the needed files.

@JoyBed
Copy link
Contributor Author

JoyBed commented May 17, 2024

Also a quick question, does VexiiRiscv keep the offset when talking thru memory bus or not?

@Dolu1990
Copy link
Collaborator

@JoyBed Ahhh right, Vexii was using the old code which was removing the offset.
I just pushed a fix in this PR. should be good now.

@JoyBed
Copy link
Contributor Author

JoyBed commented May 17, 2024

@JoyBed Ahhh right, Vexii was using the old code which was removing the offset.

I just pushed a fix in this PR. should be good now.

Either way, with VexiiRiscv I have the same problem as with NaxRiscv. Lockup when DRAM is connected thru memory bus but thru peripheral bus it works.

@Dolu1990
Copy link
Collaborator

It lockup only of you try to use more than 512 MB of ram right ?

@JoyBed
Copy link
Contributor Author

JoyBed commented May 17, 2024

It lockup only of you try to use more than 512 MB of ram right ?

No. At any amount of DRAM. I tried from 32 megs all they way up to 1gig.

@JoyBed
Copy link
Contributor Author

JoyBed commented May 17, 2024

Now when I was checking it out, why the args for FPU and RVC are commented out? I was wondering why linux are not booting. Also aint it done like in the old VexRiscv where FPU is configurable to be shared between cores?

@Dolu1990
Copy link
Collaborator

Now when I was checking it out, why the args for FPU and RVC are commented out?

Do you mean things around :
https://github.com/SpinalHDL/VexiiRiscv/blob/4d2ff4b29d04bf033239ace06fb0f61b3600362d/src/main/scala/vexiiriscv/Param.scala#L163 ?

It is only for debug, and isn't enabled for litex.
See

  //  Debug modifiers
  val debugParam = sys.env.getOrElse("VEXIIRISCV_DEBUG_PARAM", "0").toInt.toBoolean
  if(debugParam) {

As long as you feed litex with :

python3 -m litex_boards.targets.digilent_nexys_video --cpu-type=vexiiriscv  --with-jtag-tap  --bus-standard axi-lite --vexii-args=" \
--allow-bypass-from=0 --debug-privileged --with-mul --with-div --div-ipc --with-rva --with-supervisor --performance-counters 9 \
--regfile-async --xlen=64 --with-rvc --with-rvf --with-rvd --fma-reduced-accuracy \
--fetch-l1 --fetch-l1-ways=4 --fetch-l1-mem-data-width-min=64 \
--lsu-l1 --lsu-l1-ways=4  --lsu-l1-mem-data-width-min=64 --lsu-l1-store-buffer-ops=32 --lsu-l1-refill-count 2 --lsu-l1-writeback-count 2 --lsu-l1-store-buffer-slots=2  --with-lsu-bypass \
--with-btb --with-ras --with-gshare --relaxed-branch"  --cpu-count=4 --with-jtag-tap  --with-video-framebuffer --with-sdcard --with-ethernet --with-coherent-dma --l2-byte=262144 --update-repo=no  --sys-clk-freq 100000000  --build

It should be debian ready

Also aint it done like in the old VexRiscv where FPU is configurable to be shared between cores?

Right, it isn't.
Currently, the FPU is thigtly integrated into the pipeline via some plugins :
https://github.com/SpinalHDL/VexiiRiscv/blob/4d2ff4b29d04bf033239ace06fb0f61b3600362d/src/main/scala/vexiiriscv/Param.scala#L660

It would be great to have an lighter alternative (with less performance).
I was thinking that maybe a FSM like FPU (instead of fully pipelined) would allow reuse a lot of hardware.
Or FPU sharing like in VexRiscv, yes.
I implemented things like they are now as i wanted to aim at full perf thigtly coupled FPU, which would also work great in ASIC.

Through peripheral memory, did you got linux to work ?

On my side, quad core debian is very very stable, did a lot of test with it. Also, things like USB host / bluetooth / sdcard / ethernet are working well.

@JoyBed
Copy link
Contributor Author

JoyBed commented May 17, 2024

Now when I was checking it out, why the args for FPU and RVC are commented out?

Do you mean things around :

https://github.com/SpinalHDL/VexiiRiscv/blob/4d2ff4b29d04bf033239ace06fb0f61b3600362d/src/main/scala/vexiiriscv/Param.scala#L163 ?

It is only for debug, and isn't enabled for litex.

See

  //  Debug modifiers

  val debugParam = sys.env.getOrElse("VEXIIRISCV_DEBUG_PARAM", "0").toInt.toBoolean

  if(debugParam) {

As long as you feed litex with :


python3 -m litex_boards.targets.digilent_nexys_video --cpu-type=vexiiriscv  --with-jtag-tap  --bus-standard axi-lite --vexii-args=" \

--allow-bypass-from=0 --debug-privileged --with-mul --with-div --div-ipc --with-rva --with-supervisor --performance-counters 9 \

--regfile-async --xlen=64 --with-rvc --with-rvf --with-rvd --fma-reduced-accuracy \

--fetch-l1 --fetch-l1-ways=4 --fetch-l1-mem-data-width-min=64 \

--lsu-l1 --lsu-l1-ways=4  --lsu-l1-mem-data-width-min=64 --lsu-l1-store-buffer-ops=32 --lsu-l1-refill-count 2 --lsu-l1-writeback-count 2 --lsu-l1-store-buffer-slots=2  --with-lsu-bypass \

--with-btb --with-ras --with-gshare --relaxed-branch"  --cpu-count=4 --with-jtag-tap  --with-video-framebuffer --with-sdcard --with-ethernet --with-coherent-dma --l2-byte=262144 --update-repo=no  --sys-clk-freq 100000000  --build

It should be debian ready

Also aint it done like in the old VexRiscv where FPU is configurable to be shared between cores?

Right, it isn't.

Currently, the FPU is thigtly integrated into the pipeline via some plugins :

https://github.com/SpinalHDL/VexiiRiscv/blob/4d2ff4b29d04bf033239ace06fb0f61b3600362d/src/main/scala/vexiiriscv/Param.scala#L660

It would be great to have an lighter alternative (with less performance).

I was thinking that maybe a FSM like FPU (instead of fully pipelined) would allow reuse a lot of hardware.

Or FPU sharing like in VexRiscv, yes.

I implemented things like they are now as i wanted to aim at full perf thigtly coupled FPU, which would also work great in ASIC.

Through peripheral memory, did you got linux to work ?

On my side, quad core debian is very very stable, did a lot of test with it. Also, things like USB host / bluetooth / sdcard / ethernet are working well.

Yes, thru peripheral bus its working. But the arguments --with-rvc --with-with-rvf --with-rvd gave error as not recognised so I checked the core.py and they are commented out. Also some lines werent in right order in the core.py file so i reorganised it a bit. When I get home I will send you the core.py I modified from your original one in the PR.

@Dolu1990
Copy link
Collaborator

--with-rvc --with-with-rvf --with-rvd

Not reconized by the python litex itself ? or SpinalHDL generation ?
note that --with-rvc --with-with-rvf --with-rvd are part of the --vexii-args, they aren't directly feed to litex itself
--other-args --vexii-args=" ... --with-rvc --with-with-rvf --with-rvd ... " --other-args

@JoyBed
Copy link
Contributor Author

JoyBed commented May 18, 2024

--with-rvc --with-with-rvf --with-rvd

Not reconized by the python litex itself ? or SpinalHDL generation ? note that --with-rvc --with-with-rvf --with-rvd are part of the --vexii-args, they aren't directly feed to litex itself --other-args --vexii-args=" ... --with-rvc --with-with-rvf --with-rvd ... " --other-args

Yes, they are not recognised by the SpinalHDL.
image

EDIT: Nevermind, I see the error. I revised the core.py a bit so no need to call NaxRiscv for repo update. Wanna see it ?

@Dolu1990
Copy link
Collaborator

Ahhh i think you have a old version of VexiiRiscv then.
I gived you the command with --update-repo=no, i didnt noticed it.
--update-repo=recommended

That may explain a lot.

@Dolu1990
Copy link
Collaborator

Then the pythondata-cpu-vexiiriscv/pythondata_cpu_vexiiriscv/verilog/ext/vexiiriscv should be on : "fpu_internal", "8a239d10" (after running the soc generation)

@JoyBed
Copy link
Contributor Author

JoyBed commented May 18, 2024

Yes, but with the core.py version in the PR the update_repo doesnt work, I fixed it and also removed the dependancy on NaxRiscv in the process. Yes, now the pythondata are on "fpu_internal". Can we communicate on some other platform so we will get things done faster so this issue can be closed faster ?

@Dolu1990
Copy link
Collaborator

Sure, here is my discord : dolu1990
Would that work for you ?

@JoyBed
Copy link
Contributor Author

JoyBed commented May 18, 2024

Sure, here is my discord : dolu1990 Would that work for you ?

Yup, I sent you a friend request.

@enjoy-digital
Copy link
Owner

Hi @Dolu1990, @JoyBed,

have to been able to fix/understand the issue/limitation when discussing directly?

@JoyBed
Copy link
Contributor Author

JoyBed commented Jun 13, 2024

Hi @enjoy-digital ! Actually we pinned down what is the problem. It not only affects NaxRiscv's mbus but also other CPUs memory buses, its not a problem of the softcores tho. Problem is tgat Zynq7000 has older AXI3 ports while basically everything here is AXI4. The problem is the lack of WID signal from AXI4 master while AXI3 has it and the PS7 block locks up when receiving AWID of any other value than 0 as the WID on the PS7 is unconnected so interprets WID as 0. We are trying to make a bridge between AXI4 master and AXI3 slave. Only unaffected softcore is Rocket as its TileLink to AXI4 bridge uses AWID = 0 so lack of WID is not a problem.

@enjoy-digital
Copy link
Owner

Thanks @JoyBed for the feedback. Regarding your application, do you think improvement should be made to LiteX to at least maybe prevent things from building? If you could provide more information about the zynq7000 integration you are doing an minimal repro, we could try to raise an error if this case is not supposed to be supported and shouldn't build.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants