Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Block RAMs with big data width gives wrong results #2304

Open
syed-ahmed opened this issue Oct 13, 2021 · 0 comments
Open

Block RAMs with big data width gives wrong results #2304

syed-ahmed opened this issue Oct 13, 2021 · 0 comments

Comments

@syed-ahmed
Copy link
Contributor

Following block ram gives wrong results when compiled with symbiflow. Vivado compilation is correct.

// ==============================================================
// Vitis HLS - High-Level Synthesis from C, C++ and OpenCL v2020.2 (64-bit)
// Copyright 1986-2020 Xilinx, Inc. All Rights Reserved.
// ==============================================================
`timescale 1 ns / 1 ps
module matmul_partition_A_V_0_ram (addr0, ce0, d0, we0, q0,  clk);

parameter DWIDTH = 256;
parameter AWIDTH = 6;
parameter MEM_SIZE = 64;

input[AWIDTH-1:0] addr0;
input ce0;
input[DWIDTH-1:0] d0;
input we0;
output reg[DWIDTH-1:0] q0;
input clk;

reg [DWIDTH-1:0] ram[0:MEM_SIZE-1];




always @(posedge clk)  
begin 
    if (ce0) begin
        if (we0) 
            ram[addr0] <= d0; 
        q0 <= ram[addr0];
    end
end


endmodule

`timescale 1 ns / 1 ps
module matmul_partition_A_V_0(
    reset,
    clk,
    address0,
    ce0,
    we0,
    d0,
    q0);

parameter DataWidth = 32'd256;
parameter AddressRange = 32'd64;
parameter AddressWidth = 32'd6;
input reset;
input clk;
input[AddressWidth - 1:0] address0;
input ce0;
input we0;
input[DataWidth - 1:0] d0;
output[DataWidth - 1:0] q0;



matmul_partition_A_V_0_ram matmul_partition_A_V_0_ram_U(
    .clk( clk ),
    .addr0( address0 ),
    .ce0( ce0 ),
    .we0( we0 ),
    .d0( d0 ),
    .q0( q0 ));

endmodule

For the app where I'm using this module, I just pass a set of uint32_t's into this RAM and just read it back and get the following mismatch between the values:

5: We got 00080860 but expected 02080860
10: We got 7f01f801 but expected 7f01f803
13: We got 011008c0 but expected 031008c0
18: We got 6401e001 but expected 6401e003
29: We got 8c0e1870 but expected 8e0e1870
42: We got 7381fc05 but expected 7381fc07
45: We got 059c1e70 but expected 079c1e70
55: We got 2001e00f but expected 2003e00f
58: We got 23008801 but expected 23008803
61: We got 01180840 but expected 03180840
69: We got 04101860 but expected 06101860
82: We got 6701f801 but expected 6701f803
85: We got 041c1860 but expected 061c1860
93: We got 05841810 but expected 07841810
98: We got c383fc05 but expected c383fc07
101: We got 0c3c32b0 but expected 0e3c32b0
...
...
<more mismatches>

I added a test in the symbiflow xc/xc7/tests/bram_test and was able to see differences in the fasm generated (execute by make bram_test_256_6_vivado_diff_fasm). The diff is:

01:13:40.389573176 -0400
@@ -2402,7 +2402,6 @@
 CLBLM_R_X7Y130.SLICEL_X1.CFF.ZRST
 CLBLM_R_X7Y130.SLICEL_X1.CFFMUX.O5
 CLBLM_R_X7Y130.SLICEL_X1.CLUT.INIT[63:0] = 64'b1111111100000000110011001100110010101010101010101010101010101010
-CLBLM_R_X7Y130.SLICEL_X1.COUTMUX.C5Q
 CLBLM_R_X7Y130.SLICEL_X1.D5FF.ZRST
 CLBLM_R_X7Y130.SLICEL_X1.D5FFMUX.IN_B
 CLBLM_R_X7Y130.SLICEL_X1.DLUT.INIT[63:0] = 64'b1111111110101010000000001010101000000000000000000000000000000000
@@ -2637,7 +2636,6 @@
 CLBLM_R_X11Y131.SLICEL_X1.DFF.ZRST
 CLBLM_R_X11Y131.SLICEL_X1.DFFMUX.O5
 CLBLM_R_X11Y131.SLICEL_X1.DLUT.INIT[31:0] = 32'b10101010101010101010101010101010
-CLBLM_R_X11Y131.SLICEL_X1.DOUTMUX.D5Q
 CLBLM_R_X11Y131.SLICEL_X1.FFSYNC
 CLBLM_R_X11Y131.SLICEL_X1.NOCLKINV
 CLBLM_R_X11Y131.SLICEL_X1.PRECYINIT.C0
@@ -5520,7 +5518,6 @@
 INT_L_X8Y131.CLK_L1.GCLK_L_B2
 INT_L_X8Y131.CTRL_L0.SR1END2
 INT_L_X8Y131.CTRL_L1.FAN_BOUNCE1
-INT_L_X8Y131.EE2BEG0.NE2END0
 INT_L_X8Y131.EE2BEG2.LOGIC_OUTS_L10
 INT_L_X8Y131.EE2BEG3.LOGIC_OUTS_L11
 INT_L_X8Y131.EL1BEG_N3.NL1END0
@@ -5747,7 +5744,6 @@
 INT_L_X10Y131.ER1BEG3.WW2END2
 INT_L_X10Y131.FAN_ALT3.EE2END3
 INT_L_X10Y131.FAN_ALT7.FAN_BOUNCE3
-INT_L_X10Y131.NE2BEG0.EE2END0
 INT_L_X10Y131.NN2BEG2.EE2END2
 INT_L_X10Y131.SL1BEG0.ER1END0
 INT_L_X10Y131.SW6BEG1.NW2END2
@@ -5922,6 +5918,7 @@
 INT_L_X12Y131.FAN_ALT6.WL1END1
 INT_L_X12Y131.FAN_ALT7.SL1END2
 INT_L_X12Y131.GFAN0.NR1END1
+INT_L_X12Y131.GFAN1.GND_WIRE
 INT_L_X12Y131.IMUX_L0.SL1END0
 INT_L_X12Y131.IMUX_L1.WW2END0
 INT_L_X12Y131.IMUX_L10.BYP_BOUNCE_N3_6
@@ -5946,10 +5943,10 @@
 INT_L_X12Y131.IMUX_L29.NL1BEG_N3
 INT_L_X12Y131.IMUX_L3.BYP_BOUNCE_N3_7
 INT_L_X12Y131.IMUX_L31.NR1END3
-INT_L_X12Y131.IMUX_L36.ER1END2
+INT_L_X12Y131.IMUX_L36.GFAN1
 INT_L_X12Y131.IMUX_L37.EL1END3
 INT_L_X12Y131.IMUX_L38.WR1END3
-INT_L_X12Y131.IMUX_L39.EL1END_S3_0
+INT_L_X12Y131.IMUX_L39.GFAN1
 INT_L_X12Y131.IMUX_L4.FAN_BOUNCE1
 INT_L_X12Y131.IMUX_L40.WR1END0
 INT_L_X12Y131.IMUX_L41.ER1END0
@@ -7414,7 +7411,6 @@
 INT_R_X5Y124.BYP_ALT5.LOGIC_OUTS23
 INT_R_X5Y124.BYP_ALT6.NW2END3
 INT_R_X5Y124.CLK1.GCLK_B2_EAST
-INT_R_X5Y124.EE2BEG2.LOGIC_OUTS6
 INT_R_X5Y124.FAN_ALT0.SR1END_N3_3
 INT_R_X5Y124.FAN_ALT1.NN2END3
 INT_R_X5Y124.FAN_ALT2.LOGIC_OUTS19
@@ -8394,7 +8390,6 @@
 INT_R_X7Y123.WW2BEG0.SL1END0
 INT_R_X7Y123.WW2BEG2.WR1END3
 
-INT_R_X7Y124.EE4BEG2.EE2END2
 INT_R_X7Y124.NN6BEG0.NE2END0
 INT_R_X7Y124.NR1BEG1.SS2END1
 INT_R_X7Y124.NR1BEG3.SS2END3
@@ -8871,7 +8866,6 @@
 INT_R_X7Y130.IMUX47.NW2END_S0_0
 INT_R_X7Y130.IMUX6.NW2END3
 INT_R_X7Y130.IMUX7.FAN_BOUNCE_S3_4
-INT_R_X7Y130.NE2BEG0.LOGIC_OUTS18
 INT_R_X7Y130.NE2BEG1.NE2END1
 INT_R_X7Y130.NL1BEG0.NE2END1
 INT_R_X7Y130.NL1BEG2.LOGIC_OUTS17
@@ -9341,7 +9335,6 @@
 INT_R_X9Y134.NW2BEG1.NN2END1
 
 INT_R_X11Y124.EL1BEG0.NW2END1
-INT_R_X11Y124.NN6BEG2.EE4END2
 
 INT_R_X11Y126.NL1BEG1.WW2END1
 INT_R_X11Y126.NN2BEG1.EE2END1
@@ -9385,6 +9378,7 @@
 INT_R_X11Y130.BYP_ALT2.EE2END2
 INT_R_X11Y130.BYP_ALT4.NN2END1
 INT_R_X11Y130.BYP_ALT5.EE2END1
+INT_R_X11Y130.BYP_ALT6.GFAN1
 INT_R_X11Y130.BYP_ALT7.NR1END3
 INT_R_X11Y130.CLK0.GCLK_B2_EAST
 INT_R_X11Y130.CLK1.GCLK_B2_EAST
@@ -9400,6 +9394,7 @@
 INT_R_X11Y130.FAN_ALT5.ER1END2
 INT_R_X11Y130.FAN_ALT6.SL1END1
 INT_R_X11Y130.FAN_ALT7.WL1END1
+INT_R_X11Y130.GFAN1.GND_WIRE
 INT_R_X11Y130.IMUX10.BYP_BOUNCE_N3_6
 INT_R_X11Y130.IMUX11.SS2END1
 INT_R_X11Y130.IMUX17.FAN_BOUNCE5
@@ -9413,7 +9408,6 @@
 INT_R_X11Y130.IMUX45.SL1END2
 INT_R_X11Y130.NE2BEG0.LOGIC_OUTS0
 INT_R_X11Y130.NE6BEG3.LOGIC_OUTS17
-INT_R_X11Y130.NL1BEG1.NN6END2
 INT_R_X11Y130.NL1BEG2.LOGIC_OUTS3
 INT_R_X11Y130.NL1BEG_N3.NE6END0
 INT_R_X11Y130.NR1BEG0.EE4END0
@@ -9438,26 +9432,24 @@
 INT_R_X11Y131.BYP_ALT1.NR1END0
 INT_R_X11Y131.BYP_ALT2.BYP_BOUNCE1
 INT_R_X11Y131.BYP_ALT5.SL1END1
-INT_R_X11Y131.BYP_ALT7.NL1END_S3_0
+INT_R_X11Y131.BYP_ALT7.GFAN1
 INT_R_X11Y131.CLK0.GCLK_B2_EAST
 INT_R_X11Y131.EE2BEG2.SS2END2
 INT_R_X11Y131.EL1BEG1.NL1END2
 INT_R_X11Y131.EL1BEG2.ER1END3
 INT_R_X11Y131.EL1BEG_N3.EL1END0
 INT_R_X11Y131.ER1BEG1.WW2END0
-INT_R_X11Y131.ER1BEG2.LOGIC_OUTS19
 INT_R_X11Y131.ER1BEG3.ER1END2
 INT_R_X11Y131.ER1BEG_S0.SW2END3
 INT_R_X11Y131.FAN_ALT6.SR1END1
 INT_R_X11Y131.FAN_ALT7.NR1END2
 INT_R_X11Y131.GFAN0.NR1END1
-INT_R_X11Y131.IMUX16.SL1END0
+INT_R_X11Y131.GFAN1.GND_WIRE
+INT_R_X11Y131.IMUX16.BYP_BOUNCE_N3_6
 INT_R_X11Y131.IMUX41.GFAN0
 INT_R_X11Y131.NE2BEG0.NW2END0
-INT_R_X11Y131.NL1BEG0.NL1END1
 INT_R_X11Y131.NN2BEG3.LOGIC_OUTS17
 INT_R_X11Y131.NR1BEG0.LOGIC_OUTS0
-INT_R_X11Y131.NR1BEG1.LOGIC_OUTS1
 INT_R_X11Y131.SL1BEG0.ER1END0
 INT_R_X11Y131.SL1BEG1.SR1END1
 INT_R_X11Y131.SL1BEG2.SW2END2
@@ -9465,14 +9457,12 @@
 INT_R_X11Y131.SW2BEG0.WW4END1
 INT_R_X11Y131.SW2BEG3.LOGIC_OUTS3
 
-INT_R_X11Y132.EL1BEG0.NR1END1
 INT_R_X11Y132.EL1BEG_N3.NR1END0
 INT_R_X11Y132.ER1BEG1.SR1BEG_S0
 INT_R_X11Y132.ER1BEG3.ER1END2
 INT_R_X11Y132.NL1BEG2.WW2END2
 INT_R_X11Y132.SE2BEG1.ER1END1
 INT_R_X11Y132.SE2BEG2.ER1END2
-INT_R_X11Y132.SL1BEG0.NE2END0
 INT_R_X11Y132.SL1BEG1.EE4END1
 INT_R_X11Y132.SR1BEG1.SW2END0
 INT_R_X11Y132.SR1BEG_S0.WW2END3

I also compiled the verilog with yosys and then did the place and route with vivado, which gave correct results. So may be somewhere in the techmap things are going wrong? Any help is appreciated! My current workaround is to not use big data widths, which is not ideal for performance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant