Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fsm_state changes mid cycle #439

Closed
ghost opened this issue Jul 19, 2020 · 40 comments
Closed

fsm_state changes mid cycle #439

ghost opened this issue Jul 19, 2020 · 40 comments

Comments

@ghost
Copy link

@ghost ghost commented Jul 19, 2020

This "minimal" example is kind of long, but you said you[Whitequark] think you know what buy we're hitting.
I'll keep this example as is for now and try to trim it later.

"""Simple example of a FSM-based ALU

This demonstrates a design that follows the valid/ready protocol of the
ALU, but with a FSM implementation, instead of a pipeline.  It is also
intended to comply with both the CompALU API and the nmutil Pipeline API
(Liskov Substitution Principle)

The basic rules are:

1) p.ready_o is asserted on the initial ("Idle") state, otherwise it keeps low.
2) n.valid_o is asserted on the final ("Done") state, otherwise it keeps low.
3) The FSM stays in the Idle state while p.valid_i is low, otherwise
   it accepts the input data and moves on.
4) The FSM stays in the Done state while n.ready_i is low, otherwise
   it releases the output data and goes back to the Idle state.

"""

from nmigen import Elaboratable, Signal, Module, Cat
cxxsim = True
if cxxsim:
    from nmigen.sim.cxxsim import Simulator, Settle
else:
    from nmigen.back.pysim import Simulator, Settle
from nmigen.cli import rtlil
from math import log2


class Dummy:
    pass


class Shifter(Elaboratable):
    """Simple sequential shifter

    Prev port data:
    * p.data_i.data:  value to be shifted
    * p.data_i.shift: shift amount
    *                 When zero, no shift occurs.
    *                 On POWER, range is 0 to 63 for 32-bit,
    *                 and 0 to 127 for 64-bit.
    *                 Other values wrap around.

    Next port data:
    * n.data_o.data: shifted value
    """
    class PrevData:
        def __init__(self, width):
            self.data = Signal(width, name="p_data_i")
            self.shift = Signal(width, name="p_shift_i")

        def _get_data(self):
            return [self.data, self.shift]

    class NextData:
        def __init__(self, width):
            self.data = Signal(width, name="n_data_o")

        def _get_data(self):
            return [self.data]

    def __init__(self, width):
        self.width = width
        self.p = Dummy()
        self.n = Dummy()
        self.p.valid_i = Signal(name="p_valid_i")
        self.p.ready_o = Signal(name="p_ready_o")
        self.n.ready_i = Signal(name="n_ready_i")
        self.n.valid_o = Signal(name="n_valid_o")

        self.p.data_i = Shifter.PrevData(width)
        self.n.data_o = Shifter.NextData(width)

    def elaborate(self, platform):
        m = Module()

        # Note:
        # It is good practice to design a sequential circuit as
        # a data path and a control path.

        # Data path
        # ---------
        # The idea is to have a register that can be
        # loaded or shifted (left and right).

        # the control signals
        load = Signal()
        shift = Signal()
        # the data flow
        shift_in = Signal(self.width)
        shift_left_by_1 = Signal(self.width)
        next_shift = Signal(self.width)
        # the register
        shift_reg = Signal(self.width, reset_less=True)
        # build the data flow
        m.d.comb += [
            # connect input and output
            shift_in.eq(self.p.data_i.data),
            self.n.data_o.data.eq(shift_reg),
            # generate shifted views of the register
            shift_left_by_1.eq(Cat(0, shift_reg[:-1])),
        ]
        # choose the next value of the register according to the
        # control signals
        # default is no change
        m.d.comb += next_shift.eq(shift_reg)
        with m.If(load):
            m.d.comb += next_shift.eq(shift_in)
        with m.Elif(shift):
            m.d.comb += next_shift.eq(shift_left_by_1)

        # register the next value
        m.d.sync += shift_reg.eq(next_shift)

        # Control path
        # ------------
        # The idea is to have a SHIFT state where the shift register
        # is shifted every cycle, while a counter decrements.
        # This counter is loaded with shift amount in the initial state.
        # The SHIFT state is left when the counter goes to zero.

        # Shift counter
        shift_width = int(log2(self.width)) + 1
        next_count = Signal(shift_width)
        count = Signal(shift_width, reset_less=True)
        m.d.sync += count.eq(next_count)

        #m.d.comb += self.p.ready_o.eq(1)

        with m.FSM():
            with m.State("IDLE"):
                m.d.comb += [
                    # keep p.ready_o active on IDLE
                    self.p.ready_o.eq(1),
                    # keep loading the shift register and shift count
                    load.eq(1),
                    next_count.eq(self.p.data_i.shift),
                ]
                with m.If(self.p.valid_i):
                    # Leave IDLE when data arrives
                    with m.If(next_count == 0):
                        # short-circuit for zero shift
                        m.next = "DONE"
                    with m.Else():
                        m.next = "SHIFT"
            with m.State("SHIFT"):
                m.d.comb += [
                    # keep shifting, while counter is not zero
                    shift.eq(1),
                    # decrement the shift counter
                    next_count.eq(count - 1),
                ]
                with m.If(next_count == 0):
                    # exit when shift counter goes to zero
                    m.next = "DONE"
            with m.State("DONE"):
                # keep n.valid_o active while the data is not accepted
                m.d.comb += self.n.valid_o.eq(1)
                with m.If(self.n.ready_i):
                    # go back to IDLE when the data is accepted
                    m.next = "IDLE"

        return m

    def __iter__(self):
        yield self.p.data_i.data
        yield self.p.data_i.shift
        yield self.p.valid_i
        yield self.p.ready_o
        yield self.n.ready_i
        yield self.n.valid_o
        yield self.n.data_o.data

    def ports(self):
        return list(self)


def test_shifter():
    m = Module()
    m.submodules.shf = dut = Shifter(8)
    print("Shifter port names:")
    for port in dut:
        print("-", port.name)
    # generate RTLIL
    # try "proc; show" in yosys to check the data path
    il = rtlil.convert(dut, ports=dut.ports())
    with open("test_shifter.il", "w") as f:
        f.write(il)
    sim = Simulator(m)
    sim.add_clock(1e-6)

    def send(data, shift):
        yield
        yield
        yield
        yield
        yield
        yield
        yield
        # present input data and assert valid_i
        yield dut.p.data_i.data.eq(data)
        yield dut.p.data_i.shift.eq(shift)
        yield dut.p.valid_i.eq(1)
        yield
        print ("set up signals")
        # wait for p.ready_o to be asserted
        ready_o = yield dut.p.ready_o
        print ("ready_o", ready_o)
        while not (yield dut.p.ready_o):
            ready_o = yield dut.p.ready_o
            print ("ready_o", ready_o)
            yield
        print ("done ready check")
        # clear input data and negate p.valid_i
        yield dut.p.valid_i.eq(0)
        yield dut.p.data_i.data.eq(0)
        yield dut.p.data_i.shift.eq(0)
        print ("done send")

    def receive(expected):
        yield
        # signal readiness to receive data
        yield dut.n.ready_i.eq(1)
        yield
        # wait for n.valid_o to be asserted
        valid_o = yield dut.n.valid_o
        print ("        valid_o", valid_o)
        while not (yield dut.n.valid_o):
            valid_o = yield dut.n.valid_o
            print ("        valid_o", valid_o)
            yield

        # read result
        result = yield dut.n.data_o.data

        # "FIX" the problem with this line:
        #yield    # <---- remove this - pysim "works" but cxxsim does not

        # negate n.ready_i
        yield dut.n.ready_i.eq(0)
        # check result
        assert result == expected
        print ("        done receive")

    def producer():
        print ("start of producer")
        yield from send(3, 4)
        print ("end of producer")

    def consumer():
        yield
        # the consumer is not in step with the producer, but the
        # order of the results are preserved
        # 3 << 4 = 48
        print ("        start of receiver")
        yield from receive(48)
        print ("        end of receiver")

    sim.add_sync_process(producer)
    sim.add_sync_process(consumer)
    sim_writer = sim.write_vcd(
        "test_shifter.vcd",
    )
    with sim_writer:
        sim.run()


if __name__ == "__main__":
    test_shifter()
@cestrauss
Copy link

@cestrauss cestrauss commented Jul 21, 2020

Greetings. I'm Cesar, from libre-SOC.

I'd like to point to the previous discussion on https://bugs.libre-soc.org/show_bug.cgi?id=417#c13, where the issue first arose.

I managed to trim the test case by removing the shift state. Since only two states remain, the next logical step is to replace the FSM with a one-bit register.

"""Simple Handshake Test

1) p_ready_o is asserted on the initial ("Idle") state, otherwise it keeps low.
2) n_valid_o is asserted on the final ("Done") state, otherwise it keeps low.
3) The FSM stays in the Idle state while p_valid_i is low, otherwise
   it goes to Done.
4) The FSM stays in the Done state while n_ready_i is low, otherwise
   it goes back to Idle.
"""

from nmigen import Elaboratable, Signal, Module

cxxsim = True
if cxxsim:
    from nmigen.sim.cxxsim import Simulator, Settle
else:
    from nmigen.sim.pysim import Simulator, Settle


class Handshake(Elaboratable):

    def __init__(self):
        self.p_valid_i = Signal()
        self.p_ready_o = Signal()
        self.n_ready_i = Signal()
        self.n_valid_o = Signal()

    def elaborate(self, platform):
        m = Module()

        with m.FSM():
            with m.State("IDLE"):
                m.d.comb += self.p_ready_o.eq(1)
                with m.If(self.p_valid_i):
                    m.next = "DONE"
            with m.State("DONE"):
                m.d.comb += self.n_valid_o.eq(1)
                with m.If(self.n_ready_i):
                    m.next = "IDLE"

        return m

    def __iter__(self):
        yield self.p_valid_i
        yield self.p_ready_o
        yield self.n_ready_i
        yield self.n_valid_o

    def ports(self):
        return list(self)


def test_handshake():
    m = Module()
    m.submodules.hsk = dut = Handshake()
    sim = Simulator(m)
    sim.add_clock(1e-6)

    def send():
        yield
        yield
        yield
        yield
        yield
        yield
        yield
        # assert p_valid_i
        yield dut.p_valid_i.eq(1)
        yield
        print("set up signals")
        # wait for p_ready_o to be asserted
        ready_o = yield dut.p_ready_o
        print("ready_o", ready_o)
        while not (yield dut.p_ready_o):
            ready_o = yield dut.p_ready_o
            print("ready_o", ready_o)
            yield
        print("done ready check")
        # negate p_valid_i
        yield dut.p_valid_i.eq(0)
        print("done send")

    def receive():
        yield
        # signal readiness to receive data
        yield dut.n_ready_i.eq(1)
        yield
        # wait for n_valid_o to be asserted
        valid_o = yield dut.n_valid_o
        print("        valid_o", valid_o)
        while not (yield dut.n_valid_o):
            valid_o = yield dut.n_valid_o
            print("        valid_o", valid_o)
            yield
        # negate n_ready_i
        yield dut.n_ready_i.eq(0)
        print("        done receive")

    def producer():
        print("start of producer")
        yield from send()
        print("end of producer")

    def consumer():
        yield
        # the consumer is not in step with the producer, but the
        # order of the results are preserved
        # 3 << 4 = 48
        print("        start of receiver")
        yield from receive()
        print("        end of receiver")

    sim.add_sync_process(producer)
    sim.add_sync_process(consumer)
    sim_writer = sim.write_vcd("handshake.vcd")
    with sim_writer:
        sim.run()


if __name__ == "__main__":
    test_handshake()

@ghost
Copy link
Author

@ghost ghost commented Jul 21, 2020

Thanks Cesar. BracketMaster is Yehowshua BTW.

@whitequark
Copy link
Member

@whitequark whitequark commented Jul 21, 2020

This looks much better, thank you! Though it seems like it could be minimized a bit further without much effort?

@cestrauss
Copy link

@cestrauss cestrauss commented Jul 22, 2020

On https://bugs.libre-soc.org/show_bug.cgi?id=417#c15, Luke wrote:

                 m.d.comb += self.n.valid_o.eq(1)
                 with m.If(self.n.ready_i):
                    # go back to IDLE when the data is accepted
                    m.next = "IDLE"

this should set the FSM to "IDLE" on the next cycle, however
the fact that in the unit test ready_i is dropped immediately,
combined with the fact that valid_o is set combinatorially, what
happens instead is:

  • valid_o is set to 1 (combinatorially)
  • unit test (combinatorially) notices that (in the while yield loop)
  • unit test (combinatorially) sets ready_i to 0
  • DUT - combinatorially - notices that ready_i has been set to 0
  • DUT NO LONGER ASSERTS m.next=IDLE

To which, I replied:

Unit test work sequentially, not combinatorially, unless Settle() is used.
This unit test does not use Settle().

On the other hand, the behavior of cxxsim is consistent with it doing an implicit Settle() after a yield.

The following test demonstrates it:

from nmigen import Elaboratable, Signal, Module
from nmigen.sim.cxxsim import Simulator as CxxSimulator
from nmigen.sim.pysim import Simulator as PySimulator


class SamplePoint(Elaboratable):
    """Just a simple one-bit register"""

    def __init__(self):
        self.data_i = Signal()
        self.data_o = Signal()

    def elaborate(self, _):
        m = Module()
        m.d.sync += self.data_o.eq(self.data_i)
        return m
        
    def ports(self):
        return [self.data_i, self.data_o]


def test_sample_point(sim_type):
    print("Testing", sim_type)

    if sim_type == "cxxsim":
        simulator = CxxSimulator
    else:
        simulator = PySimulator

    m = Module()
    m.submodules.sp = dut = SamplePoint()
    sim = simulator(m)
    sim.add_clock(1e-6)

    def process():
        # present data to register input, to be latched at the next clock
        # rising edge
        yield dut.data_i.eq(1)
        yield
        # at this point, just after the clock rising edge, the register
        # should still hold its previous (reset) value
        assert (yield dut.data_o) == 0

    sim.add_sync_process(process)
    sim_writer = sim.write_vcd(f"{sim_type}.vcd")
    with sim_writer:
        sim.run()
    
    print("PASS")


if __name__ == "__main__":
    test_sample_point("pysim")
    test_sample_point("cxxsim")

@whitequark
Copy link
Member

@whitequark whitequark commented Jul 22, 2020

On the other hand, the behavior of cxxsim is consistent with it doing an implicit Settle() after a yield.

This narrows it down enough. Thanks for your effort, I'll fix this soon.

@whitequark whitequark added this to the 0.3 milestone Jul 22, 2020
whitequark added a commit that referenced this issue Jul 22, 2020
@whitequark
Copy link
Member

@whitequark whitequark commented Jul 22, 2020

This issue is now fixed in the cxxsim branch.

@whitequark whitequark closed this Jul 22, 2020
whitequark added a commit that referenced this issue Jul 22, 2020
@cestrauss
Copy link

@cestrauss cestrauss commented Jul 22, 2020

Greetings.

After a pull of the cxxsim branch (commit 1f8ba74), it looks like the register is not being updated, at all, even after a Settle() or a yield.

I modified the previous test case, to check for this.

from nmigen import Elaboratable, Signal, Module
from nmigen.sim.cxxsim import Simulator as CxxSimulator
from nmigen.sim.cxxsim import Settle as CxxSettle
from nmigen.sim.pysim import Simulator as PySimulator
from nmigen.sim.pysim import Settle as PySettle


class SamplePoint(Elaboratable):
    """Just a simple one-bit register"""

    def __init__(self):
        self.data_i = Signal()
        self.data_o = Signal()

    def elaborate(self, _):
        m = Module()
        m.d.sync += self.data_o.eq(self.data_i)
        return m
        
    def ports(self):
        return [self.data_i, self.data_o]


def test_sample_point(sim_type):
    print("Testing", sim_type)

    if sim_type == "cxxsim":
        simulator = CxxSimulator
        settle = CxxSettle
    else:
        simulator = PySimulator
        settle = PySettle

    m = Module()
    m.submodules.sp = dut = SamplePoint()
    sim = simulator(m)
    sim.add_clock(1e-6)

    def process():
        # present data to register input, to be latched at the next clock
        # rising edge
        yield dut.data_i.eq(1)
        yield
        # at this point, just after the clock rising edge, the register
        # should still hold its previous (reset) value
        assert (yield dut.data_o) == 0
        print("Check reset value: PASS")
        yield settle()
        # now we should see it
        assert (yield dut.data_o) == 1
        print("Check latched value: PASS")

    sim.add_sync_process(process)
    sim_writer = sim.write_vcd(f"{sim_type}.vcd")
    with sim_writer:
        sim.run()
    

if __name__ == "__main__":
    test_sample_point("pysim")
    test_sample_point("cxxsim")

@cestrauss
Copy link

@cestrauss cestrauss commented Jul 22, 2020

Same result on the just rebased branch (060ad25).

@whitequark whitequark reopened this Jul 22, 2020
whitequark added a commit that referenced this issue Aug 27, 2020
@whitequark
Copy link
Member

@whitequark whitequark commented Aug 27, 2020

Commit 7ca1477 fixes the problem where registers change mid-cycle. However, the second problem (as described in #455 (comment)) also arises here, so the code is just as broken.

whitequark added a commit that referenced this issue Aug 27, 2020
@whitequark
Copy link
Member

@whitequark whitequark commented Aug 27, 2020

I've pushed a low-performance workaround to the cxxsim branch. Please take another look.

@cestrauss
Copy link

@cestrauss cestrauss commented Aug 27, 2020

Sorry, with the cxxsim branch checked out a 0caa57e, I still see the earlier issue.

It seems to me, that the behavior of cxxsim is consistent with it doing an implicit Settle() after the clock rises, but before the Python process is run.

I managed to further reduce the test case. See below:

from nmigen import Signal, Module
from nmigen.sim.cxxsim import Simulator

m = Module()
o = Signal()

m.d.sync += o.eq(1)


def process():
    assert (yield o) == 0


sim = Simulator(m)
sim.add_clock(1e-6)
sim.add_sync_process(process)
sim_writer = sim.write_vcd("bug-439.vcd")
with sim_writer:
    sim.run()

@whitequark
Copy link
Member

@whitequark whitequark commented Aug 27, 2020

It seems to me, that the behavior of cxxsim is consistent with it doing an implicit Settle() after the clock rises, but before the Python process is run.

Thanks for the investigation! The reduced testcase is very helpful.

Regarding the implicit Settle(): I believe this is accurate with respect to the observable behavior, but, unfortunately, the behavior arises implicitly from faulty scheduling. I'll have to rethink my approach of integrating CXXRTL and the Python testbenches—I've designed them to closely match each other, but there are still opportunities for race conditions at the boundary, and you're hitting one of them.

@whitequark
Copy link
Member

@whitequark whitequark commented Sep 3, 2020

@cestrauss Please take another look. I completely reworked the integration layer and I believe cxxsim should no longer exhibit this class of issue.

@whitequark
Copy link
Member

@whitequark whitequark commented Sep 3, 2020

You'll need the very latest Yosys to use the current cxxsim branch btw.

@cestrauss
Copy link

@cestrauss cestrauss commented Sep 3, 2020

I installed Yosys master (commit c66d1dfad).
All test cases above now succeed, without assertions (after updating them to the latest Simulator API).
In the first, non-reduced test case, I do see one of the output signals (n_valid_o) changing mid-cycle.
The reduced ones do not show this.
Let me try to reduce it.

@cestrauss
Copy link

@cestrauss cestrauss commented Sep 6, 2020

Here you go.
Uncommenting any of the three commented-out lines makes the problem go away.

from nmigen import Signal, Module
from nmigen.sim import Simulator, Tick

m = Module()
r = Signal()
o = Signal()

m.d.sync += r.eq(1)
m.d.comb += o.eq(r & 1)
# m.d.comb += o.eq(r)


def process():
#    (yield r)
    yield Tick()
#    (yield r)
    yield Tick()


sim = Simulator(m, engine="cxxsim")
sim.add_clock(1e-6)
sim.add_process(process)
sim_writer = sim.write_vcd("bug-439.vcd")
with sim_writer:
    sim.run()

It generates the VCD below.
Notice at #10000, how "o" rises as "clk" falls.

$timescale 100 ps $end
$var wire 1 ! clk $end
$var wire 1 " o $end
$var reg 1 # r $end
$var wire 1 $ r$next $end
$var wire 1 % rst $end
$enddefinitions $end
#0
0!
0"
0#
1$
0%
#0
#5000
1!
1#
#10000
0!
1"
#15000
1!
#20000

@whitequark
Copy link
Member

@whitequark whitequark commented Sep 6, 2020

Here you go.

Thanks for reducing this. This is actually a very subtle issue in CXXRTL itself more so than the cxxsim integration code. This can be shown by eliminating everything but the clock process:

from nmigen import Signal, Module
from nmigen.sim import Simulator, Tick


m = Module()
r = Signal()
o = Signal()

m.d.sync += r.eq(1)
m.d.comb += o.eq(r & 1)


sim = Simulator(m, engine="cxxsim")
sim.add_clock(1e-6)
sim_writer = sim.write_vcd("bug-439.vcd")
with sim_writer:
    sim.run_until(2e-6, run_passive=True)

I'll need to investigate how to fix this. Most likely, it will be necessary to change static scheduling of combinatorial nodes connected to registers.

@whitequark
Copy link
Member

@whitequark whitequark commented Sep 6, 2020

As a temporary workaround, try applying this patch:

diff --git a/nmigen/sim/cxxsim.py b/nmigen/sim/cxxsim.py
index 753c72d..1ad91f1 100644
--- a/nmigen/sim/cxxsim.py
+++ b/nmigen/sim/cxxsim.py
@@ -71,6 +71,7 @@ class _CxxRTLProcess(BaseProcess):
 
     def run(self):
         self.cxxlib.eval(self.handle)
+        self.runnable = True
 
 
 class _CxxSimulation(BaseSimulation):

I think it won't cover every case, but it might get you further with your overall design.

@cestrauss
Copy link

@cestrauss cestrauss commented Sep 7, 2020

It works!

  1. There are no traces changing on the falling clock edge, on any of the test cases.
  2. No regressions either on their asserts.

@whitequark
Copy link
Member

@whitequark whitequark commented Sep 7, 2020

Right. I did check that on my end. What I'm more interested is you experimenting with your overall design moreso than the extracted test cases.

@cestrauss
Copy link

@cestrauss cestrauss commented Sep 8, 2020

I ran an integration test, which should give good coverage of the whole project.
On pysim, it runs for about 8 minutes.
On cxxsim, I interrupted it after letting it run for an hour, at 100% CPU.

It did output the following:

/home/cstrauss/src/nmigen/nmigen/sim/cxxsim.py:170: YosysWarning: Design contains feedback wires, which require delta cycles during evaluation.
  cxx_source, name_map = cxxrtl.convert_fragment(fragment)

Then, it spent some time compiling a 11MB sim.cc file, generating a 15MB sim.so.

After one hour, with no further output, I interrupted it, giving the following stack trace:

^CTraceback (most recent call last):
  File "/home/cstrauss/src/soc/src/soc/simple/test/test_issuer.py", line 318, in <module>
    runner.run(suite)
  File "/usr/lib/python3.7/unittest/runner.py", line 176, in run
    test(result)
  File "/usr/lib/python3.7/unittest/suite.py", line 84, in __call__
    return self.run(*args, **kwds)
  File "/usr/lib/python3.7/unittest/suite.py", line 122, in run
    test(result)
  File "/usr/lib/python3.7/unittest/case.py", line 663, in __call__
    return self.run(*args, **kwds)
  File "/usr/lib/python3.7/unittest/case.py", line 615, in run
    testMethod()
  File "/home/cstrauss/src/soc/src/soc/simple/test/test_issuer.py", line 299, in run_all
    sim.run()
  File "/home/cstrauss/src/nmigen/nmigen/sim/core.py", line 168, in run
    while self.advance():
  File "/home/cstrauss/src/nmigen/nmigen/sim/core.py", line 159, in advance
    return self._engine.advance()
  File "/home/cstrauss/src/nmigen/nmigen/sim/cxxsim.py", line 228, in advance
    self._step()
  File "/home/cstrauss/src/nmigen/nmigen/sim/cxxsim.py", line 218, in _step
    process.run()
  File "/home/cstrauss/src/nmigen/nmigen/sim/cxxsim.py", line 73, in run
    self.cxxlib.eval(self.handle)
KeyboardInterrupt

real	75m38.540s
user	74m15.167s
sys	0m23.699s

I guess there is more reducing for me to do.

@cestrauss
Copy link

@cestrauss cestrauss commented Sep 27, 2020

Greetings.

I managed to reduce the cxxsim hang, above, to the following:

from nmigen import Signal, Module
from nmigen.sim import Simulator

m = Module()
s = Signal(33)


def process():
    yield s.eq(1)
    print("end of process")


sim = Simulator(m, engine="cxxsim")
sim.add_process(process)
sim.run()

I guess we tended to use 32 bits or less in our own experiments, which resulted a lack of coverage for wider signals.
This was only caught in higher-level tests, which used the full 64 bits of our processor design.

@cestrauss
Copy link

@cestrauss cestrauss commented Sep 27, 2020

Just to clarify:

  1. Any non-zero update seems to trigger the issue, provided that the signal being updated has a width greater than 32 bits.
  2. Using longer signal sizes in the code above (128, 256, 512 bit) also trigger the issue.
  3. If it helps, we would guess "ctypes" being related to the root of the issue, perhaps.

@whitequark
Copy link
Member

@whitequark whitequark commented Sep 28, 2020

@cestrauss I believe this is actually a logic bug, not a ctypes bug. Should be easy to fix.

@whitequark
Copy link
Member

@whitequark whitequark commented Sep 28, 2020

Please try again.

@cestrauss
Copy link

@cestrauss cestrauss commented Sep 29, 2020

The hanging is gone, thanks.
But, now that the test no longer hangs, I can see that it actually fails.
It doesn't seem to be a regression, the above tests all still work.
I will resume investigating.

@cestrauss
Copy link

@cestrauss cestrauss commented Sep 29, 2020

Here you go:

from nmigen import Signal, Module
from nmigen.sim import Simulator

m = Module()
s = Signal(reset=1)


def process():
    assert((yield s) == 1)


sim = Simulator(m, engine="cxxsim")
sim.add_process(process)
sim.run()

On pysim, signal "s" is high, even when not driven.
On cxxsim, it keeps low.

@whitequark
Copy link
Member

@whitequark whitequark commented Sep 29, 2020

Please try again.

@cestrauss
Copy link

@cestrauss cestrauss commented Sep 30, 2020

Not quite there yet:

from nmigen import Signal, Module
from nmigen.sim import Simulator

m = Module()
s = Signal(reset=1)
t = Signal()
m.d.comb += t.eq(s)


def process():
    assert((yield s) == 1)


sim = Simulator(m, engine="cxxsim")
sim.add_process(process)
sim.run()

@whitequark
Copy link
Member

@whitequark whitequark commented Oct 2, 2020

Not quite there yet:

Interesting--that's a completely unrelated bug. I'll have to look into it as the cause is clear but the fix is not obvious.

@whitequark
Copy link
Member

@whitequark whitequark commented Dec 3, 2020

@cestrauss Please evaluate the following temporary solution:

diff --git a/nmigen/back/rtlil.py b/nmigen/back/rtlil.py
index 916605e..de9e5e2 100644
--- a/nmigen/back/rtlil.py
+++ b/nmigen/back/rtlil.py
@@ -333,6 +333,10 @@ class _ValueCompilerState:
             for value in signal._enum_class:
                 attrs["enum_value_{:0{}b}".format(value.value, signal.width)] = value.name
 
+        # XXX
+        if signal not in self.driven:
+            attrs["init"] = signal.reset
+
         wire_curr = self.rtlil.wire(width=signal.width, name=wire_name,
                                     port_id=port_id, port_kind=port_kind,
                                     attrs=attrs, src=src(signal.src_loc))

It cannot be applied as-is, but it should unblock you to do further verification.

@cestrauss
Copy link

@cestrauss cestrauss commented Dec 4, 2020

Indeed. Appreciated.
I will resume testing, and let you know if I find anything else.
Also, thanks for fixing the hierarchy in VCD files, much appreciated.

@cestrauss
Copy link

@cestrauss cestrauss commented Dec 6, 2020

With the above patch, I'm getting the following error on some designs:
nmigen._toolchain.yosys.YosysError: ERROR: Conflicting init values for signal \alu.pipe_middle_0.empty (\alu.pipe_middle_0.empty = 1'1 != 1'0).
It does not occur without the patch.
It is a bit rare. Many other designs do not produce this error, even with the patch.
The message comes from write_cxxrtl.
I attach the RTLIL files, with and without the patch: div.zip
To reproduce, open div-patched.il in yosys and call write_cxxrtl.
Would it help if I try to reduce it?

By the way, the patch interferes with bounded model check, which is a minor inconvenience. I get:
Checking assumptions in step 0.. Assumptions are unsatisfiable!

@whitequark
Copy link
Member

@whitequark whitequark commented Dec 6, 2020

Would it help if I try to reduce it?

Not necessary; I approximately know the cause. I'll look into it.

@whitequark
Copy link
Member

@whitequark whitequark commented Dec 7, 2020

@cestrauss Minimized with this patch and command bugpoint -command "write_cxxrtl /dev/null" -grep "1'1 != 1'0" to:

module \pipe_middle_0
  wire \empty
  attribute \init 0
  wire \p.p_ready_o
  wire \p_ready_o
  process $group_31
    assign \p_ready_o \empty
  end
  process $group_37
    sync init
      update \empty 1'1
  end
  connect \p.p_ready_o \p_ready_o
end

@whitequark
Copy link
Member

@whitequark whitequark commented Dec 7, 2020

@cestrauss Try this Nevermind, I think this requires a new approach...

@whitequark
Copy link
Member

@whitequark whitequark commented Dec 7, 2020

@cestrauss Okay, should be fixed properly now.

@cestrauss
Copy link

@cestrauss cestrauss commented Dec 8, 2020

Confirmed fixed, thanks.
This bugpoint Yosys command seems very useful, good to know.

@whitequark
Copy link
Member

@whitequark whitequark commented Dec 8, 2020

Confirmed fixed, thanks.

Closing this issue now; it's been three unrelated bugs already, so any new issues that arise should probably be reported on their own.

That said, does everything work for you now, or have you not tested the full design yet?

This bugpoint Yosys command seems very useful, good to know.

Yup, I wrote it exactly for cases like this.

@whitequark whitequark closed this Dec 8, 2020
@cestrauss
Copy link

@cestrauss cestrauss commented Dec 8, 2020

Sure.
I am indeed still seeing some difference in behavior with respect to pysim on some tests, which I'll proceed to investigate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants