Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
Original Redmine Comment
This is probably a few month effort, the big challenge is defining
I'll post more later or on request.
Original Redmine Comment
I gave this much thought some time ago in a CPU emulator I was working on. I'd like to take a stab at this in the near future. Let's make sure I don't completely misunderstand what's meant by "adding an event loop". The following is my understanding intermingled with the proposal.
The key ingredients are:
The @workunit@ is used by the event loop to pace the simulator so that it doesn't grab control for too long. The simulator can adjust its "loop count" or some equivalent to return control after certain amount of @workunit@s was done.
The event loop mostly call's the simulator's @ProcessEvent@ API with various events. Each event includes the maximum desired number of @workunit@s to be done by the simulator while processing the event.
When a @SimulatedEvent@ arrives, the simulator may be either ahead or behind the given @SimulatedTime@ of the event. If the simulator is behind the event's time, it fails an assertion: it is the event loop's job to check the simulator's time and roll back its state should the time be past the event's time. When the event is ahead of the simulator's time, the simulator keeps running the simulation until its time matched the given @SimulatedTime@, then processes the event itself (e.g. a flip of a bit somewhere in its state), then resumes. In all cases, the simulator returns when it has approximately done the amount of work it was expected of it. If it didn't have the chance to consume the event, it shall keep it in its own queue. Those events are a part of the simulator's state.
The simulator can also call APIs of the event loop. This is mostly @postevent(SimulatedTime, ...)@. It's up to the event loop to decide how to consume those events, i.e. they need to be "connected" to some receiver(s) (e.g. a GUI, a log file, ...).
Note that the simulator can advance @SimulatedTime@ arbitrarily far - this is the equivalent of, say, waiting for a timer to expire. What's important is that the simulator do a certain amount of work and return. When the event loop detects that there are some events for the simulator that happened at a prior time, it rewinds its state sufficiently far back, then issues the events to it.
Since @simulatedTime@ is in real physical units, the event loop can correlate it with the timestamps of realtime events such a UI interaction, or with timestamps of other simulations.
Since there's effectively no limit on how far into the future the events may be, it's possible to have fairly complex rendezvous scenarios. For example, suppose you have a verilog simulation of two Hayes 9600 modems, with a serial interface on one end and an audio/phone line on another. You want to test sending a file using ZModem, implemented in C++ in the bench that runs the two simulations. Since ZModem can have large windows, there'll be large chunks of simulated time where serial line's "newByteToModem1" events will be provided quite a while into the future. Thus the simulator thread for the "file sender" modem can run largely uninterrupted even if the modem itself doesn't implement a large queue in its verilog modem: the simulator itself does the queueing. If the test bench/harness wants to, for whatever reason, issue a line signal loss - whether in the future or in the past - it can certainly do so by dropping/altering the "analogData" events travelling between the modems.
I've been using a scheme similar to this one to run multiprocessor embedded hardware simulations, with the simulations written in C++ not Verilog, but they were cycle-accurate nevertheless, and it wasn't hard to separate the peripherals from the processor core: they'd run independently as much as possible, and only rendezvous when needed. In my implementation, the event loop didn't hard-rollback the simulator when it had an event in the past: it'd instead provide the event and the handle to a snapshot that predated the event, and then the simulator could decide whether the event would alter the state. Suppose that a serial line input came, but the CPU happened to have that UART disabled between the rollback state and the current state - no need to roll back even though the event was in the past. The simulator would keep a state counter for peripherals, incremented each time the externally visible state had changed, and would know that no state change occurred between state snapshots when the counter was the same.
The simulators were also informed if any events had no subscribers: in such case, even if the state of the UART might have changed, it was invisible, and thus the "state counter" of the peripheral could remain static. Those were micro-optimizations but had good results.
I would imagine that an approach that addresses these problems would be useful for Verilator, but I'm not necessarily suggesting doing it in this particular way. I do believe that performance demands the ability to minimize the synchronization between simulated entities, and to synchronize lazily i.e. only when needed. This is facilitated by the ability to inform the simulator of events that happen in the future (in @simulatedTime@ terms), as well as of events that happened in the past, where the simulator may need to roll back its state. I first had the rollback handled by the event loop and only later moved it to the simulator itself so that it could decide whether the rollback was necessary (if the event would not change the state, then there was no need to roll back, but only a simulator would know it).
But the driving design principle should be that the interface between the simulator thread and its event loop should be abstract, so that the simulator has only access to the "real world" via the event loop. The key invariant would be that if a simulator snapshot is propagated forward (i.e. work units get done) with the same external events, the resulting simulator state at any chosen future @simulatedTime@ is only a function of those events and of the snapshot state, and nothing else.
An event loop would exist for any thread where a simulator runs, and it'd need to handle cross-thread event propagation. Of course Verilog allows multiple threads of execution, and their fork-joins would need to be communicated via events, and all threads that modify the state they mutually depends on must be treated as an entity of sorts as far as event propagation goes, since their states are intertwined and an event delivered to one thread, if not ignored, is visible via the shared state to other threads. This increases the rendezvous costs somewhat, so in general the Verilog models being simulated should prefer access to shared state in the manner of Hoare's CSP's. I've sidestepped this issue in my CPU emulations by designing the emulated software to only interact in the CSP manner (implemented in a copy-free fashion).