Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Illegal use of conditional assignment in process #151

Open
MayaPosch opened this issue Feb 20, 2020 · 15 comments
Open

Illegal use of conditional assignment in process #151

MayaPosch opened this issue Feb 20, 2020 · 15 comments
Assignees

Comments

@MayaPosch
Copy link

In multiple source files, a conditional signal assignment statement is used within a process, which is illegal in VHDL-2008. Instead an if statement or case statement should be used.

During compilation one will get errors such as:

Error (10500): VHDL syntax error at wishbone_arbiter.vhdl(41) near text "when"; expecting ";"

This is in process 'wishbone_muxes' (line 31of wishbone_arbiter.vhdl:

wishbone_muxes: process(selected, candidate, busy, wb_slave_in, wb_masters_in)
variable early_sel : wb_arb_master_t;
begin
early_sel := selected;
if busy = '0' then
early_sel := candidate;
end if;
wb_slave_out <= wb_masters_in(early_sel);
for i in 0 to NUM_MASTERS-1 loop
wb_masters_out(i).dat <= wb_slave_in.dat;
wb_masters_out(i).ack <= wb_slave_in.ack when early_sel = i else '0';
wb_masters_out(i).stall <= wb_slave_in.stall when early_sel = i else '1';
end loop;
end process;

Here both conditional assignments (lines 41 and 42) should be rewritten either as a case statement or if statement. The same error occurs in various other locations in other source files, which I can list here if needed.

@mikey
Copy link
Collaborator

mikey commented Jun 11, 2020

@MayaPosch which compiler are you using? We've been using GHDL and Vivado in VHDL-2008 mode and have not been hitting this.

We can fix them but without a way to test, we'll likely introduce similar issues in the future.

@ozbenh
Copy link
Contributor

ozbenh commented Jun 27, 2020

Right neither Vivado nor ghdl complains ... I'll fix it anyways

@eine
Copy link

eine commented Jul 2, 2020

Should lines 41 and 42 be the following:

wb_masters_out(i).ack <= wb_slave_in.ack when early_sel = i else '0';
wb_masters_out(i).stall <= wb_slave_in.stall when early_sel = i else '1';

I'd say that's precisely a feature which was NOT supported before VHDL 2008, but which is supported since then. Previously, it was only allowed outside of processes. Hence, I'd say the synthesizer/simulator used by @MayaPosch was configured for VHDL <2008 (I assume, the default in the tool). Can you please confirm?

In this case, I would use a generate instead of a process, because it feels more natural from a hardware point of view (less softwareish). But I believe that's just a style preference.

@ozbenh
Copy link
Contributor

ozbenh commented Jul 2, 2020

Allright, I'm confused :-) @eine what you say the lines should be is exactly what they are today .. or am I missing something ?

Is that construct inside of a process supported or not in VHDL 2008 ? Since none of the tools we use complains I assume it is.

Out of curiosity, why do you prefer a generate statement in that case ?

@eine
Copy link

eine commented Jul 2, 2020

Allright, I'm confused :-) @eine what you say the lines should be is exactly what they are today .. or am I missing something ?

Oh, I think this is because I'm not a native english speaker. Should I have used "Shall the OP mean the following with lines 41 and 42:"?

The lines are directly copied from @MayaPosch's comment above, and I believe that those are the same that you are using in the codebase today. The discussion is whether that's valid VHDL 2008 or not. It was said it is not. I argue that it is. Hence, I think that this issue is related to the OP's tool settings, and not something that needs to be fixed in microwatt.

Is that construct inside of a process supported or not in VHDL 2008 ? Since none of the tools we use complains I assume it is.

It is. However, if you try to analyze it with GHDL or Vivado and VHDL 1993 or 2002, it should complain. That's precisely my point.

Out of curiosity, why do you prefer a generate statement in that case ?

VHDL is a language which supports synthesizable and non-synthesizable features. When using non-synthesizable language features, it feels very much like Ada or C. Hence, it is common to use lots of processes and to describe many sequential snippets, as if we were writing "functions". In this context, advantages of VHDL as a modeling language rely on the strong typing and the inherent parallelism; which makes it much easier to describe concurrent threads and inter-process communication of otherwise sequential modules/functions/blocks. It is not a language for production code, because the runtime is slower than Ada or C. But I expect you understand the potential for "modelling" any kind of "complex highly concurrent system", including physical systems (absolutely unrelated to hardware or electronics).

However, when the synthesizable subset is used, VHDL is not a programming language anymore, but a description language for hardware (read RTL) designs. As a language that is describing hardware, everything is parallel/concurrent by nature. In any FPGA/ASIC, everything is running at the same time. In this context, sequential operations are awkward, very awkward, absolutely unnatural, something to run away from... There is hardly nothing sequential in RTL. All comes down to registers (FFs), logical gates (LUTs), muxes, DSPs, and few additional and ad-hoc hard-IP included in todays SoCs. All the sequences of events that need to be described in RTL can be done through FSMs, and a FSM is just a register and the "next state logic", which is combinatorial. So, I envision describing hardware in VHDL as being suprisingly similar to drawing the same design using Tikz (LaTeX) or any other similar tool for "drawing programmatically". I believe that a good hardware designer needs to foresee the RTL layout of the code that she/he is writing. The best way to do so is to take a library of synthesizable templates (such as the one provided by many vendor tools) and compose a structural description based on those. When you need to instantiate a component/entity multiple times, using a generate is the natural way to do it.

The point is that, in the end, all concurrent statements in VHDL have an equivalent implementation as a process. IIRC, that's actually how concurrent statements are defined in the standard. I believe it makes sense because tooling is written in software, so analyzers/simulators/synthesizers feel more comfortable with sequential definitions. That's why you can write VHDL using sequential statements only, ignoring the other sub-subset completely. Nevertheless, the fact that you can do it does not mean that you should do it.

Let's be specific:

wishbone_muxes: process(selected, candidate, busy, wb_slave_in, wb_masters_in)
  variable early_sel : wb_arb_master_t;
begin
  early_sel := selected;
  if busy = '0' then
    early_sel := candidate;
  end if;
  wb_slave_out <= wb_masters_in(early_sel);
  for i in 0 to NUM_MASTERS-1 loop
    wb_masters_out(i).dat <= wb_slave_in.dat;
    wb_masters_out(i).ack <= wb_slave_in.ack when early_sel = i else '0';
    wb_masters_out(i).stall <= wb_slave_in.stall when early_sel = i else '1';
  end loop;
end process;

There is no clock signal here. Hence, this is not a register nor any other construct which needs to take time into account. You are saying that early_sel, wb_slave_out and wb_masters_out (each of them) MUST be updated each time any of selected, candidate, busy, wb_slave_in OR wb_masters_in are updated (because of the sensitivity list). Is that true (i.e. the hardware you want to describe)? Let's see:

early_sel <= candidate when busy = '0' else selected;

wb_slave_out <= wb_masters_in(early_sel);

gen: for i in 0 to NUM_MASTERS-1 generate
  wb_masters_out(i).dat <= wb_slave_in.dat;
  wb_masters_out(i).ack <= wb_slave_in.ack when early_sel = i else '0';
  wb_masters_out(i).stall <= wb_slave_in.stall when early_sel = i else '1';
end generate;

It turns out that:

  • early_sel depends on candidate and selected ONLY. And all we need to implement it in hardware is a 2-to-1 MUX.
  • wb_slave_out depends on wb_masters_in and early_sel ONLY. And all we need to implement it is, also, an N-to-1 MUX, where N is the module of wb_arb_master_t.
  • wb_masters_out depends on wb_slave_in and early_sel ONLY. Regarding the implementation, there is a hardwired broadcast for dat, NUM_MASTERS 2-to-1 MUXes for each of ack and stall, and a decoder. Well, these last MUXes can be visualized as AND/OR gates with constant 0/1, respectively.

It can be argued that the original process is almost as readable as this version, and I wouldn't complain about that, should it come from an experienced user. However, for unexperienced users, I believe the latter is significantly easier to understand. For example, should the design need to be pipelined by say registering signal early_sel:

process(clk)
begin
  if rising_edge(clk) then
    early_sel <= candidate when busy = '0' else selected;
  end if;
end process;

wb_slave_out <= wb_masters_in(early_sel);

gen: for i in 0 to NUM_MASTERS-1 generate
  wb_masters_out(i).dat <= wb_slave_in.dat;
  wb_masters_out(i).ack <= wb_slave_in.ack when early_sel = i else '0';
  wb_masters_out(i).stall <= wb_slave_in.stall when early_sel = i else '1';
end generate;

It's "natural" to visualize which parts are registered and which are not. If you try to make this change above, you'll find that it is not so evident.

Last, but not least, I don't know how will synthesis tools behave in this case. I think the code is simple enough for them to guess that wb_masters_out does NOT depend on busy, candidate, selected or wb_masters_in and they should "optimize" non-needed "update triggers" (if this wording makes any sense at all...). However, strictly speaking, tools should not do that for you. You are describing something, and they should stick to it, even if it's wrong (i.e. it doesn't make sense). Hence, even if you use sequential statements, I would recommend to use three processes instead of a single one (you'll find that doing so is more verbose than using concurrent statements). Moreover, I invite you to try synthesizing all three versions and analyzing (visualizing) the output. Is exactly the same hardware?

Of course, I do understand that many of you are coming from a software engineering background and no one has explained this "formal" details of the language to you. Please do take the explanations above as strongly opinionated and VERY friendly. I don't mean to say you are doing it wrong, not at all. As long as the hardware you get generated is the one you mean to generate, it's ok. Just be aware that in hardware (as in software) "just compiling and running" does not mean it is ok. You need to read synthesis logs (or run lots of post-synthesis and post-implementation regression tests).

@ozbenh
Copy link
Contributor

ozbenh commented Jul 2, 2020

So I believe I understand what you mean. That said, it's my experience (but let me know if I'm wrong) that synthesizers generally ignore the sensitivity list of processes completely anyways. In fact in Microwatt we have bucketloads of processes without any. What we do try to do however is clearly separate synchronous processes (ie, registers) and combinational logic.

So in the specific case of the wishbone arbiter, early_sel is "early" specifically because it's combinational. A separate process latches it. This allows us to update the muxes on the same cycle as the cyc is asserted and thus avoid a cycle latency when going from idle to having a master selected.

In the same file you can see wishbone_candidate is another combinational process with a "all" sensitivity list (because doing an accurate one here is tricky) which is effectively an encoder.

And finally I have a 3rd process that is synchronous which represent the register of the selected master.

This is based on a lot of VHDL code we've seen out there, and it's written with generating HW in mind. The one thing maybe
I could have done is move early_sel generation to its own process to separate it from the mux itself but if you look at the rest of microwatt which is using the "2 processes" vhdl technique, this is really minor compared to our gigantic combinational processes :)

Fundamentally, we rely on the fact that the synthesizer turns a process into a large logic equation, which I assume it normalizes and simplifies, in order to generate a corresponding combinatorial circuit.

@eine
Copy link

eine commented Jul 2, 2020

So I believe I understand what you mean. That said, it's my experience (but let me know if I'm wrong) that synthesizers generally ignore the sensitivity list of processes completely anyways. In fact in Microwatt we have bucketloads of processes without any.

I'm calling @Paebbels and @JimLewis here for advice. It is my understanding that, since all was introduced in VHDL 2008, most synthesizers are expected to resolve the sensitivity list on their own. Maybe the need to be accurate when specifying the list of signals in the sensitivity list goes back to earlier versions of the standard (and the tools). However, I would expect a process with an empty sensitivity list to do effectively nothing at all. I.e., not to be "executed" even once, and removed from synthesis. I might be outdated here...

What we do try to do however is clearly separate synchronous processes (ie, registers) and combinational logic.

Yes, this is a perfectly valid approach. That's why I said that the preference to use concurrent statements only (mostly) is very opinionated.

So in the specific case of the wishbone arbiter, early_sel is "early" specifically because it's combinational. A separate process latches it.

Since there is no "standard style guide" for VHDL, this is also acceptable.

This allows us to update the muxes on the same cycle as the cyc is asserted and thus avoid a cycle latency when going from idle to having a master selected.

The style you use (a single or multiple processes, or not using processes) should not have any impact on you being able to update the muxes on the same cycle as the cyc is asserted. If tools are able to transform one representation into the other while preserving the same behaviour, humans can also do it. If there is some latency, it is because you are not doing it purely combinatorial. Which, BTW, you might want to do on purpose in other cases.

In the concurrent version above, all muxes are updated as soon as early_sel is updated. Well, there is a latency because of the routing and the gates, but it is immediate from a "synchronous logic" point of view.

Fundamentally, we rely on the fact that the synthesizer turns a process into a large logic equation, which I assume it normalizes and simplifies, in order to generate a corresponding combinatorial circuit.

I guess this is the point. From a hardware description perspective, the synthesizer does not need to normalize or simplify anything, unless told to do so. From a software perspective, the compiler is expected to provide the faster/smaller code that is behaviouraly correct, thus involving multiple opmitizations. So, the more sequential logic you use, the more you rely on the compiler doing the simplification. OTOH, when targeting FPGAs, the target is, in fact, a huge array of configurable/programmable bits. These details are probably more relevant when targeting ASICs.


I'm trying to build microwatt to check some code before posting it, but I get the following error:

# DOCKER=1 make
docker run --rm -v //t/microwatt://src:z -w //src ghdl/ghdl:buster-llvm-7 ghdl -c --std=08 -frelaxed -Psim-unisim -Wl,sim_vhpi_c.o -Wl,sim_bram_helpers_c.o -Wl,sim_console_c.o -Wl,sim_jtag_socket_c.o decode_types.vhdl common.vhdl wishbone_types.vhdl fetch1.vhdl utils.vhdl 
plru.vhdl cache_ram.vhdl icache.vhdl decode1.vhdl helpers.vhdl insn_helpers.vhdl gpr_hazard.vhdl cr_hazard.vhdl control.vhdl decode2.vhdl register_file.vhdl cr_file.vhdl crhelpers.vhdl ppc_fx_insns.vhdl rotator.vhdl logical.vhdl countzero.vhdl multiply.vhdl divider.vhdl execute1.vhdl loadstore1.vhdl mmu.vhdl dcache.vhdl writeback.vhdl core_debug.vhdl core.vhdl wishbone_arbiter.vhdl wishbone_bram_wrapper.vhdl sync_fifo.vhdl wishbone_debug_master.vhdl xics.vhdl syscon.vhdl soc.vhdl spi_rxtx.vhdl spi_flash_ctrl.vhdl sim_console.vhdl sim_pp_uart.vhdl sim_bram_helpers.vhdl sim_bram.vhdl sim_jtag_socket.vhdl sim_jtag.vhdl dmi_dtm_xilinx.vhdl sim_16550_uart.vhdl core_tb.vhdl -e core_tb
error: cannot load package "vcomponents"
ghdl: compilation error
make: *** [Makefile:101: core_tb] Error 1

@ozbenh
Copy link
Contributor

ozbenh commented Jul 2, 2020

Hrm, can you open a separate issue for this ? it probably works without docker, not sure why. I build unisim/vcomponents as a library first then import it, maybe it doesn't find the unisim-obj08.cf that should get generated in sim-unisim/

@JimLewis
Copy link

JimLewis commented Jul 2, 2020

Seems like there are two questions:

  1. First, the OP are conditional assignments allowed in processes in VHDL-2008.
    Yes. Absolutely. I would not change the code unless I was the original author.

  2. Process sensitivity lists.
    A process requires a mechanism to suspend - otherwise it is an infinite loop
    and simulation time will not advance (aka the simulator will hang).
    VHDL provides two mechanisms to suspend, either a sensitivity list or
    a wait statement- all wait statements suspend for at least a simulation (aka delta) cycle.
    The language does not permit both a sensitivity list and a wait statement.

Hence, it is bad if a process does not have a sensitivity list or wait statement.
Without these a tool will warn, possible infinite loop. Sometimes I get these
warnings in testbenches test sequencers that call procedures which have
wait statements in them - generally I avoid the warning by adding a "wait ;"
as the last line of the process.

@JimLewis
Copy link

JimLewis commented Jul 2, 2020

Continuing with 2:
A process without a sensitivity list immediately starts running
and does not suspend until it hits the first wait statement.

@ozbenh
Copy link
Contributor

ozbenh commented Jul 2, 2020

So a few things to clarify here.

Jim, you seem to confirm my earlier statement implicitly. A synthesizer ignores the sensitivity list (well, Vivado does warn if you get it wrong but that's about it). This isn't surprising, it uses the actual "code" to infer a logic equation and implements that. The sensitivity list is mostly useful for simulators.

Then when I say we don't always have a sensitivity list, what I mean here is that we often use "all". This is actually a sensitivity list, but in effect it means to run whenever any signal changes. It sounds wasteful, and maybe it is, but it wouldn't be particularly difficult for a simulator to build a more precise list on the first execution pass or at compile time and use that instead.

We use that a lot in Microwatt because Anton wanted to use the "2 process" VHDL coding technique which means you have a simple synchronous process to implement the pipeline register and a huge combinatorial one that is a function of that register and all inputs. And also I'm a tad lazy :)

That said, if ghdl doesn't do the above optimisation, then I suppose we might be able to speed things up by hand-crafting more precise sensitivity lists for various processes. I know I have myself been lax lately and tended to use either "all" for combinatorial processes or "clk" for synchronous ones.

@ozbenh
Copy link
Contributor

ozbenh commented Jul 2, 2020

Note: It surprises me that the concept of sensitivity list is still around. I might be missing something here but it sounds fairly trivial to me for anything that parses vhdl or verilog to infer a sensitivity list based on the signals used in the process (or always@ block).

I understand that in theory, one could purposefully write "code" that only interprets various signals when a specific signal changes, and this is in effect what we do in synchronous processes, however requiring an explicit list seems extremely error prone to me.

It would make sense to update the formal definition of HDLs such that sensitivity list become obsolete. They should be entirely derived from the actual code. Any input outside of a if *edge and any condition of such an if *edge.

@JimLewis
Copy link

JimLewis commented Jul 2, 2020

@ozbenh A property of combinational logic is that it is sensitive to all signals that are read in the process. Hence, in RTL for combinational logic, "all" is the right answer.

Yes with flip-flops, only clock and any asynchronous control is required. Maybe a good optimizer would do as well with "all" if we are careful about how we code (which we must be for RTL).

WRT sensitivity lists, you are forgetting about behavioral code and testbenches where I want to run only if a specific signal changes.

@JimLewis
Copy link

JimLewis commented Jul 2, 2020

@ozbenh WRT RTL coding guidelines, when they are too rigid I have seen them do bad things to code.

I tend to draw a block diagram of the code and then make the code a reflection of the block diagram. Today, since the tools are now generally good enough, the biggest thing we need to do is write code that is readable and maintainable - maybe with a few optimizations here and there. Generally this means I code a separate process for each separate item in the block diagram - perhaps grouping a couple of related things together.

To me readable RTL code means that a reviewer should be able to easily draw the block diagram of the code by reading the code - and quickly. If you can't do this then the coding guidelines need to be revised until you can.

A disciplined person can write readable code using only a single process. However, single process code leads people to temptation to mix things that break the spatial locality of code and make readability more challenging. Generally as the space increases between assignments to the same output, understanding decreases - of course there are exceptions, such as statemachines.

BTW, one of the thoughts about single process implementation is simulation optimization - which was done at a time when compilers were not as smart as they are now and computers were slow.

@eine
Copy link

eine commented Jul 2, 2020

That said, if ghdl doesn't do the above optimisation, then I suppose we might be able to speed things up by hand-crafting more precise sensitivity lists for various processes. I know I have myself been lax lately and tended to use either "all" for combinatorial processes or "clk" for synchronous ones.

I'd like to stress again that you are doing it correctly. None of the style issues you mentioned is wrong per se. Moreover, using all for combinatorial processes and clk only for synchronous ones (without any async reset) is the way to go since VHDL 2008 was published.

All the discussion above is about style preferences, readability and maintainability. GHDL today will accept and it should perform equally well with any of them.

I understand that in theory, one could purposefully write "code" that only interprets various signals when a specific signal changes, and this is in effect what we do in synchronous processes, however requiring an explicit list seems extremely error prone to me.

I see it as roughly equivalent to on event listeners when doing web frontend development in JavaScript (which is another highly concurrent ecosystem). You can register multiple listeners that will execute the same function, but not all the variables that are used in the functions need to trigger it. In VHDL, the sensitivity list is how you/we declare multiple event listeners at once.

I guess this is somehow conflicting with the message I tried to deliver before, because this is mostly useful for non-synthesizable code. For synthesis, all and clk should be used almost exclusively.


@JimLewis, thanks a lot for your explanations. I must say I feel a little embarrassed with your last comment because you explained the point much shorter and clear than I did...

To me readable RTL code means that a reviewer should be able to easily draw the block diagram of the code by reading the code - and quickly. If you can't do this then the coding guidelines need to be revised until you can.

In fact, my first contact with GHDL 4 years ago was about being able to draw that block diagram automatically from the VHDL code. Not a "technology diagram", neither an "RTL diagram", but a "diagram of the code". Meaning that any entity should be a black box, each process should also be a black box, and concurrent statements in each architecture can be grouped in a single or multiple black boxes. See ghdl/ghdl#111.

The TL;DR is that tools that do "elaboration for simulation" can provide a tree of the architecture, but cannot draw the lines between the blocks/modules/blackboxes. OTOH, "elaboration for synthesis" does have the information to draw the connections, but it does "too much" elaboration and some information about processes or how concurrent statements are described in the code might be lost.

The ecosystem changed quite a lot in the last 4 years. Back then GHDL didn't support synthesis and it was absolutely out of Tristan's scope. Now, ghdl + yosys allow to have "RTL diagrams" with open source tools (e.g. netlistsvg). Moreover ghdl --synth is an unoptimized, unflattened VHDL 1993 netlist which tries to preserve naming of the original sources.

Hence, today, it might be possible to use/extend ghdl --synth along with KiCAD for not only adding FPGA/CPLD devices to the schematics, but for drawing the VHDL sources that correspond to those. I.e., the same schematic layout/manipulation engine might be used to design the board and the chip (down to RTL VHDL). Of course, this is just daydreaming because I don't have the knowledge to implement it myself. I keep trying periodically, tho.

See f4pga/ideas#41, which is a recent discussion focused on Verilog.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants