## Agile Hardware Design
***
# Decoupling

<img src="./images/logo.svg" alt="agile hardware design logo" style="float:right"/>

## Prof. Scott Beamer 


## [CSE 228A](https://classes.soe.ucsc.edu/cse228a/Winter24/)
Note: by Peter Hanping Chen 
1. Why do I modify? 
I cannot use configuration file (chisel_deps.sc) from Prof. Scott Beamer which have error in dependencies on my Ubuntu OS. 
2. Instead, I used UCB bootcamp configuration file (load_ivy.sc) below. 
Since load-ivy.sc is not 100% compatible in dependencies as chisel.deps.sc, I made some modification to make them work.
3. source/load-ivy.sc (https://github.com/freechipsproject/chisel-bootcamp)

## Plan for Today

* Scala case classes
* Decoupling blocks in Chisel
* Chisel Queue demo

## Motivation for Decoupling Components

* _**Decoupled**_ - connection with time insensitivity
  * Some flexibility about when data is sent or received
* Decoupling can _simplify design_, as can focus on each component individually
* Decoupling can improve _reusability_, since component is more flexible about its timing interactions

## Loading The Chisel Library Into a Notebook

In [45]:
//CompilationError: Failed to resolve ivy dependencies:Error downloading edu.berkeley.cs:chiseltest_2.12:0.6.2
//interp.load.module(os.Path(s"${System.getProperty("user.dir")}/../resource/chisel_deps.sc"))
// CompilationError: Failed to resolve ivy dependencies:Error downloading edu.berkeley.cs:chiseltest_2.12:0.6.2
//val path = System.getProperty("user.dir") + "/source/chisel_deps.sc"
// ammonite.util.CompilationError: Failed to resolve ivy dependencies:Error downloading edu.berkeley.cs:chiseltest_2.12:0.6.2
val path = System.getProperty("user.dir") + "/source/load-ivy.sc"
println("path: "+path)

path: /home/peter/AIU/AIU_CS800_Chisel/500_UCSC_HWD/007_Decoup/001_Code/source/load-ivy.sc


[36mpath[39m: [32mString[39m = [32m"/home/peter/AIU/AIU_CS800_Chisel/500_UCSC_HWD/007_Decoup/001_Code/source/load-ivy.sc"[39m

In [46]:
interp.load.module(ammonite.ops.Path(java.nio.file.FileSystems.getDefault().getPath(path)))

In [47]:
import chisel3._
import chisel3.util._
import chiseltest._
import chiseltest.RawTester.test

[32mimport [39m[36mchisel3._
[39m
[32mimport [39m[36mchisel3.util._
[39m
[32mimport [39m[36mchiseltest._
[39m
[32mimport [39m[36mchiseltest.RawTester.test[39m

In [48]:
// Test
class RegLand extends Module {
    val io = IO(new Bundle {
        val in  = Input(Bool())
        val en  = Input(Bool())
        val out = Output(Bool())
    })
    val r = Reg(Bool())
//    val r = RegInit(0.B)
    r := io.in
    io.out := r
//     io.out := RegNext(io.in, 0.B)
//     io.out := RegEnable(io.in, 0.B, io.en)
}
println (getVerilog(new RegLand))

Elaborating design...
Done elaborating.
module RegLand(
  input   clock,
  input   reset,
  input   io_in,
  input   io_en,
  output  io_out
);
`ifdef RANDOMIZE_REG_INIT
  reg [31:0] _RAND_0;
`endif // RANDOMIZE_REG_INIT
  reg  r; // @[cmd47.sc 7:16]
  assign io_out = r; // @[cmd47.sc 10:12]
  always @(posedge clock) begin
    r <= io_in; // @[cmd47.sc 9:7]
  end
// Register and memory initialization
`ifdef RANDOMIZE_GARBAGE_ASSIGN
`define RANDOMIZE
`endif
`ifdef RANDOMIZE_INVALID_ASSIGN
`define RANDOMIZE
`endif
`ifdef RANDOMIZE_REG_INIT
`define RANDOMIZE
`endif
`ifdef RANDOMIZE_MEM_INIT
`define RANDOMIZE
`endif
`ifndef RANDOM
`define RANDOM $random
`endif
`ifdef RANDOMIZE_MEM_INIT
  integer initvar;
`endif
`ifndef SYNTHESIS
`ifdef FIRRTL_BEFORE_INITIAL
`FIRRTL_BEFORE_INITIAL
`endif
initial begin
  `ifdef RANDOMIZE
    `ifdef INIT_RANDOM
      `INIT_RANDOM
    `endif
    `ifndef VERILATOR
      `ifdef RANDOMIZE_DELAY
        #`RANDOMIZE_DELAY begin end
      `else
        #0.002 begin 

defined [32mclass[39m [36mRegLand[39m

## Scala Case Classes

* Special type of class with additional features built-in
  * Companion object (with constructor) (don't need `new` to instantiate)
  * All parameters are automatically public (don't need to make them `val`)
  * Automatic implementations of `toString`, `equals`, and `copy` 
  * Great for pattern matching (future lecture)


### Case Class with Parameters ###

In [9]:
// https://docs.scala-lang.org/tour/case-classes.html
case class Message(sender: String, recipient: String, body: String)
val message1 = Message("guillaume@quebec.ca", "jorge@catalonia.es", "Ça va ?")

println("message1.sender: " + message1.sender)  // prints guillaume@quebec.ca


message1.sender: guillaume@quebec.ca


defined [32mclass[39m [36mMessage[39m
[36mmessage1[39m: [32mMessage[39m = [33mMessage[39m(
  [32m"guillaume@quebec.ca"[39m,
  [32m"jorge@catalonia.es"[39m,
  [32m"\u00c7a va ?"[39m
)

### Case class with Comparison ###

In [10]:
// https://docs.scala-lang.org/tour/case-classes.html
case class Message(sender: String, recipient: String, body: String)

val message2 = Message("jorge@catalonia.es", "guillaume@quebec.ca", "Com va?")
val message3 = Message("jorge@catalonia.es", "guillaume@quebec.ca", "Com va?")
val messagesAreTheSame = message2 == message3
println ("messagesAreTheSame: " + messagesAreTheSame)  // true

messagesAreTheSame: true


defined [32mclass[39m [36mMessage[39m
[36mmessage2[39m: [32mMessage[39m = [33mMessage[39m(
  [32m"jorge@catalonia.es"[39m,
  [32m"guillaume@quebec.ca"[39m,
  [32m"Com va?"[39m
)
[36mmessage3[39m: [32mMessage[39m = [33mMessage[39m(
  [32m"jorge@catalonia.es"[39m,
  [32m"guillaume@quebec.ca"[39m,
  [32m"Com va?"[39m
)
[36mmessagesAreTheSame[39m: [32mBoolean[39m = true

### Case Class Copy ###

In [13]:
// https://docs.scala-lang.org/tour/case-classes.html
// Copy meesage5 to message 4.
case class Message(sender: String, recipient: String, body: String)
val message4 = Message("julien@bretagne.fr", "travis@washington.us", "Me zo o komz gant ma amezeg")
val message5 = message4.copy(sender = message4.recipient, recipient = "claire@bourgogne.fr")
println ("message5.sender: " + message5.sender)   // travis@washington.us
println ("message5.recipient: " + message5.recipient) // claire@bourgogne.fr
println ("message5.body: " + message5.body)  // "Me zo o komz gant ma amezeg"

message5.sender: travis@washington.us
message5.recipient: claire@bourgogne.fr
message5.body: Me zo o komz gant ma amezeg


defined [32mclass[39m [36mMessage[39m
[36mmessage4[39m: [32mMessage[39m = [33mMessage[39m(
  [32m"julien@bretagne.fr"[39m,
  [32m"travis@washington.us"[39m,
  [32m"Me zo o komz gant ma amezeg"[39m
)
[36mmessage5[39m: [32mMessage[39m = [33mMessage[39m(
  [32m"travis@washington.us"[39m,
  [32m"claire@bourgogne.fr"[39m,
  [32m"Me zo o komz gant ma amezeg"[39m
)

### Case Class Example ###

In [3]:
case class Movie(name: String, year: Int, genre: String) {
    def decade(): String = (year - year%10) + "s"
}

val m1 = Movie("Gattaca", 1997, "drama")
println("m1.genre: " + m1.genre)
val m2 = Movie("The Avengers", 1998, "action")
m2.copy(year=2012)
println("m2.genre: " + m2.genre)
println("m2.decade: " + m2.decade)

m1.genre: drama
m2.genre: action
m2.decade: 1990s


defined [32mclass[39m [36mMovie[39m
[36mm1[39m: [32mMovie[39m = [33mMovie[39m([32m"Gattaca"[39m, [32m1997[39m, [32m"drama"[39m)
[36mm2[39m: [32mMovie[39m = [33mMovie[39m([32m"The Avengers"[39m, [32m1998[39m, [32m"action"[39m)
[36mres2_4[39m: [32mMovie[39m = [33mMovie[39m([32m"The Avengers"[39m, [32m2012[39m, [32m"action"[39m)

## Using `case class` for Parameters in Chisel

In [50]:
case class CounterParams(limit: Int, start: Int = 0) {
    val width = log2Ceil(limit + 1)
}

class MyCounter(cp: CounterParams) extends Module {
    val io = IO(new Bundle {
        val en  = Input(Bool())
        val out = Output(UInt(cp.width.W))
    })
    val count = RegInit(cp.start.U(cp.width.W))
    when (io.en) {
        when (count < cp.limit.U) {
            count := count + 1.U
        } .otherwise {
            count := cp.start.U
        }
    }
    io.out := count
}

//printVerilog(new MyCounter(CounterParams(14)))

defined [32mclass[39m [36mCounterParams[39m
defined [32mclass[39m [36mMyCounter[39m

In [51]:
println (getVerilog(new MyCounter(CounterParams(14))))

Elaborating design...
Done elaborating.
module MyCounter(
  input        clock,
  input        reset,
  input        io_en,
  output [3:0] io_out
);
`ifdef RANDOMIZE_REG_INIT
  reg [31:0] _RAND_0;
`endif // RANDOMIZE_REG_INIT
  reg [3:0] count; // @[cmd49.sc 10:24]
  wire [3:0] _T_2 = count + 4'h1; // @[cmd49.sc 13:28]
  assign io_out = count; // @[cmd49.sc 18:12]
  always @(posedge clock) begin
    if (reset) begin // @[cmd49.sc 10:24]
      count <= 4'h0; // @[cmd49.sc 10:24]
    end else if (io_en) begin // @[cmd49.sc 11:18]
      if (count < 4'he) begin // @[cmd49.sc 12:35]
        count <= _T_2; // @[cmd49.sc 13:19]
      end else begin
        count <= 4'h0; // @[cmd49.sc 15:19]
      end
    end
  end
// Register and memory initialization
`ifdef RANDOMIZE_GARBAGE_ASSIGN
`define RANDOMIZE
`endif
`ifdef RANDOMIZE_INVALID_ASSIGN
`define RANDOMIZE
`endif
`ifdef RANDOMIZE_REG_INIT
`define RANDOMIZE
`endif
`ifdef RANDOMIZE_MEM_INIT
`define RANDOMIZE
`endif
`ifndef RANDOM
`define RANDO

## Motivation for Handshaking Protocol

* Can already be difficult to correctly implement a seqentual component, but what about two sequential components interacting?

* For today, let's only focus on transferring data
  * A _producer_ sending data to a _consumer_

* _**Challenge:**_ recognize when a side is (or is not) able to send/receive data

<img src="images/producer.svg" alt="ready/valid schematic" style="width:75%;margin-left:auto;margin-right:auto"/>

## Best to Distribute Control

* When to use _centralized_ vs _distributed_ control?
  * Common tradeoff throughout systems
  * Centralized can be more efficient and easier to implement (for small scale)
  * Distributed (peer-to-peer) can scale to larger designs much more easily
  * _Common outcome:_ centralized within components and distributed between them
  * Thus, question: _"At what scale to switch from centralized to distributed?"_

* For data transfer between components, may need ...
  * Ability for producer to indicate no data is being sent
  * Ability for consumer to indicate inability to receive data (_back pressure_)

## Ready/Valid Protocol

* Common hardware design pattern for producer-consumer data transfer

<img src="images/handshake-wave.svg" alt="ready/valid waveform" style="width:35%;float:right"/>

* _**valid**_ - output from producer indicating sending data

* _**ready**_ - output from consumer indicating able to receive

* _**bits**_ - the payload producer is sending consumer

* Transfer occurs when both _ready & valid_ in same cycle

<img src="images/readyValid.svg" alt="ready/valid schematic" style="width:75%;margin-left:auto;margin-right:auto"/>

<!-- for waveform
https://wavedrom.com/editor.html
{signal: [
  {name: 'clock', wave: 'p....'},
  {name: 'valid', wave: '01..0'},
  {name: 'ready', wave: '0.1..'},
  {name: 'bits', wave: 'x3.4x', data: ['d0', 'd1']},
]} -->

## Chisel Supports Ready/Valid

* Best to use standard library's support for these patterns
  * Less code to write, less chance of error, standardization improves readability
* To use, wrap data to transfer with desired protocol
  * Library will add additional signals & provide helper functions
  * By default, sending data in output direction, use `Flipped` to reverse

### [Valid](https://javadoc.io/doc/edu.berkeley.cs/chisel3_2.13/latest/chisel3/util/Valid.html) - only `valid`

* Consumer can't say no
  * Must consume when sent
* Indicates the existence of data
  * Amost like hardware equivalent of Scala's `Option`

### [Decoupled](https://javadoc.io/doc/edu.berkeley.cs/chisel3_2.13/latest/chisel3/util/Decoupled$.html) - `ready & valid`

* Consumer can apply backpressure
* _**BEWARE**_ of _combinational loops_
  * Avoid using ready/valid input to combinationally create ready/valid output

## Combinational Loops

* (Uncontrolled) feedback paths that do NOT pass through state elements (registers or memories)
    * State elements provide _synchronization_ and thus control feedback
    * Generated hardware can have unpredictable values, or even get trapped in metastable state
* Generally want to avoid combinational loops (usually a mistake)
    * Can sometimes prove will converge, but should be very deliberate

In [52]:
class LoopyCounter(width: Int) extends Module {
    val io = IO(new Bundle {
        val count = Output(UInt(width.W))
    })
    // Error below
//    io.count := io.count + 1.U
    io.count := RegNext(io.count + 1.U)
}
//printVerilog(new LoopyCounter(4))

defined [32mclass[39m [36mLoopyCounter[39m

In [53]:
println (getVerilog(new LoopyCounter(4)))

Elaborating design...
Done elaborating.
module LoopyCounter(
  input        clock,
  input        reset,
  output [3:0] io_count
);
`ifdef RANDOMIZE_REG_INIT
  reg [31:0] _RAND_0;
`endif // RANDOMIZE_REG_INIT
  reg [3:0] REG; // @[cmd51.sc 7:24]
  assign io_count = REG; // @[cmd51.sc 7:14]
  always @(posedge clock) begin
    REG <= io_count + 4'h1; // @[cmd51.sc 7:34]
  end
// Register and memory initialization
`ifdef RANDOMIZE_GARBAGE_ASSIGN
`define RANDOMIZE
`endif
`ifdef RANDOMIZE_INVALID_ASSIGN
`define RANDOMIZE
`endif
`ifdef RANDOMIZE_REG_INIT
`define RANDOMIZE
`endif
`ifdef RANDOMIZE_MEM_INIT
`define RANDOMIZE
`endif
`ifndef RANDOM
`define RANDOM $random
`endif
`ifdef RANDOMIZE_MEM_INIT
  integer initvar;
`endif
`ifndef SYNTHESIS
`ifdef FIRRTL_BEFORE_INITIAL
`FIRRTL_BEFORE_INITIAL
`endif
initial begin
  `ifdef RANDOMIZE
    `ifdef INIT_RANDOM
      `INIT_RANDOM
    `endif
    `ifndef VERILATOR
      `ifdef RANDOMIZE_DELAY
        #`RANDOMIZE_DELAY begin end
      `else
        #0

<img src="images/combo.svg" alt="combinational loop example" style="width:60%;margin-left:auto;margin-right:auto"/>

## Example: Using Chisel `Valid` (1/2)

In [54]:
class MakeValid(w: Int) extends Module {
    val io = IO(new Bundle {
        val en  = Input(Bool())
        val in  = Input(UInt(w.W))
        val out = Valid(UInt(w.W))
    })
    io.out.valid := io.en
    io.out.bits := io.in
}

//printVerilog(new MakeValid(4))

defined [32mclass[39m [36mMakeValid[39m

In [55]:
println (getVerilog(new MakeValid(4)))

Elaborating design...
Done elaborating.
module MakeValid(
  input        clock,
  input        reset,
  input        io_en,
  input  [3:0] io_in,
  output       io_out_valid,
  output [3:0] io_out_bits
);
  assign io_out_valid = io_en; // @[cmd53.sc 7:18]
  assign io_out_bits = io_in; // @[cmd53.sc 8:17]
endmodule



## Example: Using Chisel `Valid` (2/2)

In [56]:
class ValidReceiver(w: Int) extends Module {
    val io = IO(new Bundle {
        val in = Flipped(Valid(UInt(w.W)))
    })
    when (io.in.valid) {
        printf("  received %d\n", io.in.bits)
    }
}

// printVerilog(new ValidReceiver(4))
test(new ValidReceiver(4)) { c =>
    for (cycle <- 0 until 8) {
        c.io.in.bits.poke(cycle.U)
        println(s"cycle: $cycle")
        c.io.in.valid.poke((cycle%2 == 0).B)
        c.clock.step()
    }
}

Elaborating design...
Done elaborating.
cycle: 0
  received   0
cycle: 1
cycle: 2
  received   2
cycle: 3
cycle: 4
  received   4
cycle: 5
cycle: 6
  received   6
cycle: 7
test ValidReceiver Success: 0 tests passed in 10 cycles in 0.015906 seconds 628.70 Hz


defined [32mclass[39m [36mValidReceiver[39m

In [57]:
println (getVerilog(new ValidReceiver(4)))

Elaborating design...
Done elaborating.
module ValidReceiver(
  input        clock,
  input        reset,
  input        io_in_valid,
  input  [3:0] io_in_bits
);
  always @(posedge clock) begin
    `ifndef SYNTHESIS
    `ifdef PRINTF_COND
      if (`PRINTF_COND) begin
    `endif
        if (io_in_valid & ~reset) begin
          $fwrite(32'h80000002,"  received %d\n",io_in_bits); // @[cmd55.sc 6:15]
        end
    `ifdef PRINTF_COND
      end
    `endif
    `endif // SYNTHESIS
  end
endmodule



## Example: Using Chisel `Decoupled` (1/2)

In [58]:
class CountWhenReady(maxVal: Int) extends Module {
    val io = IO(new Bundle {
        val en  = Input(Bool())
        val out = Decoupled(UInt())
    })
    val advanceCounter = io.en && io.out.ready
    val (count, wrap) = Counter(advanceCounter, maxVal)
    io.out.bits := count
    io.out.valid := io.en
}

//printVerilog(new CountWhenReady(4))

defined [32mclass[39m [36mCountWhenReady[39m

In [59]:
println (getVerilog(new CountWhenReady(4)))

Elaborating design...
Done elaborating.
module CountWhenReady(
  input        clock,
  input        reset,
  input        io_en,
  input        io_out_ready,
  output       io_out_valid,
  output [1:0] io_out_bits
);
`ifdef RANDOMIZE_REG_INIT
  reg [31:0] _RAND_0;
`endif // RANDOMIZE_REG_INIT
  wire  advanceCounter = io_en & io_out_ready; // @[cmd57.sc 6:32]
  reg [1:0] count; // @[Counter.scala 60:40]
  wire [1:0] _wrap_value_T_1 = count + 2'h1; // @[Counter.scala 76:24]
  assign io_out_valid = io_en; // @[cmd57.sc 9:18]
  assign io_out_bits = count; // @[cmd57.sc 8:17]
  always @(posedge clock) begin
    if (reset) begin // @[Counter.scala 60:40]
      count <= 2'h0; // @[Counter.scala 60:40]
    end else if (advanceCounter) begin // @[Counter.scala 118:17]
      count <= _wrap_value_T_1; // @[Counter.scala 76:15]
    end
  end
// Register and memory initialization
`ifdef RANDOMIZE_GARBAGE_ASSIGN
`define RANDOMIZE
`endif
`ifdef RANDOMIZE_INVALID_ASSIGN
`define RANDOMIZE
`endif
`ifdef

## `Decoupled` Helper Functions

* Convenience functions that wrap up functionality & improve readability ([code](https://github.com/chipsalliance/chisel/blob/0600e1e1875311fbe6ad8584f24def61594f79cc/src/main/scala/chisel3/util/Decoupled.scala#L48))
    * Internally, they are Scala functions working on Chisel things
* `fire` - Bool that is true if and only if ready & valid
* `enq(data)` - Sends data and sets valid to true (doesn't check ready)
* `noenq` - Sets valid to false
* `deq`/`nodeq` - Like enq/noenq for receiver

## Example: Using Chisel `Decoupled` (2/2)

In [60]:
class CountWhenReady(maxVal: Int) extends Module {
    val io = IO(new Bundle {
        val en  = Input(Bool())
        val out = Decoupled(UInt())
    })
    val (count, wrap) = Counter(io.out.fire, maxVal)
    when (io.en) {
        io.out.enq(count)
//         io.out.bits := count
//         io.out.valid := true.B
    } .otherwise {
        io.out.noenq()
//         io.out.bits := DontCare
//         io.out.valid := false.B
    }
}

// printVerilog(new CountWhenReady(3))

test(new CountWhenReady(3)) { c =>
    c.io.en.poke(true.B)
    for (cycle <- 0 until 7) {
        c.io.out.ready.poke((cycle%2 == 1).B)
        println(s"cycle: $cycle, count: ${c.io.out.bits.peek()}")
        c.clock.step()
    }
}

Elaborating design...
Done elaborating.
cycle: 0, count: UInt<1>(0)
cycle: 1, count: UInt<1>(0)
cycle: 2, count: UInt<1>(1)
cycle: 3, count: UInt<1>(1)
cycle: 4, count: UInt<2>(2)
cycle: 5, count: UInt<2>(2)
cycle: 6, count: UInt<1>(0)
test CountWhenReady Success: 0 tests passed in 9 cycles in 0.009036 seconds 995.96 Hz


defined [32mclass[39m [36mCountWhenReady[39m

In [61]:
println (getVerilog(new CountWhenReady(3)))

Elaborating design...
Done elaborating.
module CountWhenReady(
  input        clock,
  input        reset,
  input        io_en,
  input        io_out_ready,
  output       io_out_valid,
  output [1:0] io_out_bits
);
`ifdef RANDOMIZE_REG_INIT
  reg [31:0] _RAND_0;
`endif // RANDOMIZE_REG_INIT
  wire  _T = io_out_ready & io_out_valid; // @[Decoupled.scala 40:37]
  reg [1:0] count; // @[Counter.scala 60:40]
  wire  wrap_wrap = count == 2'h2; // @[Counter.scala 72:24]
  wire [1:0] _wrap_value_T_1 = count + 2'h1; // @[Counter.scala 76:24]
  assign io_out_valid = io_en; // @[cmd59.sc 7:18 Decoupled.scala 47:20 Decoupled.scala 56:20]
  assign io_out_bits = count; // @[cmd59.sc 7:18 Decoupled.scala 48:19]
  always @(posedge clock) begin
    if (reset) begin // @[Counter.scala 60:40]
      count <= 2'h0; // @[Counter.scala 60:40]
    end else if (_T) begin // @[Counter.scala 118:17]
      if (wrap_wrap) begin // @[Counter.scala 86:20]
        count <= 2'h0; // @[Counter.scala 86:28]
      end 

## Using Queues to Handle Backpressure

* If traffic is bursty, can use a _queue_ to smooth traffic rate
  * Queue fills up when too much demand
  * When demand wanes, can drain queue
* A queue can't solve a throughput mismatch
  * If always _production rate_ > _consumption rate_, queue can't help
* A queue is a great place to use _decoupled_ interfaces
* Chisel's util provides `Queue` generator

<img src="images/queue.svg" alt="ready/valid schematic" style="width:65%;margin-left:auto;margin-right:auto"/>

## Using Chisel's `Queue`

* Part of `util` ([docs](https://javadoc.io/doc/edu.berkeley.cs/chisel3_2.13/latest/chisel3/util/Queue.html))
* Uses `Decoupled` for both input and output
* Specify type and number of entries `Queue(UInt(4.W), 8)`
* Additional optional arguments
  * `pipe` - if full, allow enqueue/dequeue at same time
  * `flow` - if empty, enqueued value available immediately for dequeue

<img src="images/queueReady.svg" alt="ready/valid schematic" style="width:85%;margin-left:auto;margin-right:auto"/>

## Chisel `Queue` Demo (1/2)

In [62]:
class CountIntoQueue(maxVal: Int, numEntries: Int, pipe: Boolean, flow: Boolean) extends Module {
    val io = IO(new Bundle {
        val en  = Input(Bool())
        val out = Decoupled(UInt())
        val count = Output(UInt())
    })
    val q = Module(new Queue(UInt(), numEntries, pipe=pipe, flow=flow))
    val (count, wrap) = Counter(q.io.enq.fire, maxVal)
    q.io.enq.valid := io.en
    q.io.enq.bits := count
    io.out <> q.io.deq
    io.count := count // for visibility
}

// printVerilog(new CountIntoQueue(3,1,false,false))

defined [32mclass[39m [36mCountIntoQueue[39m

In [63]:
println (getVerilog(new CountIntoQueue(3,1,false,false)))

Elaborating design...
Done elaborating.
module Queue(
  input        clock,
  input        reset,
  output       io_enq_ready,
  input        io_enq_valid,
  input  [1:0] io_enq_bits,
  input        io_deq_ready,
  output       io_deq_valid,
  output [1:0] io_deq_bits
);
`ifdef RANDOMIZE_MEM_INIT
  reg [31:0] _RAND_0;
`endif // RANDOMIZE_MEM_INIT
`ifdef RANDOMIZE_REG_INIT
  reg [31:0] _RAND_1;
`endif // RANDOMIZE_REG_INIT
  reg [1:0] ram [0:0]; // @[Decoupled.scala 218:16]
  wire [1:0] ram_io_deq_bits_MPORT_data; // @[Decoupled.scala 218:16]
  wire  ram_io_deq_bits_MPORT_addr; // @[Decoupled.scala 218:16]
  wire [1:0] ram_MPORT_data; // @[Decoupled.scala 218:16]
  wire  ram_MPORT_addr; // @[Decoupled.scala 218:16]
  wire  ram_MPORT_mask; // @[Decoupled.scala 218:16]
  wire  ram_MPORT_en; // @[Decoupled.scala 218:16]
  reg  maybe_full; // @[Decoupled.scala 221:27]
  wire  empty = ~maybe_full; // @[Decoupled.scala 224:28]
  wire  do_enq = io_enq_ready & io_enq_valid; // @[Decoupled.scala

## Chisel `Queue` Demo (2/2)

In [64]:
test(new CountIntoQueue(4,3,pipe=false,flow=false)) { c =>
    c.io.en.poke(true.B)
    c.io.out.ready.poke(false.B)
    for (cycle <- 0 until 4) {   // Fill up queue
        println(s"f count:${c.io.count.peek()} out:${c.io.out.bits.peek()} v:${c.io.out.valid.peek()}")
        c.clock.step()
    }
    println()
    c.io.en.poke(false.B)
    c.io.out.ready.poke(true.B)
    for (cycle <- 0 until 4) {   // Drain queue
        println(s"d count:${c.io.count.peek()} out:${c.io.out.bits.peek()} v:${c.io.out.valid.peek()}")
        c.clock.step()
    }
    println()
    c.io.en.poke(true.B)
    for (cycle <- 0 until 4) {   // Simultaneous
        println(s"s count:${c.io.count.peek()} out:${c.io.out.bits.peek()} v:${c.io.out.valid.peek()}")
        c.clock.step()
    }
}

Elaborating design...
Done elaborating.
f count:UInt<1>(0) out:UInt<1>(0) v:Bool(false)
f count:UInt<1>(1) out:UInt<1>(0) v:Bool(true)
f count:UInt<2>(2) out:UInt<1>(0) v:Bool(true)
f count:UInt<2>(3) out:UInt<1>(0) v:Bool(true)

d count:UInt<2>(3) out:UInt<1>(0) v:Bool(true)
d count:UInt<2>(3) out:UInt<1>(1) v:Bool(true)
d count:UInt<2>(3) out:UInt<2>(2) v:Bool(true)
d count:UInt<2>(3) out:UInt<1>(0) v:Bool(false)

s count:UInt<2>(3) out:UInt<1>(0) v:Bool(false)
s count:UInt<1>(0) out:UInt<2>(3) v:Bool(true)
s count:UInt<1>(1) out:UInt<1>(0) v:Bool(true)
s count:UInt<2>(2) out:UInt<1>(1) v:Bool(true)
test CountIntoQueue Success: 0 tests passed in 14 cycles in 0.020419 seconds 685.63 Hz
