# Advanced Chisel

### Introduction

Chisel is a framework that helps users write hardware generators.
The idea is to encode a designer's methodology into a program that can be used to create many categories of a circuit.
Some generators are very narrow in scope and can be used to generate a small set of designs, for example an adder parameterized by the width of its operands.
Other generators are very broad in scope and can generate circuits with a wide range of architectures, for example a rocket core that can either be in-order or out-of-order.

Most of the popular HDLs have some mechanisms for writing generators, but they are often difficult to use to write sophisticated generators because of limitations of the language.
Chisel can make writing sophisticated generators much easier because it is hosted in Scala.
This allows generator writers to use the powerful language features of scala and the software development practices they enable that are not possible in HDLs.

In [2]:
import $ivy.`edu.berkeley.cs::chisel3:3.0-SNAPSHOT_2017-07-19` 
import $ivy.`edu.berkeley.cs::chisel-iotesters:1.1-SNAPSHOT_2017-07-19`
import chisel3._
import chisel3.iotesters.{ChiselFlatSpec, Driver, PeekPokeTester}
import chisel3.util._

Checking https://repo1.maven.org/maven2/edu/berkeley/cs/chisel3_2.11/3.0-SNAPSHOT_2017-07-19/chisel3_2.11-3.0-SNAPSHOT_2017-07-19.pom.sha1
Checking https://repo1.maven.org/maven2/edu/berkeley/cs/chisel3_2.11/3.0-SNAPSHOT_2017-07-19/chisel3_2.11-3.0-SNAPSHOT_2017-07-19.pom
Checked https://repo1.maven.org/maven2/edu/berkeley/cs/chisel3_2.11/3.0-SNAPSHOT_2017-07-19/chisel3_2.11-3.0-SNAPSHOT_2017-07-19.pom
Checked https://repo1.maven.org/maven2/edu/berkeley/cs/chisel3_2.11/3.0-SNAPSHOT_2017-07-19/chisel3_2.11-3.0-SNAPSHOT_2017-07-19.pom.sha1
Checking https://repo1.maven.org/maven2/edu/berkeley/cs/chisel3_2.11/3.0-SNAPSHOT_2017-07-19/
Checked https://repo1.maven.org/maven2/edu/berkeley/cs/chisel3_2.11/3.0-SNAPSHOT_2017-07-19/
Checking https://repo1.maven.org/maven2/edu/berkeley/cs/firrtl_2.11/1.0-SNAPSHOT_2017-07-19/firrtl_2.11-1.0-SNAPSHOT_2017-07-19.pom
Checking https://repo1.maven.org/maven2/edu/berkeley/cs/firrtl_2.11/1.0-SNAPSHOT_2017-07-19/firrtl_2.11-1.0-SNAPSHOT_2017-07-19.pom.sha1


[32mimport [39m[36m$ivy.$                                                  
[39m
[32mimport [39m[36m$ivy.$                                                          
[39m
[32mimport [39m[36mchisel3._
[39m
[32mimport [39m[36mchisel3.iotesters.{ChiselFlatSpec, Driver, PeekPokeTester}
[39m
[32mimport [39m[36mchisel3.util._[39m

## Module Parameterization

### Simple Parameterization

An important building block to writing hardware generators is writing a parameterized module.
Chisel `Module`s are implemented as Scala classes, and any Scala objects can be used as parameters to a `Module`.

Providing widths and vector sizes is the simplest style of parameterization and is commonly done in Verilog.
The following code block gives examples of this style of parameterization in chisel.

Notice the use of `require()`.
Some values of a parameter may be nonsensical or unsupported by the generator.
`require()` allows the generator author to make a Chisel compile-time assertion with a message explaining what was wrong.
Note what happens when you change the values of the parameters in the last two lines.

In [2]:
class Adder(inWidth: Int, outWidth: Int) extends Module {
    require(inWidth > 0 && outWidth > 0, s"Widths should be positive, got $inWidth and $outWidth")
    require (outWidth >= inWidth, s"Output width should not be smaller than input width ($outWidth < $inWidth)")
    
    val io = IO(new Bundle {
        val in0 = Input(UInt(inWidth.W))
        val in1 = Input(UInt(inWidth.W))
        val out = Output(UInt(outWidth.W))
    })
    
    io.out := io.in0 + io.in1
}

class VecAdder(inWidth: Int, outWidth: Int, vecSize: Int) extends Module {
    require (vecSize > 0, "Vector length should be positive")
    require(inWidth > 0 && outWidth > 0, "Widths should be positive")
    require (outWidth >= inWidth, "Output width should not be smaller than input width")

    
    val io = IO(new Bundle {
        val in0 = Input(Vec(vecSize, UInt(inWidth.W)))
        val in1 = Input(Vec(vecSize, UInt(inWidth.W)))
        val out = Output(Vec(vecSize, UInt(outWidth.W)))
    })
    
    for (i <- 0 until vecSize) {
        io.out(i) := io.in0(i) + io.in1(i)
    }
}

class AdderTester(c: Adder) extends PeekPokeTester(c) {
    poke(c.io.in0, 3)
    poke(c.io.in1, 4)
    step(1)
    expect(c.io.out, 7)
}

class VecAdderTester(c: VecAdder) extends PeekPokeTester(c) {
    for (i <- 0 until c.io.in0.length) {
        poke(c.io.in0(i), 3 + i)
        poke(c.io.in1(i), 4 + i)
        expect(c.io.out(i), 7 + 2 * i)
    }
}

Driver(() => new Adder(3, 4), "firrtl") { c => new AdderTester(c) }
Driver(() => new VecAdder(7, 8, 5), "firrtl") { c => new VecAdderTester(c) }

[[35minfo[0m] [0.002] Elaborating design...
[[35minfo[0m] [0.111] Done elaborating.
Total FIRRTL Compile Time: 259.0 ms
Total FIRRTL Compile Time: 30.2 ms
End of dependency graph
Circuit state created
[[35minfo[0m] [0.002] SEED 1502320703820
test cmd1WrapperHelperAdder Success: 1 tests passed in 6 cycles taking 0.017796 seconds
[[35minfo[0m] [0.004] RAN 1 CYCLES PASSED
[[35minfo[0m] [0.000] Elaborating design...
[[35minfo[0m] [0.024] Done elaborating.
Total FIRRTL Compile Time: 55.4 ms
Total FIRRTL Compile Time: 51.2 ms
End of dependency graph
Circuit state created
[[35minfo[0m] [0.000] SEED 1502320704603
test cmd1WrapperHelperVecAdder Success: 5 tests passed in 5 cycles taking 0.013948 seconds
[[35minfo[0m] [0.008] RAN 0 CYCLES PASSED


defined [32mclass[39m [36mAdder[39m
defined [32mclass[39m [36mVecAdder[39m
defined [32mclass[39m [36mAdderTester[39m
defined [32mclass[39m [36mVecAdderTester[39m
[36mres1_4[39m: [32mBoolean[39m = [32mtrue[39m
[36mres1_5[39m: [32mBoolean[39m = [32mtrue[39m

### More Advanced Parameterization

The kind of parameterization shown in `Adder` and `VecAdder` is very basic.
Chisel `Module`s are scala classes, so anything that can be used as an argument to a scala class constructor can be a parameter for a Chisel `Module`.

Here follows a few LFSR implementations that are parameterized differently.
Note that the `io` for the `LFSR` is implemented as a separate class.
This makes it easy to write multiple versions of the `LFSR` with the same IO interface.

The first example is perhaps somewhat similar to how you would write this generator in a language like Verilog.
The module has two parameters: number of state bits and an integer representing the feedback polynomial.
If the `i`th LSB of `feedback` is high, the `i`th bit of state is included in the feedback.

In the second example, instead of representing `feedback` as an integer, we represent it as a function.
`feedback` takes `UInt` as an argument and produces a `Bool`.
This is possible because scala is a functional programming language that treats functions as first class objects (you can pass them around as arguments and treat them like any other object).

Is the second example better than the first?
In this case, it made the code shorter (although defining `feedback` as a function may take more lines of code than defining it as an integer).
Using a function in this case eliminates some bit manipulation code which can be hard to read or debug.
One potential downside to having feedback defined as a function is that you could pass a function that has state or isn't linear, which would mean this is no longer an LFSR.

The third example has one parameter: a list with `Booleans` that indicate if the bit in the corresponding position is included in the feedback polynomial.
This avoids the bit manipulation code of the first example while still enforcing that you are actually building an LFSR.

One thing to notice about the third example is that `n` is no longer a parameter.
The number of bits of state is set by the length of the list being passed in.
Also note that it is written using some functional programming constructs.

Which style of parameterization presented here is best?


In [16]:
class LFSRIO extends Bundle {
    val en  = Input(Bool())
    val out = Output(Bool())
    val state = Output(UInt())
}

class LFSRwithIntParams(n: Int, feedback: Int) extends Module {
    require(n > 1, "State must be at least 2 bits")
    
    val io = IO(new LFSRIO)
    
    val allOnes = (BigInt(1) << n) - 1 // n may be larger than the word size
    val state = RegInit(allOnes.U(n.W))
    val nextStates = Wire(Vec(n, Bool()))
    
    nextStates(0) := state(0)
    for (i <- 1 until n) {
        val sel = (feedback >> i) & 1
        if (sel != 0) {
            // this is a tap!
            nextStates(i) := state(i) ^ nextStates(i-1)
        } else {
            // not a tap, just pass through
            nextStates(i) := nextStates(i-1)
        }
    }
    
    io.out := state(0)
    when (io.en) {
        state := nextStates(n-1)
    }
    io.state := state
}

// Functions in scala are first class objects
// UInt => Bool is the type signature for a function that takes a UInt as an argument
// and returns a Bool.
// The input will be the state of the lfsr and the return value will be the new
// bit to shift in.
class LFSRwithFuncParam(n: Int, feedback: UInt => Bool) extends Module {
    require(n > 1, "State must be at least 2 bits")
    
    val io = IO(new LFSRIO)
    
    val allOnes = (1 << n) - 1 // n may be larger than the word size
    val state = RegInit(allOnes.U(n.W))
    val nextState = (state << 1) | feedback(state)
    
    io.out := state(0)
    when (io.en) {
        state := nextState
    }
    io.state := state
}

class LFSRwithPolynomialParam(polynomial: Seq[Boolean]) extends Module {
    require (polynomial.length > 1, "State must be at least 2 bits")
    
    val io = IO(new LFSRIO)
    
    val n = polynomial.length
    val allOnes = (BigInt(1) << n) - 1 // n may be larger than the word size
    val state = RegInit(allOnes.U(n.W))
    // e.g. Seq(1, 0, 1) -> Seq( (1,0), (0,1), (1,2) )
    val polyWithIdxs = polynomial.zipWithIndex
    // e.g. Seq( (1,0), (0,1), (1,2) ) -> Seq( (1,0), (1,2) )
    val polyWithIdxsFiltered = polyWithIdxs.filter( x => x._1 )
    // e.g. Seq( (1,0), (1,2) ) -> Seq(0, 2)
    val feedback = polyWithIdxsFiltered.map ( x => state(x._2) ).reduce( _ ^ _ )
    // the last three lines could be combined into one step with
    //val feedback = polynomial.zipWithIndex.collect {
    //  case (sel, idx) if sel => state(idx)
    //}.reduce(_ ^ _)
    val nextState = (state << 1) | feedback
    
    io.out := state(0)
    when (io.en) {
        state := nextState
    }
    io.state := state
}

println (chisel3.Driver.emit  (() => new LFSRwithPolynomialParam(Seq(1, 0, 0, 1).map(_!=0))))

[[35minfo[0m] [0.000] Elaborating design...
[[35minfo[0m] [0.009] Done elaborating.
;buildInfoPackage: chisel3, version: 3.0-SNAPSHOT_2017-07-19, scalaVersion: 2.11.11, sbtVersion: 0.13.15, builtAtString: 2017-07-19 18:56:34.453, builtAtMillis: 1500490594453
circuit cmd15WrapperHelperLFSRwithPolynomialParam : 
  module cmd15WrapperHelperLFSRwithPolynomialParam : 
    input clock : Clock
    input reset : UInt<1>
    output io : {flip en : UInt<1>, out : UInt<1>, state : UInt}
    
    clock is invalid
    reset is invalid
    io is invalid
    reg state : UInt<4>, clock with : (reset => (reset, UInt<4>("h0f"))) @[cmd15.sc 63:24]
    node _T_7 = bits(state, 0, 0) @[cmd15.sc 69:57]
    node _T_8 = bits(state, 3, 3) @[cmd15.sc 69:57]
    node feedback = xor(_T_7, _T_8) @[cmd15.sc 69:76]
    node _T_9 = shl(state, 1) @[cmd15.sc 74:28]
    node nextState = or(_T_9, feedback) @[cmd15.sc 74:34]
    node _T_10 = bits(state, 0, 0) @[cmd15.sc 76:20]
    io.out <= _T_10 @[cmd15.sc 76:12]
    

defined [32mclass[39m [36mLFSRIO[39m
defined [32mclass[39m [36mLFSRwithIntParams[39m
defined [32mclass[39m [36mLFSRwithFuncParam[39m
defined [32mclass[39m [36mLFSRwithPolynomialParam[39m

The ability to have more sophisticated objects as parameters to our `Module`s is very powerful.
Combined with the fact that we can write arbitrary scala code with our Chisel code, this means we can write programs that generate low level parameters based on high level requirements.

In the following example, we write a `MSequence` `Module` that generates its own polynomial parameter.
It uses pure scala to find a generator polynomial that will give a maximal-length LFSR and then passes the polynomial to the `LFSR` generator.
Don't worry too much about the details of how it finds the generator polynomial (which is done inside `object Galois { ... }`).

In [17]:
// Based on Saxena & McClusky, "Primitive Polynomial Generation Algorithms: Implementation and Performance Analysis" (2004)
// http://crc.stanford.edu/crc_papers/CRC-TR-04-03.pdf
object Galois {
    def maxForDegree(n: Int): Long = {
        var max: Long = 1
        for (i <- 1 to n) {
            max *= 2
        }
        max - 1
    }
    def gp(degree: Int, l: Option[Int] = None, d: Option[Seq[Int]] = None): Seq[Int] = {
        val myL = l.getOrElse(degree - 1)
        val myD = d.getOrElse(scala.collection.mutable.ArrayBuffer.fill(degree + 1)(1))
        
        if (myL == 0) visit(myD) match {
            case Some(d) => d
            case _ => Seq()
        } else {
            val d0 = myD.updated(myL, 0)
            val d1 = myD.updated(myL, 1)
            val try0 = gp(degree, Some(myL - 1), Some(d0))
            if (try0.length > 0) return try0
            val try1 = gp(degree, Some(myL - 1), Some(d1))
            return try1
        }
        
    }
    def visit(d: Seq[Int]): Option[Seq[Int]] = {
        // println(s"visit() called on ${d.toString}")
        val n = d.length
        val max = maxForDegree(n)
        var f: Boolean = true
        var c: Long = 0
        var t: Int = 0
        val s = scala.collection.mutable.ArrayBuffer.fill(n)(1)
        do {
            c += 1
            t = 0
            for (i <- 0 until n) {
                t = (t ^ (s(i) & d(i)))
            }
            for (i <- 0 until n - 1) {
                s.update(i, s(i+1))
            }
            s.update(n-1, t)
            f = s.exists(_ == 0)
        } while (f)
        if (c == max) {
            Some(d)
        } else {
            None
        }
    }
}

class MSequence(nBits: Int) extends Module {
    val io = IO(new LFSRIO)
    
    // find polynomial corresponding to m-sequence with nBits of state
    val poly = Galois.gp(nBits - 1).map(_ != 0)
    
    val lfsr = Module(new LFSRwithPolynomialParam(poly))
    io <> lfsr.io
}

defined [32mobject[39m [36mGalois[39m
[36mres16_1[39m: [32mSeq[39m[[32mInt[39m] = [33mArrayBuffer[39m([32m1[39m, [32m0[39m, [32m0[39m, [32m1[39m)
defined [32mclass[39m [36mMSequence[39m

# TODO keep working on this

In [3]:
case class JupyterListenerCallbacks(
    step: Option[() => Unit] = None,
    poke: Option[(String, BigInt) => Unit] = None,
    peek: Option[String => BigInt] = None,
    reset: Option[() => Unit] = None,
    finish: Option[() => Unit] = None
)

object JupyterListeners {
    private var listener: Option[JupyterListenerCallbacks] = None
    
    def register(l: JupyterListenerCallbacks): Unit = {
        require(!listener.isEmpty, "Already have a registered listener!")
        listener = Some(l)
    }
    
    def finish(): Unit = {
        listener.get.finish.map { case finish => finish() }
        listener = None
    }
    
    def step(): Unit = listener.get.step.map { case step => step() }
    
    def peek(name: String): Option[BigInt] = {
        listener.get.peek.map { case peek => peek(name) }
    }
}

class JupyterTester[T <: Module](c: T) extends PeekPokeTester(c) {
    JupyterListeners.register(JupyterListenerCallbacks(
        step = Some( () => this.step(1) ),
        poke = Some( (s: String, b: BigInt) => { this.poke(s, b) }),
        peek = Some( (s: String) => this.peek(s)),
        reset = Some( () => this.reset()),
        finish = Some( () => this.finish )
    ))
}

defined [32mclass[39m [36mJupyterListenerCallbacks[39m
defined [32mobject[39m [36mJupyterListeners[39m
defined [32mclass[39m [36mJupyterTester[39m

In [25]:
// println (chisel3.Driver.emit  (() => new MSequence(5)))

publish.html(
"""
<script>
function myFunction() {
  function output(out_type, out) {
      data = out.data["text/plain"];
      alert("Here it comes: " + data);
  }
  var kernel = IPython.notebook.kernel;
  var a = kernel.execute("val a = 5; return \"a\"", {'output': output})
}
</script>
<button onclick="myFunction()">Step</button>
""")


In [20]:
a

[36mres19[39m: [32mInt[39m = [32m4[39m

### Type Parameterization

In the previous tutorial, we wrote a shift register. Unfortunately, it wasn't very flexible in what kind of inputs it could handle. If instead of a `Bool` we wanted a shift register for `SInt`, we would have to rewrite the shift register module.

In Scala, objects and functions aren't the only things we can treat as parameters. We can also treat types as paramters.

We usually need to provide a type constraint.
In this case, we want to be able to put objects in a bundle, connect (`:=`) them, and create registers with them (`RegNext`).
These operations cannot be done on arbitrary objects; for example `wire := 3` is illegal because scala is statically typed.
If we use a type constraint to say that type `T` is a subclass of `Data`, then we can use `:=` on any objects of type `T` because `:=` is defined for all `Data`.

Here are two implementations of a simple shift register that take types as a parameter.
The first is written using a loop, and the second is written using `foldLeft`.

Notice that the tester is also type parameterized!
We created a trait called `HasShiftRegisterIO[T]` that says `io` is of type `ShiftRegisterIO`.
The implementations of the shift register all include this trait, so the tester requires `T <: Module with HasShiftRegisterIO[V]`, which both `ShiftRegisterWithLoop[V]` and `ShiftRegister[V]` satisfy, so the tester can be used with both.
The tester has a second type parameter `V`, which is an argument for the `HasShiftRegisterIO` trait.
This unfortunately results in the scala compiler having some trouble figuring out how to infer all the types, so we have to be explicit and write them out, e.g. `new ShiftRegisterTester[ShiftRegisterWithLoop[SInt], SInt]`.

In [42]:
trait HasShiftRegisterIO[T <: Data] {
    def io: ShiftRegisterIO[T]
}

class ShiftRegisterIO[T <: Data](gen: T, n: Int) extends Bundle {
    require (n >= 0, "Shift register must have non-negative shift")
    
    val in = Input(gen.cloneType)
    val out = Output(Vec(n + 1, gen.cloneType)) // + 1 because in is included in out
}

class ShiftRegisterWithLoop[T <: Data](gen: T, n: Int) extends Module with HasShiftRegisterIO[T] {
    val io = IO(new ShiftRegisterIO(gen, n))
    
    io.out(0) := io.in
    for (i <- 0 until n) {
        io.out(i+1) := RegNext(io.out(i))
    }
}

class ShiftRegister[T <: Data](gen: T, n: Int) extends Module with HasShiftRegisterIO[T] {
    val io = IO(new ShiftRegisterIO(gen, n))
    
    io.out.foldLeft(io.in) { case (in, out) =>
        out := in
        RegNext(in)
    }
}

class ShiftRegisterTester[T <: Module with HasShiftRegisterIO[V], V <: Bits](c: T) extends PeekPokeTester(c) {
    println(s"Testing ShiftRegister of type ${c.io.in} and depth ${c.io.out.length}")
    for (i <- 0 until 10) {
        poke(c.io.in, i)
        println(s"$i: ${peek(c.io.out)}")
        step(1)
    }
}

println (chisel3.Driver.emit  (() => new ShiftRegister(UInt(5.W), 4)))
println (chisel3.Driver.emit  (() => new ShiftRegister(SInt(3.W), 5)))

Driver(() => new ShiftRegister(UInt(4.W), 5), "firrtl") { c => new ShiftRegisterTester[ShiftRegister[UInt], UInt](c) }
Driver(() => new ShiftRegister(SInt(6.W), 3), "firrtl") { c => new ShiftRegisterTester[ShiftRegister[SInt], SInt](c) }
Driver(() => new ShiftRegisterWithLoop(UInt(4.W), 5), "firrtl") { c => new ShiftRegisterTester[ShiftRegisterWithLoop[UInt], UInt](c) }
Driver(() => new ShiftRegisterWithLoop(SInt(6.W), 3), "firrtl") { c => new ShiftRegisterTester[ShiftRegisterWithLoop[SInt], SInt](c) }

[[35minfo[0m] [0.000] Elaborating design...
[[35minfo[0m] [0.004] Done elaborating.
;buildInfoPackage: chisel3, version: 3.0-SNAPSHOT_2017-07-19, scalaVersion: 2.11.11, sbtVersion: 0.13.15, builtAtString: 2017-07-19 18:56:34.453, builtAtMillis: 1500490594453
circuit cmd41WrapperHelperShiftRegister : 
  module cmd41WrapperHelperShiftRegister : 
    input clock : Clock
    input reset : UInt<1>
    output io : {flip in : UInt<5>, out : UInt<5>[5]}
    
    clock is invalid
    reset is invalid
    io is invalid
    io.out[0] <= io.in @[cmd41.sc 25:13]
    reg _T_14 : UInt, clock @[cmd41.sc 26:16]
    _T_14 <= io.in @[cmd41.sc 26:16]
    io.out[1] <= _T_14 @[cmd41.sc 25:13]
    reg _T_16 : UInt, clock @[cmd41.sc 26:16]
    _T_16 <= _T_14 @[cmd41.sc 26:16]
    io.out[2] <= _T_16 @[cmd41.sc 25:13]
    reg _T_18 : UInt, clock @[cmd41.sc 26:16]
    _T_18 <= _T_16 @[cmd41.sc 26:16]
    io.out[3] <= _T_18 @[cmd41.sc 25:13]
    reg _T_20 : UInt, clock @[cmd41.sc 26:16]
    _T_20 <= _T_18 @[c

defined [32mtrait[39m [36mHasShiftRegisterIO[39m
defined [32mclass[39m [36mShiftRegisterIO[39m
defined [32mclass[39m [36mShiftRegisterWithLoop[39m
defined [32mclass[39m [36mShiftRegister[39m
defined [32mclass[39m [36mShiftRegisterTester[39m
[36mres41_7[39m: [32mBoolean[39m = [32mtrue[39m
[36mres41_8[39m: [32mBoolean[39m = [32mtrue[39m
[36mres41_9[39m: [32mBoolean[39m = [32mtrue[39m
[36mres41_10[39m: [32mBoolean[39m = [32mtrue[39m

## Advanced Bundles
So far we've talked about writing code that can generate the contents of a module.
Generators also need to be able to programmatically generate IOs.
The next few sections will talk about some more sophisticated things you can do with `Bundle`s in chisel.

### DecoupledIO
Ready/valid handshakes are a very common interface.
Rather than make a new ready and valid signal in an ad-hoc way each time as required, chisel gives some helpers to make dealing with them easier.
`Decoupled` is one such helper.
Wrapping an IO with a call to `Decoupled(gen)` returns a bundle of type `DecoupledIO` with three fields:
  - `ready` (Input)
  - `valid` (Output)
  - `bits`  (Output of the type of `gen`)
The outputs and inputs can be reversed with a call to `Flipped()` if needed.
Decoupled also defines `fire()` which returns a `Bool` indicating when a valid transaction is occuring (i.e. `valid && ready`).

Chisel provides some other helpers, like `Valid()` (similar to `Decoupled` but with no `ready` signal, only `valid`) and `Irrevocable()` (same fields as `Decoupled`, but `valid` cannot go from 1 -> 0 unless `ready` is asserted).

The following code is an example of how to replace the somewhat ad-hoc `en` signal in `LFSRIO` with a `Decoupled` interface on `out`.

In [63]:
class SimpleLFSRIO extends Bundle {
    val out   = Decoupled(Bool())
    val state = Output(UInt())
}

class DecoupledLFSR(n: Int, feedback: UInt => Bool) extends Module {
    val io = IO(new SimpleLFSRIO)
    
    val allOnes    = (1 << n) - 1
    val state      = RegInit(allOnes.U(n.W))
    val nextState  = (state << 1) | feedback(state)
    io.out.valid  := true.B // LFSR can always output valid data
    io.out.bits   := state(n-1)
    io.state      := state

    when (io.out.fire()) { // io.out.fire() = io.out.ready && io.out.valid for Decoupled
        state := nextState
    }
}

println(chisel3.Driver.emit( () => new DecoupledLFSR(4, {u: UInt => u(3) ^ u(0)}) ))

[[35minfo[0m] [0.000] Elaborating design...
[[35minfo[0m] [0.007] Done elaborating.
;buildInfoPackage: chisel3, version: 3.0-SNAPSHOT_2017-07-19, scalaVersion: 2.11.11, sbtVersion: 0.13.15, builtAtString: 2017-07-19 18:56:34.453, builtAtMillis: 1500490594453
circuit cmd62WrapperHelperDecoupledLFSR : 
  module cmd62WrapperHelperDecoupledLFSR : 
    input clock : Clock
    input reset : UInt<1>
    output io : {out : {flip ready : UInt<1>, valid : UInt<1>, bits : UInt<1>}, state : UInt}
    
    clock is invalid
    reset is invalid
    io is invalid
    reg state : UInt<4>, clock with : (reset => (reset, UInt<4>("h0f"))) @[cmd62.sc 10:29]
    node _T_9 = shl(state, 1) @[cmd62.sc 11:29]
    node _T_10 = bits(state, 3, 3) @[cmd62.sc 21:84]
    node _T_11 = bits(state, 0, 0) @[cmd62.sc 21:91]
    node _T_12 = xor(_T_10, _T_11) @[cmd62.sc 21:88]
    node nextState = or(_T_9, _T_12) @[cmd62.sc 11:35]
    io.out.valid <= UInt<1>("h01") @[cmd62.sc 12:19]
    node _T_14 = bits(state, 3, 3) 

defined [32mclass[39m [36mSimpleLFSRIO[39m
defined [32mclass[39m [36mDecoupledLFSR[39m

### Parameterized Bundles

We've shown them in previous sections but it is worth discussing in a dedicated section.
Like `Module`s, chisel `Bundle`s are classes that can have any valid scala object as arguments.
These parameterized bundles can cause problems in some instances, usually with `cloneType`.
The following code will give a somewhat strange error unless you uncomment the commented `cloneType` implementation.

In [76]:
class ParamBundle(a: Int) extends Bundle {
    val in1 = Output(SInt(a.W))
    val in2 = Output(SInt(a.W))
    // override def cloneType = new ParamBundle(a).asInstanceOf[this.type]
}

println(chisel3.Driver.emit( () => new ShiftRegister(new ParamBundle(3), 4) ))

[[35minfo[0m] [0.000] Elaborating design...


: 

The error says a `cloneType` method is needed.
What is going on?
Every chisel object is either a bound "hardware" object or an unbound "type" object.
Bound hardware objects actually exist in the circuit, like a register or a wire.
Unbound type objects are things like `UInt(4.W)`- they don't exist in the circuit, they just describe a type.
`cloneType` is a method used a lot internally in chisel that gets an unbound type object from any object, including a bound hardware object.
Normally, chisel can figure out how to do this automatically, but sometimes parameterized bundles confuse this process because the chisel compiler has trouble figuring out where the parameters are coming from.
Overriding `cloneType` and filling in the parameters manually will solve the problem, as shown above.

### Optional Bundle Fields

Sometimes we want IOs to be optionally included or excluded.
Maybe there's some internal state that's nice to be able to look at for debugging, but you want to hide it when the generator is being used in a system.
Maybe your generator some inputs don't need to be connected all the time because there is a sensible default.

Optional bundle fields are one way to get this functionality.
`Option`s in scala might contain an object, or they might not.
The option could be `Some`, in which case if you call `get` on it you will get the object it contains.
It could also be `None`, in which case it contains no object and calling `get` on it raises an error.
An `Option` can be either `Some` or `None`- either it has a value, or it is empty.

In the following example, we show an LFSR where the state output is optional.
If you are debugging the LFSR, it could be nice to look at the state and see what's going on.
If you're using the LFSR as a PRBS generator, you don't have to see the state, just the output.
If the state output exists, the generator assigns to it, but if it doesn't it does nothing.

If the optional field were an input rather than an output, `getOrElse(...)` is a useful thing to call on the optional field.
If the option is `Some()`, calling `getOrElse(...)` on it returns the value of the `Some()`.
If the option is `None`, calling `getOrElse(default)` returns default.

In [58]:
class OptionalLFSRIO(includeState: Boolean = true) extends Bundle {
    val out   = Output(Bool())
    val state = if (includeState) Some(Output(UInt())) else None
}

class OptionalStateLFSR(includeState: Boolean = true) extends Module {
    val io = IO(new OptionalLFSRIO(includeState))
    
    // simple 4-bit LFSR
    val state = RegInit(15.U(4.W))
    val nextState = (state << 1) | (state(3) ^ state(0))
    state := nextState
    io.out := state(0)
    // map can be used to conditionally connect state
    // an equivalent way would be
    // if (!io.state.isEmpty) io.state.get := state
    io.state.map { case s => s := state }
}

println(chisel3.Driver.emit( () => new OptionalStateLFSR(true) ))
println(chisel3.Driver.emit( () => new OptionalStateLFSR(false) ))

[[35minfo[0m] [0.000] Elaborating design...
[[35minfo[0m] [0.014] Done elaborating.
;buildInfoPackage: chisel3, version: 3.0-SNAPSHOT_2017-07-19, scalaVersion: 2.11.11, sbtVersion: 0.13.15, builtAtString: 2017-07-19 18:56:34.453, builtAtMillis: 1500490594453
circuit cmd57WrapperHelperOptionalStateLFSR : 
  module cmd57WrapperHelperOptionalStateLFSR : 
    input clock : Clock
    input reset : UInt<1>
    output io : {out : UInt<1>, state : UInt}
    
    clock is invalid
    reset is invalid
    io is invalid
    reg state : UInt<4>, clock with : (reset => (reset, UInt<4>("h0f"))) @[cmd57.sc 10:24]
    node _T_6 = shl(state, 1) @[cmd57.sc 11:28]
    node _T_7 = bits(state, 3, 3) @[cmd57.sc 11:42]
    node _T_8 = bits(state, 0, 0) @[cmd57.sc 11:53]
    node _T_9 = xor(_T_7, _T_8) @[cmd57.sc 11:46]
    node nextState = or(_T_6, _T_9) @[cmd57.sc 11:34]
    state <= nextState @[cmd57.sc 12:11]
    node _T_10 = bits(state, 0, 0) @[cmd57.sc 13:20]
    io.out <= _T_10 @[cmd57.sc 13:12]
  

defined [32mclass[39m [36mOptionalLFSRIO[39m
defined [32mclass[39m [36mOptionalStateLFSR[39m

### Zero-Width Wires

Types with width 0 are not illegal in chisel.
This is frequently useful.
They are more or less equivalent to a literal 0 when they are used in operations, and they are not emitted in IOs.
When are they useful?

Frequently, widths are computed from other widths.
One very common case is that the width of one field is the log of the width of another field, as shown in the following example.
Rather than special-casing these situations out, zero-width wires allow your generator to be clean while still emitting the right verilog.

In [56]:
class VectorSelectIO(n: Int) extends Bundle {
    val vecIn = Input(Vec(UInt(4.W), n))
    val sel   = Input(UInt(log2Ceil(n).W))
    val out   = Output(UInt(4.W))
}

class VectorSelect(n: Int) extends Module {
    val io = IO(new VectorSelectIO(n))
    io.out := io.vecIn(io.sel)
}

println(chisel3.Driver.emit( () => new VectorSelect(4) ))
println(chisel3.Driver.emit( () => new VectorSelect(1) ))

[[35minfo[0m] [0.000] Elaborating design...
[[35minfo[0m] [0.003] Done elaborating.
;buildInfoPackage: chisel3, version: 3.0-SNAPSHOT_2017-07-19, scalaVersion: 2.11.11, sbtVersion: 0.13.15, builtAtString: 2017-07-19 18:56:34.453, builtAtMillis: 1500490594453
circuit cmd55WrapperHelperVectorSelect : 
  module cmd55WrapperHelperVectorSelect : 
    input clock : Clock
    input reset : UInt<1>
    output io : {flip vecIn : UInt<4>[4], flip sel : UInt<2>, out : UInt<4>}
    
    clock is invalid
    reset is invalid
    io is invalid
    io.out <= io.vecIn[io.sel] @[cmd55.sc 9:12]
    

[[35minfo[0m] [0.000] Elaborating design...
[[35minfo[0m] [0.002] Done elaborating.
;buildInfoPackage: chisel3, version: 3.0-SNAPSHOT_2017-07-19, scalaVersion: 2.11.11, sbtVersion: 0.13.15, builtAtString: 2017-07-19 18:56:34.453, builtAtMillis: 1500490594453
circuit cmd55WrapperHelperVectorSelect : 
  module cmd55WrapperHelperVectorSelect : 
    input clock : Clock
    input reset : UInt<1>
    outp

defined [32mclass[39m [36mVectorSelectIO[39m
defined [32mclass[39m [36mVectorSelect[39m

### Multiple Clocks

So far, all of our modules have used chisel's implicit clock and reset.
You can override one or both of clock and reset.
Clocks are special kinds of signals, whereas resets are synchronous and use any `Bool`.
Here is an example of how to add new clocks and resets and what the resulting firrtl looks like.

In [79]:
import chisel3.experimental.{withClockAndReset, withClock, withReset}

class MultiClockExample extends Module {
    val io = IO(new Bundle {
        val clk1 = Input(Clock())
        val clk2 = Output(Clock())
        val rst = Input(Bool())
        val data = Input(UInt(4.W))
    })
    
    // use the implicit clock and reset
    val reg1 = RegNext(io.data)
    // use the clock in the bundle and the implicit reset
    val reg2 = withClock(io.clk1) { RegNext(io.data) }
    // use the clock and reset (inverted) in the bundle
    withClockAndReset(io.clk1, !io.rst) {
        val reg3 = RegInit(0.U)//
        reg3 := io.data
    }
    // use the reset in the bundle
    val reg5 = withReset(io.rst) { RegInit(0.U) }
    reg5 := reg2
}

println(chisel3.Driver.emit( () => new MultiClockExample ))

[[35minfo[0m] [0.000] Elaborating design...
[[35minfo[0m] [0.012] Done elaborating.
;buildInfoPackage: chisel3, version: 3.0-SNAPSHOT_2017-07-19, scalaVersion: 2.11.11, sbtVersion: 0.13.15, builtAtString: 2017-07-19 18:56:34.453, builtAtMillis: 1500490594453
circuit cmd78WrapperHelperMultiClockExample : 
  module cmd78WrapperHelperMultiClockExample : 
    input clock : Clock
    input reset : UInt<1>
    output io : {flip clk1 : Clock, clk2 : Clock, flip rst : UInt<1>, flip data : UInt<4>}
    
    clock is invalid
    reset is invalid
    io is invalid
    reg reg1 : UInt, clock @[cmd78.sc 12:23]
    reg1 <= io.data @[cmd78.sc 12:23]
    reg reg2 : UInt, io.clk1 @[cmd78.sc 14:44]
    reg2 <= io.data @[cmd78.sc 14:44]
    node _T_9 = eq(io.rst, UInt<1>("h00")) @[cmd78.sc 16:32]
    reg _T_12 : UInt, io.clk1 with : (reset => (_T_9, UInt<1>("h00"))) @[cmd78.sc 17:27]
    _T_12 <= io.data @[cmd78.sc 18:14]
    reg reg5 : UInt, clock with : (reset => (io.rst, UInt<1>("h00"))) @[cmd78.s

[32mimport [39m[36mchisel3.experimental.{withClockAndReset, withClock, withReset}

[39m
defined [32mclass[39m [36mMultiClockExample[39m

## Verilog blackboxes

In [26]:
class Inverter extends BlackBox with HasBlackBoxInline {
    override def desiredName = "Inverter"
    val io = IO(new Bundle {
        val in  = Input(Bool())
        val out = Output(Bool())
    })
    
    setInline("Inverter.v", 
"""module Inverter(
in,
out
);
input in;
output out;
assign out = ~in;
endmodule
""")
}


class Negate extends Module {
    override def desiredName = "Negate"
    val io = IO(new Bundle {
        val in = Input(SInt(4.W))
        val out = Output(SInt(4.W))
    })
    val bools = io.in.toBools
    val negated = Vec(bools.map { case b =>
        //val inverter = Module(new Inverter)
        //inverter.io.in := b
        //inverter.io.out
        ~b
    })
    io.out := negated.asTypeOf(SInt())
}

class NegateTester(c: Negate) extends PeekPokeTester(c) {
    poke(c.io.in, 1)
    expect(c.io.out, -1)
    poke(c.io.in, 0)
    expect(c.io.out, 0)
}

chisel3.Driver.execute(Array("-X", "verilog"), () => new Negate)

Driver(() => new Negate, "verilator") { c => new NegateTester(c) }


[[35minfo[0m] [0.001] Elaborating design...
[[35minfo[0m] [0.109] Done elaborating.
Total FIRRTL Compile Time: 314.3 ms
[[35minfo[0m] [0.000] Elaborating design...
[[35minfo[0m] [0.004] Done elaborating.
Total FIRRTL Compile Time: 54.3 ms
verilator --cc Negate.v --assert -Wno-fatal -Wno-WIDTH -Wno-STMTDLY --trace -O1 --top-module Negate +define+TOP_TYPE=VNegate +define+PRINTF_COND=!Negate.reset +define+STOP_COND=!Negate.reset -CFLAGS -Wno-undefined-bool-conversion -O1 -DTOP_TYPE=VNegate -DVL_USER_FINISH -include VNegate.h -Mdir /Users/rigge/Downloads/test_run_dir/$sess.cmd25Wrapper$Helper1739786175 --exe /Users/rigge/Downloads/test_run_dir/$sess.cmd25Wrapper$Helper1739786175/Negate-harness.cpp


make: *** No rule to make target '/Users/rigge/Downloads/test_run_dir/ess.cmd25Wrapperelper1739786175/Negate-harness.cpp', needed by 'Negate-harness.o'.  Stop.


make: Entering directory '/Users/rigge/Downloads/test_run_dir/$sess.cmd25Wrapper$Helper1739786175'
make: Leaving directory '/Users/rigge/Downloads/test_run_dir/$sess.cmd25Wrapper$Helper1739786175'


: 

## Exercises
### 1. Shift Register Test with Bundles

The shift register implementations given earlier is templated for all `[T <: Data]`.
`Bundle`s are subtypes of `Data`.
However, the given tester was templated for `Bits` (which includes things like `UInt`, `SInt`, but not `Bundle`.
Also, the test only printed out the values, it didn't actually check that it was correct.

The following code defines a bundle type for complex numbers.
Write a tester to check that the shift register works correctly for complex numbers.
Test that it works for a variety of depths!
To begin with, test that it works on `depth=4` and `width=3`, but then uncomment code to test that it works for more values.

In [72]:
class ComplexBundle(w: Int) extends Bundle {
    val real = Output(SInt(w.W))
    val imag = Output(SInt(w.W))
    override def cloneType = new ComplexBundle(w).asInstanceOf[this.type]
}

// Show the emitted firrtl for an instance of ShiftRegister with Complex
println(chisel3.Driver.emit( () => new ShiftRegister(new ComplexBundle(4), 0) ))

class ComplexShiftRegisterTester(c: ShiftRegister[ComplexBundle]) extends PeekPokeTester(c) {
    // TODO fill me in and remove fail
    fail
}

// See what happens when you try to compile this
// Why won't it compile?
Driver( () => new ShiftRegister(new ComplexBundle(4), 5), "firrtl") { c=>
        new ShiftRegisterTester[ShiftRegister[ComplexBundle], ComplexBundle](c) }

val depths = List(4) // List(0, 1, 2, 5, 10, 100)
val widths = List(3) // List(3, 16)

for (w <- widths) {
    for (d <- depths) {
        Driver( () => new ShiftRegister(new ComplexBundle(w), d), "firrtl") { c=>
        new ComplexShiftRegisterTester(c) }
    }
}

cmd72.sc:18: type arguments [cmd72Wrapper.this.cmd41.wrapper.ShiftRegister[Helper.this.ComplexBundle],Helper.this.ComplexBundle] do not conform to class ShiftRegisterTester's type parameter bounds [T <: chisel3.Module with Helper.this.HasShiftRegisterIO[V],V <: chisel3.Bits]
        new ShiftRegisterTester[ShiftRegister[ComplexBundle], ComplexBundle](c) }
            ^

: 

### 2. Decoupled Shift Register

Write an implementation of a shift register that has decoupled inputs and outputs.
The output shouldn't be valid until the shift register has filled up initially.

In [83]:
class DecoupledShiftRegisterIO[T <: Data](gen: T, n: Int) extends Bundle {
    require (n >= 0, "Shift register must have non-negative shift")
    
    val in = Flipped(Decoupled(gen))
    val out = Decoupled(Vec(n + 1, gen.cloneType)) // + 1 because in is included in out
}

class DecoupledShiftRegister[T <: Data](gen: T, n: Int) extends Module {
    val io = IO(new DecoupledShiftRegisterIO(gen, n))
    
}

class DecoupledShiftRegisterTester[T <: DecoupledShiftRegister[UInt]](c: T) extends PeekPokeTester(c) {
    fail
}

println(chisel3.Driver.emit( () => new DecoupledShiftRegister(UInt(4.W), 5)))
Driver( () => new DecoupledShiftRegister(UInt(4.W), 5), "firrtl") { c => new DecoupledShiftRegisterTester(c)}

[[35minfo[0m] [0.000] Elaborating design...
[[35minfo[0m] [0.005] Done elaborating.
;buildInfoPackage: chisel3, version: 3.0-SNAPSHOT_2017-07-19, scalaVersion: 2.11.11, sbtVersion: 0.13.15, builtAtString: 2017-07-19 18:56:34.453, builtAtMillis: 1500490594453
circuit cmd82WrapperHelperDecoupledShiftRegister : 
  module cmd82WrapperHelperDecoupledShiftRegister : 
    input clock : Clock
    input reset : UInt<1>
    output io : {flip in : {flip ready : UInt<1>, valid : UInt<1>, bits : UInt<4>}, out : {flip ready : UInt<1>, valid : UInt<1>, bits : UInt<4>[6]}}
    
    clock is invalid
    reset is invalid
    io is invalid
    

[[35minfo[0m] [0.000] Elaborating design...
[[35minfo[0m] [0.002] Done elaborating.
Total FIRRTL Compile Time: 4.1 ms
Total FIRRTL Compile Time: 6.6 ms
End of dependency graph
Circuit state created
[[35minfo[0m] [0.000] SEED 1502909159479
test cmd82WrapperHelperDecoupledShiftRegister Success: 0 tests passed in 5 cycles taking 0.000842 seconds
[[35minfo

defined [32mclass[39m [36mDecoupledShiftRegisterIO[39m
defined [32mclass[39m [36mDecoupledShiftRegister[39m
defined [32mclass[39m [36mDecoupledShiftRegisterTester[39m
[36mres82_4[39m: [32mBoolean[39m = [32mfalse[39m

### Bonus! Verilog Black Boxes with Handlebars