## Agile Hardware Design
***
# Network Design Case Study

## Prof. Scott Beamer
### sbeamer@ucsc.edu

## [CSE 293](https://classes.soe.ucsc.edu/cse293/Spring21/)

## Plan for Today

* Sketch of progressive development plan
* Starting from a crossbar
* Ending with a parameterized network generator

## Loading The Chisel Library Into a Notebook

In [35]:
val path = System.getProperty("user.dir") + "/../resource/chisel_deps.sc"
interp.load.module(ammonite.ops.Path(java.nio.file.FileSystems.getDefault().getPath(path)))

[36mpath[39m: [32mString[39m = [32m"/Users/sbeamer/Spring 2021/CSE 293/lectures/15-network/../resource/chisel_deps.sc"[39m

In [36]:
import chisel3._
import chisel3.util._
import chisel3.tester._
import chisel3.tester.RawTester.test

[32mimport [39m[36mchisel3._
[39m
[32mimport [39m[36mchisel3.util._
[39m
[32mimport [39m[36mchisel3.tester._
[39m
[32mimport [39m[36mchisel3.tester.RawTester.test[39m

## Goals for Today

* Demonstrate progressive/iterative development of a generator for a _on-chip network_
  * Focus on process over polished end result
* Design abstractions and apply _inheritance_ to reuse code
* Caveats - today's design is a network generator in spirit, but lacks:
  * support for many messages in flight
  * reasonable test infrastructure
  * comprehensive flow control, multi-beat transfers
  * deadlock avoidance, quality-of-service (QoS) guarantees

## Our Crossbar (`XBar`) Revised from Prior Lectures (1/2)

In [37]:
class Message(numDests: Int, length: Int) extends Bundle {
    val addr = UInt(log2Ceil(numDests+1).W)
    val data = UInt(length.W)
    override def cloneType = (new Message(numDests, length)).asInstanceOf[this.type]
}

class XBarIO(numIns: Int, numOuts: Int, length: Int) extends Bundle {
    val in  = Vec(numIns, Flipped(Decoupled(new Message(numOuts, length))))
    val out = Vec(numOuts, Decoupled(new Message(numOuts, length)))
    override def cloneType = (new XBarIO(numIns, numOuts, length)).asInstanceOf[this.type]
}

defined [32mclass[39m [36mMessage[39m
defined [32mclass[39m [36mXBarIO[39m

## Our Crossbar (`XBar`) Revised from Prior Lectures (2/2)

In [39]:
class XBar(numIns: Int, numOuts: Int, length: Int) extends Module {
    val io = IO(new XBarIO(numIns, numOuts, length))
    val arbs = Seq.fill(numOuts)(Module(new RRArbiter(new Message(numOuts, length), numIns)))
    for (ip <- 0 until numIns) {
        io.in(ip).ready := arbs.map{ _.io.in(ip).ready }.reduce{ _ || _ }
    }
    for (op <- 0 until numOuts) {
        arbs(op).io.in.zip(io.in).foreach { case (arbIn, ioIn) =>
            arbIn.bits <> ioIn.bits
            arbIn.valid := ioIn.valid && (ioIn.bits.addr === op.U)
        }
        io.out(op) <> arbs(op).io.out
    }
}

// declaration example: new XBar(4,4,64)

defined [32mclass[39m [36mXBar[39m

## Refactor Parameters with Case Classes (1/2)

In [42]:
case class XBarParams(numHosts: Int, payloadSize: Int) {
    def addrBitW() = log2Ceil(numHosts + 1)
}

class Message(p: XBarParams) extends Bundle {
    val addr = UInt(p.addrBitW.W)
    val data = UInt(p.payloadSize.W)
    override def cloneType = (new Message(p)).asInstanceOf[this.type]
}

class PortIO(p: XBarParams) extends Bundle {
    val in = Flipped(Decoupled(new Message(p)))
    val out = Decoupled(new Message(p))
    override def cloneType = (new PortIO(p)).asInstanceOf[this.type]
}

defined [32mclass[39m [36mXBarParams[39m
defined [32mclass[39m [36mMessage[39m
defined [32mclass[39m [36mPortIO[39m

## Refactor Parameters with Case Classes (2/2)

In [43]:
class XBar(p: XBarParams) extends Module {
    val io = IO(new Bundle {
        val ports = Vec(p.numHosts, new PortIO(p))
    })
    val arbs = Seq.fill(p.numHosts)(Module(new RRArbiter(new Message(p), p.numHosts)))
    for (ip <- 0 until p.numHosts) {
        io.ports(ip).in.ready := arbs.map{ _.io.in(ip).ready }.reduce{ _ || _ }
    }
    for (op <- 0 until p.numHosts) {
        arbs(op).io.in.zip(io.ports).foreach { case (arbIn, port) =>
            arbIn.bits <> port.in.bits
            arbIn.valid := port.in.valid && (port.in.bits.addr === op.U)
        }
        io.ports(op).out <> arbs(op).io.out
    }
}

// declaration example: new XBar(XBarParams(4,64))

defined [32mclass[39m [36mXBar[39m

## Template Payload Data Type (1/2)

In [45]:
case class XBarParams[T <: chisel3.Data](numHosts: Int, payloadT: T) {
    def addrBitW() = log2Ceil(numHosts + 1)
}

class Message[T <: chisel3.Data](p: XBarParams[T]) extends Bundle {
    val addr = UInt(p.addrBitW.W)
    val data = p.payloadT
    override def cloneType = (new Message[T](p)).asInstanceOf[this.type]
}

class PortIO[T <: chisel3.Data](p: XBarParams[T]) extends Bundle {
    val in = Flipped(Decoupled(new Message(p)))
    val out = Decoupled(new Message(p))
    override def cloneType = (new PortIO[T](p)).asInstanceOf[this.type]
}

defined [32mclass[39m [36mXBarParams[39m
defined [32mclass[39m [36mMessage[39m
defined [32mclass[39m [36mPortIO[39m

## Template Payload Data Type (2/2)

In [46]:
class XBar[T <: chisel3.Data](p: XBarParams[T]) extends Module {
    val io = IO(new Bundle {
        val ports = Vec(p.numHosts, new PortIO(p))
    })
    val arbs = Seq.fill(p.numHosts)(Module(new RRArbiter(new Message(p), p.numHosts)))
    for (ip <- 0 until p.numHosts) {
        io.ports(ip).in.ready := arbs.map{ _.io.in(ip).ready }.reduce{ _ || _ }
    }
    for (op <- 0 until p.numHosts) {
        arbs(op).io.in.zip(io.ports).foreach { case (arbIn, port) =>
            arbIn.bits <> port.in.bits
            arbIn.valid := port.in.valid && (port.in.bits.addr === op.U)
        }
        io.ports(op).out <> arbs(op).io.out
    }
}

// declaration example: new XBar(XBarParams(4,UInt(64.W)))

defined [32mclass[39m [36mXBar[39m

## Need for Multi-hop Networks

* Can only make a crossbar so big, at some point will need a _multi-hop_ interconnect
* Sending messages over multiple hops requires _routing_ messages to right next hop

### Moving to a Ring Network
* A _ring network_ is a simple topology in 1-dimension
* _Routing:_ (for now) if not at destination, send to next hop
* _Plan:_ will develop independently first, then will look for commonality with `XBar`

<img src="images/ring1.svg" alt="1-way ring network" style="width:70%;margin-left:auto;margin-right:auto"/>

## First Implementation of a Ring Network

In [49]:
class RingRouter[T <: chisel3.Data](p: XBarParams[T], id: Int) extends Module {
    val io = IO(new Bundle{
        val in = Flipped(Decoupled(new Message(p)))
        val out = Decoupled(new Message(p))
        val host = new PortIO(p)
    })
    val forMe = (io.in.bits.addr === id.U) && io.in.valid
    // INCOMPLETE, but gives spirit
    io.host.in.ready := io.out.ready
    io.host.out.valid := forMe
    io.host.out.bits := io.in.bits
    io.in.ready := io.host.out.ready && io.out.ready
    io.out.valid := (io.in.fire && !forMe) || io.host.in.fire
    io.out.bits := Mux(io.host.in.fire, io.host.in.bits, io.in.bits)
}

class RingNetwork[T <: chisel3.Data](p: XBarParams[T]) extends Module {
    val io = IO(new Bundle {
        val ports = Vec(p.numHosts, new PortIO(p))
    })
    val routers = Seq.tabulate(p.numHosts){ id => new RingRouter(p, id)}
    routers.foldLeft(routers.last){ (prev, curr) => prev.io.out <> curr.io.in; curr}
    routers.zip(io.ports).foreach { case (router, port) => router.io.host <> port}
}

defined [32mclass[39m [36mRingRouter[39m
defined [32mclass[39m [36mRingNetwork[39m

## Looking for Commonality between `XBar` & `RingNetwork`

* For users, choosing one or the other requires some code changes
* _Commonality:_ both provide abstraction of network with decoupled bidirectional ports (interface)

In [50]:
case class NetworkParams[T <: chisel3.Data](numHosts: Int, payloadT: T) {
    def addrBitW() = log2Ceil(numHosts + 1)
}

class Message[T <: chisel3.Data](p: NetworkParams[T]) extends Bundle {
    val addr = UInt(p.addrBitW.W)
    val data = p.payloadT
    override def cloneType = (new Message[T](p)).asInstanceOf[this.type]
}

class PortIO[T <: chisel3.Data](p: NetworkParams[T]) extends Bundle {
    val in = Flipped(Decoupled(new Message(p)))
    val out = Decoupled(new Message(p))
    override def cloneType = (new PortIO(p)).asInstanceOf[this.type]
}

abstract class Network[T <: chisel3.Data](p: NetworkParams[T]) extends Module {
    val io = IO(new Bundle {
        val ports = Vec(p.numHosts, new PortIO(p))
    })
}

defined [32mclass[39m [36mNetworkParams[39m
defined [32mclass[39m [36mMessage[39m
defined [32mclass[39m [36mPortIO[39m
defined [32mclass[39m [36mNetwork[39m

## `XBar` Redone with Inherited Interface

In [51]:
class XBar[T <: chisel3.Data](p: NetworkParams[T]) extends Network[T](p) {
    val arbs = Seq.fill(p.numHosts)(Module(new RRArbiter(new Message(p), p.numHosts)))
    for (ip <- 0 until p.numHosts) {
        io.ports(ip).in.ready := arbs.map{ _.io.in(ip).ready }.reduce{ _ || _ }
    }
    for (op <- 0 until p.numHosts) {
        arbs(op).io.in.zip(io.ports).foreach { case (arbIn, port) =>
            arbIn.bits <> port.in.bits
            arbIn.valid := port.in.valid && (port.in.bits.addr === op.U)
        }
        io.ports(op).out <> arbs(op).io.out
    }
}

// declaration example: new XBar(NetworkParams(4,UInt(64.W)))

defined [32mclass[39m [36mXBar[39m

## `RingNetwork` Redone with Inherited Interface

In [28]:
class RingRouter[T <: chisel3.Data](p: NetworkParams[T], id: Int) extends Module {
    val io = IO(new Bundle{
        val in = Flipped(Decoupled(new Message(p)))
        val out = Decoupled(new Message(p))
        val host = new PortIO(p)
    })
    val forMe = (io.in.bits.addr === id.U) && io.in.valid
    // INCOMPLETE, but gives spirit
    io.host.in.ready := io.out.ready
    io.host.out.valid := forMe
    io.host.out.bits := io.in.bits
    io.in.ready := io.host.out.ready && io.out.ready
    io.out.valid := (io.in.fire && !forMe) || io.host.in.fire
    io.out.bits := Mux(io.host.in.fire, io.host.in.bits, io.in.bits)
}

class RingNetwork[T <: chisel3.Data](p: NetworkParams[T]) extends Network[T](p) {
    val routers = Seq.tabulate(p.numHosts){ id => new RingRouter(p, id)}
    routers.foldLeft(routers.last){ (prev, curr) => prev.io.out <> curr.io.in; curr}
    routers.zip(io.ports).foreach { case (router, port) => router.io.host <> port}
}

defined [32mclass[39m [36mRingRouter[39m
defined [32mclass[39m [36mRingNetwork[39m

## Improve Ring by Sending Message in Shorter Direction

* Make links between routers _bidirectional_ and send message to closer one
  * Reduces number of hops
  * Will complicate deadlocks and such, but will overlook that for today
* Recognize opportunity for _reuse_
  * Router (internally) is basically a crossbar (switch) with routing logic
  * _Routing logic:_ current router & destination address -> next port

<img src="images/ring2.svg" alt="2-way ring network" style="width:70%;margin-left:auto;margin-right:auto"/>

## `RingRouter` Revised for Bidirectional & Use of `XBar`

In [29]:
class RingRouter[T <: chisel3.Data](p: NetworkParams[T], id: Int) extends Module {
    val io = IO(new Bundle{
        val ports = Vec(3, new PortIO(p)) // port(2) for host
    })

    val xbarParams = NetworkParams(3, new Message(p))
    val xbar = new XBar(xbarParams)

    def nextHop(destAddr: UInt): UInt = { // routing logic
        val distTowards0 = Mux(destAddr < id.U, id.U - destAddr, id.U + (p.numHosts.U - destAddr))
        val distTowards1 = Mux(destAddr > id.U, destAddr - id.U, (p.numHosts.U - id.U) + destAddr)
        Mux(destAddr === id.U, 2.U, Mux(distTowards0 < distTowards1, 0.U, 1.U))
    }
    val portsRouted = io.ports map { port =>  
        val routed = Wire(new PortIO(xbarParams))
        // INCOMPLETE, need to connect ready & valids
        routed.in.bits.addr := nextHop(port.in.bits.addr)
        routed.in.bits.data := port.in.bits
        port.out.bits := routed.out.bits.data
        routed
    }

    portsRouted.zip(xbar.io.ports).foreach{ case (extPort, xbarPort) => extPort <> xbarPort }
}

class RingNetwork[T <: chisel3.Data](p: NetworkParams[T]) extends Network[T](p) {
    val routers = Seq.tabulate(p.numHosts){ id => new RingRouter(p, id)}
    routers.foldLeft(routers.last){ (prev, curr) => prev.io.ports(1) <> curr.io.ports(0); curr }
    routers.zip(io.ports).foreach { case (router, port) => router.io.ports(2) <> port}
}

defined [32mclass[39m [36mRingRouter[39m
defined [32mclass[39m [36mRingNetwork[39m

## Assessing Revised `RingNetwork`

* Parameterized number of hosts √
* Parameterized data type √
* Sends messages in shorter direction √
* _Missing:_ graceful interchangability with `XBar`

## Making a `Network` Factory (1/2)

* Can pattern match on params for type

In [52]:
abstract class NetworkParams[T <: chisel3.Data] {
    def numHosts: Int
    def payloadT: T
    def addrBitW() = log2Ceil(numHosts + 1)
}

case class XBarParams[T <: chisel3.Data](numHosts: Int, payloadT: T) extends NetworkParams[T]

case class RingParams[T <: chisel3.Data](numHosts: Int, payloadT: T) extends NetworkParams[T]

defined [32mclass[39m [36mNetworkParams[39m
defined [32mclass[39m [36mXBarParams[39m
defined [32mclass[39m [36mRingParams[39m

In [54]:
// need to eval next slide first

object Network {
    def apply[T <: chisel3.Data](p: NetworkParams[T]): Network[T] = p match {
        case xp: XBarParams[T] => new XBar(xp)
        case rp: RingParams[T] => new RingNetwork(rp)
    }
}

// Network(XBarParams(...))

defined [32mobject[39m [36mNetwork[39m

## Making a `Network` Factory (2/2)

In [53]:
class Message[T <: chisel3.Data](p: NetworkParams[T]) extends Bundle {
    val addr = UInt(p.addrBitW.W)
    val data = p.payloadT
    override def cloneType = (new Message[T](p)).asInstanceOf[this.type]
}

class PortIO[T <: chisel3.Data](p: NetworkParams[T]) extends Bundle {
    val in = Flipped(Decoupled(new Message(p)))
    val out = Decoupled(new Message(p))
    override def cloneType = (new PortIO(p)).asInstanceOf[this.type]
}

abstract class Network[T <: chisel3.Data](p: NetworkParams[T]) extends Module {
    val io = IO(new Bundle {
        val ports = Vec(p.numHosts, new PortIO(p))
    })
}

class XBar[T <: chisel3.Data](p: XBarParams[T]) extends Network[T](p) {
    val arbs = Seq.fill(p.numHosts)(Module(new RRArbiter(new Message(p), p.numHosts)))
    for (ip <- 0 until p.numHosts) {
        io.ports(ip).in.ready := arbs.map{ _.io.in(ip).ready }.reduce{ _ || _ }
    }
    for (op <- 0 until p.numHosts) {
        arbs(op).io.in.zip(io.ports).foreach { case (arbIn, port) =>
            arbIn.bits <> port.in.bits
            arbIn.valid := port.in.valid && (port.in.bits.addr === op.U)
        }
        io.ports(op).out <> arbs(op).io.out
    }
}

class RingRouter[T <: chisel3.Data](p: RingParams[T], id: Int) extends Module {
    val io = IO(new Bundle{
        val ports = Vec(3, new PortIO(p)) // port(2) for host
    })

    val xbarParams = XBarParams(3, new Message(p))
    val xbar = new XBar(xbarParams)

    def nextHop(destAddr: UInt): UInt = {
        val distTowards0 = Mux(destAddr < id.U, id.U - destAddr, id.U + (p.numHosts.U - destAddr))
        val distTowards1 = Mux(destAddr > id.U, destAddr - id.U, (p.numHosts.U - id.U) + destAddr)
        Mux(destAddr === id.U, 2.U, Mux(distTowards0 < distTowards1, 0.U, 1.U))
    }
    val portsRouted = io.ports map { port =>  
        val routed = Wire(new PortIO(xbarParams))
        // INCOMPLETE, need to connect ready & valids
        routed.in.bits.addr := nextHop(port.in.bits.addr)
        routed.in.bits.data := port.in.bits
        port.out.bits := routed.out.bits.data
        routed
    }

    portsRouted.zip(xbar.io.ports).foreach{ case (extPort, xbarPort) => extPort <> xbarPort }
}

class RingNetwork[T <: chisel3.Data](p: RingParams[T]) extends Network[T](p) {
    val routers = Seq.tabulate(p.numHosts){ id => new RingRouter(p, id)}
    routers.foldLeft(routers.last){ (prev, curr) => prev.io.ports(1) <> curr.io.ports(0); curr }
    routers.zip(io.ports).foreach { case (router, port) => router.io.ports(2) <> port}
}

defined [32mclass[39m [36mMessage[39m
defined [32mclass[39m [36mPortIO[39m
defined [32mclass[39m [36mNetwork[39m
defined [32mclass[39m [36mXBar[39m
defined [32mclass[39m [36mRingRouter[39m
defined [32mclass[39m [36mRingNetwork[39m

## Let's Add More Network Topologies

* What about a _mesh_ or a _torus_ instead of just a ring?
* Can we share components between these networks?
* Common abstractions:
  * Router (including routing logic)
  * Router interconnections

In [55]:
abstract class Router[T <: chisel3.Data] (p: NetworkParams[T], numPorts: Int, id: Int) extends Module {
    val io = IO(new Bundle{
        val ports = Vec(numPorts, new PortIO(p))
        // convention: last port is for attached host
    })

    val xbarParams = XBarParams(numPorts, new Message(p))
    val xbar = new XBar(xbarParams)

    def nextHop(destAddr: UInt): UInt
    val portsRouted = io.ports map { port =>  
        val routed = Wire(new PortIO(xbarParams))
        // INCOMPLETE, need to connect ready & valids
        routed.in.bits.addr := nextHop(port.in.bits.addr)
        routed.in.bits.data := port.in.bits
        port.out.bits := routed.out.bits.data
        routed
    }

    portsRouted.zip(xbar.io.ports).foreach{ case (extPort, xbarPort) => extPort <> xbarPort }
}

abstract class MultiHopNetwork[T <: chisel3.Data](p: NetworkParams[T]) extends Network[T](p) {
    val routers: Seq[Router[T]]
    def connectRouters()
    connectRouters()
    routers.zip(io.ports).foreach { case (router, port) => router.io.ports.last <> port}
}

defined [32mclass[39m [36mRouter[39m
defined [32mclass[39m [36mMultiHopNetwork[39m

## `RingNetwork` Revised with `MultiHopNetwork`

In [56]:

class RingRouter[T <: chisel3.Data](p: RingParams[T], id: Int) extends Router[T](p,3,id) {
    def nextHop(destAddr: UInt): UInt = {
        val distTowards0 = Mux(destAddr < id.U, id.U - destAddr, id.U + (p.numHosts.U - destAddr))
        val distTowards1 = Mux(destAddr > id.U, destAddr - id.U, (p.numHosts.U - id.U) + destAddr)
        Mux(destAddr === id.U, 2.U, Mux(distTowards0 < distTowards1, 0.U, 1.U))
    }
}

class RingNetwork[T <: chisel3.Data](p: RingParams[T]) extends Network[T](p) {
    val routers = Seq.tabulate(p.numHosts){ id => new RingRouter(p, id)}
    def connectRouters() {
        routers.foldLeft(routers.last){ (prev, curr) => prev.io.ports(1) <> curr.io.ports(0); curr }
    }
}

defined [32mclass[39m [36mRingRouter[39m
defined [32mclass[39m [36mRingNetwork[39m

## What About a 2D Torus?

In [57]:
case class TorusParams[T <: chisel3.Data](numHosts: Int, payloadT: T, numRows: Int) extends NetworkParams[T] {
    require(numHosts % numRows == 0)
    val numCols = numHosts / numRows
}

class TorusRouter[T <: chisel3.Data](p: TorusParams[T], id: Int) extends Router[T](p,5,id) {
    def nextHop(destAddr: UInt): UInt = {
        // FILL IN routing logic, e.g. dimension-ordered routing
        destAddr // INCORRECT, but will allow to compile 
    }
}

class TorusNetwork[T <: chisel3.Data](p: TorusParams[T]) extends MultiHopNetwork[T](p) {
    val routers = Seq.tabulate(p.numHosts){ id => new TorusRouter(p, id)}
    def connectRouters() {
        // FILL IN 2D connectivity
    }
}

defined [32mclass[39m [36mTorusParams[39m
defined [32mclass[39m [36mTorusRouter[39m
defined [32mclass[39m [36mTorusNetwork[39m

## We Did It!

* Reused common components between network types via _inheritance_
  * Inherited interfaces as well as standard connections
  * Each network focuses on what makes it unique
  * Used case classes to pass around parameters
* Can even integrate behind a factory

In [58]:
object Network {
    def apply[T <: chisel3.Data](p: NetworkParams[T]): Network[T] = p match {
        case xp: XBarParams[T] => new XBar(xp)
        case rp: RingParams[T] => new RingNetwork(rp)
        case tp: TorusParams[T] => new TorusNetwork(tp)
    }
}

defined [32mobject[39m [36mNetwork[39m

## Takeaways

* With progressive design, don't be afraid to make specific/concrete at first
  * Generalize when there is more than one instance
* Keep an eye out for _reuse_ opportunities
  * Copying & pasting (to start a module) is a sign there may be significant overlap
* _Inheritance_ is a powerful tool to reuse implementations and interfaces
* Can apply generics (templating) to increase flexibility