Skip to content

Latest commit

 

History

History
125 lines (88 loc) · 8.77 KB

Dsptools-Blocks.rst

File metadata and controls

125 lines (88 loc) · 8.77 KB

Dsptools is a Chisel library that aids in writing custom signal processing accelerators. It does this by: * Giving types and helpers that allow you to express mathematical operations more directly. * Typeclasses that let you write polymorphic generators, for example an FIR filter generator that works for both real- and complex-valued filters. * Structures for packaging DSP blocks and integrating them into a rocketchip-based SoC. * Test harnesses for testing DSP circuits, as well as VIP-style drivers and monitors for DSP blocks.

The Dsptools repository has more documentation.

Dsptools Blocks

A DspBlock is the basic unit of signal processing functionality that can be integrated into an SoC. It has a AXI4-stream interface and an optional memory interface. The idea is that these DspBlocks can be easily designed, unit tested, and assembled lego-style to build complex functionality. A DspChain is one example of how to assemble DspBlocks, in which case the streaming interfaces are connected serially into a pipeline, and a bus is instatiated and connected to every block with a memory interface.

Chipyard has example designs that integrate a DspBlock to a rocketchip-based SoC as an MMIO peripheral. The custom DspBlock has a ReadQueue before it and a WriteQueue after it, which allow memory mapped access to the streaming interfaces so the rocket core can interact with the DspBlock 1. This section will primarily focus on designing Tilelink-based peripherals. However, through the resources provided in Dsptools, one could also define an AXI4-based peripheral by following similar steps. Furthermore, the examples here are simple, but can be extended to implement more complex accelerators, for example an OFDM baseband or a spectrometer.

For this example, we will show you how to connect a simple FIR filter created using Dsptools as an MMIO peripheral as shown in the figure above. The full code can be found in generators/chipyard/src/main/scala/example/dsptools/GenericFIR.scala. That being said, one could substitute any module with a ready valid interface in the place of the FIR and achieve the same results. As long as the read and valid signals of the module are attached to those of a corresponding DSPBlock wrapper, and that wrapper is placed in a chain with a ReadQueue and a WriteQueue, following the general outline establised by these steps will allow you to interact with that block as a memory mapped IO.

The module GenericFIR is the overall wrapper of our FIR module. This module links together a variable number of GenericFIRDirectCell submodules, each of which performs the computations for one coefficient in a FIR direct form architecture. It is important to note that both modules are type-generic, which means that they can be instantiated for any datatype T that implements Ring operations (e.g. addition, multiplication, identities).

../../generators/chipyard/src/main/scala/example/dsptools/GenericFIR.scala

../../generators/chipyard/src/main/scala/example/dsptools/GenericFIR.scala

Creating a DspBlock

The first step in attaching the FIR filter as a MMIO peripheral is to create an abstract subclass of DspBlock the wraps around the GenericFIR module. Streaming outputs and inputs are packed and unpacked into UInt s. If there were control signals, this is where they'd go from raw IOs to memory mapped. The main steps of this process are as follows.

  1. Instantiate a GenericFIR within GenericFIRBlock.
  2. Attach the ready and valid signals from the in and out connections.
  3. Cast the module input data to the input type of GenericFIR (GenericFIRBundle) and attach.
  4. Cast the output of GenericFIR to UInt and attach to the module output.

../../generators/chipyard/src/main/scala/example/dsptools/GenericFIR.scala

Note that at this point the GenericFIRBlock does not have a type of memory interface specified. This abstract class can be used to create different flavors that use AXI-4, TileLink, AHB, or whatever other memory interface you like like.

With these classes implemented, you can begin to construct the chain by extending GenericFIRBlock while using the TLDspBlock trait via mixin.

../../generators/chipyard/src/main/scala/example/dsptools/GenericFIR.scala

We can then construct the final chain by utilizing the TLWriteQueue and TLReadeQueue modules found in generators/chipyard/src/main/scala/example/dsptools/DspBlocks.scala. The chain is created by passing a list of factory functions to the constructor of TLChain. The constructor then automatically instantiates these DspBlocks, connects their stream nodes in order, creates a bus, and connects any DspBlocks that have memory interfaces to the bus.

../../generators/chipyard/src/main/scala/example/dsptools/GenericFIR.scala

Top Level Traits

As in the previous MMIO example, we use a cake pattern to hook up our module to our SoC.

../../generators/chipyard/src/main/scala/example/dsptools/GenericFIR.scala

Note that this is the point at which we decide the datatype for our FIR. You could create different configs that use different types for the FIR, for example a config that instantiates a complex-valued FIR filter.

Constructing the Top and Config

Once again following the path of the previous MMIO example, we now want to mix our traits into the system as a whole. The code is from generators/chipyard/src/main/scala/DigitalTop.scala

../../generators/chipyard/src/main/scala/DigitalTop.scala

Finally, we create the configuration class in generators/chipyard/src/main/scala/config/RocketConfigs.scala that uses the WithFIR mixin defined in generators/chipyard/src/main/scala/example/dsptools/GenericFIR.scala.

../../generators/chipyard/src/main/scala/example/dsptools/GenericFIR.scala

../../generators/chipyard/src/main/scala/config/RocketConfigs.scala

FIR Testing

We can now test that the FIR is working. The test program is found in tests/streaming-fir.c.

../../tests/streaming-fir.c

The test feed a series of values into the fir and compares the output to a golden model of computation. The base of the module's MMIO write region is at 0x2000 and the base of the read region is at 0x2100 by default.

Compiling this program with make produces a streaming-fir.riscv executable.

Now we can run our simulation.

cd sims/verilator
make CONFIG=StreamingFIRRocketConfig BINARY=../../tests/streaming-fir.riscv run-binary

  1. ReadQueue and WriteQueue are good illustrations of how to write a DspBlock and how they can be integrated into rocket, but in a real design a DMA engine would be preferred. ReadQueue will stall the processor if you try to read an empty queue, and WriteQueue will stall if you try to write to a full queue, which a DMA engine can more elegantly avoid. Furthermore, a DMA engine can do the work of moving data, freeing the processor to do other useful work (or sleep).