Skip to content

Commit

Permalink
Merge remote-tracking branch 'origin/reworkInstructionCache'
Browse files Browse the repository at this point in the history
  • Loading branch information
Dolu1990 committed Feb 18, 2018
2 parents 3853e03 + 8ac4d72 commit 0270ee2
Show file tree
Hide file tree
Showing 27 changed files with 579 additions and 364 deletions.
56 changes: 37 additions & 19 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -92,10 +92,14 @@ VexRiscv smallest (RV32I, 0.52 DMIPS/Mhz, no datapath bypass) ->
Cyclone II -> 149 Mhz 780 LUT 578 FF
VexRiscv small and productive (RV32I, 0.82 DMIPS/Mhz) ->
Artix 7 -> 309 Mhz 703 LUT 557 FF
Cyclone V -> 152 Mhz 502 ALMs
Cyclone IV -> 147 Mhz 1,062 LUT 552 FF
Cyclone II -> 120 Mhz 1,072 LUT 551 FF
Artix 7 -> 327 Mhz 698 LUT 558 FF
Cyclone V -> 158 Mhz 524 ALMs
Cyclone IV -> 146 Mhz 1,061 LUT 552 FF
VexRiscv small and productive with I$ (RV32I, 0.72 DMIPS/Mhz, 4KB-I$) ->
Artix 7 -> 331 Mhz 727 LUT 600 FF
Cyclone V -> 152 Mhz 536 ALMs
Cyclone IV -> 156 Mhz 1,075 LUT 565 FF
VexRiscv full no cache (RV32IM, 1.22 DMIPS/Mhz, single cycle barrel shifter, debug module, catch exceptions, static branch) ->
Artix 7 -> 310 Mhz 1391 LUT 934 FF
Expand All @@ -104,21 +108,19 @@ VexRiscv full no cache (RV32IM, 1.22 DMIPS/Mhz, single cycle barrel shifter, deb
Cyclone II -> 108 Mhz 1,939 LUT 959 FF
VexRiscv full (RV32IM, 1.21 DMIPS/Mhz with cache trashing, 4KB-I$,4KB-D$, single cycle barrel shifter, debug module, catch exceptions, static branch) ->
Artix 7 -> 250 Mhz 1911 LUT 1501 FF
Cyclone V -> 132 Mhz 1,266 ALMs
Cyclone IV -> 127 Mhz 2,733 LUT 1,762 FF
Cyclone II -> 103 Mhz 2,791 LUT 1,760 FF
Artix 7 -> 249 Mhz 1822 LUT 1362 FF
Cyclone V -> 128 Mhz 1,187 ALMs
Cyclone IV -> 107 Mhz 2,560 LUT 1,671 FF
VexRiscv full max perf -> (RV32IM, 1.44 DMIPS/Mhz, 16KB-I$,16KB-D$, single cycle barrel shifter, debug module, catch exceptions, dynamic branch prediction in the fetch stage, branch and shift operations done in the Execute stage) ->
Artix 7 -> 198 Mhz 1920 LUT 1528 FF
Cyclone V -> 90 Mhz 1,261 ALMs
Cyclone IV -> 88 Mhz 2,780 LUT 1,788 FF
Artix 7 -> 192 Mhz 1858 LUT 1392 FF
Cyclone V -> 89 Mhz 1,246 ALMs
Cyclone IV -> 85 Mhz 2,673 LUT 1,679 FF
VexRiscv full with MMU (RV32IM, 1.26 DMIPS/Mhz with cache trashing, 4KB-I$, 4KB-D$, single cycle barrel shifter, debug module, catch exceptions, dynamic branch, MMU) ->
Artix 7 -> 223 Mhz 2085 LUT 2020 FF
Cyclone V -> 110 Mhz 1,503 ALMs
Cyclone IV -> 108 Mhz 3,153 LUT 2,281 FF
Cyclone II -> 94 Mhz 3,187 LUT 2,281 FF
Artix 7 -> 208 Mhz 2092 LUT 1881 FF
Cyclone V - > 112 Mhz 1,435 ALMs
Cyclone IV -> 94 Mhz 2,980 LUT 2,169 FF
```

There is a summary of the configuration which produce 1.44 DMIPS :
Expand Down Expand Up @@ -293,9 +295,9 @@ You can find some FPGA project which instantiate the Briey SoC there (DE1-SoC, D
There is some measurements of Briey SoC timings and area :

```
Artix 7 -> 231 Mhz 3339 LUT 3533 FF
Cyclone V -> 124 Mhz 2,264 ALMs
Cyclone IV -> 124 Mhz 4,709 LUT 3,716 FF
Artix 7 -> 239 Mhz 3227 LUT 3410 FF
Cyclone V -> 125 Mhz 2,207 ALMs
Cyclone IV -> 112 Mhz 4,594 LUT 3,620
```

## Murax SoC
Expand Down Expand Up @@ -695,7 +697,23 @@ This plugin fit in the fetch stage

#### IBusCachedPlugin

Single way cache implementation, documentation WIP
Simple and light multi way instruction cache.

| Parameters | type | description |
| ------ | ----------- | ------ |
| cacheSize | Int | Total storage capacity of the cache |
| bytePerLine | Int | Number of byte per cache line |
| wayCount | Int | Number of cache way |
| twoCycleRam | Boolean | Check the tags values in the decode stage instead of the fetch stage to relax timings |
| asyncTagMemory | Boolean | Read the cache tags in a asyncronus manner instead of syncronous one |
| addressWidth | Int | Address width, should be 32 |
| cpuDataWidth | Int | Cpu data width, should be 32 |
| memDataWidth | Int | Memory data width, could potentialy be something else than 32, but only 32 is currently tested |
| catchIllegalAccess | Boolean | Catch when an memory access is done on non valid memory address (MMU) |
| catchAccessFault | Boolean | Catch when the memeory bus is responding with an error |
| catchMemoryTranslationMiss | Boolean | Catch when the MMU miss a TLB |

Note : If you enable the twoCycleRam and and the wayCount is bigger than one, then the register file plugin should be configured to read the regFile in a asyncronus manner.

#### DecoderSimplePlugin

Expand Down
1 change: 1 addition & 0 deletions src/main/scala/vexriscv/Pipeline.scala
Original file line number Diff line number Diff line change
Expand Up @@ -119,6 +119,7 @@ trait Pipeline {
for(stageIndex <- 0 until stages.length; stage = stages(stageIndex)){
stage.arbitration.isStuckByOthers := stage.arbitration.haltByOther || stages.takeRight(stages.length - stageIndex - 1).map(s => s.arbitration.haltItself/* && !s.arbitration.removeIt*/).foldLeft(False)(_ || _)
stage.arbitration.isStuck := stage.arbitration.haltItself || stage.arbitration.isStuckByOthers
stage.arbitration.isMoving := !stage.arbitration.isStuck && !stage.arbitration.removeIt
stage.arbitration.isFiring := stage.arbitration.isValid && !stage.arbitration.isStuck && !stage.arbitration.removeIt
}

Expand Down
2 changes: 1 addition & 1 deletion src/main/scala/vexriscv/Services.scala
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ import spinal.lib._
import scala.beans.BeanProperty

trait JumpService{
def createJumpInterface(stage : Stage) : Flow[UInt]
def createJumpInterface(stage : Stage, priority : Int = 0) : Flow[UInt]
}

trait DecoderService{
Expand Down
2 changes: 2 additions & 0 deletions src/main/scala/vexriscv/Stage.scala
Original file line number Diff line number Diff line change
Expand Up @@ -48,11 +48,13 @@ class Stage() extends Area{
val haltByOther = False //When settable, stuck the instruction, should only be set by something else than the stucked instruction
val removeIt = False //When settable, unschedule the instruction as if it was never executed (no side effect)
val flushAll = False //When settable, unschedule instructions in the current stage and all prior ones
val redoIt = False //Allow to notify that a given instruction in a pipeline is rescheduled
val isValid = RegInit(False) //Inform if a instruction is in the current stage
val isStuck = Bool //Inform if the instruction is stuck (haltItself || haltByOther)
val isStuckByOthers = Bool //Inform if the instruction is stuck by sombody else
def isRemoved = removeIt //Inform if the instruction is going to be unschedule the current cycle
val isFlushed = Bool //Inform if the instruction is flushed (flushAll set in the current or subsequents stages)
val isMoving = Bool //Inform if the instruction is going somewere else (next stage or unscheduled)
val isFiring = Bool //Inform if the current instruction will go to the next stage the next cycle (isValid && !isStuck && !removeIt)
}

Expand Down
13 changes: 6 additions & 7 deletions src/main/scala/vexriscv/TestsWorkspace.scala
Original file line number Diff line number Diff line change
Expand Up @@ -41,21 +41,20 @@ object TestsWorkspace {
// ),
new IBusCachedPlugin(
config = InstructionCacheConfig(
cacheSize = 4096*4,
bytePerLine =32,
cacheSize = 2048,
bytePerLine = 32,
wayCount = 1,
wrappedMemAccess = true,
addressWidth = 32,
cpuDataWidth = 32,
memDataWidth = 32,
catchIllegalAccess = true,
catchAccessFault = true,
catchMemoryTranslationMiss = true,
asyncTagMemory = false,
twoStageLogic = true
twoCycleRam = false
),
askMemoryTranslation = true,
memoryTranslatorPortConfig = MemoryTranslatorPortConfig(
memoryTranslatorPortConfig = MemoryTranslatorPortConfig(
portTlbSize = 4
)
),
Expand Down Expand Up @@ -95,7 +94,7 @@ object TestsWorkspace {
catchIllegalInstruction = true
),
new RegFilePlugin(
regFileReadyKind = plugin.SYNC,
regFileReadyKind = plugin.ASYNC,
zeroBoot = false
),
new IntAluPlugin,
Expand All @@ -117,7 +116,7 @@ object TestsWorkspace {
// new HazardSimplePlugin(false, false, false, false),
new MulPlugin,
new DivPlugin,
new CsrPlugin(CsrPluginConfig.all(0x80000020l)),
new CsrPlugin(CsrPluginConfig.all(0x80000020l).copy(deterministicInteruptionEntry = false)),
new DebugPlugin(ClockDomain.current.clone(reset = Bool().setName("debugReset"))),
new BranchPlugin(
earlyBranch = true,
Expand Down
1 change: 1 addition & 0 deletions src/main/scala/vexriscv/VexRiscv.scala
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,7 @@ class VexRiscv(val config : VexRiscvConfig) extends Component with Pipeline{
decode.input(config.INSTRUCTION).addAttribute(Verilator.public)
decode.input(config.PC).addAttribute(Verilator.public)
decode.arbitration.isValid.addAttribute(Verilator.public)
decode.arbitration.flushAll.addAttribute(Verilator.public)
decode.arbitration.haltItself.addAttribute(Verilator.public)
writeBack.input(config.INSTRUCTION) keep() addAttribute(Verilator.public)
writeBack.input(config.PC) keep() addAttribute(Verilator.public)
Expand Down
3 changes: 1 addition & 2 deletions src/main/scala/vexriscv/demo/Briey.scala
Original file line number Diff line number Diff line change
Expand Up @@ -57,15 +57,14 @@ object BrieyConfig{
cacheSize = 4096,
bytePerLine =32,
wayCount = 1,
wrappedMemAccess = true,
addressWidth = 32,
cpuDataWidth = 32,
memDataWidth = 32,
catchIllegalAccess = true,
catchAccessFault = true,
catchMemoryTranslationMiss = true,
asyncTagMemory = false,
twoStageLogic = true
twoCycleRam = true
)
// askMemoryTranslation = true,
// memoryTranslatorPortConfig = MemoryTranslatorPortConfig(
Expand Down
6 changes: 6 additions & 0 deletions src/main/scala/vexriscv/demo/DhrystoneBench.scala
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,12 @@ object DhrystoneBench extends App{
test = "make clean run REDO=10 IBUS=SIMPLE DBUS=SIMPLE CSR=no MMU=no DEBUG_PLUGIN=no MUL=no DIV=no"
)

getDmips(
name = "GenSmallAndProductiveWithICache",
gen = GenSmallAndProductiveICache.main(null),
test = "make clean run REDO=10 IBUS=CACHED DBUS=SIMPLE CSR=no MMU=no DEBUG_PLUGIN=no MUL=no DIV=no"
)


getDmips(
name = "GenFullNoMmuNoCache",
Expand Down
2 changes: 1 addition & 1 deletion src/main/scala/vexriscv/demo/FormalSimple.scala
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ object FormalSimple extends App{
def cpu() = new VexRiscv(
config = VexRiscvConfig(
plugins = List(
new FomalPlugin,
new FormalPlugin,
new HaltOnExceptionPlugin,
new PcManagerSimplePlugin(
resetVector = 0x00000000l,
Expand Down
3 changes: 1 addition & 2 deletions src/main/scala/vexriscv/demo/GenFull.scala
Original file line number Diff line number Diff line change
Expand Up @@ -21,15 +21,14 @@ object GenFull extends App{
cacheSize = 4096,
bytePerLine =32,
wayCount = 1,
wrappedMemAccess = true,
addressWidth = 32,
cpuDataWidth = 32,
memDataWidth = 32,
catchIllegalAccess = true,
catchAccessFault = true,
catchMemoryTranslationMiss = true,
asyncTagMemory = false,
twoStageLogic = true
twoCycleRam = true
),
askMemoryTranslation = true,
memoryTranslatorPortConfig = MemoryTranslatorPortConfig(
Expand Down
3 changes: 1 addition & 2 deletions src/main/scala/vexriscv/demo/GenFullNoMmu.scala
Original file line number Diff line number Diff line change
Expand Up @@ -21,15 +21,14 @@ object GenFullNoMmu extends App{
cacheSize = 4096,
bytePerLine =32,
wayCount = 1,
wrappedMemAccess = true,
addressWidth = 32,
cpuDataWidth = 32,
memDataWidth = 32,
catchIllegalAccess = true,
catchAccessFault = true,
catchMemoryTranslationMiss = true,
asyncTagMemory = false,
twoStageLogic = true
twoCycleRam = true
)
),
new DBusCachedPlugin(
Expand Down
3 changes: 1 addition & 2 deletions src/main/scala/vexriscv/demo/GenFullNoMmuMaxPerf.scala
Original file line number Diff line number Diff line change
Expand Up @@ -21,15 +21,14 @@ object GenFullNoMmuMaxPerf extends App{
cacheSize = 4096*4,
bytePerLine =32,
wayCount = 1,
wrappedMemAccess = true,
addressWidth = 32,
cpuDataWidth = 32,
memDataWidth = 32,
catchIllegalAccess = true,
catchAccessFault = true,
catchMemoryTranslationMiss = false,
asyncTagMemory = false,
twoStageLogic = true
twoCycleRam = true
)
),
new DBusCachedPlugin(
Expand Down
73 changes: 73 additions & 0 deletions src/main/scala/vexriscv/demo/GenSmallAndPerformantICache.scala
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
package vexriscv.demo

import vexriscv.plugin._
import vexriscv.{VexRiscv, VexRiscvConfig, plugin}
import spinal.core._
import vexriscv.ip.InstructionCacheConfig

/**
* Created by spinalvm on 15.06.17.
*/
object GenSmallAndProductiveICache extends App{
def cpu() = new VexRiscv(
config = VexRiscvConfig(
plugins = List(
new PcManagerSimplePlugin(
resetVector = 0x00000000l,
relaxedPcCalculation = false
),
new IBusCachedPlugin(
config = InstructionCacheConfig(
cacheSize = 4096,
bytePerLine = 32,
wayCount = 1,
addressWidth = 32,
cpuDataWidth = 32,
memDataWidth = 32,
catchIllegalAccess = false,
catchAccessFault = false,
catchMemoryTranslationMiss = false,
asyncTagMemory = false,
twoCycleRam = false
),
askMemoryTranslation = false
),
new DBusSimplePlugin(
catchAddressMisaligned = false,
catchAccessFault = false
),
new CsrPlugin(CsrPluginConfig.smallest),
new DecoderSimplePlugin(
catchIllegalInstruction = false
),
new RegFilePlugin(
regFileReadyKind = plugin.SYNC,
zeroBoot = false
),
new IntAluPlugin,
new SrcPlugin(
separatedAddSub = false,
executeInsertion = true
),
new LightShifterPlugin,
new HazardSimplePlugin(
bypassExecute = true,
bypassMemory = true,
bypassWriteBack = true,
bypassWriteBackBuffer = true,
pessimisticUseSrc = false,
pessimisticWriteRegFile = false,
pessimisticAddressMatch = false
),
new BranchPlugin(
earlyBranch = false,
catchAddressMisaligned = false,
prediction = NONE
),
new YamlPlugin("cpu0.yaml")
)
)
)

SpinalVerilog(cpu())
}

0 comments on commit 0270ee2

Please sign in to comment.