Perceptron predictor #374

ABenC377 · 2024-01-19T12:27:32Z

Introducing a new PerceptronPredictor class as an alternative to the GenericPredictor currently used in all the config files. Shows an improvement in branch prediction rate for all the benchmarks except for STREAM, which was already being well predicted by the Generic predictor. There is more work done by this predictor per prediction so it is marginally slower per prediction than the GenericPredictor, but it compensates for this by its improved accuracy meaning that SimEng was faster with it for all but 15 of the benchmarks (72). These 15 benchmarks are generally already very quick and the regression I observed was at most 3.6% compared to improvements of up to 35% for the other benchmarks. Average improvements of -5% runtime (percentage) and -8.5% mispredictions (raw).

Benchmark	DEV time	DEV mispredict	PERCEPTRON time	PERCEPTRON mispredict	Performance change (percentage)	Mispredict change (raw)
CloverLeaf serial gcc8.3.0 armv8.4	251655ms	35.7%	161811ms	25%	-35% ✅	-10.7% ✅
CloverLeaf serial gcc9.3.0 armv8.4	171576ms	34.5%	158353ms	23%	-7.7% ✅	-11.5% ✅
CloverLeaf serial gcc10.3.0 armv8.4	173001ms	36.7%	163963ms	24.8%	-5.2% ✅	-11.9% ✅
CloverLeaf serial armclang20 armv8.4	142201ms	31.6%	134777ms	22.5%	-5.2% ✅	-9.1% ✅
CloverLeaf openmp gcc8.3.0 armv8.4	227772ms	34.6%	205901ms	24.1%	-9.6% ✅	-10.5% ✅
CloverLeaf openmp gcc9.3.0 armv8.4	225624ms	33.5%	199719ms	22.6%	-11.5% ✅	-10.9% ✅
CloverLeaf openmp gcc10.3.0 armv8.4	226401ms	34.5%	200722ms	23.5%	-11.3% ✅	-11% ✅
CloverLeaf openmp armclang20 armv8.4	193129ms	31.6%	170551ms	20.5%	-11.7% ✅	-11.1% ✅
miniBUDE openmp gcc8.3.0 armv8.4	201411ms	9.96%	203321ms	8.64%	+0.9% ❌	-1.3% ✅
miniBUDE openmp gcc9.3.0 armv8.4	201666ms	9.93%	202440ms	8.59%	+0.4% ❌	-1.3% ✅
miniBUDE openmp gcc10.3.0 armv8.4	201324ms	10%	202448ms	8.61%	+0.6% ❌	-1.4% ✅
miniBUDE openmp armclang20 armv8.4	183828ms	11.6%	185367ms	11.4%	+0.8% ❌	-0.2% ✅
STREAM serial gcc8.3.0 armv8.4	74849ms	0.619%	77580ms	0.601%	+3.6% ❌	-0.1% ✅
STREAM serial gcc9.3.0 armv8.4	76025ms	0.942%	78402ms	0.774%	+3.1% ❌	-0.2% ✅
STREAM serial gcc10.3.0 armv8.4	76237ms	0.654%	78152ms	0.838%	+2.5% ❌	+0.2% ❌
STREAM serial armclang20 armv8.4	84461ms	1.16%	87023ms	1.24%	+3.0% ❌	+0.1% ❌
STREAM openmp gcc8.3.0 armv8.4	129391ms	11.5%	113201ms	2.76%	-12.5% ✅	-8.7% ✅
STREAM openmp gcc9.3.0 armv8.4	127899ms	11%	115083ms	2.4%	-10% ✅	-8.6% ✅
STREAM openmp gcc10.3.0 armv8.4	126097ms	11.4%	112675ms	2.75%	-10.6% ✅	-8.6% ✅
STREAM openmp armclang20 armv8.4	131196ms	14.4%	123741ms	5.07%	-5.7% ✅	-9.3% ✅
TeaLeaf 2D serial gcc8.3.0 armv8.4	128641ms	24.9%	126449ms	20.7%	-1.7% ✅	-4.2% ✅
TeaLeaf 2D serial gcc9.3.0 armv8.4	127964ms	25%	126731ms	21.6%	-1.0% ✅	-3.6% ✅
TeaLeaf 2D serial gcc10.3.0 armv8.4	129052ms	24.9%	128624ms	20.8%	-0.3% ✅	-4.1% ✅
TeaLeaf 2D serial armclang20 armv8.4	233988ms	13.4%	236103ms	10.9%	+0.9% ❌	-2.5% ✅
TeaLeaf 2D openmp gcc8.3.0 armv8.4	217511ms	27.6%	182287ms	11.2%	-16.2% ✅	-16.4% ✅
TeaLeaf 2D openmp gcc9.3.0 armv8.4	214477ms	25.4%	183910ms	11.7%	-14.3% ✅	-11.1% ✅
TeaLeaf 2D openmp gcc10.3.0 armv8.4	216781ms	27.6%	183236ms	11.5%	-15.5% ✅	-16.1% ✅
TeaLeaf 2D openmp armclang20 armv8.4	598907ms	16.5%	585522ms	9.29%	-2.2% ✅	-7.2% ✅
TeaLeaf 3D serial gcc8.3.0 armv8.4	153938ms	17.3%	146189ms	11.2%	-5.0% ✅	-6.1% ✅
TeaLeaf 3D serial gcc9.3.0 armv8.4	158193ms	18.6%	152422ms	12.5%	-3.6% ✅	-6.1% ✅
TeaLeaf 3D serial gcc10.3.0 armv8.4	156685ms	17.9%	152368ms	12.9%	-2.8% ✅	-5.0% ✅
TeaLeaf 3D serial armclang20 armv8.4	220240ms	29.2%	216672ms	22.4%	-1.6% ✅	-6.8% ✅
TeaLeaf 3D openmp gcc8.3.0 armv8.4	297986ms	27.9%	240845ms	9.66%	-19.2% ✅	-18.2% ✅
TeaLeaf 3D openmp gcc9.3.0 armv8.4	307134ms	28.5%	248632ms	11.1%	-19.0% ✅	-17.4% ✅
TeaLeaf 3D openmp gcc10.3.0 armv8.4	300711ms	28%	242736ms	10.5%	-19.3% ✅	-17.5% ✅
TeaLeaf 3D openmp armclang20 armv8.4	489231ms	28.8%	449169ms	17.3%	-8.2% ✅	-11.5% ✅
CloverLeaf serial gcc8.3.0 armv8.4+sve	169930ms	35.2%	150743ms	24.3%	-11.3% ✅	-10.9% ✅
CloverLeaf serial gcc9.3.0 armv8.4+sve	166786ms	34.2%	149215ms	22.9%	-10.5% ✅	-11.3% ✅
CloverLeaf serial gcc10.3.0 armv8.4+sve	171339ms	36.5%	149683ms	25.2%	-12.6% ✅	-11.3% ✅
CloverLeaf serial armclang20 armv8.4+sve	159721ms	33%	137543ms	21.7%	-13.9% ✅	-11.3% ✅
CloverLeaf openmp gcc8.3.0 armv8.4+sve	225588ms	34.3%	198457ms	23.6%	-12.0% ✅	-10.7% ✅
CloverLeaf openmp gcc9.3.0 armv8.4+sve	221890ms	33.4%	194197ms	22.3%	-12.5% ✅	-11.1% ✅
CloverLeaf openmp gcc10.3.0 armv8.4+sve	223602ms	34.1%	195536ms	23.4%	-12.6% ✅	-10.7% ✅
CloverLeaf openmp armclang20 armv8.4+sve	206803ms	32.5%	179017ms	21.6%	-13.4% ✅	-10.9% ✅
miniBUDE openmp gcc8.3.0 armv8.4+sve	84149ms	23.6%	80624ms	16.5%	-4.2% ✅	-7.1% ✅
miniBUDE openmp gcc9.3.0 armv8.4+sve	81961ms	25.1%	77745ms	16.7%	-5.1% ✅	-8.4% ✅
miniBUDE openmp gcc10.3.0 armv8.4+sve	80809ms	24%	77152ms	16.5%	-4.5% ✅	-7.5% ✅
miniBUDE openmp armclang20 armv8.4+sve	80084ms	22.9%	80877ms	22.6%	+1.0% ❌	-0.3% ✅
STREAM serial gcc8.3.0 armv8.4+sve	40075ms	1.68%	40545ms	1.9%	+1.2% ❌	+0.2% ❌
STREAM serial gcc9.3.0 armv8.4+sve	40034ms	1.84%	41154ms	2.1%	+2.8% ❌	+0.3% ❌
STREAM serial gcc10.3.0 armv8.4+sve	39628ms	2.04%	40867ms	2.02%	+3.1% ❌	-0.0% ✅
STREAM serial armclang20 armv8.4+sve	24085ms	2.13%	24842ms	1.88%	+3.1% ❌	-0.2% ✅
STREAM openmp gcc8.3.0 armv8.4+sve	91458ms	19.5%	77394ms	5.01%	-15.4% ✅	-14.4% ✅
STREAM openmp gcc9.3.0 armv8.4+sve	91222ms	19.7%	78769ms	6.66%	-13.7% ✅	-13% ✅
STREAM openmp gcc10.3.0 armv8.4+sve	89951ms	19.5%	77005ms	5.41%	-14.4% ✅	-15.1% ✅
STREAM openmp armclang20 armv8.4+sve	76133ms	18.3%	65194ms	5.69%	-14.4% ✅	-12.6% ✅
TeaLeaf 2D serial gcc8.3.0 armv8.4+sve	130373ms	25%	128608ms	21.3%	-1.4% ✅	-3.7% ✅
TeaLeaf 2D serial gcc9.3.0 armv8.4+sve	129521ms	24.9%	127291ms	21.7%	-1.7% ✅	-3.2% ✅
TeaLeaf 2D serial gcc10.3.0 armv8.4+sve	131156ms	25%	128176ms	20.7%	-2.3% ✅	-4.3% ✅
TeaLeaf 2D serial armclang20 armv8.4+sve	99518ms	19.5%	95118ms	14.2%	-4.4% ✅	-5.3% ✅
TeaLeaf 2D openmp gcc8.3.0 armv8.4+sve	217552ms	27.4%	185782ms	11.6%	-14.6% ✅	-15.6% ✅
TeaLeaf 2D openmp gcc9.3.0 armv8.4+sve	214994ms	26.5%	192862ms	11.7%	-10.3% ✅	-14.8% ✅
TeaLeaf 2D openmp gcc10.3.0 armv8.4+sve	213683ms	27.5%	180638ms	11.9%	-15.5% ✅	-15.7% ✅
TeaLeaf 2D openmp armclang20 armv8.4+sve	822471ms	16.4%	576071ms	8.79%	-30.0% ✅	-7.6% ✅
TeaLeaf 3D serial gcc8.3.0 armv8.4+sve	130973ms	26.4%	130850ms	20.6%	-0.1% ✅	-5.8% ✅
TeaLeaf 3D serial gcc9.3.0 armv8.4+sve	131887ms	26.4%	131509ms	20.8%	-0.3% ✅	-5.6% ✅
TeaLeaf 3D serial gcc10.3.0 armv8.4+sve	132279ms	26.1%	130915ms	21.7%	-1.0% ✅	-4.4% ✅
TeaLeaf 3D serial armclang20 armv8.4+sve	219685ms	29.9%	212249ms	23.1%	-3.4% ✅	-6.8% ✅
TeaLeaf 3D openmp gcc8.3.0 armv8.4+sve	269587ms	31.3%	228886ms	14.1%	-15.1% ✅	-16.2% ✅
TeaLeaf 3D openmp gcc9.3.0 armv8.4+sve	273317ms	30.7%	234772ms	14.3%	-14.1% ✅	-16.4% ✅
TeaLeaf 3D openmp gcc10.3.0 armv8.4+sve	273946ms	32.1%	226325ms	15.5%	-17.4% ✅	-16.6% ✅
TeaLeaf 3D openmp armclang20 armv8.4+sve	530223ms	30.8%	481883ms	17.4%	-9.1% ✅	-13.4% ✅

dANW34V3R

The implementation on the whole looks great. Just some comments on clarity and naming, again for clarity

configs/DEMO_RISCV.yaml

src/lib/config/ModelConfig.cc

src/include/simeng/PerceptronPredictor.hh

src/lib/PerceptronPredictor.cc

FinnWilkinson · 2024-01-19T17:34:23Z

Please can you update the documentation with the new config options and a small section about how the new predictor works

rahahahat · 2024-01-22T13:05:02Z

I was just wondering if it would be possible to add the sources (research papers/ websites) you've used to carry out the implementation.

This could help reviewers and is a good piece of documentation for future changes.

JosephMoore25 · 2024-01-22T15:58:45Z

I suspect these tests were run in debug mode. It's great that you have lots of results, but it may be worth doing a couple comparative spot checks in release mode to ensure that the speedup is consistent between the two (misprediction rate should remain the same), as this is the mode that performance will matter the most. No need to rerun all the tests unless differences are noticed, or if this is already in release.

configs/DEMO_RISCV.yaml

src/include/simeng/PerceptronPredictor.hh

src/lib/PerceptronPredictor.cc

test/unit/PerceptronPredictorTest.cc

docs/sphinx/developer/components/branchPred.rst

src/include/simeng/PerceptronPredictor.hh

JosephMoore25

Looks good overall, nice work.

A couple small comments, although others' reviews mostly encapsulate changes that need to happen before approval.

src/lib/PerceptronPredictor.cc

src/include/simeng/PerceptronPredictor.hh

test/regression/RegressionTest.cc

Adding tests for perceptron predictor comparable to generic predictor tests, and amending existing tests which are effected by the changes to the BP config structure

FinnWilkinson

All looks great

docs/sphinx/developer/components/branchPred.rst

docs/sphinx/user/configuring_simeng.rst

src/include/simeng/PerceptronPredictor.hh

docs/sphinx/developer/components/branchPred.rst

JosephMoore25

Looks all good now. Nice work, and nice performance improvements 👍

ABenC377 requested review from dANW34V3R, jj16791, FinnWilkinson and JosephMoore25 January 19, 2024 12:27

ABenC377 linked an issue Jan 19, 2024 that may be closed by this pull request

Update generic branch predictor #354

Closed

FinnWilkinson assigned ABenC377 Jan 19, 2024

FinnWilkinson added the enhancement label Jan 19, 2024

dANW34V3R requested changes Jan 19, 2024

View reviewed changes

FinnWilkinson requested changes Jan 23, 2024

View reviewed changes

jj16791 reviewed Jan 23, 2024

View reviewed changes

docs/sphinx/developer/components/branchPred.rst Outdated Show resolved Hide resolved

src/include/simeng/PerceptronPredictor.hh Show resolved Hide resolved

JosephMoore25 requested changes Jan 23, 2024

View reviewed changes

src/lib/PerceptronPredictor.cc Outdated Show resolved Hide resolved

src/lib/PerceptronPredictor.cc Outdated Show resolved Hide resolved

src/lib/PerceptronPredictor.cc Outdated Show resolved Hide resolved

src/lib/PerceptronPredictor.cc Outdated Show resolved Hide resolved

dANW34V3R mentioned this pull request Jan 26, 2024

Config Docs Range and Core Type Clarifications #376

Open

FinnWilkinson requested changes Feb 5, 2024

View reviewed changes

src/include/simeng/PerceptronPredictor.hh Outdated Show resolved Hide resolved

test/regression/RegressionTest.cc Outdated Show resolved Hide resolved

ABenC377 force-pushed the perceptron_predictor branch from 6f71505 to 7cf73cd Compare February 7, 2024 13:25

ABenC377 added 12 commits February 7, 2024 15:39

Adding PerceptronPredictor class and rebasing

52f6f96

Updating default config options to include Branch Predictor Type

fec45b7

Shuffling order of declarations in PerceptronPRedictor.hh

b0afe42

Shuffling order of declarations in PerceptronPRedictor.hh

68387e0

debugging -- squash this commit

bc95bce

Sorting out tests

8f54745

Adding tests for perceptron predictor comparable to generic predictor tests, and amending existing tests which are effected by the changes to the BP config structure

tidying

0aa88a1

Adding to documentation

c1a3cd2

Adding to documentation

ede562d

Adding comments, adding if to config

8a042c3

Adjusting tests in view of config changes

13e79e2

clang format and finessing

4351315

ABenC377 force-pushed the perceptron_predictor branch from 43d94b0 to 4351315 Compare February 7, 2024 15:42

passing perceptron by reference in getDotProduct()

2e6e041

ABenC377 dismissed jj16791’s stale review via 2e6e041 February 9, 2024 10:36

FinnWilkinson previously approved these changes Feb 9, 2024

View reviewed changes

jj16791 previously approved these changes Feb 9, 2024

View reviewed changes

dANW34V3R requested changes Feb 9, 2024

View reviewed changes

doc changes

1b194a7

ABenC377 dismissed stale reviews from jj16791 and FinnWilkinson via 1b194a7 February 9, 2024 14:07

ABenC377 requested review from FinnWilkinson, jj16791 and dANW34V3R February 9, 2024 14:29

doc changes

5413b63

jj16791 previously approved these changes Feb 9, 2024

View reviewed changes

FinnWilkinson previously approved these changes Feb 12, 2024

View reviewed changes

JosephMoore25 previously approved these changes Feb 12, 2024

View reviewed changes

Changes docs

5f2b5b8

ABenC377 dismissed stale reviews from JosephMoore25, FinnWilkinson, and jj16791 via 5f2b5b8 February 13, 2024 20:20

Changes docs

526f509

ABenC377 requested review from jj16791, FinnWilkinson and JosephMoore25 February 13, 2024 20:45

dANW34V3R approved these changes Feb 13, 2024

View reviewed changes

jj16791 approved these changes Feb 14, 2024

View reviewed changes

JosephMoore25 approved these changes Feb 14, 2024

View reviewed changes

FinnWilkinson approved these changes Feb 14, 2024

View reviewed changes

ABenC377 merged commit 7c9ed78 into dev Feb 14, 2024

ABenC377 deleted the perceptron_predictor branch February 14, 2024 15:27

FinnWilkinson mentioned this pull request Feb 16, 2024

Update generic branch predictor #354

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Perceptron predictor #374

Perceptron predictor #374

ABenC377 commented Jan 19, 2024 •

edited

Loading

dANW34V3R left a comment

FinnWilkinson commented Jan 19, 2024

rahahahat commented Jan 22, 2024

JosephMoore25 commented Jan 22, 2024

JosephMoore25 left a comment

FinnWilkinson left a comment

JosephMoore25 left a comment

Perceptron predictor #374

Perceptron predictor #374

Conversation

ABenC377 commented Jan 19, 2024 • edited Loading

dANW34V3R left a comment

Choose a reason for hiding this comment

FinnWilkinson commented Jan 19, 2024

rahahahat commented Jan 22, 2024

JosephMoore25 commented Jan 22, 2024

JosephMoore25 left a comment

Choose a reason for hiding this comment

FinnWilkinson left a comment

Choose a reason for hiding this comment

JosephMoore25 left a comment

Choose a reason for hiding this comment

ABenC377 commented Jan 19, 2024 •

edited

Loading