Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BPU clock gating #2733

Merged
merged 5 commits into from
Apr 17, 2024
Merged

BPU clock gating #2733

merged 5 commits into from
Apr 17, 2024

Conversation

eastonman
Copy link
Member

No description provided.

@chenguokai
Copy link
Member

CI failed

@eastonman
Copy link
Member Author

seems failed for no reason, just re-run for now.

@XiangShanRobot
Copy link

[Generated by IPC robot]
commit: 6cd72ef

commit astar copy_and_run coremark gcc gromacs lbm linux mcf microbench milc namd povray wrf xalancbmk
6cd72ef 1.871 0.461 2.093 1.197 2.908 2.178 1.398 0.919 1.430 1.123 3.612 2.656 2.301 3.149

master branch:

commit astar copy_and_run coremark gcc gromacs lbm linux mcf microbench milc namd povray wrf xalancbmk
a42a7ff 1.878 0.461 2.100 1.213 2.914 2.181 1.395 0.920 1.421 1.126 3.617 2.659 2.302 3.165
b15e4c0 1.878 0.461 2.100 1.213 2.895 2.181 1.395 0.919 1.421 1.128 3.617 2.659 2.302 3.165
3c5d56a 1.878 0.461 2.100 1.209 2.913 2.182 1.395 0.920 1.421 1.123 3.617 2.660 2.302 3.165
8abe181 1.878 0.461 2.100 1.213 2.895 2.179 1.395 0.919 1.421 1.125 3.617 2.666 2.302 3.165
13156de 1.878 0.461 2.100 1.213 2.895 2.182 1.397 0.919 1.421 1.126 3.617 2.660 2.302 3.165
1fcb3bc 1.878 0.461 2.100 1.211 2.895 2.181 1.395 0.920 1.421 1.123 3.617 2.659 2.302 3.165
f3c16e1 1.878 0.461 2.100 1.213 2.895 2.182 1.396 0.920 1.421 1.126 3.617 2.659 2.302 3.165
45f43e6 1.878 0.461 2.100 1.211 2.895 2.182 1.395 0.920 1.421 1.127 3.617 2.659 2.302 3.165
8fae59b 1.878 0.461 2.100 1.213 2.913 2.182 1.394 0.920 1.421 1.123 3.617 2.647 2.302 3.165
a61a35e 1.879 0.460 2.092 1.200 2.895 2.176 1.394 0.917 1.436 1.125 3.617 2.663 2.298 3.134

@eastonman eastonman added the do not merge Do not merge this pull request label Mar 4, 2024
@eastonman
Copy link
Member Author

Behavior change is expected, but I will split the changes affecting performance into another PR.

@eastonman eastonman removed the do not merge Do not merge this pull request label Mar 5, 2024
@eastonman
Copy link
Member Author

eastonman commented Mar 5, 2024

I believe the performance fluctuation is due to a performance bug fixed in e6b2fa7, which have minor influence.

In 14 microbenchmarks, 6 decreased MPKI, 1 same, and 7 increased. I think the fluctuation is acceptable.

astar.log
copy_and_run.log
coremark.log
gcc.log
gromacs.log
lbm.log
linux.log
mcf.log
microbench.log
milc.log
namd.log
povray.log
wrf.log
xalancbmk.log
(Ubuntu)  manyang@open05 6445 $> cd ../6453
(Ubuntu)  manyang@open05 6453 $> ls | sort |xargs cat | grep BpBRight
[PERF ][time=             2672567] TOP.SimTop.l_soc.core_with_l2.core.frontend.ftq: BpBRight,               633722
[PERF ][time=              882819] TOP.SimTop.l_soc.core_with_l2.core.frontend.ftq: BpBRight,                62733
[PERF ][time=             1503050] TOP.SimTop.l_soc.core_with_l2.core.frontend.ftq: BpBRight,               599581
[PERF ][time=             4178172] TOP.SimTop.l_soc.core_with_l2.core.frontend.ftq: BpBRight,               695115
[PERF ][time=             1719508] TOP.SimTop.l_soc.core_with_l2.core.frontend.ftq: BpBRight,                92771
[PERF ][time=             2295462] TOP.SimTop.l_soc.core_with_l2.core.frontend.ftq: BpBRight,                56652
[PERF ][time=             2961840] TOP.SimTop.l_soc.core_with_l2.core.frontend.ftq: BpBRight,               534243
[PERF ][time=             5443099] TOP.SimTop.l_soc.core_with_l2.core.frontend.ftq: BpBRight,               592129
[PERF ][time=              228655] TOP.SimTop.l_soc.core_with_l2.core.frontend.ftq: BpBRight,                54216
[PERF ][time=             4453785] TOP.SimTop.l_soc.core_with_l2.core.frontend.ftq: BpBRight,               313615
[PERF ][time=             1384306] TOP.SimTop.l_soc.core_with_l2.core.frontend.ftq: BpBRight,               501060
[PERF ][time=             1882376] TOP.SimTop.l_soc.core_with_l2.core.frontend.ftq: BpBRight,               456550
[PERF ][time=             2172840] TOP.SimTop.l_soc.core_with_l2.core.frontend.ftq: BpBRight,               254440
[PERF ][time=             1587779] TOP.SimTop.l_soc.core_with_l2.core.frontend.ftq: BpBRight,              1103260
(Ubuntu)  manyang@open05 6453 $> cd ../6445
(Ubuntu)  manyang@open05 6445 $> ls | sort |xargs cat | grep BpBRight
[PERF ][time=             2661985] TOP.SimTop.l_soc.core_with_l2.core.frontend.ftq: BpBRight,               633704
[PERF ][time=              882163] TOP.SimTop.l_soc.core_with_l2.core.frontend.ftq: BpBRight,                62734
[PERF ][time=             1498152] TOP.SimTop.l_soc.core_with_l2.core.frontend.ftq: BpBRight,               599656
[PERF ][time=             4123637] TOP.SimTop.l_soc.core_with_l2.core.frontend.ftq: BpBRight,               695197
[PERF ][time=             1715795] TOP.SimTop.l_soc.core_with_l2.core.frontend.ftq: BpBRight,                92805
[PERF ][time=             2292792] TOP.SimTop.l_soc.core_with_l2.core.frontend.ftq: BpBRight,                56652
[PERF ][time=             2967776] TOP.SimTop.l_soc.core_with_l2.core.frontend.ftq: BpBRight,               534057
[PERF ][time=             5435407] TOP.SimTop.l_soc.core_with_l2.core.frontend.ftq: BpBRight,               591821
[PERF ][time=              230140] TOP.SimTop.l_soc.core_with_l2.core.frontend.ftq: BpBRight,                54165
[PERF ][time=             4441659] TOP.SimTop.l_soc.core_with_l2.core.frontend.ftq: BpBRight,               313613
[PERF ][time=             1382280] TOP.SimTop.l_soc.core_with_l2.core.frontend.ftq: BpBRight,               501060
[PERF ][time=             1880573] TOP.SimTop.l_soc.core_with_l2.core.frontend.ftq: BpBRight,               456503
[PERF ][time=             2172362] TOP.SimTop.l_soc.core_with_l2.core.frontend.ftq: BpBRight,               254455
[PERF ][time=             1579787] TOP.SimTop.l_soc.core_with_l2.core.frontend.ftq: BpBRight,              1103272

@eastonman
Copy link
Member Author

re-running CI with LFSR changes reverted.

@XiangShanRobot
Copy link

[Generated by IPC robot]
commit: 37153f8

commit astar copy_and_run coremark gcc gromacs lbm linux mcf microbench milc namd povray wrf xalancbmk
37153f8 1.878 0.461 2.100 1.210 2.895 2.181 1.394 0.919 1.421 1.123 3.617 2.628 2.302 3.165

master branch:

commit astar copy_and_run coremark gcc gromacs lbm linux mcf microbench milc namd povray wrf xalancbmk
a42a7ff 1.878 0.461 2.100 1.213 2.914 2.181 1.395 0.920 1.421 1.126 3.617 2.659 2.302 3.165
b15e4c0 1.878 0.461 2.100 1.213 2.895 2.181 1.395 0.919 1.421 1.128 3.617 2.659 2.302 3.165
3c5d56a 1.878 0.461 2.100 1.209 2.913 2.182 1.395 0.920 1.421 1.123 3.617 2.660 2.302 3.165
8abe181 1.878 0.461 2.100 1.213 2.895 2.179 1.395 0.919 1.421 1.125 3.617 2.666 2.302 3.165
13156de 1.878 0.461 2.100 1.213 2.895 2.182 1.397 0.919 1.421 1.126 3.617 2.660 2.302 3.165
1fcb3bc 1.878 0.461 2.100 1.211 2.895 2.181 1.395 0.920 1.421 1.123 3.617 2.659 2.302 3.165
f3c16e1 1.878 0.461 2.100 1.213 2.895 2.182 1.396 0.920 1.421 1.126 3.617 2.659 2.302 3.165
45f43e6 1.878 0.461 2.100 1.211 2.895 2.182 1.395 0.920 1.421 1.127 3.617 2.659 2.302 3.165
8fae59b 1.878 0.461 2.100 1.213 2.913 2.182 1.394 0.920 1.421 1.123 3.617 2.647 2.302 3.165
a61a35e 1.879 0.460 2.092 1.200 2.895 2.176 1.394 0.917 1.436 1.125 3.617 2.663 2.298 3.134

@XiangShanRobot
Copy link

[Generated by IPC robot]
commit: 5e916e7

commit astar copy_and_run coremark gcc gromacs lbm linux mcf microbench milc namd povray wrf xalancbmk
5e916e7 1.878 0.461 2.100 1.211 2.913 2.179 1.395 0.919 1.421 1.123 3.617 2.662 2.302 3.165

master branch:

commit astar copy_and_run coremark gcc gromacs lbm linux mcf microbench milc namd povray wrf xalancbmk
a42a7ff 1.878 0.461 2.100 1.213 2.914 2.181 1.395 0.920 1.421 1.126 3.617 2.659 2.302 3.165
b15e4c0 1.878 0.461 2.100 1.213 2.895 2.181 1.395 0.919 1.421 1.128 3.617 2.659 2.302 3.165
3c5d56a 1.878 0.461 2.100 1.209 2.913 2.182 1.395 0.920 1.421 1.123 3.617 2.660 2.302 3.165
8abe181 1.878 0.461 2.100 1.213 2.895 2.179 1.395 0.919 1.421 1.125 3.617 2.666 2.302 3.165
13156de 1.878 0.461 2.100 1.213 2.895 2.182 1.397 0.919 1.421 1.126 3.617 2.660 2.302 3.165
1fcb3bc 1.878 0.461 2.100 1.211 2.895 2.181 1.395 0.920 1.421 1.123 3.617 2.659 2.302 3.165
f3c16e1 1.878 0.461 2.100 1.213 2.895 2.182 1.396 0.920 1.421 1.126 3.617 2.659 2.302 3.165
45f43e6 1.878 0.461 2.100 1.211 2.895 2.182 1.395 0.920 1.421 1.127 3.617 2.659 2.302 3.165
8fae59b 1.878 0.461 2.100 1.213 2.913 2.182 1.394 0.920 1.421 1.123 3.617 2.647 2.302 3.165
a61a35e 1.879 0.460 2.092 1.200 2.895 2.176 1.394 0.917 1.436 1.125 3.617 2.663 2.298 3.134

@eastonman eastonman added the do not merge Do not merge this pull request label Mar 6, 2024
@eastonman
Copy link
Member Author

I suggest these clock-gating PRs should be documented with comments before merging.
#2733 #2745 #2734

@eastonman eastonman removed the do not merge Do not merge this pull request label Apr 1, 2024
this commit ports clock gating optimization from nanhu if applicable,
most ungated register is opted

Co-authored-by: Liang Sen <liangsen20z@ict.ac.cn>
some control signal inside Tage is using data signals from io
this commit add io control to these signals
@XiangShanRobot
Copy link

[Generated by IPC robot]
commit: 7f85bd1

commit astar copy_and_run coremark gcc gromacs lbm linux mcf microbench milc namd povray wrf xalancbmk
7f85bd1 1.707 0.449 2.112 1.153 1.643 1.183 2.331 0.908 1.398 0.832 2.349 2.136 1.783 2.911

master branch:

commit astar copy_and_run coremark gcc gromacs lbm linux mcf microbench milc namd povray wrf xalancbmk
23761fd
0c00289 1.707 0.449 2.112 1.155 1.643 1.183 2.332 0.908 1.383 0.832 2.349 2.142 1.783 2.911
eef81af 1.862 0.460 2.087 1.204 2.903 2.177 2.332 0.929 1.427 1.138 3.610 2.663 2.297 3.183
875ae3b 1.862 0.460 2.087 1.204 2.903 2.177 2.332 0.928 1.427 1.138 3.610 2.659 2.297 3.183
f6916db 1.862 0.460 2.087 1.204 2.903 2.177 2.332 0.929 1.427 1.138 3.610 2.659 2.297 3.183
4d931b7 1.862 0.460 2.087 1.204 2.903 2.177 2.332 0.928 1.427 1.138 3.610 2.657 2.297 3.183
9afa8a4 1.862 0.460 2.087 1.204 2.903 2.177 2.332 0.928 1.427 1.138 3.610 2.657 2.297 3.183
ef6723f 1.862 0.460 2.087 1.204 2.903 2.177 2.331 0.929 1.427 1.138 3.610 2.648 2.297 3.183
58a9a40 1.862 0.460 2.087 1.204 2.903 2.177 2.332 0.929 1.427 1.138 3.610 2.657 2.297 3.183
8f62644 1.862 0.460 2.087 1.204 2.903 2.177 2.331 0.929 1.427 1.138 3.610 2.645 2.297 3.183

@eastonman
Copy link
Member Author

@Tang-Haojin I believe this can be merged?

@Tang-Haojin
Copy link
Member

Need squash or rebase?

@eastonman
Copy link
Member Author

@Tang-Haojin prefer squash, not big PR.

@Tang-Haojin Tang-Haojin merged commit 7af6acb into master Apr 17, 2024
4 checks passed
@Tang-Haojin Tang-Haojin deleted the bpu-clock-gating branch April 17, 2024 03:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants