v1: Initial version
v2: implement review comments
v3: adapt to new API
The current sleep timer implementation basically offers two variants. Either
wait the specified time exactly with a condition variable (as host) or use a
combination of it with a thread yielding busy loop afterwards (usleep timer).
While the second one is very precise it consumes CPU loops for each wait call
below 50us. Games like Bomberman Ultra spam 30us waits and the emulator hogs
low power CPUs. Switching to host mode reduces CPU consumption but gives a
~50us penalty for each wait call. Thus extending all sleeps by a factor of
more than two.
The following bugfix tries to improve the system timer for Linux by using
Linux native timers for small wait calls below 1ms. This has two effects.
- Host wait setting has much less wait overhead
- usleep wait setting produces lower CPU overhead
Some numbers for host timer setting from my tests on a Pentium G5600, UHD
630 waiting in the Bomberman welcome screen. I shortened/lengthened the
game timer inside the emulator to get a better picture for different wait
lenghts. As you can see current implementation always produces a 50us
overhead while the new implementation mostly stays below 10us. us(er),
sy(stem), id(le) have been taken from vmstat during the tests.
sleeps of 70usec
Calls >=120us <120us <95us <80us <73us us sy id
Master run 1: 1000000 708599 144933 114607 27954 3906 44 12 15
Master run 2: 1000000 707853 145802 114613 27757 3975 45 12 43
Patch run 1: 1000000 24478 37779 122771 679292 135679 46 13 41
Patch run 2: 1000000 27544 38647 120150 676306 137353 45 13 42
sleeps of 60usec
Calls >=110us <110us <85us <70us <63us us sy id
Master run 1: 1000000 695187 167665 107111 26767 3269 42 11 47
Master run 2: 1000000 698397 166151 106322 25889 3241 42 11 46
Patch run 1: 1000000 23266 36454 131397 651232 157650 44 12 44
Patch run 2: 1000000 27780 41361 141313 636585 152961 45 12 42
sleeps of 50usec
Calls >=100us <100us <75us <60us <53us us sy id
Master run 1: 1000000 690729 183766 97207 25160 3137 43 12 46
Master run 2: 1000000 689518 184570 97716 25131 3065 42 11 47
Patch run 1: 1000000 21068 34504 124814 646399 173214 45 13 42
Patch run 2: 1000000 22531 36852 130585 638397 171635 44 12 44
sleeps of 40usec
Calls >=90us <90us <65us <50us <43us us sy id
Master run 1: 1000000 688084 176572 111680 20357 3306 45 12 44
Master run 2: 1000000 687553 177216 111599 20409 3223 46 12 42
Patch run 1: 1000000 18164 31248 113778 643851 192958 44 12 44
Patch run 2: 1000000 20985 34841 120508 633031 190635 45 12 43
sleeps of 30usec
Calls >=80us <80us <55us <40us <33us us sy id
Master run 1: 1000000 721705 205084 60793 12060 357 44 12 45
Master run 2: 1000000 720323 205960 61524 11884 309 43 11 46
Patch run 1: 1000000 15139 16863 101604 629094 227299 44 12 44
Patch run 2: 1000000 18560 30207 110159 617093 223981 45 12 43
sleeps of 20usec
Calls >=70us <70us <45us <30us <23us us sy id
Master run 1: 1000000 813648 144746 36458 5111 36 43 12 45
Master run 2: 1000000 813322 144917 36618 5097 46 45 12 43
Patch run 1: 1000000 14073 23076 83921 635412 243517 45 13 42
Patch run 2: 1000000 13769 23460 86245 632826 243700 44 13 43
sleeps of 10usec
Calls >=60us <60us <35us <20us <13us us sy id
Master run 1: 1000000 864216 101101 29002 5651 29 43 12 45
Master run 2: 1000000 864896 100595 28941 5550 18 42 11 47
Patch run 1: 1000000 7613 13301 52335 640861 285889 46 13 41
Patch run 2: 1000000 7223 13280 52123 644643 282731 47 13 40
Comparison between host and usleep setting for game defaults of 30us waits
fps us sy id
Mater run host : 53 43 11 46
Patch run host : 52 44 12 44
Mater run usleep: 49 51 18 31
Patch run usleep: 51 48 15 37