Optimisations. #4

Nebuleon · 2014-01-18T07:31:16Z

I attach some commits that optimise some rather common cases in interpreted opcodes:

floating-point operations setting the rounding mode when that's not needed;
inter-block branches (J*_OUT and B*_OUT) targetting the physical addresses (0x80000000 inclusive to 0xC0000000 exclusive).

Setting the rounding mode on some host architectures (including x86, according to <http://www.mega-nerd.com/FPcast/>) empties the floating-point unit's pipeline, resulting in poorer performance. Omitting the rounding mode before exact operations restores some performance. Affected opcodes are CVT.W.S, CVT.W.D, CVT.L.S, CVT.L.D, ABS.S, MOV.S, NEG.S, ABS.D, MOV.D and NEG.D.

The rounding mode is completely unneeded when converting from W (32-bit integer) or S (32-bit floating-point) to D (64-bit floating-point), and is actually unused on MIPS processors. Quoting from the MIPS Programmer's Manual, volume 2, for CVT.D.fmt: The value in FPR fs, in format fmt, is converted to a value in double floating point format and rounded according to the current rounding mode in FCSR. The result is placed in FPR fd. If fmt is S or W, then the operation is always exact.

richard42 · 2014-01-21T06:11:25Z

Hi Nebuleon, thanks for this patch. In reviewing the changes, I think that the following instructions still need to retain the rounding mode set function call:

cvt_w_s
cvt_w_d
cvt_l_s
cvt_l_d

These instructions perform a conversion from floating-point to integer, which loses precision. As such, I think that their results will be affected by the current rounding mode. Aside from this, the other changes look good.

Nebuleon · 2014-01-21T09:23:27Z

cvt_[int]_[float] call into one of sixteen functions depending on the value of the N64 FCR31, each of which in turn calls one of eight C standard library functions (truncf, roundf, floorf, ceilf, trunc, round, floor, and ceil) which set their own native rounding modes as implied by their names.

Because none of the sixteen functions actually needs the rounding mode to already be set, the issue is not that the rounding mode is not needed, it's that the rounding mode is needlessly set twice. I am sorry for the confusion and the lumping into the same commit.

Reference:

mupen64plus-core/src/r4300/fpu.h

Line 162 in f2247c5

M64P_FPU_INLINE void cvt_w_s(float *source,int *dest)

Optimisations.

richard42 · 2014-01-23T05:52:58Z

I see, thanks for the explanation.

Move seek_track logic to disk module

Capture backend

Nebuleon added 3 commits January 18, 2014 07:09

Remove a native test and branch from inter-block jumps.

1ef073e

richard42 added a commit that referenced this pull request Jan 23, 2014

Merge pull request #4 from Nebuleon/master

0356e78

Optimisations.

richard42 merged commit 0356e78 into mupen64plus:master Jan 23, 2014

billingb mentioned this pull request Jan 9, 2016

Input configuration can't be modified on mac os x #147

Open

fullmetal1 mentioned this pull request Aug 20, 2016

freeze on startup of any ROM #162

Open

Meriipu mentioned this pull request Apr 26, 2017

Banjo-Kazooie intro puzzle effect not displaying correctly with Rice #265

Closed

huvox mentioned this pull request Jul 26, 2017

Crash GLXBadFBConfig #349

Closed

bsmiles32 mentioned this pull request Sep 14, 2017

Yakouchuu II - Satsujin Kouro does not go to menu #392

Closed

fzurita mentioned this pull request Dec 25, 2017

Reported crashes #489

Closed

gianlucarenzi1971 mentioned this pull request Dec 28, 2017

PowerPC Big Endian issue???? #506

Open

sodomon2 mentioned this pull request Jul 14, 2019

Video Error #664

Open

inukaze mentioned this pull request Aug 24, 2020

How debug? #772

Open

bsmiles32 pushed a commit to bsmiles32/mupen64plus-core that referenced this pull request Dec 5, 2020

Merge pull request mupen64plus#4 from bsmiles32/diskupdate

3c19a69

Move seek_track logic to disk module

distherapy mentioned this pull request Dec 19, 2020

Display issue with Intel integrated graphics OpenGL (Mesa) #810

Open

Madghostek referenced this pull request in Madghostek/mupen64plus-core-rr Jul 26, 2023

Merge pull request #4 from jgcodes2020/capture-backend

0de6cb5

Capture backend

retropieuser mentioned this pull request Nov 15, 2023

[Raspberry Pi 5] 4KB Page Memory Incompatible with 16KB Page Memory #1047

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimisations. #4

Optimisations. #4

Nebuleon commented Jan 18, 2014

richard42 commented Jan 21, 2014

Nebuleon commented Jan 21, 2014

richard42 commented Jan 23, 2014

Optimisations. #4

Optimisations. #4

Conversation

Nebuleon commented Jan 18, 2014

richard42 commented Jan 21, 2014

Nebuleon commented Jan 21, 2014

richard42 commented Jan 23, 2014