Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modified VU0MixVec. Corrected clobbers. #272

Merged
merged 1 commit into from
May 24, 2020

Conversation

astrelsky
Copy link
Contributor

@astrelsky astrelsky commented May 23, 2020

Pull Request checklist

Note: these are not necessarily requirements

  • I reformatted the code with clang-format
  • I checked to make sure my submission worked
  • I am the author of submission or have permission from the original author
  • Requires update of the PS2SDK
  • Requires update of the gsKit

Pull Request description

For readability the operands have also been changed to symbolic names. These changes were made primarily to enable functionality on both the old and new toolchains while using the same code.

Prior to these changes the following was produced by the old toolchain:

VU0MixVec:
    addiu      sp,sp,-0x10
    swc1       mix,0x0(sp)
    lqc2       y1,a->x(a0)
    lqc2       y2,b->x(a1)
    lw         v0,0x0(sp)
    qmtc2      v0,vf3
    vaddw.x    vf5,vf0,vf0w
    vsub.x     vf4,vf5,vf3
    vmulax.x   ACC,y1,vf3x
    vmaddx.x   y1,y2,vf4x
    sqc2       y1,res->x(a2)
    jr         ra
    _addiu     sp,sp,0x10

With these changes the following is produced by the old toolchain:

VU0MixVec:
    mfc1       v0,mix
    lqc2       vf1,a->x(a0)
    lqc2       vf2,b->x(a1)
    qmtc2      v0,vf3
    vaddw.x    vf5,vf0,vf0w
    vsub.x     vf4,vf5,vf3
    vmulax.x   ACC,vf1,vf3x
    vmaddx.x   vf1,vf2,vf4x
    jr         ra
    _sqc2      vf1,res->x(a2)

With these changes the following is produced by the new toolchain:

VU0MixVec:
    mfc1       v0,mix
    lqc2       vf1,a->x(a0)
    lqc2       vf2,b->x(a1)
    qmtc2      v0,vf3
    vaddw.x    vf5,vf0,vf0w
    vsub.x     vf4,vf5,vf3
    vmulax.x   ACC,vf1,vf3x
    vmaddx.x   vf1,vf2,vf4x
    sqc2       vf1,res->x(a3)
    jr         ra
    nop

@rickgaiser
Copy link
Member

Nice work @astrelsky ! I've tested with gcc3.2.3 and gcc11 toolchains:

  • gcc 3.2.3 works the same, so good!
  • gcc 11 works a lot better that before, but still not as it should be:
    Schermafdruk van 2020-05-23 22-35-40

The perlin noise seems to be mostly working, but there's still a '+' sign in the perlin noise that fades in and out. The c code for the same function does not show this artifact:

res->x = mix(a->x, b->x, 1.0 - t);
res->y = mix(a->y, b->y, 1.0 - t);
res->z = mix(a->z, b->z, 1.0 - t);
res->w = mix(a->w, b->w, 1.0 - t);

@astrelsky
Copy link
Contributor Author

Nice work @astrelsky ! I've tested with gcc3.2.3 and gcc11 toolchains:

  • gcc 3.2.3 works the same, so good!
  • gcc 11 works a lot better that before, but still not as it should be:
    Schermafdruk van 2020-05-23 22-35-40

The perlin noise seems to be mostly working, but there's still a '+' sign in the perlin noise that fades in and out. The c code for the same function does not show this artifact:

res->x = mix(a->x, b->x, 1.0 - t);
res->y = mix(a->y, b->y, 1.0 - t);
res->z = mix(a->z, b->z, 1.0 - t);
res->w = mix(a->w, b->w, 1.0 - t);

Hmm that is strange. I was testing with the function in a file by itself so I didn't notice there was still an issue. I'm working directly with opl now so I can check through calls to the function and ensure everything is flowing correctly.

@astrelsky
Copy link
Contributor Author

astrelsky commented May 23, 2020

@rickgaiser It should be good now. It turns out that gcc was changing instructions when inlining it. Adding the noinline attribute prevents this. Since inlining never worked on the old toolchain it won't make a difference.

Also, for the new toolchain, make sure mix is changed to t. It took me forever to notice but gcc will silently accept it and link to the mix function address instead of the t parameter.

It is also possible that inlining was the problem the whole time but I haven't bothered to check.

@rickgaiser
Copy link
Member

It turns out that gcc was changing instructions when inlining it. Adding the noinline attribute prevents this.

Inlining should be possible. So it seems like there is another bug. I tested on gcc11, and your solution does work, but it's a workaround for the real problem right? Do you know what other instructions it's wrongfully mixing with, and why?

Also, for the new toolchain, make sure mix is changed to t. It took me forever to notice but gcc will silently accept it and link to the mix function address instead of the t parameter.

I've noticed, the hard way ;-).

For readability the operands have also been changed to symbolic
names. The noinline attribute is required for functionality on
the new toolchain. It should have virtually no effect on the old
toolchain.
@astrelsky
Copy link
Contributor Author

It turns out that gcc was changing instructions when inlining it. Adding the noinline attribute prevents this.

Inlining should be possible. So it seems like there is another bug. I tested on gcc11, and your solution does work, but it's a workaround for the real problem right? Do you know what other instructions it's wrongfully mixing with, and why?

Also, for the new toolchain, make sure mix is changed to t. It took me forever to notice but gcc will silently accept it and link to the mix function address instead of the t parameter.

I've noticed, the hard way ;-).

It appears I was wrong, the old toolchain will inline functions. Just not as aggressively as newer versions of gcc. I previously had a situation where I used inline __attribute__((__always_inline__)) and it still was not inlined. That is why I was under the impression that it would never inline.

As for the fix here I was missing the "memory" clobber. Without it the compiler wasn't correctly preserving the value of one of the input pointers.

@rickgaiser
Copy link
Member

I've tested using both the old and new toolchains. It's a real fix for the real bug, great! Merging.

@rickgaiser rickgaiser merged commit e2fb60c into ps2homebrew:master May 24, 2020
AKuHAK pushed a commit that referenced this pull request Sep 30, 2021
Modified VU0MixVec. Corrected clobbers.
citronalco pushed a commit to citronalco/OPL-Daily-Builds that referenced this pull request Sep 10, 2023
Modified VU0MixVec. Corrected clobbers.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants