Fix missing fmod variants #1643

lgritz · 2023-02-03T21:24:27Z

Fixes #1526 (maybe?)

Signed-off-by: Larry Gritz lg@larrygritz.com

lgritz · 2023-02-03T21:25:36Z

@MasterZap Are you able to apply this patch on your end and see if it clears up the problem? I'm still having trouble directly reproducing, I think it may only be a problem on Windows.

AlexMWells · 2023-02-03T21:40:16Z

If if that fixes it, it doesn't answer the question of how/why the "float" is not being properly promoted?

lgritz · 2023-02-03T22:22:27Z

why should it?

lgritz · 2023-02-03T22:25:08Z

stdosl.h:118 is why

lgritz · 2023-02-03T22:25:53Z

I should fix the docs to make it more clear that fmod (like pow!) allows the second argument to be float even if the first arg is a triple.

AlexMWells · 2023-02-03T22:41:30Z

That can be traced back to
ca0c6df
Fix slightly broken overload resolution
which added handling of float mixing with Vec3 based types. But no documentation change.

AlexMWells · 2023-02-03T23:28:41Z

Can we get some testsuite entries added to test these combinations and try them in batched as well?

I see a good example of testing the combinations for pow in
testsuite/transcendental-reg

which exercises
"../common/shaders/test_binary_mixed_triple_float_xmacro.h"

whereas the fmod tests in
testsuite/arithmetic-reg
only exercises
#include "../common/shaders/test_binary_xmacro.h"

which does not include the Vec3%float variants.

If you update those tests to be on par with the pow tests, I think you will have your reproducer.
Then will be able to add the necessary implementations to support it.

In particular src/liboslexec/wide/wide_opalgebraic.cpp will need to have more variations added for fmod

MasterZap · 2023-02-04T19:04:43Z

@lgritz I will try to test this on Monday.

lgritz · 2023-02-04T20:06:12Z

As Alex points out, this is not a complete solution. But if it changes the symptoms, it definitely means I'm on the right track.

MasterZap · 2023-02-07T20:12:38Z

Tested it - seems to work!!

Thanks so much for this ♥️

/Z

lgritz · 2023-02-07T20:36:47Z

Cool, you can try that patch on your end, I guess it can't make it any worse. But as Alex said, there's more work to make it fully complete for the batch case.

And also, I think Chris has a point. Looking at the implementation, I think there is no real runtime computation to be saved by having the (triple,float) special cases. I believe just allowing oslc to promote the (triple,float) case to (triple,triple) has no real penalty and is a simple solution, the only being that previously-compiled shaders would need to be recompiled. But as far as I know, you're the only person to ever report encountering the bug in practice, so maybe that's adequate. It's something I'm mulling over today.

lgritz · 2023-02-08T22:27:52Z

@MasterZap Would you mind doing another experiment for me?

In llvm_gen.cpp, circa line 756, you'll see

#ifdef OSL_LLVM_NO_BITCODE
    // On Windows 32 bit this calls an unknown instruction, probably need to
    // link with LLVM compiler-rt to fix, for now just fall back to op
    if (is_float)
        return llvm_gen_generic(rop, opnum);
#endif

I'm kind of suspecting that these lines, which are 10 years old, may be totally irrelevant and reflects something that has long since been fixed on the LLVM side. Some things can easily be simplified if this is no longer needed.

Can you please simply comment out just these lines entirely? I don't have a good way of testing on Windows, but if things seem fine for you with this clause gone, that will change how I want to approach the full solution to this.

lgritz · 2023-02-08T23:41:32Z

Updated with a full working solution (including batch) based on Alex's suggestion, including full regression tests. Passes now.

I'm awaiting Zap's test that I suggested above, and pending that I'll probably have one more cleanup/simplification round.

MasterZap · 2023-02-09T12:14:03Z

Those lines, are that specifically on windows 32 BIT as you wrote? We are all 64 bit, and really have no way to test 32 bit. And if it really only is for 32 bit then... you can probably ignore it :)

lgritz · 2023-02-09T16:31:08Z

Under the paranoid assumption that all comments are not necessarily 100% correct, I guess I wanted to verify empirically that this was not a concern with today's LLVM on Windows in practice before I ripped it out completely.

MasterZap · 2023-02-09T17:17:51Z

I have not had time to test deleting this code yet, hope to soon.

Apparently some varieties of fmod(triple,float) -- which were not documented in the spec but were intended to work and did allow that combination in oslc -- were not fully implemented in the runtime. * Add all these tests to the arithmetic-reg test, to fully test all possible argument type and unif/vary/const combinations once and for all for fmod. Also add some more rudimentary testing to miscmath. * Implement the missing bits and/or avoid the code paths that rely on the unimplemented parts. The implementation of fmod for funny -- both for batch and non-batch, it's got one code path that just directly generates code and doesn't need any functional implementation in llvm_ops at all, and a second code path that uses the functional implementations to work around (different, in each case!) what appears to be bugs or limitations in long-ago LLVM versions we don't need any more. I'm trying to assess whether those are still needed at all in the oldest versions of LLVM we are using, and if not, I may amend even more simplification to this whole thing. Signed-off-by: Larry Gritz <lg@larrygritz.com>

lgritz · 2023-02-27T22:49:23Z

In the interest of having a working solution in time for the 1.12 patch release this week, I'm inclined to merge and backport this now, and then later I can follow up with a separate PR that removes the parts that we think are only needed for the old-LLVM versions we no longer support (perhaps putting it only in main and not backporting, for additional safety).

AlexMWells · 2023-02-28T17:54:04Z

src/liboslexec/batched_llvm_gen.cpp

@@ -1677,6 +1677,7 @@ LLVMGEN(llvm_gen_modulus)

    int num_components = type.aggregate;

+#if 0


Suggest just deleting the #if 0 -> #endif
Maybe comment
// Handle all combinations of constant, uniform, varying parameters here in code gen.

Perhaps similar approach would be good for the scalar version vs. the osl_fmod_vvf, osl_fmod_dvdvdf, etc. variations.

Apparently some varieties of fmod(triple,float) -- which were not documented in the spec but were intended to work and did allow that combination in oslc -- were not fully implemented in the runtime. * Add all these tests to the arithmetic-reg test, to fully test all possible argument type and unif/vary/const combinations once and for all for fmod. Also add some more rudimentary testing to miscmath. * Implement the missing bits and/or avoid the code paths that rely on the unimplemented parts. The implementation of fmod for funny -- both for batch and non-batch, it's got one code path that just directly generates code and doesn't need any functional implementation in llvm_ops at all, and a second code path that uses the functional implementations to work around (different, in each case!) what appears to be bugs or limitations in long-ago LLVM versions we don't need any more. I'm trying to assess whether those are still needed at all in the oldest versions of LLVM we are using, and if not, I may amend even more simplification to this whole thing. Signed-off-by: Larry Gritz <lg@larrygritz.com>

lgritz force-pushed the lg-fmod branch from b3b3175 to c7d64e4 Compare February 8, 2023 23:39

lgritz force-pushed the lg-fmod branch from c7d64e4 to 56245e6 Compare February 27, 2023 22:47

lgritz merged commit fbe141b into AcademySoftwareFoundation:main Feb 28, 2023

AlexMWells reviewed Feb 28, 2023

View reviewed changes

lgritz deleted the lg-fmod branch March 16, 2023 05:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix missing fmod variants #1643

Fix missing fmod variants #1643

lgritz commented Feb 3, 2023

lgritz commented Feb 3, 2023

AlexMWells commented Feb 3, 2023

lgritz commented Feb 3, 2023

lgritz commented Feb 3, 2023

lgritz commented Feb 3, 2023

AlexMWells commented Feb 3, 2023

AlexMWells commented Feb 3, 2023 •

edited

MasterZap commented Feb 4, 2023

lgritz commented Feb 4, 2023

MasterZap commented Feb 7, 2023

lgritz commented Feb 7, 2023

lgritz commented Feb 8, 2023

lgritz commented Feb 8, 2023

MasterZap commented Feb 9, 2023

lgritz commented Feb 9, 2023

MasterZap commented Feb 9, 2023

lgritz commented Feb 27, 2023

AlexMWells Feb 28, 2023

		@@ -1677,6 +1677,7 @@ LLVMGEN(llvm_gen_modulus)

		int num_components = type.aggregate;

		#if 0

Fix missing fmod variants #1643

Fix missing fmod variants #1643

Conversation

lgritz commented Feb 3, 2023

lgritz commented Feb 3, 2023

AlexMWells commented Feb 3, 2023

lgritz commented Feb 3, 2023

lgritz commented Feb 3, 2023

lgritz commented Feb 3, 2023

AlexMWells commented Feb 3, 2023

AlexMWells commented Feb 3, 2023 • edited

MasterZap commented Feb 4, 2023

lgritz commented Feb 4, 2023

MasterZap commented Feb 7, 2023

lgritz commented Feb 7, 2023

lgritz commented Feb 8, 2023

lgritz commented Feb 8, 2023

MasterZap commented Feb 9, 2023

lgritz commented Feb 9, 2023

MasterZap commented Feb 9, 2023

lgritz commented Feb 27, 2023

AlexMWells Feb 28, 2023

Choose a reason for hiding this comment

AlexMWells commented Feb 3, 2023 •

edited