Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix missing fmod variants #1643

Merged
merged 1 commit into from
Feb 28, 2023
Merged

Conversation

lgritz
Copy link
Collaborator

@lgritz lgritz commented Feb 3, 2023

Fixes #1526 (maybe?)

Signed-off-by: Larry Gritz lg@larrygritz.com

@lgritz
Copy link
Collaborator Author

lgritz commented Feb 3, 2023

@MasterZap Are you able to apply this patch on your end and see if it clears up the problem? I'm still having trouble directly reproducing, I think it may only be a problem on Windows.

@AlexMWells
Copy link
Contributor

If if that fixes it, it doesn't answer the question of how/why the "float" is not being properly promoted?

@lgritz
Copy link
Collaborator Author

lgritz commented Feb 3, 2023

why should it?

@lgritz
Copy link
Collaborator Author

lgritz commented Feb 3, 2023

stdosl.h:118 is why

@lgritz
Copy link
Collaborator Author

lgritz commented Feb 3, 2023

I should fix the docs to make it more clear that fmod (like pow!) allows the second argument to be float even if the first arg is a triple.

@AlexMWells
Copy link
Contributor

That can be traced back to
ca0c6df
Fix slightly broken overload resolution
which added handling of float mixing with Vec3 based types. But no documentation change.

@AlexMWells
Copy link
Contributor

AlexMWells commented Feb 3, 2023

Can we get some testsuite entries added to test these combinations and try them in batched as well?

I see a good example of testing the combinations for pow in
testsuite/transcendental-reg

which exercises
"../common/shaders/test_binary_mixed_triple_float_xmacro.h"

whereas the fmod tests in
testsuite/arithmetic-reg
only exercises
#include "../common/shaders/test_binary_xmacro.h"

which does not include the Vec3%float variants.

If you update those tests to be on par with the pow tests, I think you will have your reproducer.
Then will be able to add the necessary implementations to support it.

In particular src/liboslexec/wide/wide_opalgebraic.cpp will need to have more variations added for fmod

@MasterZap
Copy link

@lgritz I will try to test this on Monday.

@lgritz
Copy link
Collaborator Author

lgritz commented Feb 4, 2023

As Alex points out, this is not a complete solution. But if it changes the symptoms, it definitely means I'm on the right track.

@MasterZap
Copy link

Tested it - seems to work!!

Thanks so much for this ♥️

/Z

@lgritz
Copy link
Collaborator Author

lgritz commented Feb 7, 2023

Cool, you can try that patch on your end, I guess it can't make it any worse. But as Alex said, there's more work to make it fully complete for the batch case.

And also, I think Chris has a point. Looking at the implementation, I think there is no real runtime computation to be saved by having the (triple,float) special cases. I believe just allowing oslc to promote the (triple,float) case to (triple,triple) has no real penalty and is a simple solution, the only being that previously-compiled shaders would need to be recompiled. But as far as I know, you're the only person to ever report encountering the bug in practice, so maybe that's adequate. It's something I'm mulling over today.

@lgritz
Copy link
Collaborator Author

lgritz commented Feb 8, 2023

@MasterZap Would you mind doing another experiment for me?

In llvm_gen.cpp, circa line 756, you'll see

#ifdef OSL_LLVM_NO_BITCODE
    // On Windows 32 bit this calls an unknown instruction, probably need to
    // link with LLVM compiler-rt to fix, for now just fall back to op
    if (is_float)
        return llvm_gen_generic(rop, opnum);
#endif

I'm kind of suspecting that these lines, which are 10 years old, may be totally irrelevant and reflects something that has long since been fixed on the LLVM side. Some things can easily be simplified if this is no longer needed.

Can you please simply comment out just these lines entirely? I don't have a good way of testing on Windows, but if things seem fine for you with this clause gone, that will change how I want to approach the full solution to this.

@lgritz
Copy link
Collaborator Author

lgritz commented Feb 8, 2023

Updated with a full working solution (including batch) based on Alex's suggestion, including full regression tests. Passes now.

I'm awaiting Zap's test that I suggested above, and pending that I'll probably have one more cleanup/simplification round.

@MasterZap
Copy link

Those lines, are that specifically on windows 32 BIT as you wrote? We are all 64 bit, and really have no way to test 32 bit. And if it really only is for 32 bit then... you can probably ignore it :)

@lgritz
Copy link
Collaborator Author

lgritz commented Feb 9, 2023

Under the paranoid assumption that all comments are not necessarily 100% correct, I guess I wanted to verify empirically that this was not a concern with today's LLVM on Windows in practice before I ripped it out completely.

@MasterZap
Copy link

I have not had time to test deleting this code yet, hope to soon.

Apparently some varieties of fmod(triple,float) -- which were not
documented in the spec but were intended to work and did allow that
combination in oslc -- were not fully implemented in the runtime.

* Add all these tests to the arithmetic-reg test, to fully test all
  possible argument type and unif/vary/const combinations once and for
  all for fmod. Also add some more rudimentary testing to miscmath.

* Implement the missing bits and/or avoid the code paths that rely
  on the unimplemented parts.

The implementation of fmod for funny -- both for batch and non-batch,
it's got one code path that just directly generates code and doesn't
need any functional implementation in llvm_ops at all, and a second
code path that uses the functional implementations to work around
(different, in each case!) what appears to be bugs or limitations in
long-ago LLVM versions we don't need any more.  I'm trying to assess
whether those are still needed at all in the oldest versions of LLVM
we are using, and if not, I may amend even more simplification to this
whole thing.

Signed-off-by: Larry Gritz <lg@larrygritz.com>
@lgritz
Copy link
Collaborator Author

lgritz commented Feb 27, 2023

In the interest of having a working solution in time for the 1.12 patch release this week, I'm inclined to merge and backport this now, and then later I can follow up with a separate PR that removes the parts that we think are only needed for the old-LLVM versions we no longer support (perhaps putting it only in main and not backporting, for additional safety).

@lgritz lgritz merged commit fbe141b into AcademySoftwareFoundation:main Feb 28, 2023
@@ -1677,6 +1677,7 @@ LLVMGEN(llvm_gen_modulus)

int num_components = type.aggregate;

#if 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest just deleting the #if 0 -> #endif
Maybe comment
// Handle all combinations of constant, uniform, varying parameters here in code gen.

Perhaps similar approach would be good for the scalar version vs. the osl_fmod_vvf, osl_fmod_dvdvdf, etc. variations.

lgritz added a commit to lgritz/OpenShadingLanguage that referenced this pull request Mar 1, 2023
Apparently some varieties of fmod(triple,float) -- which were not
documented in the spec but were intended to work and did allow that
combination in oslc -- were not fully implemented in the runtime.

* Add all these tests to the arithmetic-reg test, to fully test all
  possible argument type and unif/vary/const combinations once and for
  all for fmod. Also add some more rudimentary testing to miscmath.

* Implement the missing bits and/or avoid the code paths that rely
  on the unimplemented parts.

The implementation of fmod for funny -- both for batch and non-batch,
it's got one code path that just directly generates code and doesn't
need any functional implementation in llvm_ops at all, and a second
code path that uses the functional implementations to work around
(different, in each case!) what appears to be bugs or limitations in
long-ago LLVM versions we don't need any more.  I'm trying to assess
whether those are still needed at all in the oldest versions of LLVM
we are using, and if not, I may amend even more simplification to this
whole thing.

Signed-off-by: Larry Gritz <lg@larrygritz.com>
@lgritz lgritz deleted the lg-fmod branch March 16, 2023 05:59
lgritz added a commit to lgritz/OpenShadingLanguage that referenced this pull request Mar 16, 2023
Apparently some varieties of fmod(triple,float) -- which were not
documented in the spec but were intended to work and did allow that
combination in oslc -- were not fully implemented in the runtime.

* Add all these tests to the arithmetic-reg test, to fully test all
  possible argument type and unif/vary/const combinations once and for
  all for fmod. Also add some more rudimentary testing to miscmath.

* Implement the missing bits and/or avoid the code paths that rely
  on the unimplemented parts.

The implementation of fmod for funny -- both for batch and non-batch,
it's got one code path that just directly generates code and doesn't
need any functional implementation in llvm_ops at all, and a second
code path that uses the functional implementations to work around
(different, in each case!) what appears to be bugs or limitations in
long-ago LLVM versions we don't need any more.  I'm trying to assess
whether those are still needed at all in the oldest versions of LLVM
we are using, and if not, I may amend even more simplification to this
whole thing.

Signed-off-by: Larry Gritz <lg@larrygritz.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Executing shader using fmod() sometimes crashes on windows
3 participants