Skip to content

Conversation

modiking
Copy link
Contributor

@modiking modiking commented Oct 3, 2025

NVCC does more aggressive inlining than Clang/GCC causing the exported functions in extrema.cpp and findloc.cpp to become extremely large from function specializations leading to compilation timeouts. Marking the 2 functions in this change as noinline for NVCC alleviates this problem as it removes the worst of the cross-matrix argument specializations.

Also remove the workaround in #156542 that opted out findloc.cpp from the CUDA flang-rt build

Testing:
ninja flang-rt builds in ~30 minutes, these 2 files build in ~3 minutes

@modiking modiking requested a review from clementval October 3, 2025 01:35
@clementval clementval requested a review from klausler October 3, 2025 01:37
@clementval
Copy link
Contributor

clementval commented Oct 3, 2025

Thanks for the workaround @modiking. I added @klausler who is the main runtime developer as reviewer. Give him some time to chime in if he has anything to say.

Copy link

github-actions bot commented Oct 3, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@clementval clementval requested a review from vzakhari October 3, 2025 01:41
@vzakhari
Copy link
Contributor

vzakhari commented Oct 3, 2025

Thanks for the changes! Please use RT_DEVICE_NOINLINE macro instead. We have it defined in flang/include/flang/Common/api-attrs.h.

Copy link
Contributor

@clementval clementval left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks

Copy link
Contributor

@klausler klausler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please get Slava's approval before merging.

Copy link
Contributor

@vzakhari vzakhari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@modiking
Copy link
Contributor Author

modiking commented Oct 3, 2025

Appreciate the quick reviews!

@modiking modiking merged commit 74180eb into llvm:main Oct 3, 2025
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants