Use forward declarations of extended floating point types instead of including the headers by miscco · Pull Request #5846 · NVIDIA/cccl

miscco · 2025-09-10T10:48:10Z

The cuda floating point headers carry a lot of weight.

We only really need those for cmath and complex, so try and get away with only forward declaring the types if available for in other places

davebayer · 2025-09-10T10:58:28Z

libcudacxx/include/cuda/std/__cccl/extended_data_types.h

+struct __nv_fp8_e8m0;
+struct __nv_fp8x2_e8m0;
+struct __nv_fp8x4_e8m0;


This should be guarded by _CCCL_HAS_NVFP8_E8M0(). Maybe we can move the forward declarations lower and guard all of them by their corresponding _CCCL_HAS_NVFPX macro

This is the header that sets all the _CCCL_HAS_NVFPX macros

I added the CTK check, so we should be fine

…including the headers The cuda floating point headers carry a lot of weight. We only really need those for `cmath` and `complex`, so try and get away with only forward declaring the types if available for in other places

…ations for CUB / thrust

github-actions · 2025-09-10T18:51:57Z

🥳 CI Workflow Results

🟩 Finished in 3h 36m: Pass: 100%/229 | Total: 8d 12h | Max: 3h 35m | Hits: 44%/348747

See results here.

…including the headers (#5846) * Use forward declarations of extended floating point types instead of including the headers The cuda floating point headers carry a lot of weight. We only really need those for `cmath` and `complex`, so try and get away with only forward declaring the types if available for in other places * Also do not include the full headers when we only need forward declarations for CUB / thrust * Guard availability of `__nv_fp8_e8m0` on CTK version * Include header for floating_point tests

…pes instead of including the headers (#5846) (#5978) * Use forward declarations of extended floating point types instead of including the headers (#5846) * Use forward declarations of extended floating point types instead of including the headers The cuda floating point headers carry a lot of weight. We only really need those for `cmath` and `complex`, so try and get away with only forward declaring the types if available for in other places * Also do not include the full headers when we only need forward declarations for CUB / thrust * Guard availability of `__nv_fp8_e8m0` on CTK version * Include header for floating_point tests * Fix python builds?

…including the headers (NVIDIA#5846) * Use forward declarations of extended floating point types instead of including the headers The cuda floating point headers carry a lot of weight. We only really need those for `cmath` and `complex`, so try and get away with only forward declaring the types if available for in other places * Also do not include the full headers when we only need forward declarations for CUB / thrust * Guard availability of `__nv_fp8_e8m0` on CTK version * Include header for floating_point tests

miscco requested a review from a team as a code owner September 10, 2025 10:48

miscco requested a review from davebayer September 10, 2025 10:48

github-project-automation bot moved this to Todo in CCCL Sep 10, 2025

github-project-automation bot added this to CCCL Sep 10, 2025

cccl-authenticator-app bot moved this from Todo to In Review in CCCL Sep 10, 2025

davebayer requested changes Sep 10, 2025

View reviewed changes

github-project-automation bot moved this from In Review to In Progress in CCCL Sep 10, 2025

miscco requested review from a team as code owners September 10, 2025 11:19

miscco requested review from bernhardmgruber and fbusato September 10, 2025 11:19

davebayer approved these changes Sep 10, 2025

View reviewed changes

bernhardmgruber approved these changes Sep 10, 2025

View reviewed changes

github-project-automation bot moved this from In Progress to In Review in CCCL Sep 10, 2025

miscco added 4 commits September 10, 2025 17:12

Also do not include the full headers when we only need forward declar…

2f369d0

…ations for CUB / thrust

Guard availability of __nv_fp8_e8m0 on CTK version

0c171ce

Include header for floating_point tests

c6c9edd

miscco force-pushed the forward_declare_cuda_fp_types branch from d1d9245 to c6c9edd Compare September 10, 2025 15:13

miscco merged commit 0b700f3 into NVIDIA:main Sep 10, 2025
239 checks passed

github-project-automation bot moved this from In Review to Done in CCCL Sep 10, 2025

miscco deleted the forward_declare_cuda_fp_types branch September 10, 2025 19:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use forward declarations of extended floating point types instead of including the headers#5846

Use forward declarations of extended floating point types instead of including the headers#5846
miscco merged 4 commits intoNVIDIA:mainfrom
miscco:forward_declare_cuda_fp_types

miscco commented Sep 10, 2025

Uh oh!

davebayer Sep 10, 2025

Uh oh!

miscco Sep 10, 2025

Uh oh!

miscco Sep 10, 2025

Uh oh!

github-actions bot commented Sep 10, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

miscco commented Sep 10, 2025

Uh oh!

davebayer Sep 10, 2025

Choose a reason for hiding this comment

Uh oh!

miscco Sep 10, 2025

Choose a reason for hiding this comment

Uh oh!

miscco Sep 10, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Sep 10, 2025

🥳 CI Workflow Results

🟩 Finished in 3h 36m: Pass: 100%/229 | Total: 8d 12h | Max: 3h 35m | Hits: 44%/348747

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants