Skip to content

Add tests/parallel/examples/scan/scan_applications.py#5634

Merged
shwina merged 2 commits intoNVIDIA:mainfrom
oleksandr-pavlyk:add-scan-applications-example
Aug 26, 2025
Merged

Add tests/parallel/examples/scan/scan_applications.py#5634
shwina merged 2 commits intoNVIDIA:mainfrom
oleksandr-pavlyk:add-scan-applications-example

Conversation

@oleksandr-pavlyk
Copy link
Contributor

@oleksandr-pavlyk oleksandr-pavlyk commented Aug 22, 2025

Description

This file contains two applications.

  1. inclusive_segmented_scan_example

This example demonstrates how to compute segmented scan using ordinary scan with crafter scan operation over flag-value struct.

This example uses ZipIterator to efficiently read values and head flags.

  1. logcdf_from_logpdf_example

This example uses inclusive_scan with logaddsum operation: $v_1, v_2 \to \log(\exp(v_1) + \exp(v_2))$

Applied to a sequence of log-probabilities, computed for binomial distribution in this example using cupyx, the result should be a strictly increasing sequence that ends with 0.0 == log(1.0).

Due to numerical accuracy issues (?) this is not always true, so we follow up with another application of inclusive_scan with maximum operator, producing strictly non-decreasing sequence.

cp.searchsorted is used to compute quantiles of binomial distribution. Result is compared with reference scipy.stats.distributions.binom.isf

Checklist

  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

This file contains two applications.

1. inclusive_segmented_scan_example

This example demonstrates how to compute segmented scan using
ordinary scan with crafter scan operation over flag-value struct.

This example uses ZipIterator to efficiently read values and head
flags.

2. logcdf_from_logpdf_example

This example uses inclusive_scan with logaddsum operation
(v1, v2) -> log(exp(v1) + exp(v2))

Applied to a sequence of log-probabilities, computed for binomial
distribution in this example using cupyx, the result should be
a strictly increasing sequence that ends with 0.0 == log(1.0).

Due to numerical accuracy issues (?) this is not always true, so we
follow up with another application of inclusive_scan with maximum
operator, producing strictly non-decreasing sequence.

cp.searchsorted is used to compute quantiles of binomial distribution.
Result is compared with reference scipy.stats.distributions.binom.isf
@oleksandr-pavlyk oleksandr-pavlyk requested a review from a team as a code owner August 22, 2025 20:02
@github-project-automation github-project-automation bot moved this to Todo in CCCL Aug 22, 2025
@cccl-authenticator-app cccl-authenticator-app bot moved this from Todo to In Review in CCCL Aug 22, 2025
@github-actions
Copy link
Contributor

🟩 CI finished in 33m 25s: Pass: 100%/22 | Total: 3h 51m | Avg: 10m 31s | Max: 19m 39s
  • 🟩 python: Pass: 100%/22 | Total: 3h 51m | Avg: 10m 31s | Max: 19m 39s

    🟩 cpu
      🟩 amd64              Pass: 100%/22  | Total:  3h 51m | Avg: 10m 31s | Max: 19m 39s
    🟩 ctk
      🟩 12.5               Pass: 100%/6   | Total: 46m 36s | Avg:  7m 46s | Max: 15m 34s
      🟩 12.8               Pass: 100%/2   | Total: 39m 01s | Avg: 19m 30s | Max: 19m 38s
      🟩 12.9               Pass: 100%/14  | Total:  2h 26m | Avg: 10m 25s | Max: 19m 39s
    🟩 cudacxx
      🟩 nvcc12.5           Pass: 100%/6   | Total: 46m 36s | Avg:  7m 46s | Max: 15m 34s
      🟩 nvcc12.8           Pass: 100%/2   | Total: 39m 01s | Avg: 19m 30s | Max: 19m 38s
      🟩 nvcc12.9           Pass: 100%/14  | Total:  2h 26m | Avg: 10m 25s | Max: 19m 39s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/22  | Total:  3h 51m | Avg: 10m 31s | Max: 19m 39s
    🟩 cxx
      🟩 GCC13              Pass: 100%/22  | Total:  3h 51m | Avg: 10m 31s | Max: 19m 39s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/22  | Total:  3h 51m | Avg: 10m 31s | Max: 19m 39s
    🟩 gpu
      🟩 h100               Pass: 100%/4   | Total: 41m 21s | Avg: 10m 20s | Max: 17m 49s
      🟩 l4                 Pass: 100%/18  | Total:  3h 10m | Avg: 10m 34s | Max: 19m 39s
    🟩 jobs
      🟩 Build cuda.cccl    Pass: 100%/2   | Total: 19m 59s | Avg:  9m 59s | Max: 10m 10s
      🟩 Test cuda.cccl.cooperative Pass: 100%/5   | Total:  1h 10m | Avg: 14m 01s | Max: 15m 34s
      🟩 Test cuda.cccl.examples Pass: 100%/5   | Total: 24m 13s | Avg:  4m 50s | Max:  5m 56s
      🟩 Test cuda.cccl.headers Pass: 100%/5   | Total: 21m 25s | Avg:  4m 17s | Max:  5m 14s
      🟩 Test cuda.cccl.parallel Pass: 100%/5   | Total:  1h 35m | Avg: 19m 11s | Max: 19m 39s
    🟩 py_version
      🟩 3.10               Pass: 100%/9   | Total:  1h 36m | Avg: 10m 40s | Max: 19m 39s
      🟩 3.13               Pass: 100%/13  | Total:  2h 15m | Avg: 10m 25s | Max: 19m 38s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
CCCL Packaging
libcu++
CUB
Thrust
CUDA Experimental
stdpar
+/- python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
CCCL Packaging
libcu++
CUB
Thrust
CUDA Experimental
stdpar
+/- python
CCCL C Parallel Library
Catch2Helper

🏃‍ Runner counts (total jobs: 22)

# Runner
16 linux-amd64-gpu-l4-latest-1
4 linux-amd64-gpu-h100-latest-1
2 linux-amd64-cpu16

@leofang
Copy link
Member

leofang commented Aug 22, 2025

Very cool!

Copy link
Contributor

@shwina shwina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very cool indeed!

@github-actions
Copy link
Contributor

🟩 CI finished in 31m 40s: Pass: 100%/22 | Total: 3h 59m | Avg: 10m 52s | Max: 20m 43s
  • 🟩 python: Pass: 100%/22 | Total: 3h 59m | Avg: 10m 52s | Max: 20m 43s

    🟩 cpu
      🟩 amd64              Pass: 100%/22  | Total:  3h 59m | Avg: 10m 52s | Max: 20m 43s
    🟩 ctk
      🟩 12.5               Pass: 100%/6   | Total: 46m 18s | Avg:  7m 43s | Max: 13m 42s
      🟩 12.8               Pass: 100%/2   | Total: 40m 45s | Avg: 20m 22s | Max: 20m 35s
      🟩 12.9               Pass: 100%/14  | Total:  2h 32m | Avg: 10m 52s | Max: 20m 43s
    🟩 cudacxx
      🟩 nvcc12.5           Pass: 100%/6   | Total: 46m 18s | Avg:  7m 43s | Max: 13m 42s
      🟩 nvcc12.8           Pass: 100%/2   | Total: 40m 45s | Avg: 20m 22s | Max: 20m 35s
      🟩 nvcc12.9           Pass: 100%/14  | Total:  2h 32m | Avg: 10m 52s | Max: 20m 43s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/22  | Total:  3h 59m | Avg: 10m 52s | Max: 20m 43s
    🟩 cxx
      🟩 GCC13              Pass: 100%/22  | Total:  3h 59m | Avg: 10m 52s | Max: 20m 43s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/22  | Total:  3h 59m | Avg: 10m 52s | Max: 20m 43s
    🟩 gpu
      🟩 h100               Pass: 100%/4   | Total: 45m 10s | Avg: 11m 17s | Max: 17m 00s
      🟩 l4                 Pass: 100%/18  | Total:  3h 14m | Avg: 10m 47s | Max: 20m 43s
    🟩 jobs
      🟩 Build cuda.cccl    Pass: 100%/2   | Total: 19m 18s | Avg:  9m 39s | Max:  9m 40s
      🟩 Test cuda.cccl.cooperative Pass: 100%/5   | Total:  1h 11m | Avg: 14m 18s | Max: 16m 25s
      🟩 Test cuda.cccl.examples Pass: 100%/5   | Total: 26m 11s | Avg:  5m 14s | Max:  6m 18s
      🟩 Test cuda.cccl.headers Pass: 100%/5   | Total: 23m 14s | Avg:  4m 38s | Max:  5m 27s
      🟩 Test cuda.cccl.parallel Pass: 100%/5   | Total:  1h 39m | Avg: 19m 49s | Max: 20m 43s
    🟩 py_version
      🟩 3.10               Pass: 100%/9   | Total:  1h 37m | Avg: 10m 48s | Max: 20m 39s
      🟩 3.13               Pass: 100%/13  | Total:  2h 22m | Avg: 10m 56s | Max: 20m 43s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
CCCL Packaging
libcu++
CUB
Thrust
CUDA Experimental
stdpar
+/- python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
CCCL Packaging
libcu++
CUB
Thrust
CUDA Experimental
stdpar
+/- python
CCCL C Parallel Library
Catch2Helper

🏃‍ Runner counts (total jobs: 22)

# Runner
16 linux-amd64-gpu-l4-latest-1
4 linux-amd64-gpu-h100-latest-1
2 linux-amd64-cpu16

@shwina shwina merged commit f69b41b into NVIDIA:main Aug 26, 2025
36 checks passed
@github-project-automation github-project-automation bot moved this from In Review to Done in CCCL Aug 26, 2025
@oleksandr-pavlyk oleksandr-pavlyk deleted the add-scan-applications-example branch August 27, 2025 17:12
davebayer pushed a commit to davebayer/cccl that referenced this pull request Sep 23, 2025
* Add tests/parallel/examples/scan/scan_applications.py

This file contains two applications.

1. inclusive_segmented_scan_example

This example demonstrates how to compute segmented scan using
ordinary scan with crafter scan operation over flag-value struct.

This example uses ZipIterator to efficiently read values and head
flags.

2. logcdf_from_logpdf_example

This example uses inclusive_scan with logaddsum operation
(v1, v2) -> log(exp(v1) + exp(v2))

Applied to a sequence of log-probabilities, computed for binomial
distribution in this example using cupyx, the result should be
a strictly increasing sequence that ends with 0.0 == log(1.0).

Due to numerical accuracy issues (?) this is not always true, so we
follow up with another application of inclusive_scan with maximum
operator, producing strictly non-decreasing sequence.

cp.searchsorted is used to compute quantiles of binomial distribution.
Result is compared with reference scipy.stats.distributions.binom.isf

* Add example of computing exponential moving average using inclusive_scan
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

3 participants