# CUDA Parallel Tutorial - High-Level GPU Programming
## Table of Contents

9. Exercise Solutions

In [1]:
import numpy as np
import cupy as cp

import cuda.cccl.parallel.experimental as parallel

## 9. Exercises

Now it's time to practice. Here are some hands-on exercises to reinforce your learning:

### Exercise 1: Compute the minimum value of a sequence
Use `reduce_into()` to compute the minimum value of a sequence

In [2]:
# Prepare the input and output arrays.
d_input = cp.array([-2, 3, 5, 1, 7, -6, 8, -4], dtype=np.int32)
d_output = cp.empty(1, dtype=np.int32)


# begin TODO
MAX_INT = np.iinfo(np.int32).max
h_init = np.asarray([MAX_INT], dtype=np.int32)
parallel.reduce_into(d_input, d_output, parallel.OpKind.MINIMUM, len(d_input), h_init)
# end TODO

expected_output = -6
assert (d_output == expected_output).all()
result = d_output[0]
print(f"Min reduction result: {result}")

Min reduction result: -6


### Exercise 2: Sort by the last digit
Use `merge_sort()` with a custom comparator function to sort elements by the last digit. For example, 
[9, 23, 1001, 802] -> [1001, 802, 23, 9].

In [3]:
# Prepare the input and output arrays.
d_in_keys = cp.asarray([29, 9, 136, 1001, 72, 24, 32, 1], dtype="int32")

# define the custom comparator.
def comparison_op(lhs, rhs):
    return lhs % 10 < rhs % 10

# Perform the merge sort.
parallel.merge_sort(
    d_in_keys,
    None,
    d_in_keys,
    None,
    comparison_op,
    d_in_keys.size,
)

print(f"Result: {d_in_keys}")
expected = np.asarray([1001, 1, 72, 32, 24, 136, 29, 9], dtype=np.int32)
assert (d_in_keys.get() == expected).all()

Result: [1001    1   72   32   24  136   29    9]


## Resources
API Reference: https://nvidia.github.io/cccl/python/parallel_api.html#module-cuda.cccl.parallel.experimental.algorithms