# Advanced HPX Algorithms

HPXPy provides direct access to HPX's powerful parallel algorithms, giving you fine-grained control over execution policies and advanced operations.

This tutorial covers:
- Scan algorithms (inclusive/exclusive scan)
- Transform-reduce operations
- Set operations (union, intersection, difference)
- Selection algorithms (nth_element, median, percentile)
- Search algorithms (searchsorted, isin, includes)
- Grouped reductions (reduce_by_key)

In [None]:
import hpxpy as hpx
import numpy as np

# Initialize HPX runtime
hpx.init()
print(f"Running with {hpx.num_threads()} threads")

## 1. Scan Algorithms

Scan algorithms compute running accumulations across an array. HPXPy exposes both inclusive and exclusive scan with custom operations.

In [None]:
# Inclusive scan: result[i] = op(arr[0], ..., arr[i])
arr = hpx.array([1.0, 2.0, 3.0, 4.0, 5.0])

print("Original:", arr.to_numpy())
print("Inclusive scan (add):", hpx.inclusive_scan(arr, "add").to_numpy())  # cumsum
print("Inclusive scan (mul):", hpx.inclusive_scan(arr, "mul").to_numpy())  # cumprod

In [None]:
# Exclusive scan: result[i] = op(init, arr[0], ..., arr[i-1])
arr = hpx.array([1.0, 2.0, 3.0, 4.0, 5.0])

print("Original:", arr.to_numpy())
print("Exclusive scan (add, init=0):", hpx.exclusive_scan(arr, 0, "add").to_numpy())
print("Exclusive scan (add, init=10):", hpx.exclusive_scan(arr, 10, "add").to_numpy())
print("Exclusive scan (mul, init=1):", hpx.exclusive_scan(arr, 1, "mul").to_numpy())

## 2. Transform-Reduce

The `transform_reduce` operation fuses a transform and reduction into a single pass over the data, which is more efficient than doing them separately.

In [None]:
arr = hpx.array([1.0, 2.0, 3.0, 4.0, 5.0])

print("Array:", arr.to_numpy())

# Sum of squares in one pass
sum_sq = hpx.transform_reduce(arr, "square", "add")
print(f"Sum of squares: {sum_sq}")
print(f"  Verification: {np.sum(arr.to_numpy()**2)}")

# Sum of absolute values
arr_signed = hpx.array([-1.0, 2.0, -3.0, 4.0, -5.0])
sum_abs = hpx.transform_reduce(arr_signed, "abs", "add")
print(f"\nSigned array: {arr_signed.to_numpy()}")
print(f"Sum of absolute values: {sum_abs}")

# Product using identity transform
product = hpx.transform_reduce(arr, "identity", "mul")
print(f"\nProduct of elements: {product}")

## 3. Reduce by Key

Group values by key and reduce each group. This is useful for computing grouped statistics, histograms, and aggregations.

In [None]:
# Simple grouped sum
keys = hpx.array([1.0, 2.0, 1.0, 2.0, 1.0, 3.0, 3.0])
values = hpx.array([10.0, 20.0, 30.0, 40.0, 50.0, 5.0, 15.0])

print("Keys:", keys.to_numpy())
print("Values:", values.to_numpy())

unique_keys, sums = hpx.reduce_by_key(keys, values, "add")
print("\nGrouped sums:")
for k, v in zip(unique_keys.to_numpy(), sums.to_numpy()):
    print(f"  Key {int(k)}: sum = {v}")

In [None]:
# Different reduction operations
keys = hpx.array([1.0, 1.0, 1.0, 2.0, 2.0, 2.0])
values = hpx.array([3.0, 1.0, 4.0, 1.0, 5.0, 9.0])

print("Keys:", keys.to_numpy())
print("Values:", values.to_numpy())

_, group_max = hpx.reduce_by_key(keys, values, "max")
_, group_min = hpx.reduce_by_key(keys, values, "min")
_, group_sum = hpx.reduce_by_key(keys, values, "add")

print("\nGroup 1: max={}, min={}, sum={}".format(
    group_max.to_numpy()[0], group_min.to_numpy()[0], group_sum.to_numpy()[0]))
print("Group 2: max={}, min={}, sum={}".format(
    group_max.to_numpy()[1], group_min.to_numpy()[1], group_sum.to_numpy()[1]))

## 4. Set Operations

HPXPy provides efficient parallel set operations on sorted arrays.

In [None]:
a = hpx.array([1.0, 2.0, 3.0, 4.0, 5.0])
b = hpx.array([3.0, 4.0, 5.0, 6.0, 7.0])

print("a:", a.to_numpy())
print("b:", b.to_numpy())
print()

# Union: elements in either set
print("union1d(a, b):", hpx.union1d(a, b).to_numpy())

# Intersection: elements in both sets
print("intersect1d(a, b):", hpx.intersect1d(a, b).to_numpy())

# Difference: elements in a but not in b
print("setdiff1d(a, b):", hpx.setdiff1d(a, b).to_numpy())

# Symmetric difference: elements in exactly one set
print("setxor1d(a, b):", hpx.setxor1d(a, b).to_numpy())

In [None]:
# Check if one set contains all elements of another
superset = hpx.array([1.0, 2.0, 3.0, 4.0, 5.0])
subset1 = hpx.array([2.0, 4.0])
subset2 = hpx.array([2.0, 6.0])  # 6 not in superset

print("Superset:", superset.to_numpy())
print("Subset 1:", subset1.to_numpy())
print("Subset 2:", subset2.to_numpy())
print()
print(f"Superset includes subset1? {hpx.includes(superset, subset1)}")
print(f"Superset includes subset2? {hpx.includes(superset, subset2)}")

## 5. Selection Algorithms

Selection algorithms find specific order statistics (like median, percentiles) in O(n) time, faster than sorting.

In [None]:
# nth_element: Find the nth smallest element in O(n) time
arr = hpx.array([7.0, 2.0, 9.0, 1.0, 5.0, 8.0, 3.0, 6.0, 4.0])

print("Array:", arr.to_numpy())
print("Sorted would be: [1, 2, 3, 4, 5, 6, 7, 8, 9]")
print()
print(f"nth_element(arr, 0) = {hpx.nth_element(arr, 0)} (minimum)")
print(f"nth_element(arr, 4) = {hpx.nth_element(arr, 4)} (5th smallest = median)")
print(f"nth_element(arr, -1) = {hpx.nth_element(arr, -1)} (maximum)")

In [None]:
# Median and percentiles
arr = hpx.arange(100, dtype='float64')

print(f"Median: {hpx.median(arr)}")
print(f"25th percentile: {hpx.percentile(arr, 25)}")
print(f"75th percentile: {hpx.percentile(arr, 75)}")
print(f"90th percentile: {hpx.percentile(arr, 90)}")

# Verify against NumPy
np_arr = arr.to_numpy()
print(f"\nNumPy median: {np.median(np_arr)}")
print(f"NumPy 25th percentile: {np.percentile(np_arr, 25)}")

## 6. Search Algorithms

Efficient algorithms for searching sorted arrays and testing membership.

In [None]:
# searchsorted: Binary search for insertion points
sorted_arr = hpx.array([10.0, 20.0, 30.0, 40.0, 50.0])
values = hpx.array([15.0, 25.0, 35.0, 5.0, 55.0])

print("Sorted array:", sorted_arr.to_numpy())
print("Values to search:", values.to_numpy())

indices = hpx.searchsorted(sorted_arr, values)
print("\nInsertion indices (left):")
for v, i in zip(values.to_numpy(), indices.to_numpy()):
    print(f"  {v} -> index {i}")

In [None]:
# isin: Test membership
arr = hpx.array([1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0])
test_values = hpx.array([2.0, 4.0, 6.0, 8.0])  # Even numbers

print("Array:", arr.to_numpy())
print("Test values:", test_values.to_numpy())

mask = hpx.isin(arr, test_values)
print("Is in test values:", mask.to_numpy())

In [None]:
# nonzero: Find indices of non-zero elements
arr = hpx.array([0.0, 1.0, 0.0, 2.0, 0.0, 0.0, 3.0, 0.0])

print("Array:", arr.to_numpy())
nz_indices = hpx.nonzero(arr)
print("Non-zero indices:", nz_indices.to_numpy())

# Use NumPy for fancy indexing (not yet supported in HPXPy)
np_arr = arr.to_numpy()
print("Non-zero values:", np_arr[nz_indices.to_numpy()])

## 7. Array Manipulation

Additional algorithms for manipulating and comparing arrays.

In [None]:
# flip: Reverse array
arr = hpx.array([1.0, 2.0, 3.0, 4.0, 5.0])
print("Original:", arr.to_numpy())
print("Flipped:", hpx.flip(arr).to_numpy())

# roll: Rotate array elements
print("\nRoll examples:")
print("Roll by 2 (left):", hpx.roll(arr, 2).to_numpy())
print("Roll by -2 (right):", hpx.roll(arr, -2).to_numpy())

In [None]:
# array_equal: Compare arrays
a = hpx.array([1.0, 2.0, 3.0])
b = hpx.array([1.0, 2.0, 3.0])
c = hpx.array([1.0, 2.0, 4.0])

print("a:", a.to_numpy())
print("b:", b.to_numpy())
print("c:", c.to_numpy())
print()
print(f"array_equal(a, b): {hpx.array_equal(a, b)}")
print(f"array_equal(a, c): {hpx.array_equal(a, c)}")

In [None]:
# merge_sorted: Merge two sorted arrays
sorted1 = hpx.array([1.0, 3.0, 5.0, 7.0, 9.0])
sorted2 = hpx.array([2.0, 4.0, 6.0, 8.0, 10.0])

print("Sorted array 1:", sorted1.to_numpy())
print("Sorted array 2:", sorted2.to_numpy())
print("Merged:", hpx.merge_sorted(sorted1, sorted2).to_numpy())

In [None]:
# stable_sort: Preserves relative order of equal elements
arr = hpx.array([3.0, 1.0, 4.0, 1.0, 5.0, 9.0, 2.0, 6.0])

print("Original:", arr.to_numpy())
print("Stable sorted:", hpx.stable_sort(arr).to_numpy())

## 8. Deterministic Reductions

Parallel floating-point reductions can produce different results due to non-associativity. Use `reduce_deterministic` when reproducibility is required.

In [None]:
# Demonstrate potential floating-point variance
arr = hpx.array([1e20, 1.0, -1e20, 2.0, 1e20, 3.0, -1e20])

print("Array with large magnitude differences:")
print(arr.to_numpy())
print()

# Parallel reduction (may vary)
par_result = hpx.reduce(arr, "add", policy="par")
print(f"Parallel sum: {par_result}")

# Deterministic reduction (always same result)
det_result = hpx.reduce_deterministic(arr, "add")
print(f"Deterministic sum: {det_result}")

# Compare to NumPy
np_result = np.sum(arr.to_numpy())
print(f"NumPy sum: {np_result}")

## 9. Performance Example

Let's benchmark some algorithms on larger data.

In [None]:
import time

n = 1_000_000
arr = hpx.random.uniform(0, 1000, size=n)

print(f"Array size: {n:,} elements")
print()

def benchmark(name, func):
    start = time.time()
    result = func()
    elapsed = (time.time() - start) * 1000
    print(f"{name:25s}: {elapsed:7.2f} ms")
    return result

# Benchmark various algorithms
benchmark("sum (parallel)", lambda: hpx.sum(arr, policy="par"))
benchmark("transform_reduce (sum sq)", lambda: hpx.transform_reduce(arr, "square", "add"))
benchmark("inclusive_scan (cumsum)", lambda: hpx.inclusive_scan(arr, "add"))
benchmark("median (O(n) selection)", lambda: hpx.median(arr))
benchmark("unique", lambda: hpx.unique(arr))
benchmark("sort (parallel)", lambda: hpx.sort(arr, policy="par"))

In [None]:
# Clean up
hpx.finalize()
print("Runtime finalized")

## Summary

In this tutorial, you learned about HPXPy's advanced algorithms:

1. **Scan Algorithms**: `inclusive_scan`, `exclusive_scan` - running accumulations with custom operations
2. **Transform-Reduce**: `transform_reduce` - fused transform + reduce for efficiency
3. **Grouped Reductions**: `reduce_by_key` - aggregate values by group
4. **Set Operations**: `union1d`, `intersect1d`, `setdiff1d`, `setxor1d`, `includes`
5. **Selection Algorithms**: `nth_element`, `median`, `percentile` - O(n) order statistics
6. **Search Algorithms**: `searchsorted`, `isin`, `nonzero`
7. **Array Manipulation**: `flip`, `roll`, `array_equal`, `merge_sorted`, `stable_sort`, `partition`
8. **Deterministic Reductions**: `reduce`, `reduce_deterministic` - control over reproducibility

### Key Takeaways

- Use **execution policies** (`policy="seq"` or `policy="par"`) to control parallelism
- **Transform-reduce** is more efficient than separate transform + reduce
- **Selection algorithms** (nth_element, median) are O(n), faster than sorting for single values
- **Set operations** work on sorted arrays and are parallelized
- Use **reduce_deterministic** when you need reproducible floating-point results

These algorithms directly expose HPX's parallel capabilities, giving you fine-grained control over execution while maintaining NumPy-like ergonomics.