Description
Add wp.copysign(x, y), which returns a value with the magnitude of x and the sign of y. This matches C's copysign and NumPy's np.copysign.
Motivation
The headline use case (and motivation for landing this alongside the work in GH-1376) is forcing a specific sign on a result whose signed-zero behavior is otherwise implementation-defined, e.g. wp.min(-0.0, +0.0) returning either zero per C99's fmin/fmax allowance. With wp.copysign users can pin the sign deterministically:
result = wp.copysign(wp.min(a, b), -1.0) # always returns -|min|
Requirements
The builtin should be supported for wp.float16 / wp.float32 / wp.float64 / wp.bfloat16. It lowers to a single instruction on every supported target:
- CUDA: libdevice
__nv_copysign (one PTX instruction).
- CPU JIT (Clang/LLVM):
__builtin_copysignf / __builtin_copysign (compiler intrinsic, no CRT dependency).
- Host build (MSVC fallback):
::copysign from <math.h> via crt.h.
The adjoint adj_copysign routes the gradient on x based on the sign-bit agreement between x and y (+1 when signs agree, -1 otherwise). The gradient on y is 0 almost everywhere — copysign(x, y) = |x| * sign(y) is locally constant in y. Sign comparison goes through copysign(T(1), .) so signed-zero inputs are classified by the IEEE 754 sign bit rather than < 0 (which would misclassify -0 as non-negative).
The builtin is differentiable. The gradient on x is +1 when the signs of x and y agree, -1 otherwise; the gradient on y is 0. Sign agreement is determined by the IEEE 754 sign bit, so ±0 inputs are classified correctly.
Example
import warp as wp
@wp.kernel
def k(
x: wp.array(dtype=wp.float32),
y: wp.array(dtype=wp.float32),
out: wp.array(dtype=wp.float32),
):
i = wp.tid()
out[i] = wp.copysign(x[i], y[i])
x = wp.array([3.0, 3.0, -0.0, float("nan")], dtype=wp.float32)
y = wp.array([1.0, -1.0, 1.0, -1.0], dtype=wp.float32)
out = wp.zeros(4, dtype=wp.float32)
wp.launch(k, dim=4, inputs=[x, y], outputs=[out])
print(out.numpy()) # [3.0, -3.0, 0.0, nan] -- magnitude of x with sign of y
Description
Add
wp.copysign(x, y), which returns a value with the magnitude ofxand the sign ofy. This matches C'scopysignand NumPy'snp.copysign.Motivation
The headline use case (and motivation for landing this alongside the work in GH-1376) is forcing a specific sign on a result whose signed-zero behavior is otherwise implementation-defined, e.g.
wp.min(-0.0, +0.0)returning either zero per C99'sfmin/fmaxallowance. Withwp.copysignusers can pin the sign deterministically:Requirements
The builtin should be supported for
wp.float16/wp.float32/wp.float64/wp.bfloat16. It lowers to a single instruction on every supported target:__nv_copysign(one PTX instruction).__builtin_copysignf/__builtin_copysign(compiler intrinsic, no CRT dependency).::copysignfrom<math.h>via crt.h.The adjoint
adj_copysignroutes the gradient onxbased on the sign-bit agreement betweenxandy(+1when signs agree,-1otherwise). The gradient onyis0almost everywhere —copysign(x, y) = |x| * sign(y)is locally constant iny. Sign comparison goes throughcopysign(T(1), .)so signed-zero inputs are classified by the IEEE 754 sign bit rather than< 0(which would misclassify-0as non-negative).The builtin is differentiable. The gradient on
xis+1when the signs ofxandyagree,-1otherwise; the gradient onyis0. Sign agreement is determined by the IEEE 754 sign bit, so±0inputs are classified correctly.Example