`cuda.bench` implements procedural way of registering benchmark functions. This PR is to implement more Python decorator-based syntax.
cuda.benchimplements procedural way of registering benchmark functions.This PR is to implement more Python decorator-based syntax.