-
Notifications
You must be signed in to change notification settings - Fork 7.2k
Description
🐛 Describe the bug
When building the ROI Pool CUDA kernel on Windows + MSVC + CUDA, compilation fails if AT_DISPATCH_FLOATING_TYPES_AND_HALF instantiates with T=half.
The comparison operator becomes ambiguous and MSVC reports a build error.
Minimal Repro Example
// roi_pool_forward_kernel_impl excerpt
if (offset_input[input_index] > maxval) {
maxval = offset_input[input_index];
maxidx = input_index;
}- With T=half, this line fails to compile on MSVC.
Environment
OS: Windows 10/11
Compiler: MSVC 19.x (Visual Studio 2022)
CUDA: 12.x
PyTorch / torchvision: built from source (latest main)
Proposed Fix
Following PyTorch conventions, it may be preferable to compare in an accumulation type:
using acc_t = at::acc_type<T, /*is_cuda=*/true>;
acc_t v = static_cast<acc_t>(offset_input[input_index]);
acc_t mv = static_cast<acc_t>(maxval);
if (v > mv) {
maxval = offset_input[input_index];
maxidx = input_index;
}
Also, initialize maxval in a type-consistent way:
T maxval = is_empty ? T(0) : std::numeric_limits<T>::lowest();This avoids the MSVC ambiguity and preserves precision across float/double/half.
###Suggestion
Would it make sense to update the ROI Pool kernel accordingly?
I am happy to prepare a PR if maintainers agree with this direction.
Note
I’m a junior-level software engineer and this is my first time submitting a report here. If I missed any guidelines or phrased things imperfectly, please kindly let me know. Thank you for your understanding.
Versions
https://github.com/pytorch/vision/blob/main/torchvision/csrc/ops/cuda/roi_pool_kernel.cu