Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 17 additions & 6 deletions src/torchcodec/decoders/_core/CudaDevice.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -223,13 +223,24 @@ void convertAVFrameToDecodedOutputOnCuda(
Npp8u* input[2] = {src->data[0], src->data[1]};

auto start = std::chrono::high_resolution_clock::now();
NppStatus status = nppiNV12ToRGB_8u_P2C3R(
input,
src->linesize[0],
static_cast<Npp8u*>(dst.data_ptr()),
dst.stride(0),
oSizeROI);
NppStatus status;
if (src->colorspace == AVColorSpace::AVCOL_SPC_BT709) {
status = nppiNV12ToRGB_709HDTV_8u_P2C3R(
input,
src->linesize[0],
static_cast<Npp8u*>(dst.data_ptr()),
dst.stride(0),
oSizeROI);
} else {
status = nppiNV12ToRGB_8u_P2C3R(
input,
src->linesize[0],
static_cast<Npp8u*>(dst.data_ptr()),
dst.stride(0),
oSizeROI);
}
TORCH_CHECK(status == NPP_SUCCESS, "Failed to convert NV12 frame.");

// Make the pytorch stream wait for the npp kernel to finish before using the
// output.
at::cuda::CUDAEvent nppDoneEvent;
Expand Down
6 changes: 4 additions & 2 deletions test/decoders/test_video_decoder_ops.py
Original file line number Diff line number Diff line change
Expand Up @@ -118,8 +118,10 @@ def test_get_frame_at_pts(self, device):
# return the next frame since the right boundary of the interval is
# open.
next_frame, _, _ = get_frame_at_pts(decoder, 6.039367)
with pytest.raises(AssertionError):
frame_compare_function(next_frame, reference_frame6.to(device))
if device == "cpu":
# We can only compare exact equality on CPU.
with pytest.raises(AssertionError):
frame_compare_function(next_frame, reference_frame6.to(device))

@pytest.mark.parametrize("device", cpu_and_cuda())
def test_get_frame_at_index(self, device):
Expand Down
2 changes: 1 addition & 1 deletion test/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ def assert_tensor_equal(*args, **kwargs):

# Asserts that at least `percentage`% of the values are within the absolute tolerance.
def assert_tensor_close_on_at_least(
actual_tensor, ref_tensor, percentage=90, abs_tolerance=20
actual_tensor, ref_tensor, percentage=90, abs_tolerance=19
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This tolerance is still a bit high I think, so might be good to see if it is due to limited color range vs full color range.

I had some good results with limited color range more closely matching the output of some videos.

The code I used was in https://github.com/fairinternal/amaia/blob/fmassa/video_reader/amaia_video/nv12_to_rgb.py

Copy link

@pjs102793 pjs102793 Nov 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fmassa

In the torchvision decoder, this function was used. According to the NPP Documentation, it is noted that HDTV conversion assumes a full color range of 0 - 255; use the CSC version for limited range color.

Maybe using this function seems to significantly improve issues with color discrepancies.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I directly replaced the function with the CSC function, the difference in values between the CPU and CUDA results was only 2 across all values. While further validation is necessary, it seems much more accurate compared to a difference of around 20. I will provide more details in the issue.

):
assert (
actual_tensor.device == ref_tensor.device
Expand Down
Loading