Skip to content

Conversation

NicolasHug
Copy link
Contributor

@NicolasHug NicolasHug commented Oct 4, 2025

This PR is based on #930, so to review it, look at 68878f2 (#931)

The PR looks scary but it's mostly just moving code around, and deleting / simplifying some parts. I added comments below to guide the review.


Currently in main, the BETA CUDA interface holds a default CUDA interface member, and uses it for the color-conversion step convertAVFrameToFrameOutput. This is problematic, because:

  • It forces the BetaCudaDeviceInterface to do some fake dummy initialization of the CudaDeviceInterface class
  • More importantly, it forces the CudaDeviceInterface to know that it may be called by BetaCudaDeviceInterface. This led to a bunch of if/else logic that is confusing, and a clear anti-pattern. The interfaces should be independent of each others.

This PR resolves this problem by:

  • Removing the dependency of BetaCudaDeviceInterface on CudaDeviceInterface
  • Removing the adhoc if/else logic in CudaDeviceInterface. CudaDeviceInterface doesn't need to worry about the fact that it might be called by a BetaCudaDeviceInterface.
  • Moving common parts to both interfaces into a new CUDACommon.cpp file.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Oct 4, 2025
torch::Tensor dummyTensorForCudaInitialization = torch::empty(
{1}, torch::TensorOptions().dtype(torch::kUInt8).device(device_));

nppCtx_ = getNppStreamContext(device_);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Above: moved the dummy CUDA context initialization to the constructor, like how it was already done for the CudaDeviceInterface.

Also, this BETACudaInterface now needs its own NPP context. Previously, it was relying on its CudaDeviceInterface member (now removed).

at::cuda::getCurrentCUDAStream(device_.index());

frameOutput.data = convertNV12FrameToRGB(
avFrame, device_, nppCtx_, nvdecStream, preAllocatedOutputTensor);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Above:

  • we now call convertNV12FrameToRGB() for color conversion, instead of calling defaultCudaInterface_->convertAVFrameToFrameOutput. convertNV12FrameToRGB() is now common to both interfaces.
  • previously, the stream synchronization between NVDEC and NPP was done within defaultCudaInterface_->convertAVFrameToFrameOutput(). And defaultCudaInterface_->convertAVFrameToFrameOutput had to know that the NVDEC stream was: 0 for the CudaDeviceInterface, or the default stream for BetaCudaDeviceInterface. Now, each interface explicitly specifies what the NVDEC stream is by passing it down to convertNV12FrameToRGB, and this is now where the stream synchronization happens.
  • On the TODONVDEC P2: I am doing further investigations but my current understanding is that the new BetaCudaDeviceInterface will never need to call maybeConvertAVFrameToNV12OrRGB24 - which is a GOOD thing!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

99% of this file is copy/pasted from what previously was inside of CudaDeviceInterface.

#include <libavutil/hwcontext_cuda.h>
#include <libavutil/pixdesc.h>
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Below: the stuff that was removed from this file was either:

  • logic that was specific for the BetaCudaDeviceInterface: not needed anymore now that we have removed that dependency
  • moved into CUDACommon.cpp.

// Above we checked that the AVFrame was on GPU, but that's not enough, we
// also need to check that the AVFrame is in AV_PIX_FMT_NV12 format (8 bits),
// because this is what the NPP color conversion routines expect. This SHOULD
// be enforced by our call to maybeConvertAVFrameToNV12OrRGB24() above.
Copy link
Contributor Author

@NicolasHug NicolasHug Oct 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Below:

Note how we removed the if (avFrame->hw_frames_ctx != nullptr) ... which was needed because this could be called from the Beta interface, where avFrame->hw_frames_ctx would be null. Now we don't need to check for that anymore.

: "unknown"),
", but we expected AV_PIX_FMT_NV12. "
"That's unexpected, please report this to the TorchCodec repo.");

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Below: we would previously do the NVDEC / NPP stream synchronization here. Now, it's done a bit later inside of convertNV12FrameToRGB. So this portion just becomes about figuring out what the NVDEC stream is so we can pass it down to convertNV12FrameToRGB.

Comment on lines -422 to -427
torch::Tensor& dst = frameOutput.data;
if (preAllocatedOutputTensor.has_value()) {
dst = preAllocatedOutputTensor.value();
} else {
dst = allocateEmptyHWCTensor(frameDims, device_);
}
Copy link
Contributor Author

@NicolasHug NicolasHug Oct 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This validation above was moved inside of convertNV12FrameToRGB


#pragma once

#include <npp.h>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This include isn't strictly needed here as CUDACommon.h includes it.


#include "src/torchcodec/_core/BetaCudaDeviceInterface.h"

#include "src/torchcodec/_core/CUDACommon.h"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This include isn't strictly needed as BetaCudaDeviceInterface.h includes it.

@NicolasHug NicolasHug merged commit 6377dfc into meta-pytorch:main Oct 6, 2025
50 checks passed
@NicolasHug NicolasHug deleted the nvdec-separate-interface branch October 6, 2025 14:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants