Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

G-API: VPL source to BGR kernel #21737

Closed
wants to merge 1 commit into from

Conversation

mpashchenkov
Copy link
Contributor

@mpashchenkov mpashchenkov commented Mar 17, 2022

Added OCL version for BGR kernel. OCL version depends on DX11 and VPL and use OCV's DX11 interop.

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

  • I agree to contribute to the project under Apache 2 License.
  • To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
  • The PR is proposed to the proper branch
  • There is a reference to the original bug report and related work
  • There is accuracy test, performance test and test data in opencv_extra repository, if applicable
    Patch to opencv_extra has the same branch name.
  • The feature is well documented and sample code can be built with the project CMake

Build commands

force_builders=Custom,Custom Win,Custom Mac
build_gapi_standalone:Linux x64=ade-0.1.1f
build_gapi_standalone:Win64=ade-0.1.1f
build_gapi_standalone:Mac=ade-0.1.1f
build_gapi_standalone:Linux x64 Debug=ade-0.1.1f

Xbuild_image:Custom=centos:7
Xbuildworker:Custom=linux-1
build_gapi_standalone:Custom=ade-0.1.1f

build_image:Custom=ubuntu-openvino-2021.4.1:20.04
build_image:Custom Win=openvino-2021.4.1
build_image:Custom Mac=openvino-2021.4.1

test_modules:Custom=gapi,python2,python3,java
test_modules:Custom Win=gapi,python2,python3,java
test_modules:Custom Mac=gapi,python2,python3,java

buildworker:Custom=linux-1
# disabled due high memory usage: test_opencl:Custom=ON
test_opencl:Custom=OFF
test_bigdata:Custom=1
test_filter:Custom=*

});
GAPI_Assert(it != devices.end() && "Vector of devices doesn't contain associated with D3D11Device device");

cv::directx::convertFromD3D11Texture2D(texture, out, &devices.at(std::distance(it, devices.begin())).second);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it the same as 'it->second', isn't it?

GAPI_Assert(d.fmt == cv::MediaFormat::NV12);

auto params = in.get<cv::gapi::wip::onevpl::VPLMediaFrameDX11Adapter>()->blobParams();
auto handle = cv::util::any_cast<mfxHDLPair>(params);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm afraid we cannot use blobParam is such way anymore because changes had been done in giebacked in different PR ( which require two maps) and Vpl adapter must have been changed to support giebackend modifications. So We need to reuse approach from latest giebackend... but it incorporates a lot of IE stuff...

I think your current solution is well-enough as prototyping phase(and could be merged at least), but we need to discover a better format for blobParams better, to satisfy common approach

auto params = in.get<cv::gapi::wip::onevpl::VPLMediaFrameDX11Adapter>()->blobParams();
auto handle = cv::util::any_cast<mfxHDLPair>(params);
ID3D11Texture2D* texture = reinterpret_cast<ID3D11Texture2D*>(handle.first);
GAPI_Assert(texture != nullptr && "D3D11Texture2D from mfxHDLPair is nullptr");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GAPI_LOG_DEBUG(nullptr, "texture: " << texture << ", subresource: " << handle.second)

GAPI_Assert(pD3D11Device != nullptr && "D3D11Texture2D::GetDevice returns pD3D11Device that is nullptr");

auto devices = cv::directx::getDeviceIDsByD3D11Device(pD3D11Device);
GAPI_Assert(!devices.empty() && "getDeviceIDsByD3D11Device returns empty vector of devices");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GAPI_LOG_DEBUG(, "CL devices obtained: " << devices.size())

});
GAPI_Assert(it != devices.end() && "Vector of devices doesn't contain associated with D3D11Device device");

cv::directx::convertFromD3D11Texture2D(texture, out, &devices.at(std::distance(it, devices.begin())).second);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

after...
GAPI_LOG_DEBUG(nullptr, "UMAt created: " << out.handle() )

CV_Assert(textureType >= 0);

u.create(Size(desc.Width, desc.Height), textureType);
cl_mem clBuffer = (cl_mem)u.handle(ACCESS_READ);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think we supposed to use WRITE still we convert TO this buffer

int textureType = getTypeFromDXGI_FORMAT(desc.Format);
CV_Assert(textureType >= 0);

u.create(Size(desc.Width, desc.Height), textureType);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it the same UMat which is rendered now in main loop?
i means: is it possible to create and fiil this UMat while we still drawing it in main loop?

@@ -523,6 +539,45 @@ GAPI_OCL_KERNEL(GOCLTranspose, cv::gapi::core::GTranspose)
}
};

GAPI_OCL_KERNEL(GOCLVPL2BGR, cv::gapi::streaming::VPL2BGR)
{
static void run(const cv::MediaFrame& in, cv::UMat& out)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you point me out who create this out initially?

#ifdef HAVE_DIRECTX_NV12
cl_mem clImageUV = 0;
if(DXGI_FORMAT_NV12 == desc.Format) {
clImageUV = impl->clCreateFromD3D11Texture2DKHR((cl_context)context.ptr(), CL_MEM_READ_ONLY, pD3D11Texture2D, 1, &status);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://www.khronos.org/registry/OpenCL/sdk/2.2/docs/man/html/clCreateFromD3D11Texture2DKHR.html

it is incorrect that you set 1 as subresource argument in general way
I have idea why it has worked before latest rebase:
VPL had used TextureArray before for decoding surfaces: It meant that we had only one Texture pointer which represented multiple subtextures (subresources) inside it. So in this case Texture pointer was a pointer on array of textures-subresources.
And in this case when use pass parameter 0 and 1 for this clCreateFromD3D11Texture2DKHR then you just extract the current and the next decoding surfaces (current and the next MediaFrame implicitly) which was incorrect too.
But for some reasons it worked( may be because subresources in TextureArray initialized by zero or something other)

But after some events (Integrating to openVINO) decoding routine in VPL was changed and i applied model Array of Textures instead of TexturesArray. It means at now that instead of single Texture pointer with multiple subresources for multiple MediaFrames we have multiple Textures with 0 subresources for Multiple Frames
So clCreateFromD3D11Texture2DKHR with argument 1 as subresource has no meaning now because the MAX number of subresource id in a single texture is 0 at now and represent texture by itself. So you MUST NOT use subresource 1 for extract UV component but should do it in another way.

But because, in your latter implementation you use 0 & 1: which meant single NV12 texture for current Frame and single NV12 Texture for the next Frame - different surfaces but not different planes on the same surface then i suggest you to pass the same Image twice in ocl nv12-to-grb converting function, just for experiment (lulz :) )

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct way is to use subbuffer from cl_mem which represents correct data region for U and UV separatedly:
https://www.khronos.org/registry/OpenCL/specs/2.2/html/OpenCL_API.html#clCreateSubBuffer

so we need something like that:

cl_mem y = clCreateSubBuffer( image, 0, h*w/2); 
cl_mem uv = clCreateSubBuffer( image, h*2/w,  h*w/2); 

ocl::ocl_convert_nv12_to_bgr(y, uv, clBuffer, (int)u.step[0], u.cols, u.rows)) {

@mpashchenkov mpashchenkov marked this pull request as ready for review March 24, 2022 08:07
Copy link
Contributor

@sivanov-work sivanov-work left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Major stuff which must be resolved or discussed at least:

  1. We should apply OCL kernel in existing onevpl_infer_single_roi to avoid code duplications and apply conversion for real use-case scenario with IE
    Is it possible to using CPU || OCL kernel combinations in order to use CPU fallback if gapi_core compiled without OCL?
  2. OCLBRG kernel uses blobParam in DX11FrameAdapter which has different sematics when IE is enabled or not. In this case Kernel behavior would be changed when IE is available or not. And it would be broken in real use-case scenario onevpl_infer_single_roi when IE is enabled and DXMediaFrameAdapter acts in different way
    So in my perspective we must loose coupling between GIEbackend, DX11MediaFrameAdapter & OCL kernel in this PR before continue
    Let's discuss it

#ifdef HAVE_DIRECTX
#ifdef HAVE_D3D11
bool initializeContext(ID3D11Device* pD3D11Device) {
const static auto context_is_initialized = [&]() -> bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is it initialized by lambda?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it should has thread_local modifier according to opencv policy: one ctx per thread


cv::Ptr<cv::gapi::wip::IStreamSource> cap;
std::vector<cv::gapi::wip::onevpl::CfgParam> source_cfgs;
source_cfgs.push_back(cfg::create_from_string(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can drop create_from_string at all and reuse source_cfgs.push_back(CfgParam::create_acceleration_mode("MFX_ACCEL_MODE_VIA_D3D11"));

{
static void run(const cv::MediaFrame& in, cv::UMat& out)
{
(void)in; (void)out;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cv::util::suppress_unused_warning is a cross-platform way

@@ -978,7 +978,7 @@ static void __convertToD3D11Texture2DKHR(InputArray src, ID3D11Texture2D* pD3D11
cl_context context = (cl_context)ctx.ptr();
OpenCL_D3D11* impl = ctx.getUserContext<OpenCL_D3D11>().get();
if (nullptr == impl)
CV_Error(cv::Error::OpenCLApiCallError, "OpenCL: Context initilized without DirectX interoperability");
CV_Error(cv::Error::OpenCLApiCallError, "OpenCL: Context initialized without DirectX interoperability");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i suggest rollback changes here, because it is not related to our functionality

#include <opencv2/gapi/streaming/onevpl/source.hpp>
#include <opencv2/gapi/streaming/onevpl/data_provider_interface.hpp>
#include <opencv2/highgui.hpp> // CommandLineParser
#include <opencv2/gapi/infer/parsers.hpp>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think this is redundant


// Now build the graph
cv::GFrame in; // input frame from VPL source
auto bgr_gmat = cv::gapi::streaming::BGR(in); // conversion from VPL source frame to BGR UMat
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don we not apply it in vpl source by itself?

@mpashchenkov mpashchenkov marked this pull request as draft April 5, 2022 07:42
@smirnov-alexey
Copy link
Contributor

Updated version: #22559

@mpashchenkov
Copy link
Contributor Author

It has more relevant version

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants