Skip to content
This repository has been archived by the owner on May 17, 2023. It is now read-only.

[MFX] [HWVPP] HEVC 10 bit hwdownload VPP issue (unmatched result ) #1027

Closed
fulinjie opened this issue Dec 18, 2018 · 15 comments
Closed

[MFX] [HWVPP] HEVC 10 bit hwdownload VPP issue (unmatched result ) #1027

fulinjie opened this issue Dec 18, 2018 · 15 comments

Comments

@fulinjie
Copy link
Contributor

CMD Line:
$ ffmpeg -v debug -y -hwaccel qsv -c:v hevc_qsv -load_plugin hevc_hw -i $src_file -vf "hwdownload,format=p010" qsv.yuv
$(ffmpeg-3.4.1) ffmpeg -v verbose -y -c:v hevc_qsv -load_plugin hevc_hw -i $src_file -pix_fmt p010le qsv_341_no_hw.yuv
$ ffmpeg -hwaccel vaapi -v verbose -y -i $src_file -pix_fmt p010le vaapi.yuv
$ ffmpeg -v verbose -y -i $src_file -pix_fmt p010le -vframes 1 ref.yuv

compare the output:

md5sum *.yuv

653e42290412ebc17c3681d7fea1ab90 qsv_341_no_hw.yuv
b9c9436b729f2a1bb0e0ab0d5c25ccdc qsv.yuv
653e42290412ebc17c3681d7fea1ab90 ref.yuv
653e42290412ebc17c3681d7fea1ab90 vaapi.yuv

Apply the patch in libva to enable va_TraceSurface support for P010, and dump the surface to check the decode result:
intel/libva#260
export LIBVA_TRACE_SURFACE=qsvdec

Checked the libva output file from vaapi, qsv, qsv_without_HWVPP, they all got the same output.
I think it means decoding is all correct.

Tracing into MSDK and Media driver, and dump the data from Surface ID to check where this issue occurs:

va_TraceSurface// same Y data with ref.yuv

VAAPIVideoCORE::DoFastCopyWrapper:
sts = GetFrameHDL(srcMemId, &srcHandle);// already wrong different Y data

Dump the [PostAll].p010 and [preALL].p010 from surface After VPP and pre-VPP:

$ od -A x -t x1z -v 'surfdump_loc[postALL]_lyr[0]_f[000]_w[3840]_h[2176]_p[7680].p010' | head -n 2
000000 40 b2 80 b2 c0 b2 00 b3 80 b3 00 b4 80 b4 00 b5 >@...............<
000010 c0 b5 40 b6 c0 b6 40 b7 80 b7 c0 b7 00 b8 00 b8 >..@...@.........<

$ od -A x -t x1z -v 'surfdump_loc[preALL]_lyr[0]_f[000]_w[3840]_h[2176]_p[7680].p010' | head -n 2
000000 80 b2 c0 b2 00 b3 40 b3 c0 b3 40 b4 c0 b4 40 b5 >......@...@...@.<
000010 00 b6 80 b6 00 b7 80 b7 c0 b7 00 b8 40 b8 40 b8 >............@.@.<

$ od -A x -t x1z -v ref.yuv | head -n 2
000000 80 b2 c0 b2 00 b3 40 b3 c0 b3 40 b4 c0 b4 40 b5 >......@...@...@.<
000010 00 b6 80 b6 00 b7 80 b7 c0 b7 00 b8 40 b8 40 b8 >............@.@.<

Result shows that after VPP, Data.Y has the different value compared with the ref.yuv and vaapi.yuv

@fulinjie
Copy link
Contributor Author

Find something more:
It seems that P010 was not support as a VPP output format in media_driver for gen 9 and gen 10.

LOG:
[LIBVA]:ENTER - DdiMedia_UnmapBuffer
[LIBVA]:ENTER - DdiMedia_DeriveImage
[LIBVA]:ENTER - DdiMedia_MapBufferInternal
[LIBVA]:ENTER - DdiMedia_EndPicture
[VP]: ENTER - DdiVp_EndPicture
[VP]: NORMAL - IsOutputFormatSupported:68: Unsupported Render Target Format '0x00000053' for SFC Pipe.
[VP]: NORMAL - GetOutputPipe:299: Feature or surface format not supported by SFC Pipe.
[VP]: NORMAL - VpHal_RndrRenderVebox:3971: VPOutputPipe = 0, VEFeatureInUse = 0
[VP]: NORMAL - Render:2132: enter CComposite::Render()
[VP]: ENTER - KernelDll_GetCombinedKernel
[VP]: CRITICAL - DumpSurfaceToFile:507: Invalid (nullptr) Pointer.

Also tested through vavpp:

$ ./vavpp process.cfg
libva info: Open new log file ./libva.085644.thd-0x00000ea5 for the thread 0x00000ea5
libva info: LIBVA_TRACE is on, save log into ./libva.085644.thd-0x00000ea5
libva info: VA-API version 1.4.0
libva info: va_getDriverName() returns 0
libva info: User requested driver 'iHD'
libva info: Trying to open /usr/lib/x86_64-linux-gnu/dri//iHD_drv_video.so
libva info: Found init function __vaDriverInit_1_4
libva info: va_openDriver() returns 0
[LIBVA]:ENTER - DdiMedia_QueryConfigEntrypoints
[LIBVA]:ENTER - DdiMedia_GetConfigAttributes
RT format 256 is not supported by VPP !
vavpp: vavpp.cpp:1482: VAStatus vpp_context_create(): Assertion `0' failed.
Aborted (core dumped)

Source Code:
https://github.com/intel/media-driver/blob/273ef13413a1c1f3527831df9f2ad962361dd97c/media_driver/agnostic/gen9/vp/hal/vphal_render_sfc_g9_base.cpp#L56

MSDK seems to use VPP pipeline in all cases. (vaapi doesn't use VPP in this situation)
And i think maybe it's better to add a bypass to cope with the direct map situation regardless of the output format for VPP, and could be available for all gen platforms. (when there is no need to do post processing)

@dvrogozh
Copy link
Contributor

Please, provide device IDs on which you did your experiments. There is difference in support on different Gen9 platforms like SKL and CFL. We need to know what is your platform.

@fulinjie
Copy link
Contributor Author

Tested in KBL.

@fulinjie
Copy link
Contributor Author

Any comments? @dvrogozh
If more information is required, I'll be glad to provide for this high priority issue in my side.

@dvrogozh
Copy link
Contributor

Looking into this:

VAAPIVideoCORE::DoFastCopyWrapper:
sts = GetFrameHDL(srcMemId, &srcHandle);// already wrong different Y data

Dump the [PostAll].p010 and [preALL].p010 from surface After VPP and pre-VPP:

$ od -A x -t x1z -v 'surfdump_loc[postALL]_lyr[0]_f[000]_w[3840]_h[2176]_p[7680].p010' | head -n 2
000000 40 b2 80 b2 c0 b2 00 b3 80 b3 00 b4 80 b4 00 b5 >@...............<
000010 c0 b5 40 b6 c0 b6 40 b7 80 b7 c0 b7 00 b8 00 b8 >..@...@.........<

$ od -A x -t x1z -v 'surfdump_loc[preALL]_lyr[0]_f[000]_w[3840]_h[2176]_p[7680].p010' | head -n 2
000000 80 b2 c0 b2 00 b3 40 b3 c0 b3 40 b4 c0 b4 40 b5 >......@...@...@.<
000010 00 b6 80 b6 00 b7 80 b7 c0 b7 00 b8 40 b8 40 b8 >............@.@.<

$ od -A x -t x1z -v ref.yuv | head -n 2
000000 80 b2 c0 b2 00 b3 40 b3 c0 b3 40 b4 c0 b4 40 b5 >......@...@...@.<
000010 00 b6 80 b6 00 b7 80 b7 c0 b7 00 b8 40 b8 40 b8 >............@.@.<

It seems that actually the correct data is there inside postALL, but it has some offset: if initial 40 b2 will be skipped you will see correct data. It seems that mediasdk copy wrapper incorrectly treats color format.... @onabiull : who can comment on this?

@dvrogozh
Copy link
Contributor

@fulinjie: why do you use p010 in ffmpeg-qsv command line and not p010le? Is there any difference? /I don't see p010 description in ffmpeg, just p010le or p010be/

@fulinjie
Copy link
Contributor Author

@dvrogozh: I had tried p010le, and it had the same result like p010.
They are still not bit matched.

@artem-shaporenko
Copy link
Contributor

I've looked at the result and it is not something like shifted wrongly or some memory operation went wrong, @fulinjie can you please check if VPP output and input parameters are the same, especially Crop and resolution(width/height), I suspect there is resolution change happens, resulting into difference i output.

@dvrogozh
Copy link
Contributor

Well... That's interesting. First of all, I've reproduced the issue and looked into the surface right before VP and right after, i.e. here: https://github.com/Intel-Media-SDK/MediaSDK/blob/master/_studio/shared/src/mfx_vpp_vaapi.cpp#L532

Indeed, surface before VPP differs from the surface after VPP and before VPP it looks matching reference from SW decoder (at least first few 36 bytes:)). The obvious assumption would be that mediasdk enables some VP algorithm as @artem-shaporenko suggested, scaling, deinterlacing, denoise, whatever. The problem is that I don't see that! Input/output parameters look matching. I start to think that this is a driver VP bug which makes some algo operation while surface just should be copied. I start to think we need to report/step into the driver.

@fulinjie
Copy link
Contributor Author

Checked the parameters and agreed that they are matching.
As to the assumption of driver issue, is there any connections with the above information:
P010 was not support as a VPP output format in media_driver for gen 9 and gen 10.

@dvrogozh
Copy link
Contributor

P010 was not support as a VPP output format in media_driver for gen 9 and gen 10.

That's one of the things I am trying to check actually. Not sure where this is being set. So, if mediasdk did not enable any algorithm via vaapi vp pipeline setup, then media driver itself could enable some algorithm. For example, it could make some strange color conversion if it thinks P010 is not supported on the output.

By the way, this is another issue: why ffmpeg-qsv tries to use VP for hwdownload.... This does not make sense actually since I believe there is additional memory copy operation: one from decoding video surface to vpp output video surface, then from video surface to system memory. I don't say that's necessarily ffmpeg-qsv issue, it could be mediasdk doing something strange here...

Anyhow, VP problem should be addressed probably anyway.

@lizhong1008
Copy link
Contributor

this is another issue: why ffmpeg-qsv tries to use VP for hwdownload.... This does not make sense actually since I believe there is additional memory copy operation: one from decoding video surface to vpp output video surface, then from video surface to system memory. I don't say that's necessarily ffmpeg-qsv issue, it could be mediasdk doing something strange here...

For qsv transcoding pipeline, it is not necessary to download from video memory to system memory.
This is just for our decoding conformance testing, regardless performance.

@artem-shaporenko
Copy link
Contributor

There can be different pipelines someone can want to use own CPU based filters, thus use system memory as output from decoder. This is wrong to use VPP for copying purpose - system memory can be directly fit into decoder.
But VPP issue stil need to be fixed.

@artem-shaporenko
Copy link
Contributor

That's one of the things I am trying to check actually. Not sure where this is being set. So, if mediasdk did not enable any algorithm via vaapi vp pipeline setup, then media driver itself could enable some algorithm. For example, it could make some strange color conversion if it thinks P010 is not supported on the output.

the main problem- why we are even calling driver if there are no parameters change? We need just copy

Alexandr-Konovalov pushed a commit to Alexandr-Konovalov/MediaSDK that referenced this issue Mar 15, 2019
Alexandr-Konovalov pushed a commit to Alexandr-Konovalov/MediaSDK that referenced this issue Mar 18, 2019
Alexandr-Konovalov pushed a commit to Alexandr-Konovalov/MediaSDK that referenced this issue Mar 18, 2019
Alexandr-Konovalov pushed a commit to Alexandr-Konovalov/MediaSDK that referenced this issue Mar 19, 2019
Alexandr-Konovalov pushed a commit to Alexandr-Konovalov/MediaSDK that referenced this issue Mar 20, 2019
Alexandr-Konovalov pushed a commit to Alexandr-Konovalov/MediaSDK that referenced this issue Mar 20, 2019
Alexandr-Konovalov pushed a commit to Alexandr-Konovalov/MediaSDK that referenced this issue Mar 20, 2019
@fulinjie
Copy link
Contributor Author

@lizhong1008 @artem-shaporenko Verified, this patch can fix the bit match issue for HEVC 10 bit issue.
Thanks.

However, it can not cope with the issue in #916 after applying the resolution change patch:
Garbage is gone, but still not matched in md5 after applying the copy pass through patch.

onabiull pushed a commit that referenced this issue Mar 21, 2019
gkrivor pushed a commit to gkrivor/MediaSDK that referenced this issue Mar 25, 2019
fzhar pushed a commit to fzhar/MediaSDK that referenced this issue Mar 29, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants