Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

inaccurate conversion from YCbCr Narrow range to RGB Full range #109

Closed
cepesh opened this issue Jul 24, 2019 · 12 comments
Closed

inaccurate conversion from YCbCr Narrow range to RGB Full range #109

cepesh opened this issue Jul 24, 2019 · 12 comments

Comments

@cepesh
Copy link

cepesh commented Jul 24, 2019

Hi, I've been advised to raise a ticket here rather than on the ffmpeg trac system. Original issue description, source and output files can be found here: https://trac.ffmpeg.org/ticket/8034

The problem is that when I convert from a v210 YCbCr BT.2020 NCL narrow-range file to an RGB TIFF full-range, some code values are incorrect. For simplicity of reproduction I've focused on just three colours, but it doesn't mean that there are other value errors present.

Here's a copy of the ffmpeg bug text:

Summary of the bug: inaccurate conversion from BT.2020 NCL YCbCr Narrow range to BT.2020 RGB Full range using zscale filter
How to reproduce (on the latest github windows build ffmpeg-20190722-3883c9d-win64-static):

ffmpeg started on 2019-07-24 at 11:10:42
Report written to "ffmpeg-20190724-111042.log"
Command line:
"C:\tools\ffmpeg-20190722-3883c9d-win64-static\bin\ffmpeg.exe" -loglevel debug -report -s 3840x2160 -vcodec v210 -i YUV_pattern-2fr.v210 -vf "zscale=dither=none:matrix=bt2020nc:matrixin=bt2020nc:in_range=limited:out_range=full" -vframes 1 ffmpeg_3883c9d_zscale.TIFF -y
ffmpeg version N-94383-g3883c9d147 Copyright (c) 2000-2019 the FFmpeg developers
built with gcc 9.1.1 (GCC) 20190716
configuration: --enable-gpl --enable-version3 --enable-sdl2 --enable-fontconfig --enable-gnutls --enable-iconv --enable-libass --enable-libdav1d --enable-libbluray --enable-libfreetype --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libopus --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libtheora --enable-libtwolame --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libzimg --enable-lzma --enable-zlib --enable-gmp --enable-libvidstab --enable-libvorbis --enable-libvo-amrwbenc --enable-libmysofa --enable-libspeex --enable-libxvid --enable-libaom --enable-libmfx --enable-amf --enable-ffnvcodec --enable-cuvid --enable-d3d11va --enable-nvenc --enable-nvdec --enable-dxva2 --enable-avisynth --enable-libopenmpt
libavutil 56. 32.100 / 56. 32.100
libavcodec 58. 55.100 / 58. 55.100
libavformat 58. 30.100 / 58. 30.100
libavdevice 58. 9.100 / 58. 9.100
libavfilter 7. 58.100 / 7. 58.100
libswscale 5. 6.100 / 5. 6.100
libswresample 3. 6.100 / 3. 6.100
libpostproc 55. 6.100 / 55. 6.100
Splitting the commandline.
Reading option '-loglevel' ... matched as option 'loglevel' (set logging level) with argument 'debug'.
Reading option '-report' ... matched as option 'report' (generate a report) with argument '1'.
Reading option '-s' ... matched as option 's' (set frame size (WxH or abbreviation)) with argument '3840x2160'.
Reading option '-vcodec' ... matched as option 'vcodec' (force video codec ('copy' to copy stream)) with argument 'v210'.
Reading option '-i' ... matched as input url with argument 'YUV_pattern-2fr.v210'.
Reading option '-vf' ... matched as option 'vf' (set video filters) with argument 'zscale=dither=none:matrix=bt2020nc:matrixin=bt2020nc:in_range=limited:out_range=full'.
Reading option '-vframes' ... matched as option 'vframes' (set the number of video frames to output) with argument '1'.
Reading option 'ffmpeg_3883c9d_zscale.TIFF' ... matched as output url.
Reading option '-y' ... matched as option 'y' (overwrite output files) with argument '1'.
Finished splitting the commandline.
Parsing a group of options: global .
Applying option loglevel (set logging level) with argument debug.
Applying option report (generate a report) with argument 1.
Applying option y (overwrite output files) with argument 1.
Successfully parsed a group of options.
Parsing a group of options: input url YUV_pattern-2fr.v210.
Applying option s (set frame size (WxH or abbreviation)) with argument 3840x2160.
Applying option vcodec (force video codec ('copy' to copy stream)) with argument v210.
Successfully parsed a group of options.
Opening an input file: YUV_pattern-2fr.v210.
[NULL @ 000001ed6f27c780] Opening 'YUV_pattern-2fr.v210' for reading
[file @ 000001ed6f27cfc0] Setting default whitelist 'file,crypto'
[v210 @ 000001ed6f27c780] Format v210 probed with size=2048 and score=50
[v210 @ 000001ed6f27c780] Before avformat_find_stream_info() pos: 0 bytes read:32768 seeks:0 nb_streams:1
[v210 @ 000001ed6f27c780] All info found
[v210 @ 000001ed6f27c780] Estimating duration from bitrate, this may be inaccurate
[v210 @ 000001ed6f27c780] After avformat_find_stream_info() pos: 22118400 bytes read:22118400 seeks:0 frames:1
Input #0, v210, from 'YUV_pattern-2fr.v210':
Duration: 00:00:00.08, start: 0.000000, bitrate: 4423680 kb/s
Stream #0:0, 1, 1/25: Video: v210, 1 reference frame, yuv422p10le, 3840x2160, 0/1, 4423680 kb/s, 25 tbr, 25 tbn, 25 tbc
Successfully opened the file.
Parsing a group of options: output url ffmpeg_3883c9d_zscale.TIFF.
Applying option vf (set video filters) with argument zscale=dither=none:matrix=bt2020nc:matrixin=bt2020nc:in_range=limited:out_range=full.
Applying option vframes (set the number of video frames to output) with argument 1.
Successfully parsed a group of options.
Opening an output file: ffmpeg_3883c9d_zscale.TIFF.
Successfully opened the file.
Stream mapping:
Stream #0:0 -> #0:0 (v210 (native) -> tiff (native))
Press [q] to stop, [?] for help
cur_dts is invalid st:0 (0) [init:0 i_done:0 finish:0] (this is harmless if it occurs once at the start per stream)
detected 32 logical cores
[Parsed_zscale_0 @ 000001ed6f284f80] Setting 'dither' to value 'none'
[Parsed_zscale_0 @ 000001ed6f284f80] Setting 'matrix' to value 'bt2020nc'
[Parsed_zscale_0 @ 000001ed6f284f80] Setting 'matrixin' to value 'bt2020nc'
[Parsed_zscale_0 @ 000001ed6f284f80] Setting 'in_range' to value 'limited'
[Parsed_zscale_0 @ 000001ed6f284f80] Setting 'out_range' to value 'full'
[graph 0 input from stream 0:0 @ 000001ed6f285100] Setting 'video_size' to value '3840x2160'
[graph 0 input from stream 0:0 @ 000001ed6f285100] Setting 'pix_fmt' to value '66'
[graph 0 input from stream 0:0 @ 000001ed6f285100] Setting 'time_base' to value '1/25'
[graph 0 input from stream 0:0 @ 000001ed6f285100] Setting 'pixel_aspect' to value '0/1'
[graph 0 input from stream 0:0 @ 000001ed6f285100] Setting 'sws_param' to value 'flags=2'
[graph 0 input from stream 0:0 @ 000001ed6f285100] Setting 'frame_rate' to value '25/1'
[graph 0 input from stream 0:0 @ 000001ed6f285100] w:3840 h:2160 pixfmt:yuv422p10le tb:1/25 fr:25/1 sar:0/1 sws_param:flags=2
[format @ 000001ed6f2973c0] Setting 'pix_fmts' to value 'rgb24|rgb48le|pal8|rgba|rgba64le|gray|ya8|gray16le|ya16le|monob|monow|yuv420p|yuv422p|yuv440p|yuv444p|yuv410p|yuv411p'
[auto_scaler_0 @ 000001ed6f297980] Setting 'flags' to value 'bicubic'
[auto_scaler_0 @ 000001ed6f297980] w:iw h:ih flags:'bicubic' interl:0
[format @ 000001ed6f2973c0] auto-inserting filter 'auto_scaler_0' between the filter 'Parsed_zscale_0' and the filter 'format'
[AVFilterGraph @ 000001ed6f2825c0] query_formats: 4 queried, 2 merged, 1 already done, 0 delayed
[auto_scaler_0 @ 000001ed6f297980] picking rgb48le out of 17 ref:yuv422p10le alpha:0
[Parsed_zscale_0 @ 000001ed6f284f80] w:3840 h:2160 fmt:yuv422p10le sar:0/1 -> w:3840 h:2160 fmt:yuv422p10le sar:0/1
[auto_scaler_0 @ 000001ed6f297980] w:3840 h:2160 fmt:yuv422p10le sar:0/1 -> w:3840 h:2160 fmt:rgb48le sar:0/1 flags:0x4
[Parsed_zscale_0 @ 000001ed6f284f80] w:3840 h:2160 fmt:yuv422p10le sar:0/1 -> w:3840 h:2160 fmt:yuv422p10le sar:0/1
Output #0, image2, to 'ffmpeg_3883c9d_zscale.TIFF':
Metadata:
encoder : Lavf58.30.100
Stream #0:0, 0, 1/25: Video: tiff, 1 reference frame, rgb48le, 3840x2160, 0/1, q=2-31, 200 kb/s, 25 fps, 25 tbn, 25 tbc
Metadata:
encoder : Lavc58.55.100 tiff
Clipping frame in rate conversion by 0.000008
No more output streams to write to, finishing.
[image2 @ 000001ed6f28eb40] Opening 'ffmpeg_3883c9d_zscale.TIFF' for writing
[file @ 000001ed6f2d4e80] Setting default whitelist 'file,crypto'
[AVIOContext @ 000001ed6f2d4f80] Statistics: 0 seeks, 13 writeouts
frame= 1 fps=0.0 q=-0.0 Lsize=N/A time=00:00:00.04 bitrate=N/A speed=0.0883x
video:3195kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: unknown
Input file #0 (YUV_pattern-2fr.v210):
Input stream #0:0 (video): 1 packets read (22118400 bytes); 1 frames decoded;
Total: 1 packets (22118400 bytes) demuxed
Output file #0 (ffmpeg_3883c9d_zscale.TIFF):
Output stream #0:0 (video): 1 frames encoded; 1 packets muxed (3271738 bytes);
Total: 1 packets (3271738 bytes) muxed
1 frames successfully decoded, 0 decoding errors
[AVIOContext @ 000001ed6f285200] Statistics: 22118400 bytes read, 0 seeks

100% White - CORRECT
Source YCbCr value = 0x03ac, 0x0200, 0x0200 (940, 512, 512)
Observed RGB value = 65535, 65535, 65535
Expected RGB value = 65535, 65535, 65535

58% White - INCORRECT
Source YCbCr value = 0x023c, 0x0200, 0x0200 (572, 512, 512)
Observed RGB value = 39423, 39423, 39423
Expected RGB value = 38004, 38004, 38004

100% Magenta - ALMOST CORRECT
Source YCbCr value = 0x015a, 0x343, 0x039c
Observed RGB value = 65535, 0, 65535
Expected RGB value = 65533, 0, 65535

@sekrit-twc
Copy link
Owner

sekrit-twc commented Jul 24, 2019

I can not reproduce this issue:

import vapoursynth as vs

core = vs.get_core()
c = core.std.BlankClip(format=vs.YUV444P10, color=[572, 512, 512])
c = core.resize.Bicubic(c, format=vs.RGB48, matrix_in_s='2020ncl')
c = core.std.PlaneStats(c)
c = core.text.FrameProps(c)
c = core.resize.Bicubic(c, format=vs.COMPATBGR32)
c.set_output()

This prints 38004. For your 100% magenta question, your 10-bit inputs are already quantized, so why would you expect it to produce an exact result? The input vector (16-bit) [22149, 53433, 59134] produces bit-exact 16-bit output.

@cepesh
Copy link
Author

cepesh commented Jul 24, 2019

Thank you for looking into this so promptly. Unfortunately, I've not worked with vapoursynth before and I can't yet reproduce your reproduction steps myself. But assuming zimg itself is working correctly, would it be too much to ask you to see if you can reproduce the issue using the ffmpeg command I am using?

For your 100% magenta question, your 10-bit inputs are already quantized, so why would you expect it to produce an exact result?

I am expecting an exact result because there is no good reason to not have an exact result and the fact my input is quantised already shouldn't matter that much.

The input vector (16-bit) [22149, 53433, 59134] produces bit-exact 16-bit output.

Can you please clarify what this input vector represents and what output does it produce?

Thanks!

@richardpl
Copy link

Pwrhaps error came from fact that zscale override matrix to rgb one for rgb output pixel formats?

@sekrit-twc
Copy link
Owner

sekrit-twc commented Jul 24, 2019

For your 100% magenta question, your 10-bit inputs are already quantized, so why would you expect it to produce an exact result?

I am expecting an exact result because there is no good reason to not have an exact result and the fact my input is quantised already shouldn't matter that much.

Of course it matters. If your input is inexact, the correct result is for the output to be inexact. You can work the math out yourself.

The input vector (16-bit) [22149, 53433, 59134] produces bit-exact 16-bit output.

Can you please clarify what this input vector represents and what output does it produce?

Thanks!

I used the Rec.2020 matrix to compute the real-valued YUV corresponding to R=1, G=0, B=1, and then I rounded it to 16-bit. From ITU-R BT.2020-2:

Y' = 0.2627*R' + 0.6780*G' + 0.0593*B' = 0.322
C'b = (B' - Y') / 1.8814 = 0.36036993728074834
C'r = (R' - Y') / 1.4746 = 0.459785704597857

Quantizing it to 16-bit values:

Y'' = Y' * (60160-4096) + 4096 = 22148.6 = 22149
C''b = (C'b + 0.5) * (61440-4096) + 4096 = 53433.05368342723 = 53433
C''r = (C'r + 0.5) * (61440-4096) + 4096 = 59133.95144445951 = 59134

Your 10-bit input (346, 835, 924) maps to:

Y' = (346-64) / (940-64) = 0.3219178082191781
C'b = (835-64) / (960-64) - 0.5 = 0.3604910714285714
C'r = (924-64) / (960-64) - 0.5 = 0.4598214285714286

Which is correct only to the 3rd decimal point, whereas a 16-bit number requires around 5 decimal points of precision. It is therefore not surprising that your input is off by 2/65535, which is on the order of the 5th decimal place.

@cepesh
Copy link
Author

cepesh commented Jul 25, 2019

If your input is inexact, the correct result is for the output to be inexact. You can work the math out yourself.

Here you are second guessing what the original intent behind the input code value was. When I supply code value 346 on input to the matrix conversion I expect it to be treated as 346.000 rather than 346.078125. I don't really want to focus on the magenta situation, as you say it's only 2/65535 off and there is a bigger issue that would be good to get to the bottom of:

Have you got a view regarding this comment above?
#109 (comment)

Pwrhaps error came from fact that zscale override matrix to rgb one for rgb output pixel formats?

@sekrit-twc
Copy link
Owner

Here you are second guessing what the original intent behind the input code value was.

Getting offended by maths is a new one. Different strokes, different folks perhaps...

Y' = (346-64) / (940-64) = 0.3219178082191781
C'b = (835-64) / (960-64) - 0.5 = 0.3604910714285714
C'r = (924-64) / (960-64) - 0.5 = 0.4598214285714286

B' = C'b * 1.8814+ Y' = 1.0001457100048923
R' = C'r * 1.4746 + Y' = 0.9999704867906067
G' = (Y' - 0.2627*R' - 0.0593*B')/0.6780 = -0.00012253578761713176

R'' = R'*65535 = 65533.065851822415 = 65533
G'' = G'*65535 = -8.030382841488729 = -8
B'' = B'*65535 = 65544.54910517062 = 65544

So we see that your "original intent" is to produce the color (65533, -8, 65544), and it is only clipping that is allowing you to have the delusion that you are close to (65535, 0, 65535).

@cepesh
Copy link
Author

cepesh commented Jul 25, 2019

Trust me, I had done my maths on these colour chips before submitting the original ffmpeg bug report. I know there is clipping required for the YCbCr triplet (0x015a, 0x343, 0x039c).
I don't think we are in a mathematical disagreement here and I hope you are not offended.

Sorry for repeating myself, but have you got a view on the more serious discrepancy 58% White when ffmpeg zscale is used?

@sekrit-twc
Copy link
Owner

I have not been able to set up a FFmpeg debug environment, as I have no involvement w/ vf_zscale. It will probably take a few more days to investigate.

@sekrit-twc
Copy link
Owner

I looked at the log output in the first post, which appears to indicate that z.lib is not involved in converting to RGB.

[Parsed_zscale_0 @ 000001ed6f284f80] w:3840 h:2160 fmt:yuv422p10le sar:0/1 -> w:3840 h:2160 fmt:yuv422p10le sar:0/1
[auto_scaler_0 @ 000001ed6f297980] w:3840 h:2160 fmt:yuv422p10le sar:0/1 -> w:3840 h:2160 fmt:rgb48le sar:0/1 flags:0x4
[Parsed_zscale_0 @ 000001ed6f284f80] w:3840 h:2160 fmt:yuv422p10le sar:0/1 -> w:3840 h:2160 fmt:yuv422p10le sar:0/1

I tried to reproduce this with an FFmpeg build that I compiled myself, but I could not determine how to convert to RGB by reading the documentation (https://ffmpeg.org/ffmpeg-filters.html#zscale-1).

@richardpl
Copy link

There is user supplied command above...

@richardpl
Copy link

richardpl commented Jul 26, 2019

@cepesh Could you use zscale=other parameters as usual,format=rgb48le instead of what are you using currently?

@sekrit-twc
Copy link
Owner

sekrit-twc commented Jul 26, 2019

@cepesh I did some Googling and found that you must add 'format=gbrp16le' after 'zscale'. Otherwise, FFmpeg instructs z.lib to perform a no-op conversion from YUV to YUV, and instead uses its internal code to convert to RGB. After I did this, I verified that I got '38004' when converting from '572'. This bug(?) exists in FFmpeg, not z.lib.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants