Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libobs: Separate textures for YUV output, fix chroma #1995

Merged
merged 1 commit into from
Aug 10, 2019

Conversation

jpark37
Copy link
Collaborator

@jpark37 jpark37 commented Jul 27, 2019

Description

The shaders to pack YUV information into the same texture were rather
complicated and suffering precision issues. Breaking them up into
separate textures makes the shaders much simpler and avoids having to
compute large integer offsets. Unfortunately, the code to handle
multiple textures is not as pleasant, but at least the NV12 rendering
path is no longer separate.

In addition, write chroma samples to "standard" offsets. For I444,
there's no difference, but I420/NV12 formats now have chroma shifted to
the left as 4:2:0 is shown in the H.264 specification.

Motivation and Context

YUV shaders are not operating as quickly as they should, and we can out knock two precision-related bugs in the process.
Fixes https://obsproject.com/mantis/view.php?id=624
Fixes https://obsproject.com/mantis/view.php?id=1512

Also nice to do chroma samples the H.264 way although it should probably be a user setting in a perfect world.

How Has This Been Tested?

  • Verified output video correct, and correct shaders used in D3D.
  • Verified output video correct, and correct shaders used in GL.
  • Verify Mantis issues are fixed.

Intel GPA, SetStablePowerState, Intel HD Graphics 530

Expect speed incrase:
I420: 844 us -> 493 us (254 us + 190 us + 274 us)
I444: 837 us -> 747 us (258 us + 276 us + 272 us)
NV12: 450 us -> 368 us (319 us + 168 us)

Expect no change:
NV12 (HW): 580 (481 us + 166 us) us -> 588 us (468 us + 247 us)
RGB: 359 us -> 387 us

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • My code has been run through clang-format.
  • I have read the contributing document.
  • My code is not on the master branch.
  • The code has been tested.
  • All commit messages are properly formatted and commits squashed where appropriate.

@energizerfellow
Copy link

i422?

@jpark37
Copy link
Collaborator Author

jpark37 commented Jul 27, 2019

I422 is not supported by OBS as an output format.

@energizerfellow
Copy link

True, but should it be added at some point with 4:2:2's prevalence in the pro video world?

@jpark37
Copy link
Collaborator Author

jpark37 commented Jul 27, 2019

That sounds like a fine idea, but it's not on my personal hit list.

The shaders to pack YUV information into the same texture were rather
complicated and suffering precision issues. Breaking them up into
separate textures makes the shaders much simpler and avoids having to
compute large integer offsets. Unfortunately, the code to handle
multiple textures is not as pleasant, but at least the NV12 rendering
path is no longer separate.

In addition, write chroma samples to "standard" offsets. For I444,
there's no difference, but I420/NV12 formats now have chroma shifted to
the left as 4:2:0 is shown in the H.264 specification.

Intel GPA, SetStablePowerState, Intel HD Graphics 530

Expect speed incrase:
I420: 844 us -> 493 us (254 us + 190 us + 274 us)
I444: 837 us -> 747 us (258 us + 276 us + 272 us)
NV12: 450 us -> 368 us (319 us + 168 us)

Expect no change:
NV12 (HW): 580 (481 us + 166 us) us -> 588 us (468 us + 247 us)
RGB: 359 us -> 387 us

Fixes https://obsproject.com/mantis/view.php?id=624
Fixes https://obsproject.com/mantis/view.php?id=1512
@jpark37
Copy link
Collaborator Author

jpark37 commented Jul 27, 2019

Finished testing, and I verified both Mantis bugs are fixed.

@jpark37 jpark37 marked this pull request as ready for review July 27, 2019 15:09
@Xaymar
Copy link
Contributor

Xaymar commented Jul 27, 2019

I'm pretty sure there are 4:2:2 options for use in encoders:

	/* packed 422 formats */
	VIDEO_FORMAT_YVYU,
	VIDEO_FORMAT_YUY2, /* YUYV */
	VIDEO_FORMAT_UYVY,

I'm using these in obs-ffmpeg-encoder and enc-vfw.

Edit: Not saying these are required to be supported, just that they exist and currently likely fall back to software conversion.

@jpark37
Copy link
Collaborator Author

jpark37 commented Jul 27, 2019

OBS has shaders to convert 4:2:2 YCbCr to RGB, but not the reverse. There are only 4 options in the output settings: NV12, I420, I444, and RGB.

@jpark37
Copy link
Collaborator Author

jpark37 commented Jul 28, 2019

This would also make #1655 unnecessary because those shader functions have been deleted.

@jp9000 jp9000 merged commit 164f731 into obsproject:master Aug 10, 2019
@jpark37 jpark37 deleted the yuv-simplify branch August 10, 2019 04:50
@SuslikV
Copy link
Contributor

SuslikV commented Sep 21, 2019

... I420/NV12 formats now have chroma shifted to
the left as 4:2:0 is shown in the H.264 specification

Document needs clarification with year of the last redaction used as base source for the implementation.

@WizardCM WizardCM added Bug Fix Non-breaking change which fixes an issue Code Cleanup Non-breaking change which makes code smaller or more readable Enhancement Improvement to existing functionality labels Oct 17, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Fix Non-breaking change which fixes an issue Code Cleanup Non-breaking change which makes code smaller or more readable Enhancement Improvement to existing functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants