Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

banding and color differences in GPU path vs. CPU path #394

Closed
mjmvisser opened this issue Jun 8, 2015 · 15 comments
Closed

banding and color differences in GPU path vs. CPU path #394

mjmvisser opened this issue Jun 8, 2015 · 15 comments

Comments

@mjmvisser
Copy link

Hi, we have found in some cases a big difference between images viewed in RV and images viewed in Nuke (both using the respective apps' native OCIO support) as well as terrible banding. I was able to reproduce this issue in Nuke using @slooper's patch to OCIODisplay to include a GPU path (PR #328), so it is definitely a difference between the GPU and CPU paths in OCIO, and not an RV bug.

Here's a repro kit with compiled OCIO nodes (with @slooper's patch) for Nuke 8.0, 64-bit Linux: ocio-gpu-color.zip | uploaded via ZenHub

To see this problem in action:

  1. unpack the zip and cd to ocio-gpu-color
  2. export OCIO=/path/to/OpenColorIO-Configs/aces_0.7.1/config.ocio
  3. export NUKE_PATH=$PWD/nuke8.0
  4. LD_PRELOAD=$PWD/nuke8.0/libOpenColorIO.so Nuke8.0 ocio_gpu_color.nk
  5. mouse over the viewer and press the 1 / 2 / 3 keys:

1 = emulating what is happening in RV: input transform from logc to aces, then display transform from aces to rrt/srgb, both via the GPU path:
image
Note the banding.

2 = going directly from logc to rrt/srgb, also via GPU path:
image

3 = (control) logc to rrt/srgb via CPU path (using the OCIOColorSpace node, which does not have GPU path support)
image

The banding is most apparent when using an IDT and and ODT (as in the first example). However, even in the second example the color difference from the control is apparent.

To build @slooper's patch:

  1. check out the latest master from OpenColorIO as a new branch
  2. git remote add slooper git@github.com:slooper/OpenColorIO.git
  3. git fetch slooper
  4. git cherry-pick 85bbfc61d2e442fbe8ea34764dca93b145642738
  5. edit CMakeLists.txt and add the line ADD_DEFINITIONS("OCIO_NUKE_GPU_ENABLE=1")
  6. mkdir ../OpenColorIO-build-nuke
  7. make sure Nuke8.0 is your default Nuke
  8. CC=gcc412 CXX=g++412 cmake ../OpenColorIO
  9. make sure you LD_PRELOAD the libOpenColorIO.so you just built, otherwise Nuke will use the one it ships with.

Building for Nuke 9:

  1. change the line m_textureName = uniqueGPUShaderId("$$lut"); to m_textureName = "$$lut";
  2. CC=gcc412 CXX=g++412 cmake ../OpenColorIO
    I wasn't able to use the LD_PRELOAD trick to use these custom-built OCIO nodes in Nuke9 successfully.
@sobotka
Copy link
Contributor

sobotka commented Jun 8, 2015

Have you tried reducing the stops allocated in the allocation vars?

With respect,
TJS

On Mon, Jun 8, 2015, 9:04 AM Mark Visser notifications@github.com wrote:

Hi, we have found in some cases a big difference between images viewed in
RV and images viewed in Nuke (both using the respective apps' native OCIO
support) as well as terrible banding. I was able to reproduce this issue in
Nuke using @slooper https://github.com/slooper's patch to OCIODisplay
to include a GPU path (PR #328
#328), so it is
definitely a difference between the GPU and CPU paths in OCIO, and not an
RV bug.

Here's a repro kit with compiled OCIO nodes (with @slooper
https://github.com/slooper's patch) for Nuke 8.0, 64-bit Linux: ocio-gpu-color.zip
| uploaded via ZenHub https://files.zenhub.io/5575b4c9a2be94cc31723cb2

To see this problem in action:

  1. unpack the zip and cd to ocio-gpu-color
  2. export OCIO=/path/to/OpenColorIO-Configs/aces_0.7.1/config.ocio
  3. export NUKE_PATH=$PWD/nuke8.0
  4. LD_PRELOAD=$PWD/nuke8.0/libOpenColorIO.so Nuke8.0 ocio_gpu_color.nk
  5. mouse over the viewer and press the 1 / 2 / 3 keys:

1 = emulating what is happening in RV: input transform from logc to aces,
then display transform from aces to rrt/srgb, both via the GPU path:
[image: image]
https://cloud.githubusercontent.com/assets/351739/8038961/a85ff64c-0dd4-11e5-86c1-6b53adf07f8f.png
Note the banding.

2 = going directly from logc to rrt/srgb, also via GPU path:
[image: image]
https://cloud.githubusercontent.com/assets/351739/8038990/d19400e4-0dd4-11e5-8424-fefb150ec6b8.png

3 = (control) logc to rrt/srgb via CPU path (using the OCIOColorSpace
node, which does not have GPU path support)
[image: image]
https://cloud.githubusercontent.com/assets/351739/8038996/de5a1f84-0dd4-11e5-89d4-12a72647a5a8.png

The banding is most apparent when using an IDT and and ODT (as in the
first example). However, even in the second example the color difference
from the control is apparent.

To build @slooper https://github.com/slooper's patch:

  1. check out the latest master from OpenColorIO as a new branch
  2. git remote add slooper git@github.com:slooper/OpenColorIO.git
  3. git fetch slooper
  4. git cherry-pick 85bbfc61d2e442fbe8ea34764dca93b145642738
  5. edit CMakeLists.txt and add the line
    ADD_DEFINITIONS("OCIO_NUKE_GPU_ENABLE=1")
  6. mkdir ../OpenColorIO-build-nuke
  7. make sure Nuke8.0 is your default Nuke
  8. CC=gcc412 CXX=g++412 cmake ../OpenColorIO
  9. make sure you LD_PRELOAD the libOpenColorIO.so you just built,
    otherwise Nuke will use the one it ships with.

Building for Nuke 9:

  1. change the line m_textureName = uniqueGPUShaderId("$$lut"); to m_textureName
    = "$$lut";
  2. CC=gcc412 CXX=g++412 cmake ../OpenColorIO
    I wasn't able to use the LD_PRELOAD trick to use these custom-built OCIO
    nodes in Nuke9 successfully.


Reply to this email directly or view it on GitHub
#394.

@mjmvisser
Copy link
Author

Hi, thanks for your reply.

Yes, I've tried using a narrower allocation for the aces space. The banding is less, but the color difference is still there. Note that I'm using the aces_0.7.1 config for my example above, and it already has a tighter range ([-8.5, 5]) than the [-16, 6] discussed in the docs.

I'm open to this being a configuration problem, but I've been unable to find a combination of allocation space and vars that fixes it. Our workaround is to bypass the GPU path in RV completely by dynamically baking LUTs for input space->display space instead. It works, but it's not ideal. :-/

cheers,
-Mark

@chadrik
Copy link

chadrik commented Sep 18, 2015

Hi all, I want to add a +1 to this as it is a critical issue for us. The banding makes OCIO unusable for final review in any playback application that uses the GPU (for us this is RV, but I assume this affects other apps like Hiero). Honestly, I'm a bit surprised there is not more of a clamor about this as I assume it affects many studios. How are others dealing with this?
I'd like to get a sense of whether this problem is a fixable bug in OCIO, or a lower level problem which will be difficult or impossible to workaround. Has there been any further investigation into the problem? Anyone have any insight to share?

@mjmvisser, any chance you could share some code from your solution?

@mjmvisser
Copy link
Author

Hi Chad,

Here's our modified RV ocio_source_setup package. Note that this hasn't been tested on RV 6 (we're still on RV 4).

https://drive.google.com/file/d/0B880qC-5oTaTNGRMcDFFME5kZVE/view?usp=sharing

edit: the version I originally uploaded had a cube size of 17. We bumped this to 65 today after noticing some color differences between Nuke (OCIO CPU path) and RV (OCIO baked LUTs). I updated the package.

@Shootfast
Copy link
Contributor

Hi folks,

Whilst I don't have a solution for the problem just now, I am currently looking at improving the overall unit testing in OpenColorIO so we can catch these problems. I'll add some tests in to see if I can catch this issue and stamp it out for good.

@mjmvisser
Copy link
Author

Great idea! I haven't had time to even look at this, but I was thinking that a test that compared CPU path to GPU path over a bunch of test cases would be very useful.

@KevinJW
Copy link
Contributor

KevinJW commented Sep 28, 2015

whilst not necessarily the case in this example, I've run across a number of things that cause differences - so somethings to consider for the tests.

power of two vs non-power of two 3D luts meaning apps re-sample to powers of two (quality LUTs use 2^n+1 node points, e.g. 17^3, 33^3, 65^3)
limits on the size of LUTs/textures
Lack of tetrahedral interpolation on the GPU path
On older cards the interpolators don't have the precision
problems with shaper implementation (typically not logarithmic or unable to map the desired linear range correctly/fully)
extrapolation problems (not likely the case in OCIO currently)
mixing up co-ordinates of pixel centers (+/- 0.5)
trying to represent a function with high curvature with insufficient node counts when trilinear interpolating can result in over/under shoot (use tetrahedral instead - not on GPU path currently).
intermediate buffers being lower precision than required

Kevin

@RachelKLowe
Copy link

Has there been any change with this ticket? I am experiencing problems here as well. We built a pipeline around using RV, Nuke, and oiiotool to either apply LUTs at display time or bake them in as needed. In testing everything seemed to match well enough, but now that we are hitting production we are finding images that contain a lot of skin tones look very different depending on if RV is used or Nuke/oiiotool which do match each other.

I'm going to continue digging into the suggestions in this thread and try to make RV match our other tools. Unfortunately at our studio people preferred the look of the GPU applied LUT over the CPU applied one. Do you know which one is "correct"? Is there a fix in progress? Thank you!

Kenny

@mjmvisser
Copy link
Author

Unfortunately no. I "solved" the problem at my studio by rewriting the ocio_source_setup package to use dynamically baked and loaded .csp LUTs. So we're not even using OCIO directly in RV anymore -- at least not the GPU path.

I asked about the stalled public development on the dev list back in Novembers and got this reply from Mark Boorer (who I assume works at spi):

Hi Mark,

We have a number of large changes about to land in the public repository, and whilst I know it's frustrating to see the repo stagnant, by no means is the project dead.
My current goal is to put our humungous patch out into the public domain, seek feedback, and then assess how that affects the state of the existing pull requests / issues.
This patch should tackle a number of outstanding issues, primarily the plugin arcitecture, differences between the GPU and CPU code-paths, better unit testing, improved CMake code for cross platform builds, documentation, and repository structure. There are others working on new file-format plugins whose work I am hoping to integrate soon.

In addition, there is also a large amount of organisation happening behind-the-scenes to do with our official ACES implementation on the config side of things.

I am working towards having this ready for feedback by the new year, and once this is done we will hopefully see a lot more activity in the open again.

Cheers,
Mark

I posted a follow-up asking if there was an update to the timeline, but no reply. :-/

edit: I think @Shootfast is Mark Boorer, hopefully he sees this and jumps in with an update.

@RachelKLowe
Copy link

Thanks Mark! I saw your example and ended up doing roughly the same thing. Rather than baking the LUT dynamically in Python I just used the ociobakelut command and stored the resulting .csp file alongside our ocio config. Same thing in the end though, we are just working around it by not using OCIO inside of RV.

@nrusch
Copy link
Contributor

nrusch commented Sep 6, 2016

Pinging this thread once again to see if there are any updates.

@dbr
Copy link
Contributor

dbr commented Sep 7, 2016

Could someone check if this is just caused by tetrahedral interpolation not being implemented in the GPU code path? It sounds very much like it..

Simplest way to check would be to replace any interpolation: tetrahedral with interpolation: linear which should produce the same banding in GPU and CPU

@mjmvisser
Copy link
Author

mjmvisser commented Sep 13, 2016

I just tried this (had to dig out Nuke8.0v4 to test with the binaries I built). Changing the interpolation from tetrahedral to linear doesn't make a noticeable difference in either the GPU or CPU path. Same banding and difference between GPU and CPU.

What I did:

  • updated my test nk script to use OpenColorIO-Configs aces_1.0.1
  • duplicated "Output - sRGB (D60 sim.)" as "Output - sRGB (D60 sim.) linear", changed interpolation from tetrahedral to linear
  • duplicated "sRGB D60 sim." view as "sRGB D60 sim. linear", pointed at "Output - sRGB (D60 sim.) linear"

(the ARRI IDTs are 1D luts, so they're already linear)

edit: Just saw the thread on the mailing list, replying there.

@bsloan666
Copy link

bsloan666 commented Nov 22, 2016

I don't object to OCIO providing a GPU implementation of the transforms but I question RV's decision to use opencolorio's GPU-, rather than the CPU-path. I believe RV's and Nuke's display color management has made use of OpenGL optimizations (pixel shaders, 3D textures etc.) since before their integration of OCIO.

Shouldn't graphical clients of opencolorio (Nuke, RV, etc.) rely on opencolorio to provide the precise computation of the color transforms as defined in the config when generating their own internal 3D lookups. Should this step be a candidate for optimizations that are potentially lossy/imprecise?

@scoopxyz
Copy link
Contributor

Addressed by #539

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests