Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide a way to archive a config and its LUTs #1627

Closed
doug-walker opened this issue Apr 6, 2022 · 9 comments
Closed

Provide a way to archive a config and its LUTs #1627

doug-walker opened this issue Apr 6, 2022 · 9 comments
Labels
Feature Request New addition to OCIO functionality. Needs Discussion Needs discussion before implmentation, which could result in changes, or a decision not to proceed.
Milestone

Comments

@doug-walker
Copy link
Collaborator

An OCIO configuration consists of the config file YAML document along with an optional set of external LUT files. This information plays a critical role in the interpretation of many deliverables (e.g scene or project files created by a DCC). In other words, if the config goes missing or cannot be found or located, it is not possible for an application to correctly interpret any scene or project files that reference it. This is a current area of fragility for workflows involving OCIO.

Examples of common use-cases that are impacted by this fragility are:

  • Packaging an OCIO config to share across multiple facilities
  • Archiving a project into a self-contained file
  • Collaboration between workstations that are not on the same network

One feature that would be useful in this regard is if there were an API call that could package an OCIO configuration into a single file (and an inverse function to unpack one). We could call the result a "config archive".

The possible locations of the LUT files is specified in a Unix-style search path found in the config file (i.e., "search_path"). When OCIO encounters a FileTransform elsewhere in the config, it needs to search for the named file by walking the search path in the specified order until the file is located.

It is not uncommon to have several files of a given name in various locations along the path. For example, there may be a default look transform for a show that is used as a fall-back if there is no look transform for a specific scene.

Furthermore, it is possible to use OCIO context variables to make the search_path a function of the current environment variables. This is very flexible but also greatly complicates any effort to know what files will actually be used. Because of this complication, the proposal is that the new API call would fail if the config uses context variables. Furthermore, because the API does not have a good way to search the config for context variable usage, the exact requirement is that the "environment" line of the config must be present and empty (i.e., "{}"), indicating that no context variables are used.

Another complication is that there may be files in the search directories that are not actually used by OCIO, for example image files. These could greatly increase the size of the config archive so these files should be excluded. The proposed way to do that is to only include files that have one of the extensions supported by OCIO. Note: My understanding is that some users have a workflow where they create "LUT" files without an extension as a way to be agnostic of the actual file format (in other words, it might sometimes be a Lut3D and other times be a CDL). This is not technically required since the OCIO LUT parser will try all the formats if the parsing of the first format fails. If you think it is necessary to include files without any extension, please let us know in the comments.

As long as the LUT files are being gathered together into an archive file, the function may as well compress them too (e.g., using zip or some similar algorithm). LUT files tend to compress quite well.

Issue #1621 proposes building-in one or more default configs to the OCIO library. This would be straight-forward for configs such as the CG config that have no external LUT files but would be more challenging for the Studio config, which will require external LUTs. By providing a way to package the Studio config into a single file, this feature might enable more easily adding other configs such as that to the built-in set.

@doug-walker doug-walker added Feature Request New addition to OCIO functionality. Needs Discussion Needs discussion before implmentation, which could result in changes, or a decision not to proceed. labels Apr 6, 2022
@doug-walker doug-walker added this to the OCIO 2.2.0 milestone Apr 6, 2022
@SonyDennisAdams
Copy link

Interesting.

We had a use case for embedding the config and LUTs into a library so that it could be a stand-alone file. Our solution was to extend OCIO so the host can register a stream handler that OCIO uses to request streams for the config and LUTs. Then the host uses embedded resources to satisfy those requests.

A similar mechanism could be used inside OCIO, so if a ZIP is passed to CreateFromFile (or CreateFromStream) then the config and LUTs would be read from it instead of the file system.

@zachlewis
Copy link
Contributor

Re: context vars:

  • What if the config only uses context variables with ColorSpaceTransforms, and never with FileTransforms or in the Search Paths?
  • What if the config only uses context variables for the CCCID key in a FileTransform, but never anywhere else?

In both cases, the exact LUTs used are known ahead of time. I hear what you're saying about not having a great way to ascertain the config's use of context variables; but I think it should be possible to support the above use cases IFF defaults are provided in the "environment" section, and context variables aren't used in any part of the search path or any FileTransform's src parameter.

Re: only archiving LUTs with supported extensions:

Huh. For our OCIOv1 pipeline, we actually don't allow extensions for our published file transforms for exactly that reason (being that a referenced file might be any kind of transform, and we feel it's better to have no information than wrong information). That's interesting, I had no idea other users and facilities were doing the same thing.

This said, we only strip extensions for published files whose contents are potentially ambiguous, and that potential ambiguity is invariably a function of using context variables to reference the path to said file. In other words, whenever we're pointing to an "immutable" path (i.e., no context variables involved), there's no reason for us not to keep the extension. So I guess we're good either way. Plus, the plan is to use CTF for everything as soon as we can, so we can put this no-extension silliness to bed once and for all.

In any case, in a worst case scenario, a slow archival process really wouldn't be the end of the world. If there's demand, maybe users could override permitted extensions with an environment variable.

@doug-walker doug-walker added this to config management in v2.2 planning Apr 13, 2022
@remia
Copy link
Collaborator

remia commented Jul 12, 2022

Agree this would be a great feature to improve sharing OCIO configs, here are a few remarks / thoughts:

  • If we are allowing unpacking, how are we dealing with the LUTs file if they come from a network share, for example, that is not available on the other end? Do we need to rewrite all the FileTransform paths to a single LUT folder for unpacking?
  • If we don't need unpacking and OCIO can use the packed representation directly, it looks like this could be considered as analogous in a way to Python freezing? Maybe worth thinking in terms of using the same terminology. This is probably a simplified view on what packing / unpacking could be used for though.
  • The proposal from @zachlewis to allow environment variables that have default in the environment section would make sense to me, you could say freeze the config for this shot. I'm not sure how much extra works that represent though. I can also confirm we tend to rely more and more on environment variables and not only for FileTransform but also for ColorSpaceTransform.
  • Probably missing something obvious but if you parse and resolve (context variables with default values) all the FileTransform in the config, could you just archive only the LUTs that are actually used in the config instead of copying every files in those folder filtering by extension? We also tend to use files without extension but I agree this is not a big deal as there is no need to do that with OCIO.

Some of these points may be just for future iteration of the feature.

@doug-walker
Copy link
Collaborator Author

Thanks for the feedback everyone! Cedrik will be working on this feature soon and I'd like to provide a few updates on our thinking:

  • I've reached out to @SonyDennisAdams to see if it makes sense to reuse any of his team's work.
  • I wrote above that we would not try to handle context variables, but I agree with the comments above and we will try to support them as best as possible. My current thinking is that we would try to make the archive preserve the full context variable functionality rather than "baking" the archive to a specific set of context variables. (If baking to a specific set is needed, that could potentially be done as a follow-up option.)
  • The main restriction looks like it will be the location of the LUT files. We're thinking we will only support configs where the search paths and FileTransform sources are all children of the directory that contains the config file. Context variables could be used in the paths, but only if they follow that restriction. And no absolute paths, only paths relative to the config file. Do you think that is too restrictive? If so, how would you propose re-pathing LUT files? Or do we leave them at their original path, but then it is essentially a tar-bomb that could be problematic if it were ever decompressed.
  • It will not be necessary to unpack the entire archive to use it. LUTs will only be decompressed as needed. So if people feel it is essential to support LUTs in arbitrary locations, perhaps we never support unarchiving back to files and modify the header so standard zip tools won't read it? I would rather not go this route, but we're open to suggestions.
  • We are leaning towards zlib to do the compression. Speed is not really an issue for this feature. Good compression is desirable and zlib likely gets us about 4:1 on LUT files. There are other options that get about 7:1, which would be nice. But we are leaning towards zlib since it is already used so widely (e.g. OpenEXR). Open to suggestions!
  • We are leaning towards minizip-ng to create the zip archive. The archive files could then be decompressible using standard zip tools, if needed, however we will provide an ocioarchive command-line tool for packing, unpacking, and listing contents.

@SonyDennisAdams
Copy link

I'm working on packaging up our stream handler changes described above and I'll send them to Doug.

cedrik-fuoco-adsk added a commit to autodesk-forks/OpenColorIO that referenced this issue Sep 30, 2022
Signed-off-by: Cedrik Fuoco <cedrik.fuoco@autodesk.com>
@doug-walker
Copy link
Collaborator Author

@SonyDennisAdams, thank you for the work you shared. We just posted PR #1696 to implement this feature. It contains a new class called ConfigIOProxy which we think you should be able to use to implement your stream handler technique going forward. We'd appreciate any feedback.

Regarding the discussion in the thread about context variables, we are supporting them. The only limitation (and it's a significant but probably unavoidable one) is that they may only refer to files that are located somewhere below the working directory.

There was some discussion above and at our working group meeting at Siggraph about baking an archive for just a specific setting of the context variables. This PR does not implement that feature, but we feel that it would be straight forward to add as a follow-up. As it stands now, the archive should work with all the same context variable settings as the original config.

Please review the PR and let us know what you think. Thanks!

@SonyDennisAdams
Copy link

@doug-walker

I did not build it but I did look over the changes. They look good and thank you and your team for your work on this.

Some differences from our v1 internal implementation:

  1. It returns a byte array instead of stream. That's fine, and probably works better with your compressed objects.
  2. It handles the config too (we just used Config::CreateFromStream in v1)
  3. It has getLutData but does it also handle .spimtx files? (of course these can be embedded in the config, but sometimes they are external).

@doug-walker
Copy link
Collaborator Author

Thank you for taking a look @SonyDennisAdams ! Regarding your points:

  1. We returned a byte array rather than a stream because it lends itself better to having a Python binding.
  2. Yes, the config may be provided by the proxy too.
  3. Yes, getLutData just returns an array of bytes which then get passed into the existing parsers. So it will handle any OCIO file formats, including spimtx.

doug-walker added a commit that referenced this issue Oct 25, 2022
* OCIOZ archive feature (#1627)

Signed-off-by: Cedrik Fuoco <cedrik.fuoco@autodesk.com>

* Add OCIOZ extension if not present.

Signed-off-by: Cedrik Fuoco <cedrik.fuoco@autodesk.com>

* - Changing getLutData (return vector instead of passing by reference) in order to respect C++ idiom and letting compiler optimize with NRVO and Copy elision
- Improving how cmake find zlib (renamed getZLIB to Findzlib)
- Make sure that ZLIB is statically linked on all platform (it wasn't on windows)
- Fix issue with Python Unit test when using Python 2 (problem discovered on OSX)
- Change how OpenEXR finds ZLIB by using our custom Findzlib
- Patching Minizip-ng makefile to change cmake minimum version

Signed-off-by: Cedrik Fuoco <cedrik.fuoco@autodesk.com>

* - Remove the minizip-ng patch as it doesn't work and further discussion needs to happend.
- Responding to Remi's comments.
  Notable changes:
  - Comments typo and styles fix.
  - Unit tests fixes.
  - Removing the usage of std::ifstream in getLutData and getConfig since it is not needed at all - It was slowing down getLutData.
  - Implicitly converting python list to std::vector<uint8_t> (for getLutData)
  - When archiving a config, the file won't added (overwridden) multiple times if the file matches multiple supported format.

Signed-off-by: Cedrik Fuoco <cedrik.fuoco@autodesk.com>

* - Added implicit conversion from bytearray to std::vector<uint8_t>
- In ConfigIOProxy Python Unit test, changed getLutData to return a bytearray instead of a list. I think this makes more sense and it works with Python 2 and 3.

Signed-off-by: Cedrik Fuoco <cedrik.fuoco@autodesk.com>

* Update comments to be in line with the current implementations

Signed-off-by: Cedrik Fuoco <cedrik.fuoco@autodesk.com>

* - Changing ci workflow in order to install cmake version 3.13.3 for Linux (matrix.vfx-cy == '2019').

Signed-off-by: Cedrik Fuoco <cedrik.fuoco@autodesk.com>

* - Fixing issue with the install location of zlib library
- Removing unused CMAKE variables from ZLIB_CMAKE_ARGS

Signed-off-by: Cedrik Fuoco <cedrik.fuoco@autodesk.com>

Signed-off-by: Cedrik Fuoco <cedrik.fuoco@autodesk.com>
Co-authored-by: Doug Walker <doug.walker@autodesk.com>
@doug-walker
Copy link
Collaborator Author

Closing this as implemented via PR #1696.

@doug-walker doug-walker moved this from config management to Done in v2.2 planning Nov 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature Request New addition to OCIO functionality. Needs Discussion Needs discussion before implmentation, which could result in changes, or a decision not to proceed.
Projects
Development

No branches or pull requests

4 participants