Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch to embedding libdeflate into EXRCore #1387

Merged
merged 1 commit into from
Apr 25, 2023

Conversation

kdt3rd
Copy link
Contributor

@kdt3rd kdt3rd commented Apr 23, 2023

Switches to embedding upstream (MIT license) libdeflate vs linking against external library, functions should be private / hidden by default

use EXRCore library in main C++ library

decompression, but ZIP compression is significantly slower. Photographic
images tend to shrink to between 45 and 55 percent of their uncompressed
size.
open source defulate / zlib library. ZIP decompression is faster than PIZ
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo

@lgritz
Copy link
Contributor

lgritz commented Apr 23, 2023

Do you want to entertain a discussion of the merits of FetchContent, which needs internet access at "cmake configure" time, versus git submodules, which need internet access only at clone time, versus at least supporting the option to get libdeflate from a previously built local or system install, from the point of view of those users who may build with very restricted internet access?

@meshula
Copy link
Contributor

meshula commented Apr 23, 2023

I'm extremely excited for this one, thanks for taking it on!

@kdt3rd
Copy link
Contributor Author

kdt3rd commented Apr 23, 2023

Do you want to entertain a discussion of the merits of FetchContent, which needs internet access at "cmake configure" time, versus git submodules, which need internet access only at clone time, versus at least supporting the option to get libdeflate from a previously built local or system install, from the point of view of those users who may build with very restricted internet access?

absolutely open to the discussion, I just control with the option and point it to a pre-cloned repo (OPENEXR_DEFLATE_REPO / TAG variables) usually, but submodule would work great as well, and have the bonus of not cloning into the build folder if you do a make clean (I would like to change the images to do the same thing that seem to download every time I wipe the build folder if that is what we want to do)...

images tend to shrink to between 45 and 55 percent of their uncompressed
size.
open source deflate library for zlib compression. ZIP
decompression is faster than PIZ decompression, but ZIP may be
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that ‘compression’ was deleted after ‘ZIP’ and needs to be re-added.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how so? I re-wrapped the paragraph, but all the words are there...

@lgritz
Copy link
Contributor

lgritz commented Apr 23, 2023

@kdt3rd I don't have a strong opinion on the right way to embed. I always wonder myself, am indecisive enough that I usually just punt and assume it's already installed somewhere (or maybe write a short install_blah.sh). So I think you should interpret my comment as "Is that the right way to do it? If so, I should do it in other projects, too. But if you didn't have a clear methodology in mind when you wrote it, make sure you are considering the different permutations of internet availability that people may be subject to."

Copy link
Contributor

@meshula meshula left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this methodology of embedding libdeflate meets the requirements I have. Thanks so much for taking this on!

cmake/OpenEXRSetup.cmake Outdated Show resolved Hide resolved
docs/TechnicalIntroduction.rst Outdated Show resolved Hide resolved
docs/TechnicalIntroduction.rst Outdated Show resolved Hide resolved
@meshula
Copy link
Contributor

meshula commented Apr 23, 2023

If necessary we can tweak the methodology to allow external libdeflate sources to be provided if it turns out to be an issue. This is how we deal with it in OpenTimelineIO ~ if you override a variable pointing to rapidjson, we use it, otherwise we FetchContent into __deps. Same for our other dependencies.

This meets my needs which are to not introduce a zlib dependency when using only OpenEXRCore, to use deflate for performance, and accomodate the full interring of deflate so that it's symbols are invisible to the thing linking OpenEXRCore.

@kdt3rd
Copy link
Contributor Author

kdt3rd commented Apr 23, 2023

@kdt3rd I don't have a strong opinion on the right way to embed. I always wonder myself, am indecisive enough that I usually just punt and assume it's already installed somewhere (or maybe write a short install_blah.sh). So I think you should interpret my comment as "Is that the right way to do it? If so, I should do it in other projects, too. But if you didn't have a clear methodology in mind when you wrote it, make sure you are considering the different permutations of internet availability that people may be subject to."

Need to do an experiment of what happens when there's a submodule and you package up a release tar.gz - that might be the deciding factor to me, if that pulls in and includes in the tarball or not. I was just following what was already in the setup for zlib / Imath (other than not actually recursing and compiling it). It would arguably be simpler to have a submodule, other than having to teach people to do git submodule --init --update ...

Although would it be weird to do submodule for one (deflate) and fetchcontent for Imath? I think fetchcontent is the right thing to do for Imath, given people might want to be poking around in that source. Does that add to the cognitive load, or simplify it?

bazel/third_party/libdeflate.BUILD Outdated Show resolved Hide resolved
BUILD.bazel Outdated
@@ -141,6 +141,144 @@ cc_library(
deps = [":Iex"],
)

_DEFLATE_SRCS = [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be moved to libdeflate.BUILD - see my other comment

"https://mirror.bazel.build/zlib.net/zlib-1.2.13.tar.gz",
"https://zlib.net/zlib-1.2.13.tar.gz",
],
http_archive(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use maybe again:

maybe(
        http_archive, 
        name = "libdeflate",
        build_file = "@com_openexr//:bazel/third_party/libdeflate.BUILD",
        sha256 = "225d982bcaf553221c76726358d2ea139bb34913180b20823c782cede060affd",
        strip_prefix = "libdeflate-1.18",
        urls = ["https://github.com/ebiggers/libdeflate/archive/refs/tags/v1.18.tar.gz"],
)

The whole point of maybe is that it checks if a dependency with the name libdeflate was already added - and if so it skips it - can happen in large project that use the same transitive dependencies

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I debated removing the maybe, but not being a bazelite, I didn't know if maybe gave it a chance to be overridden by the user (much like is possible in cmake where I can specify the (potentially local) repo...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think the maybe is now consistent with the others. Where I was embedding before, there wasn't a choice of maybe given the hijinks, now that it is a normal library, it's fine

@meshula
Copy link
Contributor

meshula commented Apr 23, 2023

Submoduling is problematic when a project itself is one that is likely to be submoduled, which is the case for OpenEXR.

If my project, foo, submodules both libdeflate, and OpenEXR, you've got a problem. When people fetch foo, I've instructed them to do so recursively. That recursiveness is transitive to OpenEXR, so I'll end up with two libdeflates, possibly at different commits.

libdeflate itself is a "leaf" but the problem frequently arises that somewhat large dependencies are roped in by submodules, such as glfw, various test frameworks, and so on. It's yet another variant of a dependency hell.

Sticking to FetchContent, in the case that libdeflate was not otherwise detected, quarantines the libdeflate artifact and source trees to the build directory under __deps/libdeflate and never infects the source tree.

I would only ever recommend submodules when the project in question is "outermost" and unlikely to be itself submoduled.

@kmilos
Copy link

kmilos commented Apr 24, 2023

If necessary we can tweak the methodology to allow external libdeflate sources to be provided if it turns out to be an issue.

Yes, please! The first thing we do as distro packagers is unbundle stuff like this and use shared system libraries.

Edit: libjxl perhaps has the right balance of having these as submodules (optional through scripts), or using system libraries.

@aras-p
Copy link
Contributor

aras-p commented Apr 24, 2023

This looks great! Last time I checked specifically on EXR files, libdeflate was significantly faster than zlib (https://aras-p.info/blog/2021/08/09/EXR-libdeflate-is-great/).

@kdt3rd
Copy link
Contributor Author

kdt3rd commented Apr 24, 2023

If necessary we can tweak the methodology to allow external libdeflate sources to be provided if it turns out to be an issue.

Yes, please! The first thing we do as distro packagers is unbundle stuff like this and use shared system libraries.

Edit: libjxl perhaps has the right balance of having these as submodules (optional through scripts), or using system libraries.

this you can do with specifying the cache variable OPENEXR_LIBDEFLATE_REPO in the cmake chunklet in OpenEXRSetup.cmake (or the branch variable) to override what is used....

@kdt3rd
Copy link
Contributor Author

kdt3rd commented Apr 24, 2023

This looks great! Last time I checked specifically on EXR files, libdeflate was significantly faster than zlib (https://aras-p.info/blog/2021/08/09/EXR-libdeflate-is-great/).

@aras-p what I didn't know is if in switching to libdeflate if I should switch the default compression level from a perf / size tradeoff standpoint - did you do any tests with that when you were looking at the zlib compression level? Either from your suggested switch from 6 -> 4, or from the blog post you've written?

@kmilos
Copy link

kmilos commented Apr 24, 2023

this you can do with specifying the cache variable OPENEXR_LIBDEFLATE_REPO in the cmake chunklet in OpenEXRSetup.cmake (or the branch variable) to override what is used....

For the source, yes. But I cannot unbundle this change and choose a system library instead, like e.g. libjxl lets me do for most of its dependencies. I have 3 options there:

  1. Check out the project w/ git - all the submodules get pulled in and third_party is populated, builds OOTB
  2. I get a source tarball - submodules are not distributed w/ it, and third_party is not populated
    2a. I run deps.sh and get my third_party populated
    2b. I don't care about bundling sources and and want to use system libraries (static or shared)

Most, if not all, packagers wanting to distribute OpenEXR will choose 2b:

Arch: https://github.com/archlinux/svntogit-community/blob/20bd0349291a2fb11d49f6326804aef63a8599c4/trunk/PKGBUILD#L59-L62
Debian/Ubuntu: https://salsa.debian.org/debian-phototools-team/libjxl/-/blob/d5e9a5f19ead9dd70756fc0451a3309eee1cee08/debian/rules#L66-69
Fedora: https://src.fedoraproject.org/rpms/jpegxl/blob/f38/f/jpegxl.spec#_163
MacPorts: https://github.com/macports/macports-ports/blob/3f6602d14eb8b4cb484ace0a4d43e56f540dd8a7/graphics/libjxl/Portfile#L67-L68
MSYS2 MinGW: https://github.com/msys2/MINGW-packages/blob/d839b496700a44e355a1c54e7a190c322b4e6c60/mingw-w64-libjxl/PKGBUILD#L68-L70
openSUSE: https://build.opensuse.org/package/view_file/openSUSE:Leap:15.5:Update/libjxl/libjxl.spec
vcpkg: https://github.com/microsoft/vcpkg/blob/b3e0a1e625a0cc00073efb763e39088e5b75939e/ports/libjxl/portfile.cmake#L31-L33

etc.

Basically, this could be handled identically to Imath (w/ added complexity of local CMake find module since not all libdeflate versions ship CMake config files).

@aras-p
Copy link
Contributor

aras-p commented Apr 24, 2023

what I didn't know is if in switching to libdeflate if I should switch the default compression level from a perf / size tradeoff standpoint - did you do any tests with that when you were looking at the zlib compression level? Either from your suggested switch from 6 -> 4, or from the blog post you've written?

@kdt3rd in my tests level 4 with libdeflate made the most sense too, just like with zlib. There's a tiny loss of comp. ratio, but writing the files is much faster at level 4 compared to level 6.

Switches to embedding upstream (MIT license) libdeflate vs linking
against external library, functions should be private / hidden by
default.

Like Imath, enables one to use a system version or force using an
internally embedded version (OPENEXR_FORCE_INTERNAL_DEFLATE).
Further, the repository used can be controlled with
OPENEXR_DEFLATE_REPO and OPENEXR_DEFLATE_TAG cmake variables.

This centralizes all IETF RFC 1950 (zlib) compression into
the EXRCore library. This implies that the main C++ library
depends on EXRCore.

Signed-off-by: Kimball Thurston <kdt3rd@gmail.com>
@kdt3rd
Copy link
Contributor Author

kdt3rd commented Apr 25, 2023

this you can do with specifying the cache variable OPENEXR_LIBDEFLATE_REPO in the cmake chunklet in OpenEXRSetup.cmake (or the branch variable) to override what is used....

For the source, yes. But I cannot unbundle this change and choose a system library instead, like e.g. libjxl lets me do for most of its dependencies. I have 3 options there:

I have switched to enabling the system version to be used if available, or embedding if not. Further, fixed such that symbols should be hidden if openexr is then used within another project that also uses deflate somehow.

@kdt3rd kdt3rd merged commit ef4b710 into AcademySoftwareFoundation:main Apr 25, 2023
@kdt3rd kdt3rd deleted the switch_to_deflate branch April 25, 2023 05:42
@kmilos
Copy link

kmilos commented Apr 25, 2023

I have switched to enabling the system version to be used if available, or embedding if not. Further, fixed such that symbols should be hidden if openexr is then used within another project that also uses deflate somehow.

Great, thanks!

@cary-ilm cary-ilm added the v3.2.0 label Jul 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants