fix: Fix UB in exif parsing of corrupt data#5113
Merged
lgritz merged 2 commits intoAcademySoftwareFoundation:mainfrom Mar 28, 2026
Merged
fix: Fix UB in exif parsing of corrupt data#5113lgritz merged 2 commits intoAcademySoftwareFoundation:mainfrom
lgritz merged 2 commits intoAcademySoftwareFoundation:mainfrom
Conversation
Corrupted exif data could put a value in a "tiff data type" field that is not one of the valid enum values. That's UB. Identified by running the sanitizer with a newer clang than we usually do. Signed-off-by: Larry Gritz <lg@larrygritz.com>
jessey-git
reviewed
Mar 28, 2026
src/libOpenImageIO/exif.cpp
Outdated
|
|
||
|
|
||
| inline bool | ||
| validate_TIFFDataType(TIFFDataType tifftype) |
Contributor
There was a problem hiding this comment.
No one calls this new function?
Collaborator
Author
There was a problem hiding this comment.
Ah, you're right. That was from an earlier edit, before I changed exactly where I did the check. Will remove.
|
|
||
|
|
||
| TypeDesc | ||
| tiff_datatype_to_typedesc(const TIFFDirEntry& dir) |
Contributor
There was a problem hiding this comment.
Might as well check for values <= TIFF_NOTYPE too?
Collaborator
Author
There was a problem hiding this comment.
The field we're checking is unsigned and TIFF_NOTYPE is 0.
Signed-off-by: Larry Gritz <lg@larrygritz.com>
lgritz
added a commit
that referenced
this pull request
Mar 30, 2026
#5120) PR #5113 didn't fix all the problems after all. Revert it, because it's not needed after the real solution. The real solution is to mark tiff_datatype_to_typedesc() with OIIO_NO_SANITIZE_UNDEFINED to have the sanitizer skip it. What's happening is that corrupted files end up with a value in the data type field that is not a valid value of the TIFFDataType enum, and ubsan is flagging that. BUT... `tiff_datatype_to_typedesc()` itself is actually safe in that circumstance, because it's got a 'default` clause that handles the unknown enum values just fine. In contrast, I've been running around in circles trying to find something to do *within* that function to make it "safe" (by the sanitizer's reckoning), and trying to check for valid values prior to converting to a TIFFDataType is just too hard, partly because there are places where that conversion happens unchecked inside libtiff (C language, just does a cast), so it comes to us as a TIFFDataType already with an invalid value. It's easier to just mark the whole function as being ignored by ubsan, given that we can see by inspection that it has totally deterministic and desired behavior for the illegal values. Making this spurious ubsan error disappear allows us to upgrade the container version we are using for the "sanitizer" CI test variant, so we also push that all the way up to 2026. Signed-off-by: Larry Gritz <lg@larrygritz.com>
lgritz
added a commit
to lgritz/OpenImageIO
that referenced
this pull request
Mar 30, 2026
…n#5113) Corrupted exif data could put a value in a "tiff data type" field that is not one of the valid enum values. That's UB. Identified by running the sanitizer with a newer clang than we usually do. --------- Signed-off-by: Larry Gritz <lg@larrygritz.com>
lgritz
added a commit
to lgritz/OpenImageIO
that referenced
this pull request
Mar 30, 2026
…ove sanitizer test to 2026 (AcademySoftwareFoundation#5120) PR AcademySoftwareFoundation#5113 didn't fix all the problems after all. Revert it, because it's not needed after the real solution. The real solution is to mark tiff_datatype_to_typedesc() with OIIO_NO_SANITIZE_UNDEFINED to have the sanitizer skip it. What's happening is that corrupted files end up with a value in the data type field that is not a valid value of the TIFFDataType enum, and ubsan is flagging that. BUT... `tiff_datatype_to_typedesc()` itself is actually safe in that circumstance, because it's got a 'default` clause that handles the unknown enum values just fine. In contrast, I've been running around in circles trying to find something to do *within* that function to make it "safe" (by the sanitizer's reckoning), and trying to check for valid values prior to converting to a TIFFDataType is just too hard, partly because there are places where that conversion happens unchecked inside libtiff (C language, just does a cast), so it comes to us as a TIFFDataType already with an invalid value. It's easier to just mark the whole function as being ignored by ubsan, given that we can see by inspection that it has totally deterministic and desired behavior for the illegal values. Making this spurious ubsan error disappear allows us to upgrade the container version we are using for the "sanitizer" CI test variant, so we also push that all the way up to 2026. Signed-off-by: Larry Gritz <lg@larrygritz.com>
ssh4net
pushed a commit
to ssh4net/OpenImageIO
that referenced
this pull request
Apr 2, 2026
…n#5113) Corrupted exif data could put a value in a "tiff data type" field that is not one of the valid enum values. That's UB. Identified by running the sanitizer with a newer clang than we usually do. --------- Signed-off-by: Larry Gritz <lg@larrygritz.com> Signed-off-by: Vlad (Kuzmin) Erium <libalias@gmail.com>
ssh4net
pushed a commit
to ssh4net/OpenImageIO
that referenced
this pull request
Apr 2, 2026
…ove sanitizer test to 2026 (AcademySoftwareFoundation#5120) PR AcademySoftwareFoundation#5113 didn't fix all the problems after all. Revert it, because it's not needed after the real solution. The real solution is to mark tiff_datatype_to_typedesc() with OIIO_NO_SANITIZE_UNDEFINED to have the sanitizer skip it. What's happening is that corrupted files end up with a value in the data type field that is not a valid value of the TIFFDataType enum, and ubsan is flagging that. BUT... `tiff_datatype_to_typedesc()` itself is actually safe in that circumstance, because it's got a 'default` clause that handles the unknown enum values just fine. In contrast, I've been running around in circles trying to find something to do *within* that function to make it "safe" (by the sanitizer's reckoning), and trying to check for valid values prior to converting to a TIFFDataType is just too hard, partly because there are places where that conversion happens unchecked inside libtiff (C language, just does a cast), so it comes to us as a TIFFDataType already with an invalid value. It's easier to just mark the whole function as being ignored by ubsan, given that we can see by inspection that it has totally deterministic and desired behavior for the illegal values. Making this spurious ubsan error disappear allows us to upgrade the container version we are using for the "sanitizer" CI test variant, so we also push that all the way up to 2026. Signed-off-by: Larry Gritz <lg@larrygritz.com> Signed-off-by: Vlad (Kuzmin) Erium <libalias@gmail.com>
ssh4net
pushed a commit
to ssh4net/OpenImageIO
that referenced
this pull request
Apr 2, 2026
…n#5113) Corrupted exif data could put a value in a "tiff data type" field that is not one of the valid enum values. That's UB. Identified by running the sanitizer with a newer clang than we usually do. --------- Signed-off-by: Larry Gritz <lg@larrygritz.com> Signed-off-by: Vlad (Kuzmin) Erium <libalias@gmail.com> Signed-off-by: Vlad <shaamaan@gmail.com>
ssh4net
pushed a commit
to ssh4net/OpenImageIO
that referenced
this pull request
Apr 2, 2026
…ove sanitizer test to 2026 (AcademySoftwareFoundation#5120) PR AcademySoftwareFoundation#5113 didn't fix all the problems after all. Revert it, because it's not needed after the real solution. The real solution is to mark tiff_datatype_to_typedesc() with OIIO_NO_SANITIZE_UNDEFINED to have the sanitizer skip it. What's happening is that corrupted files end up with a value in the data type field that is not a valid value of the TIFFDataType enum, and ubsan is flagging that. BUT... `tiff_datatype_to_typedesc()` itself is actually safe in that circumstance, because it's got a 'default` clause that handles the unknown enum values just fine. In contrast, I've been running around in circles trying to find something to do *within* that function to make it "safe" (by the sanitizer's reckoning), and trying to check for valid values prior to converting to a TIFFDataType is just too hard, partly because there are places where that conversion happens unchecked inside libtiff (C language, just does a cast), so it comes to us as a TIFFDataType already with an invalid value. It's easier to just mark the whole function as being ignored by ubsan, given that we can see by inspection that it has totally deterministic and desired behavior for the illegal values. Making this spurious ubsan error disappear allows us to upgrade the container version we are using for the "sanitizer" CI test variant, so we also push that all the way up to 2026. Signed-off-by: Larry Gritz <lg@larrygritz.com> Signed-off-by: Vlad (Kuzmin) Erium <libalias@gmail.com> Signed-off-by: Vlad <shaamaan@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Corrupted exif data could put a value in a "tiff data type" field that is not one of the valid enum values. That's UB. Identified by running the sanitizer with a newer clang than we usually do.