Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parsing IPTC data in JPEG's APP13 segment - possible 8BIM collision #7040

Open
seruss opened this issue Jan 18, 2024 · 2 comments
Open

Parsing IPTC data in JPEG's APP13 segment - possible 8BIM collision #7040

seruss opened this issue Jan 18, 2024 · 2 comments

Comments

@seruss
Copy link

seruss commented Jan 18, 2024

ImageMagick version

7.1.1-26 Q16-HDRI x64 83eefaf:20240107

Operating system

Windows

Operating system, version and so on

Windows 10 Enterprise v2009 10.0.19041.3636

Description

We've identified a potential issue in the parsing of IPTC data within the APP13 segment of JPEG files. It appears that the software might be misinterpreting certain sequences as the start of IPTC data, leading to inconsistent or incorrect metadata extraction.

The sequence in question is FF ED C7 30, found at the beginning of an APP13 segment in a JPEG file. This sequence signifies the start of the segment (FF ED) and its length (C7 30, which translates to 50,992 bytes in decimal).
The IPTC data is commonly stored within the APP13 segment, especially when processed with software like Adobe Photoshop.
A specific sequence within this segment, 1C 01 DA C7 0F BD, was initially thought to be an IPTC start header. However, upon closer inspection, it might be part of Photoshop-specific metadata (possibly 8BIM format) and not standard IPTC.
IrfanView successfully reads the IPTC tags from the same files without issues, suggesting that the problem might lie in how the APP13 segment is parsed.

magick identify result:

Profiles:
Profile-8bim: 50976 bytes
Profile-exif: 7985 bytes
Profile-icc: 456 bytes
Profile-iptc: 50964 bytes
Custom Field 19[1,218]: 0x00000000: 3015ffff ffff69ff ffff0dff ffffffff ff06ffff -0-ý-ľúičąň-Î-ĆÖę‹-Ń
0x00000014: ffff37ff ff7d3fff ff09ffff 4affffff 6116ff58 üŕő7żé}?§ţ ężJÍŔ¬a-“
0x00000028: 6b6f6e43 0cff56ff ff6dffff ff371fff ffffffff XkonC--Výďm¶łé7-ÚßÓł
0x0000003c: 15ff70ff 00 ô-Ąp-

IrfanView screenshot as well as original image attached below.

Steps to Reproduce

magick identify -verbose iptc.jpeg

Images

iptc
image

@urban-warrior
Copy link
Member

Not sure what the issue is. ImageMagick extracts the IPTC profile but does not extract meta-data from the profile, instead preferring EXIF. You can view the IPTC profile with this command:

$ magick iptc_image.jpg iptcData.iptc
$ cat iptcData.iptc
x
John (Felix von Jascheroff) und Laura (Chryssanthi Kavazi, r.) sind �berrumpelt,
 als Lydia "Janani" (Michaela Hanser) �ber Weihnachten zu Besuch kommt.

+++ Die Verwendung des sendungsbezogenen Materials ist nur mit dem Hinweis und V
erlinkung auf RTL+ gestattet. +++tFoto: RTL / Rolf BaumgartnerP
Folge 7924<10254520231026(�Die Verwendung des Materials von RTL Deutschland ist 
nur zur redaktionellen Berichterstattung im Zusammenhang mit der Sendung unter A
ngabe der Credits/Quellenangabe und Beachtung der unter media.rtl.com genannten 
AGB erlaubt.Gute Zeiten, schlechte Zeiten

@seruss
Copy link
Author

seruss commented Jan 22, 2024

@urban-warrior I didn't mention that the problem originally occurred while using Magick.NET. When I used the command you suggested magick indeed saved proper iptc profile, however when calling C API GetImageProfile the returned data contains some (I suppose) photoshop specific information in addition to iptc values which could be causing problems for Magick.NET which is parsing iptc tags and values internally. I attach the result of API call.
GetProfileFromApi.txt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants