Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support non-8BIM signatures in Photoshop directory #125

Closed
ArjohnKampman opened this issue Oct 19, 2015 · 7 comments
Closed

Support non-8BIM signatures in Photoshop directory #125

ArjohnKampman opened this issue Oct 19, 2015 · 7 comments

Comments

@ArjohnKampman
Copy link
Contributor

The code in PhotoshopReader mentions that IRB signatures "Should always be 8BIM". However, we're frequently finding jpeg files that have "PHUT" signatures mixed in. Some sites also report "AgHg" and "DCSR" signatures, for example: http://dev.exiv2.org/issues/800. Although the signature is not what the code expects, the block's data layout is still the same as for 8BIM. Please consider either parsing such blocks, or at least skip them with a warning. Currently, the code throws an exception and is missing all blocks after that. I'm attaching a sample image from the Enron Corpus that has two of these PHUT resource blocks.

friends

@drewnoakes
Copy link
Owner

Thanks, this is a very interesting report. Can and do you permit that sample image to be added to the sample images repo? That'd be very helpful, if you're in a position to allow that.

The fix should be relatively straightforward. Want to put a PR together for this?

@ArjohnKampman
Copy link
Contributor Author

The image has been extracted from the origianl EDRM Enron data set, which was distributed under a Creative Commons license, so it should be OK to include that the sample images repository.

I don't have time to learn git enough to come up a pull request, but I may be able to send a patch later.

@drewnoakes
Copy link
Owner

I'll take a look at those images as they sound like a useful resource for
the project.

A patch would be fine. No need to let the tools be a barrier to
contributing.

On Mon, 26 Oct 2015 08:58 ArjohnKampman notifications@github.com wrote:

The image has been extracted from the origianl EDRM Enron data set, which
was distributed under a Creative Commons license, so it should be OK to
include that the sample images repository.

I don't have time to learn git enough to come up a pull request, but I may
be able to send a patch later.


Reply to this email directly or view it on GitHub
#125 (comment)
.

@ArjohnKampman
Copy link
Contributor Author

FYI: the Enron data set is not a simple set of images. It's a collection of email from the Enron company. This image was likely an attachment of one such email.

@ArjohnKampman
Copy link
Contributor Author

I've fabricated my git first pull request with a fix for this issue, hope I've done it correctly. GitHub's issue tracker doesn't allow one to attach patch files :-(

@drewnoakes drewnoakes mentioned this issue Oct 28, 2015
@drewnoakes
Copy link
Owner

@ArjohnKampman thanks for the PR and for explaining the Enron data set. Sounds like it might be worth a trawl just for exceptions as I'm guessing there are a lot of images there.

Let's leave this issue open until an image exhibiting this problem ends up in the sample library to prevent regressions.

Will also apply this patch to the .NET implementation.

drewnoakes added a commit to drewnoakes/metadata-extractor-dotnet that referenced this issue Nov 12, 2015
Some images contain IRBs that are not prefixed with "8BIM", and this change allows skipping such blocks instead of registering errors.

This commit gives .NET parity to the Java PR in drewnoakes/metadata-extractor#125
drewnoakes added a commit to drewnoakes/metadata-extractor-images that referenced this issue Nov 15, 2015
@drewnoakes
Copy link
Owner

The image you provided is now in the image library. This has also been cross ported to the .NET implementation in drewnoakes/metadata-extractor-dotnet@1bc7422.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants