Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unit tests for images from metadata-extractor-images #34

Open
RicardoBochnia opened this issue Dec 17, 2014 · 4 comments
Open

Unit tests for images from metadata-extractor-images #34

RicardoBochnia opened this issue Dec 17, 2014 · 4 comments

Comments

@RicardoBochnia
Copy link

This issue is intended to be more like a super issue which spawns multiple sub issues because to do this all at once would be quite time consuming..

I think it would be great to have Unit tests for all database images. These tests should check if all tags were recognized and contain correct values. The correct values should be obtained be using at least two other metadata extraction tools if possible.

It would be sufficient to run these test cases manually only a few days before a new version is released. I think it would be the best to use a extra package for these test cases. Also the file path should not be hardcoded since the database is not a part of the repository.

What do you think? See also drewnoakes/metadata-extractor#28 for the reason why I think these test cases may be useful to detect/prevent bugs – especially the more subtle ones.

@drewnoakes
Copy link
Owner

I think this is a good idea, yes.

Currently I use the image database and run

java com.drew.tools.ProcessAllImagesInFolderUtility -text /path/to/image/database

This updates the files in the metadata subfolder which is under version control. I regularly run this command and check the repo for diffs. This currently serves as a decent approach to regression testing, athough the breadth of the database should really be increased. (Can you think of any creative ways to expand this, by the way?)

An extension to this would be to also run other metadata tools (such as the excellent exiftool) and compare the output files.

To support such automated comparisons will require some kind of adaptation between the formats to produce comparable data. I'm not totally clear on how to do this however. There are unlikely to be unambiguously joinable identifiers between the data sets.

We should not require all developers to download the image database to their machine. These tests would probably be manual and optional as you say.

What's more, it's not clear what a failure would mean. Are we saying that metadata-extractor should produce exactly the same output as other tools? It may produce more or less. Also many values are equivalent but may have different textual representations when displayed on screen. The *Descriptor classes do loads of formatting on values to be displayed.

For now, it may be interesting to produce a /exiftool folder which contains the equivalent .txt files. We as developers can use this to check for disagreements over the images. No doubt looking through these will bring to light whether a suitable technique for comparing output exists.

@RicardoBochnia
Copy link
Author

athough the breadth of the database should really be increased.

We could use public domain/creative commons licensed images. The only problem would be that most of them would require that we credit the creator. So we would need a credit text file for the database repo. By the way if you are using a Chrome there exists a plugin called "EXIF Viewer" which shows Exif on mouse hover. This makes searching for an image from a particular camera model easier if we would use image databases with public domain/creative commons licensed images.

To support such automated comparisons will require some kind of adaptation between the formats to produce comparable data. I'm not totally clear on how to do this however. There are unlikely to be unambiguously joinable identifiers between the data sets.

I have an vague idea how this could work. I will look into this issue after New Year's Eve.

For now, it may be interesting to produce a /exiftool folder which contains the equivalent .txt files. We as developers can use this to check for disagreements over the images. No doubt looking through these will bring to light whether a suitable technique for comparing output exists.

I agree.

@drewnoakes
Copy link
Owner

if you are using a Chrome there exists a plugin called "EXIF Viewer" which shows Exif on mouse hover

Fantastic!

Let's put together a bash script that makes an /exiftool folder alongside the /metadata one in the sample image library. I also won't have time to look at this until the new year.

@drewnoakes drewnoakes transferred this issue from drewnoakes/metadata-extractor May 11, 2020
@drewnoakes
Copy link
Owner

I've transferred this issue to the metadata-extractor-images repo where it's a better fit. See also #20.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants