Unit tests for images from metadata-extractor-images #34

RicardoBochnia · 2014-12-17T18:45:42Z

This issue is intended to be more like a super issue which spawns multiple sub issues because to do this all at once would be quite time consuming..

I think it would be great to have Unit tests for all database images. These tests should check if all tags were recognized and contain correct values. The correct values should be obtained be using at least two other metadata extraction tools if possible.

It would be sufficient to run these test cases manually only a few days before a new version is released. I think it would be the best to use a extra package for these test cases. Also the file path should not be hardcoded since the database is not a part of the repository.

What do you think? See also drewnoakes/metadata-extractor#28 for the reason why I think these test cases may be useful to detect/prevent bugs – especially the more subtle ones.

drewnoakes · 2014-12-17T19:41:16Z

I think this is a good idea, yes.

Currently I use the image database and run

java com.drew.tools.ProcessAllImagesInFolderUtility -text /path/to/image/database

This updates the files in the metadata subfolder which is under version control. I regularly run this command and check the repo for diffs. This currently serves as a decent approach to regression testing, athough the breadth of the database should really be increased. (Can you think of any creative ways to expand this, by the way?)

An extension to this would be to also run other metadata tools (such as the excellent exiftool) and compare the output files.

To support such automated comparisons will require some kind of adaptation between the formats to produce comparable data. I'm not totally clear on how to do this however. There are unlikely to be unambiguously joinable identifiers between the data sets.

We should not require all developers to download the image database to their machine. These tests would probably be manual and optional as you say.

What's more, it's not clear what a failure would mean. Are we saying that metadata-extractor should produce exactly the same output as other tools? It may produce more or less. Also many values are equivalent but may have different textual representations when displayed on screen. The *Descriptor classes do loads of formatting on values to be displayed.

For now, it may be interesting to produce a /exiftool folder which contains the equivalent .txt files. We as developers can use this to check for disagreements over the images. No doubt looking through these will bring to light whether a suitable technique for comparing output exists.

RicardoBochnia · 2014-12-17T22:22:38Z

athough the breadth of the database should really be increased.

We could use public domain/creative commons licensed images. The only problem would be that most of them would require that we credit the creator. So we would need a credit text file for the database repo. By the way if you are using a Chrome there exists a plugin called "EXIF Viewer" which shows Exif on mouse hover. This makes searching for an image from a particular camera model easier if we would use image databases with public domain/creative commons licensed images.

To support such automated comparisons will require some kind of adaptation between the formats to produce comparable data. I'm not totally clear on how to do this however. There are unlikely to be unambiguously joinable identifiers between the data sets.

I have an vague idea how this could work. I will look into this issue after New Year's Eve.

For now, it may be interesting to produce a /exiftool folder which contains the equivalent .txt files. We as developers can use this to check for disagreements over the images. No doubt looking through these will bring to light whether a suitable technique for comparing output exists.

I agree.

drewnoakes · 2014-12-18T13:07:44Z

if you are using a Chrome there exists a plugin called "EXIF Viewer" which shows Exif on mouse hover

Fantastic!

Let's put together a bash script that makes an /exiftool folder alongside the /metadata one in the sample image library. I also won't have time to look at this until the new year.

drewnoakes · 2020-05-11T04:31:03Z

I've transferred this issue to the metadata-extractor-images repo where it's a better fit. See also #20.

drewnoakes transferred this issue from drewnoakes/metadata-extractor May 11, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unit tests for images from metadata-extractor-images #34

Unit tests for images from metadata-extractor-images #34

RicardoBochnia commented Dec 17, 2014

drewnoakes commented Dec 17, 2014

RicardoBochnia commented Dec 17, 2014

drewnoakes commented Dec 18, 2014

drewnoakes commented May 11, 2020

Unit tests for images from metadata-extractor-images #34

Unit tests for images from metadata-extractor-images #34

Comments

RicardoBochnia commented Dec 17, 2014

drewnoakes commented Dec 17, 2014

RicardoBochnia commented Dec 17, 2014

drewnoakes commented Dec 18, 2014

drewnoakes commented May 11, 2020