Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JPEG conflict in 1.1.0 #140

Closed
jcoyne opened this issue May 31, 2017 · 11 comments
Closed

JPEG conflict in 1.1.0 #140

jcoyne opened this issue May 31, 2017 · 11 comments

Comments

@jcoyne
Copy link

jcoyne commented May 31, 2017

$ fits.sh -v
Picked up JAVA_TOOL_OPTIONS: -Xmx128m
1.1.0
[ec2-user@ip-10-0-3-116 ~]$ fits.sh -i dog.jpeg
Picked up JAVA_TOOL_OPTIONS: -Xmx128m
log4j:WARN No appenders could be found for logger (edu.harvard.hul.ois.fits.Fits).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
<?xml version="1.0" encoding="UTF-8"?>
<fits xmlns="http://hul.harvard.edu/ois/xml/ns/fits/fits_output" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://hul.harvard.edu/ois/xml/ns/fits/fits_output http://hul.harvard.edu/ois/xml/xsd/fits/fits_output.xsd" version="1.1.0" timestamp="5/31/17 2:59 PM">
  <identification status="CONFLICT">
    <identity format="JPEG File Interchange Format" mimetype="image/jpeg" toolname="FITS" toolversion="1.1.0">
      <tool toolname="Droid" toolversion="6.1.5" />
      <tool toolname="file utility" toolversion="5.22" />
      <version toolname="Droid" toolversion="6.1.5">1.01</version>
      <externalIdentifier toolname="Droid" toolversion="6.1.5" type="puid">fmt/43</externalIdentifier>
    </identity>
    <identity format="JPEG EXIF" mimetype="image/jpeg" toolname="FITS" toolversion="1.1.0">
      <tool toolname="Exiftool" toolversion="10.00" />
      <tool toolname="NLNZ Metadata Extractor" toolversion="3.6GA" />
      <version toolname="Exiftool" toolversion="10.00">1.01</version>
    </identity>
  </identification>
  <fileinfo>
    <size toolname="Jhove" toolversion="1.16">1245975</size>
    <creatingApplicationName toolname="Exiftool" toolversion="10.00">Canon PowerShot SD870 IS</creatingApplicationName>
    <lastmodified toolname="Exiftool" toolversion="10.00" status="CONFLICT">2012:04:08 09:50:57</lastmodified>
    <lastmodified toolname="Tika" toolversion="1.10" status="CONFLICT">2012-04-08T09:50:57</lastmodified>
    <created toolname="Exiftool" toolversion="10.00">2012:04:08 09:50:57</created>
    <filepath toolname="OIS File Information" toolversion="0.2" status="SINGLE_RESULT">/home/ec2-user/dog.jpeg</filepath>
    <filename toolname="OIS File Information" toolversion="0.2" status="SINGLE_RESULT">dog.jpeg</filename>
    <md5checksum toolname="OIS File Information" toolversion="0.2" status="SINGLE_RESULT">ce2204e9ca94f1c80f124dfea460c254</md5checksum>
    <fslastmodified toolname="OIS File Information" toolversion="0.2" status="SINGLE_RESULT">1496242740000</fslastmodified>
  </fileinfo>
  <filestatus />
  <metadata />
  <statistics fitsExecutionTime="935">
    <tool toolname="MediaInfo" toolversion="0.7.75" status="did not run" />
    <tool toolname="OIS Audio Information" toolversion="0.1" status="did not run" />
    <tool toolname="ADL Tool" toolversion="0.1" status="did not run" />
    <tool toolname="VTT Tool" toolversion="0.1" status="did not run" />
    <tool toolname="Droid" toolversion="6.1.5" executionTime="203" />
    <tool toolname="Jhove" toolversion="1.16" executionTime="850" />
    <tool toolname="file utility" toolversion="5.22" executionTime="770" />
    <tool toolname="Exiftool" toolversion="10.00" executionTime="874" />
    <tool toolname="NLNZ Metadata Extractor" toolversion="3.6GA" executionTime="802" />
    <tool toolname="OIS File Information" toolversion="0.2" executionTime="184" />
    <tool toolname="OIS XML Metadata" toolversion="0.2" status="did not run" />
    <tool toolname="ffident" toolversion="0.2" executionTime="628" />
    <tool toolname="Tika" toolversion="1.10" executionTime="822" />
  </statistics>
</fits>

Let me know if you want me to send you a test file somewhere.

@daveneiman
Copy link
Contributor

Hi,
Wondering if you have received the same conflict with earlier versions of FITS.
It looks like 2 tools in FITS are giving you a format of "JPEG File Interchange Format" and 2 other tools are giving you a format of "JPEG EXIF" for the JPEG. Providing a file might be helpful but please run with earlier versions of FITS to see if this output existed previously.

@jcoyne
Copy link
Author

jcoyne commented Jun 1, 2017

@daveneiman it works correctly on 1.0.5. I have not had a chance to test on 1.0.6/7

@daveneiman
Copy link
Contributor

It is most likely there were improvements made to file format normalization with some of the tools embedded within FITS. This is not necessarily an error. Take a look at the format output by the earlier versions of FITS.

@jcoyne
Copy link
Author

jcoyne commented Jun 1, 2017

@daveneiman It seems like I should be able to get the height and width from a jpeg image. In 1.0.5 I could, but in 1.1.0 I can't.

@jcoyne
Copy link
Author

jcoyne commented Jun 1, 2017

Here's the output in 1.0.5 note the complete <metadata> section that is missing in the 1.1.0 output:

$ fits.sh -i ~/Downloads/dog.jpeg
Jun 01, 2017 10:36:49 AM edu.harvard.hul.ois.jhove.JhoveBase init
SEVERE: Testing SEVERE level
<?xml version="1.0" encoding="UTF-8"?>
<fits xmlns="http://hul.harvard.edu/ois/xml/ns/fits/fits_output" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://hul.harvard.edu/ois/xml/ns/fits/fits_output http://hul.harvard.edu/ois/xml/xsd/fits/fits_output.xsd" version="1.0.5" timestamp="6/1/17 10:36 AM">
  <identification>
    <identity format="JPEG File Interchange Format" mimetype="image/jpeg" toolname="FITS" toolversion="1.0.5">
      <tool toolname="Droid" toolversion="6.1.5" />
      <tool toolname="file utility" toolversion="5.25" />
      <tool toolname="Exiftool" toolversion="10.00" />
      <tool toolname="NLNZ Metadata Extractor" toolversion="3.6GA" />
      <version toolname="Droid" toolversion="6.1.5">1.01</version>
      <externalIdentifier toolname="Droid" toolversion="6.1.5" type="puid">fmt/43</externalIdentifier>
    </identity>
  </identification>
  <fileinfo>
    <size toolname="Jhove" toolversion="1.11">1245975</size>
    <creatingApplicationName toolname="Exiftool" toolversion="10.00">Canon PowerShot SD870 IS</creatingApplicationName>
    <lastmodified toolname="Exiftool" toolversion="10.00" status="CONFLICT">2012:04:08 09:50:57</lastmodified>
    <lastmodified toolname="Tika" toolversion="1.10" status="CONFLICT">2012-04-08T09:50:57</lastmodified>
    <created toolname="Exiftool" toolversion="10.00">2012:04:08 09:50:57</created>
    <filepath toolname="OIS File Information" toolversion="0.2" status="SINGLE_RESULT">/Users/jcoyne/Downloads/dog.jpeg</filepath>
    <filename toolname="OIS File Information" toolversion="0.2" status="SINGLE_RESULT">dog.jpeg</filename>
    <md5checksum toolname="OIS File Information" toolversion="0.2" status="SINGLE_RESULT">ce2204e9ca94f1c80f124dfea460c254</md5checksum>
    <fslastmodified toolname="OIS File Information" toolversion="0.2" status="SINGLE_RESULT">1496241873000</fslastmodified>
  </fileinfo>
  <filestatus />
  <metadata>
    <image>
      <imageWidth toolname="Exiftool" toolversion="10.00">3264</imageWidth>
      <imageHeight toolname="Exiftool" toolversion="10.00">2448</imageHeight>
      <iccProfileName toolname="Exiftool" toolversion="10.00" status="SINGLE_RESULT">sRGB IEC61966-2.1</iccProfileName>
      <iccProfileVersion toolname="Exiftool" toolversion="10.00" status="SINGLE_RESULT">2.1.0</iccProfileVersion>
      <YCbCrSubSampling toolname="Exiftool" toolversion="10.00" status="SINGLE_RESULT">2 2</YCbCrSubSampling>
      <orientation toolname="Exiftool" toolversion="10.00">normal*</orientation>
      <samplingFrequencyUnit toolname="NLNZ Metadata Extractor" toolversion="3.6GA" status="SINGLE_RESULT">in.</samplingFrequencyUnit>
      <xSamplingFrequency toolname="Exiftool" toolversion="10.00">180</xSamplingFrequency>
      <ySamplingFrequency toolname="Exiftool" toolversion="10.00">180</ySamplingFrequency>
      <bitsPerSample toolname="Exiftool" toolversion="10.00">8 8 8</bitsPerSample>
      <digitalCameraManufacturer toolname="Exiftool" toolversion="10.00" status="SINGLE_RESULT">Canon</digitalCameraManufacturer>
      <digitalCameraModelName toolname="Exiftool" toolversion="10.00" status="SINGLE_RESULT">Canon PowerShot SD870 IS</digitalCameraModelName>
      <fNumber toolname="Exiftool" toolversion="10.00">2.8</fNumber>
      <exposureTime toolname="Exiftool" toolversion="10.00" status="CONFLICT">0.0008</exposureTime>
      <exposureTime toolname="NLNZ Metadata Extractor" toolversion="3.6GA" status="CONFLICT">8.0E-4</exposureTime>
      <isoSpeedRating toolname="Exiftool" toolversion="10.00" status="SINGLE_RESULT">80</isoSpeedRating>
      <exifVersion toolname="Exiftool" toolversion="10.00">0220</exifVersion>
      <shutterSpeedValue toolname="Exiftool" toolversion="10.00" status="SINGLE_RESULT">1/1244</shutterSpeedValue>
      <apertureValue toolname="Exiftool" toolversion="10.00" status="CONFLICT">2.8</apertureValue>
      <apertureValue toolname="NLNZ Metadata Extractor" toolversion="3.6GA" status="CONFLICT">2.96875</apertureValue>
      <exposureBiasValue toolname="Exiftool" toolversion="10.00" status="SINGLE_RESULT">0</exposureBiasValue>
      <maxApertureValue toolname="Exiftool" toolversion="10.00" status="CONFLICT">2.8</maxApertureValue>
      <maxApertureValue toolname="NLNZ Metadata Extractor" toolversion="3.6GA" status="CONFLICT">2.96875</maxApertureValue>
      <meteringMode toolname="Exiftool" toolversion="10.00">Pattern</meteringMode>
      <lightSource toolname="NLNZ Metadata Extractor" toolversion="3.6GA" status="SINGLE_RESULT">unknown</lightSource>
      <flash toolname="Exiftool" toolversion="10.00">Flash did not fire, auto mode</flash>
      <focalLength toolname="Exiftool" toolversion="10.00">4.6</focalLength>
      <sensingMethod toolname="Exiftool" toolversion="10.00">One-chip color area sensor</sensingMethod>
    </image>
  </metadata>
  <statistics fitsExecutionTime="555">
    <tool toolname="MediaInfo" toolversion="0.7.75" status="did not run" />
    <tool toolname="OIS Audio Information" toolversion="0.1" status="did not run" />
    <tool toolname="ADL Tool" toolversion="0.1" status="did not run" />
    <tool toolname="VTT Tool" toolversion="0.1" status="did not run" />
    <tool toolname="Droid" toolversion="6.1.5" executionTime="112" />
    <tool toolname="Jhove" toolversion="1.11" executionTime="473" />
    <tool toolname="file utility" toolversion="5.25" executionTime="430" />
    <tool toolname="Exiftool" toolversion="10.00" executionTime="493" />
    <tool toolname="NLNZ Metadata Extractor" toolversion="3.6GA" executionTime="459" />
    <tool toolname="OIS File Information" toolversion="0.2" executionTime="107" />
    <tool toolname="OIS XML Metadata" toolversion="0.2" status="did not run" />
    <tool toolname="ffident" toolversion="0.2" executionTime="404" />
    <tool toolname="Tika" toolversion="1.10" executionTime="430" />
  </statistics>
</fits>

@daveneiman
Copy link
Contributor

You are getting no metadata because of the conflict in the sections. If you either 1) comment out the Droid and file utility tools in the fits.xml config file or 2) add 'jpg' or 'jpeg' to the exclusions list in those tool entries in the same fits.xml, then then those tools won't report, the conflict should go away, and you should see metadata again. On looking at this latest 1.0.5 output you pasted in above, it appears most of your metadata is coming from Exiftool so that metadata would still appear if you try the above suggestion. Curious of your outcome.

@jcoyne
Copy link
Author

jcoyne commented Jun 1, 2017

I'm proposing that FITS should know that "JPEG EXIF" and "JPEG File Interchange Format" is not a conflict. They are the same type and each tool just has a different name for them. If I have to configure FITS to support my content, it's not as useful to me as a tool for figuring out what my content is.

@daveneiman
Copy link
Contributor

After consulting with our digital preservation specialist we agreed that you are correct. We have made a change to the "format tree" which will eliminate the conflict and now recognize JPEG EXIF as a more specific type than "JFIF". In your FITS deployment, find the file xml/fits_format_tree.xml and modify the second stanza to the following. This change will go into the next release of FITS. Thank you for bringing this to our attention.

	<branch format="Raw JPEG Stream">
		<branch format="jpeg">
			<branch format="JPEG File Interchange Format">
				<branch format="JPEG EXIF"/>
			</branch>
		</branch>
	</branch>

@jcoyne
Copy link
Author

jcoyne commented Jun 1, 2017

@daveneiman excellent. Thanks for your hard work.

@conorom
Copy link

conorom commented Aug 17, 2018

In case anyone else is reading this and wondering whether to upgrade/downgrade or edit fits.xml. Confirming that the problem exists at least from 1.0.7 through 1.1.1. Looks like the fix is in 1.2.0 and up.
Funny, it seems this issue is the reason Hyrax essentially recommends sticking with 1.0.5 in its README.

@jcoyne
Copy link
Author

jcoyne commented Aug 17, 2018

@conorom I suspect the reason that Hyrax hasn't changed is that nobody has suggested it. If you want to make a PR on that repo, I suspect they would merge it for you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants