Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSON output handler via jhove-gui will only output information for one file where other handlers list all objects validated #667

Closed
ross-spencer opened this issue Apr 8, 2021 · 3 comments · Fixed by #728
Assignees
Labels
P1 High priority issues to be scheduled in the upcoming release
Milestone

Comments

@ross-spencer
Copy link

When using jhove-gui one can scan multiple objects at a time and see multiple results in the window. When those results are exported it is expected that both results are in the output.

For JSON output, only one of the results is returned, i.e. the JSON results display the results of just one file. Take this example (it probably doesn't reveal much as it is just one file... but two files were scanned)

{
	"jhove": {
		"name": "JhoveView",
		"release": "1.25.0-SNAPSHOT",
		"date": "2021-04-07",
		"executionTime": "2021-04-08T09:50:13+02:00",
		"repInfo": {
			"uri": "/tmp/opf/Summary.pdf",
			"reportingModule": {
				"name": "PDF-hul",
				"release": "1.12.2",
				"date": "2019-12-10"
			},
			"lastModified": "2021-04-08T00:05:16+02:00",
			"size": 132877,
			"format": "PDF",
			"version": "1.4",
			"status": "Not well-formed",
			"sigMatch": ["PDF-hul"],
			"messages": [{
				"message": "Unexpected exception java.lang.NullPointerException",
				"severity": "error",
				"id": "PDF-HUL-94"
			}],
			"mimeType": "application/pdf",
			"properties": [{
				"PDFMetadata": [{
					"Objects": 38
				}, {
					"FreeObjects": 3
				}, {
					"IncrementalUpdates": 2
				}, {
					"DocumentCatalog": [{
						"PageLayout": "SinglePage"
					}, {
						"PageMode": "UseNone"
					}]
				}, {
					"Info": [{
						"Title": "DocuSign-Zertifikat"
					}, {
						"Author": ""
					}, {
						"Subject": "DocuSign-Zertifikat"
					}]
				}, {
					"ID": ["0x35663464323637302d633562612d346261622d623964322d636163386437306164313131", "0x9d91f58b1574d3a15937b1cd2b36d684"]
				}, {
					"XMP": "<x:xmpmeta xmlns:x=\"adobe:ns:meta/\" x:xmptk=\"Adobe XMP Core 5.1.0-jc003\">\n  <rdf:RDF xmlns:rdf=\"http://www.w3.org/1999/02/22-rdf-syntax-ns#\">\n    <rdf:Description rdf:about=\"\"\n        xmlns:pdf=\"http://ns.adobe.com/pdf/1.3/\"\n        xmlns:xmp=\"http://ns.adobe.com/xap/1.0/\"\n        xmlns:dc=\"http://purl.org/dc/elements/1.1/\"\n      pdf:Producer=\"PDFKit.NET 21.1.102.20091\"\n      pdf:Keywords=\"\"\n      pdf:PDFVersion=\"1.4\"\n      xmp:CreateDate=\"2021-04-08T00:05:16-07:00\"\n      xmp:ModifyDate=\"2021-04-08T00:05:16-07:00\"\n      xmp:CreatorTool=\"\"\n      xmp:MetadataDate=\"2021-04-08T00:05:16-07:00\"\n      dc:format=\"application/pdf\">\n      <dc:creator>\n        <rdf:Seq>\n          <rdf:li/>\n        </rdf:Seq>\n      </dc:creator>\n      <dc:subject>\n        <rdf:Bag/>\n      </dc:subject>\n      <dc:description>\n        <rdf:Alt>\n          <rdf:li xml:lang=\"x-default\">DocuSign-Zertifikat</rdf:li>\n        </rdf:Alt>\n      </dc:description>\n      <dc:title>\n        <rdf:Alt>\n          <rdf:li xml:lang=\"x-default\">DocuSign-Zertifikat</rdf:li>\n        </rdf:Alt>\n      </dc:title>\n    </rdf:Description>\n  </rdf:RDF>\n</x:xmpmeta>"
				}, {
					"Pages": [{
						"Page": [{
							"Sequence": 1
						}, {
							"Annotations": [{
								"Annotation": [{
									"Subtype": "Widget"
								}, {
									"Rect": [0, 0, 0, 0]
								}, {
									"Flags": 132
								}, {
									"AppearanceDictionary": true
								}]
							}]
						}]
					}]
				}]
			}]
		}
	}
}

The equivalent output for text and xml look as follows:

Note: 2x repInfo uri

<?xml version="1.0" encoding="utf-8"?>
<jhove xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://schema.openpreservation.org/ois/xml/ns/jhove" xsi:schemaLocation="http://schema.openpreservation.org/ois/xml/ns/jhove https://schema.openpreservation.org/ois/xml/xsd/jhove/1.8/jhove.xsd" name="JhoveView" release="1.25.0-SNAPSHOT" date="2021-04-07">
  <date>2021-04-08T09:49:07+02:00</date>
  <repInfo uri="/tmp/opf/combined_Please_review__sign_your_document.pdf">
    <reportingModule release="1.12.2" date="2019-12-10">PDF-hul</reportingModule>
    <lastModified>2021-04-08T09:05:38+02:00</lastModified>
    <size>319340</size>
    <format>PDF</format>
    <version>1.4</version>
    <status>Not well-formed</status>
    <sigMatch>
      <module>PDF-hul</module>
    </sigMatch>
    <messages>
      <message severity="error" id="PDF-HUL-94">Unexpected exception java.lang.NullPointerException</message>
    </messages>
    <mimeType>application/pdf</mimeType>
    <properties>
      <property>
        <name>PDFMetadata</name>
        <values arity="List" type="Property">
		...
	  </property>
    </properties>
  </repInfo>
  <repInfo uri="/tmp/opf/Summary.pdf">
    <reportingModule release="1.12.2" date="2019-12-10">PDF-hul</reportingModule>
    <lastModified>2021-04-08T00:05:16+02:00</lastModified>
    <size>132877</size>
    <format>PDF</format>
    <version>1.4</version>
    <status>Not well-formed</status>
    <sigMatch>
      <module>PDF-hul</module>
    </sigMatch>
    <messages>
      <message severity="error" id="PDF-HUL-94">Unexpected exception java.lang.NullPointerException</message>
    </messages>
    <mimeType>application/pdf</mimeType>
    <properties>
      <property>
        <name>PDFMetadata</name>
        <values arity="List" type="Property">
		...
      </property>
    </properties>
  </repInfo>
</jhove>

Note: 2x RepresentationInformation strings

JhoveView (Rel. 1.25.0-SNAPSHOT, 2021-04-07)
 Date: 2021-04-08 09:51:15 CEST
 RepresentationInformation: /tmp/opf/combined_Please_review__sign_your_document.pdf
  ReportingModule: PDF-hul, Rel. 1.12.2 (2019-12-10)
  LastModified: 2021-04-08 09:05:38 CEST
  Size: 319340
  Format: PDF
  Version: 1.4
  Status: Not well-formed
  SignatureMatches:
   PDF-hul
  ErrorMessage: Unexpected exception java.lang.NullPointerException
   ID: PDF-HUL-94
  MIMEtype: application/pdf
  PDFMetadata: 
   ...
   Pages: 
    Page: 
     Sequence: 1
     Annotations: 
      Annotation: 
       Subtype: Widget
       Rect: 0, 0, 0, 0
       Flags: 132
       AppearanceDictionary: true
 RepresentationInformation: /tmp/opf/Summary.pdf
  ReportingModule: PDF-hul, Rel. 1.12.2 (2019-12-10)
  LastModified: 2021-04-08 00:05:16 CEST
  Size: 132877
  Format: PDF
  Version: 1.4
  Status: Not well-formed
  SignatureMatches:
   PDF-hul
  ErrorMessage: Unexpected exception java.lang.NullPointerException
   ID: PDF-HUL-94
  MIMEtype: application/pdf
  PDFMetadata: 
   ...
   Pages: 
    Page: 
     Sequence: 1
     Annotations: 
      Annotation: 
       Subtype: Widget
       Rect: 0, 0, 0, 0
       Flags: 132
       AppearanceDictionary: true

The audit output:

<?xml version="1.0" encoding="utf-8"?>
<jhove xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://schema.openpreservation.org/ois/xml/ns/jhove" xsi:schemaLocation="http://schema.openpreservation.org/ois/xml/ns/jhove https://schema.openpreservation.org/ois/xml/xsd/jhove/1.8/jhove.xsd" name="JhoveView" release="1.25.0-SNAPSHOT" date="2021-04-07">
 <date>2021-04-08T09:49:53+02:00</date>
 <audit home="/home/user/jhove">
  <file mime="application/pdf" status="not well-formed">/tmp/opf/combined_Please_review__sign_your_document.pdf</file>
  <file mime="application/pdf" status="not well-formed">/tmp/opf/Summary.pdf</file>
 </audit>
</jhove>
<!-- Summary by MIME type:
<!-- [mime type]: [file count] ([valid],[well-formed],[not well-formed],[unknown])
application/pdf: 2 (0,0,2,0)
Total: 2 (0,0,2,0)
-->
<!-- Summary by directory:
<!-- [directory]: [file count] ([valid],[well-formed],[not well-formed],[unknown])
/home/user/jhove: 2 (0,0,2,0)
Total: 2 (0,0,2,0)
-->
<!-- Elapsed time: 0:00:01 -->

To repeat this:

  1. Scan two files using jhove-gui.
  2. Select save-as from the File menu.
  3. Save as each of the different formats.

NB. I haven't tried this with the CLI today. It might be worth checking - I'm not familiar what the expected behavior is from there.

Related to #385

@orgabor
Copy link

orgabor commented Jun 17, 2021

Hello, we are using the CLI option and we are facing with the exact same issue described by @ross-spencer

@ambs
Copy link

ambs commented Sep 12, 2021

Same problem here, moving to XML temporarily.

@Slange-Mhath
Copy link

Hey,

I know that this is a long shot, but I think the issue still exists and I was wondering if anyone has an idea how to solve it. It only occurs when we are trying to get the output in JSON, XML seems to be fine.
Looking back into the history it seems like #515 and #544 introduced the JSON output, so we were wondering if this will still be maintained @carlwilson?

Thank you so much!

@carlwilson carlwilson self-assigned this Apr 7, 2022
@carlwilson carlwilson added the P1 High priority issues to be scheduled in the upcoming release label Apr 7, 2022
@carlwilson carlwilson added this to the JHOVE 1.26 milestone Apr 7, 2022
tledoux pushed a commit to tledoux/jhove that referenced this issue Apr 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P1 High priority issues to be scheduled in the upcoming release
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants