Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JHOVE outputs non-XML content to stdout for certain files #20

Closed
mistydemeo opened this issue Sep 15, 2014 · 7 comments
Closed

JHOVE outputs non-XML content to stdout for certain files #20

mistydemeo opened this issue Sep 15, 2014 · 7 comments

Comments

@mistydemeo
Copy link
Contributor

When FITS's embedded JHOVE is enabled, certain files will print non-XML content to stdout. For example, here's the first five lines when running on balloon_eciRGBv2_ps_adobeplugin.jpf from the OPF format corpus:

READBOX seen=true
<?xml version="1.0" encoding="UTF-8"?>
<fits xmlns="http://hul.harvard.edu/ois/xml/ns/fits/fits_output" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://hul.harvard.edu/ois/xml/ns/fits/fits_output http://hul.harvard.edu/ois/xml/xsd/fits/fits_output.xsd" version="0.8.2" timestamp="15/09/14 10:00 AM">
  <identification>
    <identity format="JPEG 2000 JP2" mimetype="image/jp2" toolname="FITS" toolversion="0.8.2">

Since the first line isn't valid XML, anything that attempts to parse FITS's output as XML will fail.

This doesn't happen when JHOVE is disabled in the FITS configuration file, so the problem appears to be coming from the embedded version of JHOVE specifically.

@andreagoethals
Copy link

I hadn't seen that before but confirmed that the non-XML is printed out with FITS version 0.8.2. We don't accept JPX format (just JP2) in our repository so we haven't tested FITS with this format.

@spmcewen
Copy link
Contributor

I believe that's just messages from Jhove being sent to the standard output stream. If you output to a file (-o if I remember correctly) you shouldn't see it. It's a similar problem as nlnz internally using System.out.println() to show errors.

On Sep 15, 2014, at 1:06 PM, "Misty De Meo" <notifications@github.commailto:notifications@github.com> wrote:

When FITS's embedded JHOVE is enabled, certain files will print non-XML content to stdout. For example, here's the first five lines when running on balloon_eciRGBv2_ps_adobeplugin.jpfhttps://github.com/openplanets/format-corpus/raw/master/jp2k-test/icc/balloon_eciRGBv2_ps_adobeplugin.jpf from the OPF format corpushttps://github.com/openplanets/format-corpus:

READBOX seen=true

Since the first line isn't valid XML, anything that attempts to parse FITS's output as XML will fail.

This doesn't happen when JHOVE is disabled in the FITS configuration file, so the problem appears to becoming from the embedded version of JHOVE specifically.

Reply to this email directly or view it on GitHubhttps://github.com//issues/20.

@mistydemeo
Copy link
Contributor Author

Sorry for the late reply.

Thanks for the advice. Unfortunately, our workflow depends on stdout - we're not writing files to disk, but piping data between processes. Does it seem like it would be feasible for FITS to trap the stdout from the tools it's running to prevent this?

@andreagoethals
Copy link

Hi Misty. Our main FITS developer is temporarily away so he can't comment on this...

@jcoyne
Copy link

jcoyne commented Aug 31, 2015

@mistydemeo I'm running into this now. Did you figure out a solution?

jcoyne added a commit to samvera/hydra-file_characterization that referenced this issue Aug 31, 2015
This is due to JHOVE putting some log messages on STDOUT
See harvard-lts/fits#20
@ross-spencer
Copy link

Is it happening with the same files in standalone versions of JHOVE? - Has it been fixed in more recent versions of the tool?

@daveneiman
Copy link
Member

Ran this JPF file with FITS 1.2.0 (which contains JHOVE 1.16) and there was no non-XML output to the console.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants