Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add support for Grok JPEG 2000 decoder #376

Merged
merged 1 commit into from Aug 20, 2020
Merged

add support for Grok JPEG 2000 decoder #376

merged 1 commit into from Aug 20, 2020

Conversation

boxerab
Copy link
Contributor

@boxerab boxerab commented Jun 9, 2020

Add support for Grok decompressor.

Grok features include:

  1. High performance - currently around 1/2 speed of Kakadu demo decoder
  2. Fast sub-region decode
  3. Supports output to stdout for certain file formats
  4. Supports TLM code stream marker for fast single tile and sub-tile decoding
    of large tiled images
  5. Supports PLT code stream marker for fast sub-tile decoding of large single-tile images
  6. Full support for ICC profiles and other meta-data such as XML, IPTC and XMP
  7. Supports new Part 15 of the standard, aka High Throughput JPEG 2000, which promises
    up to 10x speed up over current Part 1.

@boxerab
Copy link
Contributor Author

boxerab commented Jul 27, 2020

@adolski do you have any issues with the PR ?

@adolski
Copy link
Contributor

adolski commented Jul 28, 2020

Hi @boxerab, sorry for the delay.

This looks pretty much like a renamed OpenJpegProcessor? Is there any reason that that processor couldn't be modified to invoke grk_decompress instead of opj_decompress? (Maybe this would fix #349 at the same time.)

OpenJpegProcessor has a couple of major problems, which are firstly that it can only read from files, and secondly that it relies on a weird technique involving /dev/stdout to pipe image data from the opj_decompress process into the application process. (In Windows it's even worse, because there is no /dev/stdout there, so temporary files are involved.) I assume an implementation that would take better advantage of OpenJPEG or Grok would call directly into its native library. (I'm not familiar with the API but I assume it supports reading source image data from byte arrays or whatever.)

This ties into #388, which describes a longer-term plan for replacing OpenJpegProcessor with an Image I/O plugin for OpenJPEG. Another user has mentioned working on such a plugin, but I haven't had time to look at it: #293 (comment) I wonder if this plugin could be retrofitted to work with Grok--or if it's already compatible?

@boxerab
Copy link
Contributor Author

boxerab commented Jul 28, 2020

This looks pretty much like a renamed OpenJpegProcessor? Is there any reason that that processor couldn't be modified to invoke grk_decompress instead of opj_decompress? (Maybe this would fix #349 at the same time.)

Grok differs somewhat from OpenJPEG, not just in the name of the binary but also in some command line args, so it's probably a good idea to have a separate processor.

OpenJpegProcessor has a couple of major problems, which are firstly that it can only read from files, and secondly that it relies on a weird technique involving /dev/stdout to pipe image data from the opj_decompress process into the application process. (In Windows it's even worse, because there is no /dev/stdout there, so temporary files are involved.) I assume an implementation that would take better advantage of OpenJPEG or Grok would call directly into its native library. (I'm not familiar with the API but I assume it supports reading source image data from byte arrays or whatever.)

Grok supports decoding to stdout on Linux and Windows for certain file formats : BMP for example. To call directly into the API is certainly do-able, but this brings the Grok license into play. It is released under AGPL, so a Cantaloupe release with Grok API would require an AGPL license on the Cantaloupe release. Using the binary doesn't trigger the AGPL.

This ties into #388, which describes a longer-term plan for replacing OpenJpegProcessor with an Image I/O plugin for OpenJPEG. Another user has mentioned working on such a plugin, but I haven't had time to look at it: #293 (comment) I wonder if this plugin could be retrofitted to work with Grok--or if it's already compatible?

Yes it could be retrofitted, but we would have the same issue with licensing.

@adolski
Copy link
Contributor

adolski commented Aug 3, 2020

I see. That makes sense.

Okay, I'm willing to take this if you can add a test for it, and get the test working in continuous integration. You could probably just copy OpenJpegProcessorTest and exclude it from the nodeps profile.

The Grok dependency would then have to be added to the CI Dockerfiles.

I can then add documentation for this new processor to the user manual.

@boxerab
Copy link
Contributor Author

boxerab commented Aug 5, 2020

Thanks, will try to get a working test this week.

@boxerab
Copy link
Contributor Author

boxerab commented Aug 6, 2020

copy OpenJpegProcessorTest and exclude it from the nodeps profile.

The Grok dependency would then have to be added to the CI Dockerfiles.

Hi @adolski , I have added a test, and excluded it form the nodeps profile, as you suggested.
As for the CI, unfortunately there is no debian package yet for Grok (it is currently making its way through the debian release process)- but there are binaries from the Github repo. Also, I guess I could create a ppa for the package. What do you think is the best approach ?

@adolski
Copy link
Contributor

adolski commented Aug 6, 2020

Hi @boxerab, I'll trust your judgment on whatever will be easiest to get working and maintain.

@boxerab
Copy link
Contributor Author

boxerab commented Aug 9, 2020

Hi @boxerab, I'll trust your judgment on whatever will be easiest to get working and maintain.

Thanks, actually, it looks like you use Fedora for the Docker image, so I will go with simple wget from github

@boxerab
Copy link
Contributor Author

boxerab commented Aug 15, 2020

Hi @adolski , I have Grok now installed in the ubuntu-latest, jdk Docker, but I am getting a test failure that I can't figure out, as the error message is quite cryptic. When you have time, can you take a look ?

cantaloupe_1  | [ERROR]   GrokProcessorTest.testProcessWithActualFormatDifferentFromSetFormat:14->AbstractProcessorTest.testProcessWithActualFormatDifferentFromSetFormat:540 Unexpected exception type thrown ==> expected: <edu.illinois.library.cantaloupe.processor.SourceFormatException> but was: <edu.illinois.library.cantaloupe.processor.ProcessorException>

@adolski
Copy link
Contributor

adolski commented Aug 17, 2020

The idea of that test is that when GrokProcessor.setSourceFormat() has been invoked with an argument like Format.JP2, but the underlying source image is some other format, does process() throw a SourceFormatException? Here it is failing because it's throwing a ProcessorException instead.

@boxerab
Copy link
Contributor Author

boxerab commented Aug 17, 2020

The idea of that test is that when GrokProcessor.setSourceFormat() has been invoked with an argument like Format.JP2, but the underlying source image is some other format, does process() throw a SourceFormatException? Here it is failing because it's throwing a ProcessorException instead.

Thanks. So, what's the best way of debugging the exception ?

@adolski
Copy link
Contributor

adolski commented Aug 17, 2020

Does grk_decompress emit some kind of information when given a non-JP2 image to read, that you could check to determine what kind of exception to throw? For example, if it emits a message like, "Error: invalid JP2 image", could you check for that string?

@boxerab
Copy link
Contributor Author

boxerab commented Aug 17, 2020

Does grk_decompress emit some kind of information when given a non-JP2 image to read, that you could check to determine what kind of exception to throw? For example, if it emits a message like, "Error: invalid JP2 image", could you check for that string?

Here is the output when passing in a PNG:

[2020-08-17 11:47:26.506] [error]  edinburgh_bayer.png does not contain a JPEG 2000 code stream
[2020-08-17 11:47:26.507] [error] Unable to open file edinburgh_bayer.png for decoding.

and exit value is 1.

Why is this not a problem for openjpeg, I wonder ? My test is quite similar to the OpenJPEG processor test.

@boxerab
Copy link
Contributor Author

boxerab commented Aug 17, 2020

does not contain a JPEG 2000 code stream

OK, I think I found the problem. As you suggested, grok outputs a different error string when file format is not recognized.

@boxerab
Copy link
Contributor Author

boxerab commented Aug 17, 2020

GrokProcessorTest.testProcessWithActualFormatDifferentFromSetFormat:14->AbstractProcessorTest.testProcessWithActualFormatDifferentFromSetFormat:540 Unexpected exception type thrown ==> expected: <edu.illinois.library.cantaloupe.processor.SourceFormatException> but was: <edu.illinois.library.cantaloupe.processor.ProcessorException>

@adolski is there a way of printing out the message string in the ProcessorException ?

@boxerab
Copy link
Contributor Author

boxerab commented Aug 17, 2020

@adolski after a workaround, all GrokProcessor tests pass on ubuntu! For the Oracle Linux VM, it is possible to upgrade to a newer version of the OS ? I need to build Grok from source on the VM, and the compiler toolchain is a bit old.

@adolski
Copy link
Contributor

adolski commented Aug 17, 2020

after a workaround, all GrokProcessor tests pass on ubuntu! For the Oracle Linux VM, it is possible to upgrade to a newer version of the OS ? I need to build Grok from source on the VM, and the compiler toolchain is a bit old.

Oracle Linux is what the oracle/graalvm-ce image is based on. But the GraalVM tests really just need GraalVM + some kind of Linux--there's no reason they have to use this particular image. If GraalVM could be installed in Ubuntu instead, that would be fine, and then you wouldn't have to compile anything.

@boxerab
Copy link
Contributor Author

boxerab commented Aug 17, 2020

after a workaround, all GrokProcessor tests pass on ubuntu! For the Oracle Linux VM, it is possible to upgrade to a newer version of the OS ? I need to build Grok from source on the VM, and the compiler toolchain is a bit old.

Oracle Linux is what the oracle/graalvm-ce image is based on. But the GraalVM tests really just need GraalVM + some kind of Linux--there's no reason they have to use this particular image. If GraalVM could be installed in Ubuntu instead, that would be fine, and then you wouldn't have to compile anything.

Yes, that would be great. As the compiler version is gcc 4.8.5. I think I will still need to build Grok from source, as it has to link to the same libstdc++6 version as is on the VM. But, 4.8.5 won't work.

@boxerab
Copy link
Contributor Author

boxerab commented Aug 17, 2020

after a workaround, all GrokProcessor tests pass on ubuntu! For the Oracle Linux VM, it is possible to upgrade to a newer version of the OS ? I need to build Grok from source on the VM, and the compiler toolchain is a bit old.

Oracle Linux is what the oracle/graalvm-ce image is based on. But the GraalVM tests really just need GraalVM + some kind of Linux--there's no reason they have to use this particular image. If GraalVM could be installed in Ubuntu instead, that would be fine, and then you wouldn't have to compile anything.

So, what needs to happen to switch GraalVM images?

@adolski
Copy link
Contributor

adolski commented Aug 17, 2020

I would change the FROM in that Dockerfile to one of the later Ubuntu images and then install GraalVM manually. (I believe that would just involve downloading, unzipping, and also declaring ENV JAVA_HOME=... and ENV GRAALVM_HOME=... in the Dockerfile.) And then the rest of the Dockerfile would end up looking much like the Linux-JDK11 one.

I'm happy to do this if you'd rather not, but it might take me a few days.

@boxerab
Copy link
Contributor Author

boxerab commented Aug 17, 2020

I'm happy to do this if you'd rather not, but it might take me a few days.

Thanks, if you don't mind. Out of curiosity, why can't you run all of the tests on the first VM ?

@adolski
Copy link
Contributor

adolski commented Aug 18, 2020

The new Ubuntu+GraalVM image is available on develop now.

Thanks, if you don't mind. Out of curiosity, why can't you run all of the tests on the first VM ?

That could probably work too... I like that there is one Dockerfile per test environment, but, you're right, it does cause some duplication.

@boxerab
Copy link
Contributor Author

boxerab commented Aug 18, 2020

The new Ubuntu+GraalVM image is available on develop now.

Great, thanks. I've updated the PR. How long should the tests take on that VM?
It looks like it may be hanging.

@boxerab
Copy link
Contributor Author

boxerab commented Aug 19, 2020

After 6 hours:

debconf: unable to initialize frontend: Dialog
debconf: (TERM is not set, so the dialog frontend is not usable.)
debconf: falling back to frontend: Readline
Configuring tzdata
------------------

Please select the geographic area in which you live. Subsequent configuration
questions will narrow this down by presenting a list of cities, representing
the time zones in which they are located.

  1. Africa      4. Australia  7. Atlantic  10. Pacific  13. Etc
  2. America     5. Arctic     8. Europe    11. SystemV
  3. Antarctica  6. Asia       9. Indian    12. US
##[error]The operation was canceled.

@adolski
Copy link
Contributor

adolski commented Aug 19, 2020

Hmm, there must be something in the addition of Grok that is triggering that, because I can't reproduce it. I've pushed a change on develop that I hope will fix it.

@boxerab
Copy link
Contributor Author

boxerab commented Aug 19, 2020

thanks, it looks like that fixed the hang.

@boxerab
Copy link
Contributor Author

boxerab commented Aug 19, 2020

Test still fails, but this is due to other errors, not in Grok processor test.

@boxerab
Copy link
Contributor Author

boxerab commented Aug 19, 2020

Cool, passing now.

@boxerab
Copy link
Contributor Author

boxerab commented Aug 20, 2020

@adolski anything else that needs doing before this is merged ?

@adolski adolski merged commit 7e52d41 into cantaloupe-project:develop Aug 20, 2020
@adolski
Copy link
Contributor

adolski commented Aug 20, 2020

It looks good to me @boxerab! Thanks for all of your work on this.

I will make a note to check at some point whether Grok is available as a Debian package.

Sorry this took so long--between other projects and vacation time you caught me at almost the worst possible time (not your fault).

@boxerab
Copy link
Contributor Author

boxerab commented Aug 20, 2020

Great, thanks! No worries about the time, just glad it was relatively painless to get this merged in.
I will probably create a PPA for the project, and create another MR to update the docker, but that won't be for another few
weeks. Enjoy your vacation !!!

@boxerab
Copy link
Contributor Author

boxerab commented Aug 20, 2020

btw, forgot to mention that Grok will not delete symlink output file, if decode fails (unlike openjpeg)

@adolski
Copy link
Contributor

adolski commented Aug 20, 2020

Thanks @boxerab, and thanks for letting me know about the symlink thing, I will make a note to take a look.

adolski pushed a commit that referenced this pull request Aug 26, 2020
@boxerab
Copy link
Contributor Author

boxerab commented Sep 9, 2020

@adolski surely you are finished your vacation by now ? :)
Just wondering when you are planning on the next cantaloupe release with new support for Grok.
I think folks at https://groups.google.com/g/iiif-discuss would be interested.

@adolski
Copy link
Contributor

adolski commented Sep 9, 2020

Hi @boxerab , I don't know for sure, but maybe early next year. 5.0 needs some more work, but I'm only able to spend around 10% of my time on this project.

@boxerab
Copy link
Contributor Author

boxerab commented Sep 9, 2020

Thanks, sounds good. I have a number of improvements planned for the library, so by early next year it should be trading blows with Kakadu.

@boxerab
Copy link
Contributor Author

boxerab commented Oct 2, 2020

@adolski still waiting for my deb package to be accepted by Debian. In the meantime, I've posted the .deb files for my latest release
on github: https://github.com/GrokImageCompression/grok/releases/tag/v7.6.1

@adolski
Copy link
Contributor

adolski commented Oct 6, 2020

Thanks for letting me know, @boxerab. I will put this on my to-do list. By the way, are you planning to release to macOS Homebrew?

@boxerab
Copy link
Contributor Author

boxerab commented Oct 7, 2020

Thanks. I took a look at the Homebrew docs : seems easy to create a package from a tarball. So, yes, I will look into releasing that way as well.

@boxerab
Copy link
Contributor Author

boxerab commented Jan 2, 2021

@adolski hello and Happy New Year! Just wanted to let you know I've submitted a homebrew PR

Homebrew/homebrew-core#68156

@adolski
Copy link
Contributor

adolski commented Jan 4, 2021

That's great, @boxerab! This will make it more convenient for all Mac users to use and test against Grok. I was already able to install the package, and all of the GrokProcessorTests passed right away. 👍

@boxerab boxerab deleted the grok branch January 14, 2021 03:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants