Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[IMAGING-319] Fix for lost first character on String #359

Merged
merged 2 commits into from Apr 13, 2024

Conversation

StefanOltmann
Copy link
Contributor

@StefanOltmann StefanOltmann commented Feb 4, 2024

I encountered a bug while testing the Kotlin Multiplatform port (https://github.com/ashampoo/kim) of this library. The bug is a critical issue causing metadata corruption during the write process.

This bug has been reported at https://issues.apache.org/jira/browse/IMAGING-319. Fortunately, a functional fix was identified by @gwlucastrig, although, for some reason, it hasn't been incorporated into a pull request. All credit for the fix goes to him.

Here is the equivalent code in my library:
https://github.com/Ashampoo/kim/blob/01fbb7825d3ccf53170c4625aad63b713614919e/src/commonMain/kotlin/com/ashampoo/kim/format/tiff/write/TiffWriterLossless.kt#L257-L261

I can confirm that the fix is effective. Removing that specific line causes my unit tests to fail, particularly with the test image photo_28.jpg.

Unfortunately, I'm unable to include this image in the pull request since I don't own it. The image falls under the Unsplash license, requiring attribution, and I cannot make decisions regarding this for Apache.

@garydgregory
Copy link
Member

@StefanOltmann
Thank you for your PR.
You'll need a unit test to avoid future regressions and prove this fixes something.

@StefanOltmann
Copy link
Contributor Author

@garydgregory
Do you want me add the image and add Unsplash attribution to your NOTICE file like this?

https://github.com/Ashampoo/kim/blob/5d1b28e4423e6bc240196d3ad7db25495da095c9/NOTICE.txt#L13-L15

@garydgregory
Copy link
Member

@StefanOltmann
Sounds good to me, ty!

@StefanOltmann
Copy link
Contributor Author

StefanOltmann commented Feb 4, 2024

Alright, this might take a bit of time. I've added it to the regular test files, but none of the existing tests are failing. I'm not entirely sure what they cover, but it seems like they are overlooking some aspects.

In my library, the JpegRewriterTest is able to identify this issue. Unfortunately, the author of ExifRewriterRoundtripTest (#275 by @charleshope) never had the chance to complete it. I believe that completing this test would help to showcase problem and proof that this is the correct fix.

This pull request is also related to https://issues.apache.org/jira/browse/IMAGING-351, as it addresses the same underlying problem.

Also it's worth to note that it does not always affect String fields. It can corrupt any field / offset. In the provided sample the MakerNotes get corrupted, due to an offset shift.

@kinow FYI

@garydgregory
Copy link
Member

@StefanOltmann
I don't see any new tests here.

@StefanOltmann
Copy link
Contributor Author

StefanOltmann commented Feb 4, 2024

I don't see any new tests here.

I did not commit it. I added the file above locally to the test folders hoping that one of the rewrite tests will pick it up and fail.

It gets actually processed, but the existing tests do not fail, because they don’t detect the issue. There is apparently a lack of coverage here.

The test that might pick it up is still disabled, because the author never finished it.

See this comment: #275 (review)

The offsets are actually expected to change and must be ignored. The wrong orientation value is most likely a problem that came from changing the byte order.

I guess I need to fix that test first to have a proper unit test, before I can add an image that makes it fail.

@garydgregory
Copy link
Member

@StefanOltmann
Thanks for the update. We're glad to get help to improve it all 😃

@StefanOltmann
Copy link
Contributor Author

@garydgregory I have fixed the test first. As soon you pulled the change I will refresh this PR.

@garydgregory
Copy link
Member

@garydgregory I have fixed the test first. As soon you pulled the change I will refresh this PR.

See my comments in your other PR. To summarize, please provide your main change and test in the same PR.

@StefanOltmann
Copy link
Contributor Author

See my comments in your other PR. To summarize, please provide your main change and test in the same PR.

As I commented on the other PR you will not see a proof that it fixes something then.

You demanded a proof, because you won't believe my words. So I fixed the roundtrip test that it actually tests something and will show the problem.

If I commit the test and the fix together you won't see it.

This is why I instructed you to drop the test file locally to see it fail.

@StefanOltmann
Copy link
Contributor Author

StefanOltmann commented Feb 24, 2024

@garydgregory As you requested #367 is a combined PR of bugfix and repaired unit test plus a file that shows the problem.

@garydgregory
Copy link
Member

A PR should include main and test changes, not one PR for test changes, and a different PR for main changes.

@StefanOltmann
Copy link
Contributor Author

A PR should include main and test changes, not one PR for test changes, and a different PR for main changes.

Yes, but the bugfix and the test are unrelated.

One PR fixes a bug without a test. The other PR completes a test that’s right now disabled, because it wasn’t working.

@StefanOltmann
Copy link
Contributor Author

StefanOltmann commented Mar 10, 2024

@kinow

Oh, I just see that @gwlucastrig once provided a PR for his fix in #203.

For some reason he closed the PR a few months later, but the fix is still valid.

Looks like a unit test was missing. I provided that with #366, but unfortunately we don't have a test image that can be used to proof the test. You will need to verify that locally without checking the image in.

You could also pull his PR, because he is the one who found and fixed the bug.

@kinow
Copy link
Member

kinow commented Mar 10, 2024

Oh, I will have to re-read the issues and pull requests as I'm a bit out of the loop, but thanks for pinging me here, @StefanOltmann ! I know @gwlucastrig from previous issues and emails. Let's give him some days, and I can ping him by email too if needed. I had a look and the PR was indeed closed... but I cannot recall if there was any reason for that 😥

You could also pull his PR, because he is the one who found and fixed the bug.

Sounds good too, although I am sure he wouldn't mind, will keep it in mind to see if I can pull the branches locally and maybe merge commits or combine to give credit where it's due. I will add something in the changes.xml that's used to publish the website as well.

Thanks @StefanOltmann !

@kinow
Copy link
Member

kinow commented Mar 10, 2024

(in the meantime, approved the GH Actions to run -- gotta go 🏃 )

@codecov-commenter
Copy link

codecov-commenter commented Mar 10, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 71.58%. Comparing base (fd2b5b8) to head (416ce7e).
Report is 79 commits behind head on master.

Additional details and impacted files
@@             Coverage Diff              @@
##             master     #359      +/-   ##
============================================
+ Coverage     71.48%   71.58%   +0.09%     
- Complexity     3580     3584       +4     
============================================
  Files           352      352              
  Lines         16919    16907      -12     
  Branches       2648     2648              
============================================
+ Hits          12095    12103       +8     
+ Misses         3785     3764      -21     
- Partials       1039     1040       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@garydgregory
Copy link
Member

Still no unit test. I'm not sure how we can avoid a future regression without one. 😞

@StefanOltmann
Copy link
Contributor Author

Still no unit test. I'm not sure how we can avoid a future regression without one. 😞

Wrong. There is a unit test in #366.

@kinow
Copy link
Member

kinow commented Mar 29, 2024

Reviewed #366, I think that one can be merged. That pull request fixes a @Disabled test that does the roud-trip test for EXIF data.

Once that's merged, and the test is re-enabled, there is a test image in #366 that should fail. I think here we have to:

  1. Wait for [IMAGING-351] ExifRewriterRoundtripTest fixed, completed & enabled #366 to be merged
  2. Use the test image to confirm the test fails for that image
  3. Check out this branch, re-run the test with the new image
  4. Review the license for that image, and if we really cannot use it, then decide what to do (I would spend some time checking if a similar image can be crafted or found somewhere else)
  5. Once the image is added, confirm the code here is being covered by the test+new image, and approve and merge! 🚀

Cheers

@StefanOltmann
Copy link
Contributor Author

StefanOltmann commented Mar 29, 2024

@kinow Great news, thanks!

Maybe we can use the image attached to https://issues.apache.org/jira/browse/IMAGING-319. I just asked the reporter.

@kinow
Copy link
Member

kinow commented Mar 29, 2024

@kinow Great news, thanks!

Maybe we can use the image attached to https://issues.apache.org/jira/browse/IMAGING-319. I just asked the reporter.

Excellent idea! That'd save us a lot of time, thanks for the initiative, @StefanOltmann 👍

@kinow
Copy link
Member

kinow commented Apr 13, 2024

Merged your other PR, @StefanOltmann (thanks very much for that), I'll push a new commit here but that'll be just to rebase it for now 👍

@StefanOltmann
Copy link
Contributor Author

StefanOltmann commented Apr 13, 2024

@kinow Yes, you can pull this one or the original #203, which are both the same change @gwlucastrig proposed in JIRA. Credit for this should solely go to him. :)

Unfortunately the author of https://issues.apache.org/jira/browse/IMAGING-319 did not respond. So we could ask legal if attaching an image to the JIRA ticket can be seen as a permission to use that one. Just to make progress and have this nasty bug fixed for 1.0-alpha5

@kinow
Copy link
Member

kinow commented Apr 13, 2024

Thanks @StefanOltmann . I searched my hard drive for JPEG images to check if I could reproduce the issue with any, but only the IMAGING-319 had this issue (maybe it happened in another field, but I only tested for the Offset Time like the issue reported).

package org.apache.commons.imaging;

import org.apache.commons.imaging.common.ImageMetadata;
import org.apache.commons.imaging.formats.jpeg.JpegImageMetadata;
import org.apache.commons.imaging.formats.jpeg.exif.ExifRewriter;
import org.apache.commons.imaging.formats.tiff.TiffImageMetadata;
import org.apache.commons.imaging.formats.tiff.write.TiffOutputSet;
import org.apache.commons.io.IOUtils;

import java.io.BufferedOutputStream;
import java.io.File;
import java.io.IOException;
import java.nio.charset.Charset;
import java.nio.file.Files;
import java.util.Objects;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

/**
 * Search for images that cause the bug in IMAGING-319.
 */
public class LibraryTest {

    public static void main(String[] args) throws ImagingException, IOException, InterruptedException {
        File path = new File("/home/kinow/Desktop/haystack/");
        String needle = "Offset Time\\s+:\\s+";
        Pattern pattern = Pattern.compile(String.format("^%s$", needle), Pattern.MULTILINE);
        for (File source : Objects.requireNonNull(path.listFiles(s -> s.getName().toLowerCase().endsWith(".jpg")))) {
            File result = Files.createTempFile("test_", ".jpg").toFile();
            try {
                final ImageMetadata metadata = Imaging.getMetadata(source);
                final JpegImageMetadata jpegMetadata = (JpegImageMetadata) metadata;
                if (jpegMetadata != null) {
                    final TiffImageMetadata exif = jpegMetadata.getExif();
                    if (exif != null) {
                        TiffOutputSet outputSet = exif.getOutputSet();
                        BufferedOutputStream bufferedOutputStream = new BufferedOutputStream(Files.newOutputStream(result.toPath()));
                        new ExifRewriter().updateExifMetadataLossless(source, bufferedOutputStream, outputSet);

                        String[] cmd = {
                                "/bin/sh",
                                "-c",
                                String.format("exiftool %s | grep \"Offset Time\"", result.getAbsolutePath())
                        };
                        Process process = Runtime.getRuntime().exec(cmd);
                        process.waitFor();
                        String output = IOUtils.toString(process.getInputStream(), Charset.defaultCharset());
                        // System.out.println(output);

                        Matcher m = pattern.matcher(output);
                        if (m.find()) {
                            System.out.printf("Bug detected in %s%n", source.getAbsolutePath());
                        }
                    }
                }
            } catch (RuntimeException|ImagingException ignore) {
                ignore.printStackTrace();
            } // invalid metadata or bad format
        }
    }
}

I will debug the code again to either craft a test image by hand, or write a test code + mock. Let's see if that works...

@StefanOltmann
Copy link
Contributor Author

StefanOltmann commented Apr 13, 2024

@kinow I spent some time looking around different sources, too. Luckily I just found another test image.

It's https://github.com/drewnoakes/metadata-extractor-images/blob/main/jpg/Sony%20DigitalMavica.jpg

There is no LICENSE file in the repository, but the README says You are free to use these media files however you wish.

For my metadata library I used some other images of that repository, too. I added them to my NOTICE for completeness: https://github.com/Ashampoo/kim/blob/9ab5fb627d3d450286c94d1d7bb0f3335629c6c2/NOTICE.txt#L17C1-L19C73

But I believe this is not strictly necessary in this case.

What do you think? Sounds like "public domain with no attribution required" to me.

@kinow
Copy link
Member

kinow commented Apr 13, 2024

Thanks for searching @StefanOltmann !

I modified the script to search for any field in the diff output that is empty, and ran on some personal photos and found one!

Code:

package org.apache.commons.imaging;

import org.apache.commons.imaging.common.ImageMetadata;
import org.apache.commons.imaging.formats.jpeg.JpegImageMetadata;
import org.apache.commons.imaging.formats.jpeg.exif.ExifRewriter;
import org.apache.commons.imaging.formats.tiff.TiffImageMetadata;
import org.apache.commons.imaging.formats.tiff.write.TiffOutputSet;
import org.apache.commons.io.IOUtils;

import java.io.BufferedOutputStream;
import java.io.File;
import java.io.IOException;
import java.nio.charset.Charset;
import java.nio.file.Files;
import java.util.Objects;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

/**
 * Search for images that cause the bug in IMAGING-319.
 */
public class LibraryTest {
//
//    public static void main2(String[] args) throws ImagingException, IOException {
//        File source = new File("/home/kinow/Desktop/haystack/20220514_102409.jpg");
//        File result = new File("/home/kinow/Desktop/haystack/editted-20220514_102409.jpg");
//        final ImageMetadata metadata = Imaging.getMetadata(source);
//        final JpegImageMetadata jpegMetadata = (JpegImageMetadata) metadata;
//        final TiffImageMetadata exif = jpegMetadata.getExif();
//        TiffOutputSet outputSet = exif.getOutputSet();
//        BufferedOutputStream bufferedOutputStream = new BufferedOutputStream(Files.newOutputStream(result.toPath()));
//        new ExifRewriter().updateExifMetadataLossless(source, bufferedOutputStream, outputSet);
//    }

    public static void main(String[] args) throws ImagingException, IOException, InterruptedException {
        File path = new File("/home/kinow/Desktop/haystack/");
        String needle = ".*([\\s+])*\\s+:\\s+";
        Pattern pattern = Pattern.compile(String.format("^%s$", needle), Pattern.MULTILINE);
        for (File source : Objects.requireNonNull(path.listFiles(s -> s.getName().toLowerCase().endsWith(".jpg")))) {
            File result = Files.createTempFile("test_", ".jpg").toFile();
            try {
                final ImageMetadata metadata = Imaging.getMetadata(source);
                final JpegImageMetadata jpegMetadata = (JpegImageMetadata) metadata;
                if (jpegMetadata != null) {
                    final TiffImageMetadata exif = jpegMetadata.getExif();
                    if (exif != null) {
                        TiffOutputSet outputSet = exif.getOutputSet();
                        BufferedOutputStream bufferedOutputStream = new BufferedOutputStream(Files.newOutputStream(result.toPath()));
                        new ExifRewriter().updateExifMetadataLossless(source, bufferedOutputStream, outputSet);

                        String[] cmd = {
                                "/bin/sh",
                                "-c",
                                String.format("/home/kinow/Desktop/haystack/test.sh %s %s", source.getAbsolutePath(), result.getAbsolutePath())
                                // String.format("exiftool %s | grep \"Offset Time\"", result.getAbsolutePath())
                        };
                        Process process = Runtime.getRuntime().exec(cmd);
                        process.waitFor();
                        String output = IOUtils.toString(process.getInputStream(), Charset.defaultCharset());
                        // System.out.println(output);

                        Matcher m = pattern.matcher(output);
                        if (m.find()) {
                            System.out.printf("Bug detected in %s%n", source.getAbsolutePath());
                        }
                    }
                }
            } catch (RuntimeException|ImagingException ignore) {
                ignore.printStackTrace();
            } // invalid metadata or bad format
        }
    }
}

Test image (I drew something some years ago, covered parts of it that were irrelevant).

20220514_102409

To confirm this is causing the issue, I tried the following:

  • Copied it in data.images.jpg.8, ran ExifRewriterRoundtripTest and it failed for that file (master)
  • Changed to the code of this pull request, all tests passed! 🎉
  • Then re-ran the sample code from JIRA, then compared EXIF metadata with exiftool + diff
kinow@ranma:~/Desktop/haystack$ diff <(exiftool 20220514_102409.jpg) <(exiftool editted-20220514_102409.jpg)
2c2
< File Name                       : 20220514_102409.jpg
---
> File Name                       : editted-20220514_102409.jpg
5,8c5,8
< File Modification Date/Time     : 2024:04:14 00:29:34+02:00
< File Access Date/Time           : 2024:04:14 00:29:37+02:00
< File Inode Change Date/Time     : 2024:04:14 00:29:34+02:00
< File Permissions                : -rwxrwxrwx
---
> File Modification Date/Time     : 2024:04:14 00:36:13+02:00
> File Access Date/Time           : 2024:04:14 00:36:13+02:00
> File Inode Change Date/Time     : 2024:04:14 00:36:13+02:00
> File Permissions                : -rw-rw-r--
15c15
< Camera Model Name               : SM-S901E
---
> Camera Model Name               : 
29c29
< Offset Time                     : +12:00
---
> Offset Time                     : +12:00.
40c40
< Thumbnail Offset                : 586
---
> Thumbnail Offset                : 38

(Camera Model Name ended up being replaced by empty string, like the JIRA example)

  • Finally, added a quick test code in TiffImageWriterLossless.java (fixed in this branch) with the condition where the bug was fixed, see below
                if (offset != length && (offset & 1L) != 0 && length > outputItemLength) {
                    int i = 1; // breakpoint here to just confirm the file triggers this...
                }

And the breakpoint was reached for the iPhone photo from JIRA, and also for my image file 🎉

I will clean up my local modifications, and add my test image to this branch, then rebase + squash. Feel free to check the image file and confirm it if you'd like, @StefanOltmann. Thank you for the help, and for the JIRA OP for providing a useful code snippet to verify the issue.

@StefanOltmann
Copy link
Contributor Author

I am glad that we finally have a usable test image for this PR! 🎉💪

Copy link
Member

@kinow kinow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's nearly 1AM now, and I rebased/edited commits. @StefanOltmann feel free to have a look at the changes to confirm everything looks OK.

I've left one extra commit here where I added the image, and the commit message points that the author (I) is the author of the image. Hopefully that way we won't have any more issues with Legal/etc. (someday I'm really going to write a test images generator based on specs).

👋 thanks! (p.s. if you find nothing wrong here, @StefanOltmann , I think this is one more ready to be merged for alpha5).

Thanks for the help and enjoy the weekend!

Bruno

<!-- UPDATE -->
<action dev="ggregory" type="update" due-to="Dependabot">Bump org.apache.commons:commons-parent from 67 to 69 #382.</action>
<action dev="ggregory" type="update" due-to="Dependabot">Bump commons-io:commons-io from 2.16.0 to 2.16.1 #385.</action>
<action issue="IMAGING-351" dev="kinow" type="update" due-to="Stefan Oltmann, Gary Lucas">
<action issue="IMAGING-351" dev="kinow" type="update" due-to="Stefan Oltmann, Gary Lucas, Charles Hope">
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@StefanOltmann didn't have much time to look at history (a bit late, won't have time tomorrow/next week), so I added your suggestion here 👍, thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it’s fair. Charles wrote the first version and did half the job. I think he did not understand why the offsets changed and lost interest in fixing his test, but we are a community, so I stepped in. 🙂

Gary was not actually involved in the unit test, but as he is the hero solving the problem, I’m fine with mentioning him. 🤷‍♂️

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

~450kb, tried reducing the image with GIMP, but that caused the image to not fail the test anymore 🤷‍♂️ , but I think 450kb for a new test file is OK.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That’s a really small image and should not be a problem.
To my lib I added images that are 5 MB in size. 🫣

@kinow kinow self-assigned this Apr 13, 2024
@StefanOltmann
Copy link
Contributor Author

@kinow By porting this library to Kotlin I found and fixed many minor bugs, but this was the only really serious one as it corrupts users data in a non-recoverable way. (For example the problem of dropping the GpsVersionId, which is optional and has always the same value if the file is written/modified by ExifTool, is not a serious bug.)

I don’t know what the planned release cycles are, but the last release was two years ago. I would recommend against allowing this bug to stay for another two years. People might use this library in production, so I think the next release should be soon. You may agree that the problem is serious enough.

(I don’t understand why Gary made a release with knowledge of this bug anyway.)

@kinow
Copy link
Member

kinow commented Apr 13, 2024

@kinow By porting this library to Kotlin I found and fixed many minor bugs, but this was the only really serious one as it corrupts users data in a non-recoverable way. (For example the problem of dropping the GpsVersionId, which is optional and has always the same value if the file is written/modified by ExifTool, is not a serious bug.)

👍

I don’t know what the planned release cycles are, but the last release was two years ago. I would recommend against allowing this bug to stay for another two years. People might use this library in production, so I think the next release should be soon. You may agree that the problem is serious enough.

We do not really have planned release cycles. Sometimes volunteers have time for cutting releases, sometimes they get busy with other projects, $work, life, etc. There are times where we have a lot of people active, and times of low bandwidth.

For me it is more or less based on weather and some yearly milestones (cold? ==> open source for me, taxes season or good weather ==> no open source for a while for me).

And for the past few years I have been working with Python on HPC with climate models. So I don't have $work time to dedicate to open source Java projects like Commons. But if I work again on Java projects, I will probably be more active around here.

(I don’t understand why Gary made a release with knowledge of this bug anyway.)

RERO, release early, release often (as often as we can). If we have unreleased changes, and one of us (Apache committers) has time, it's common that we just go around releasing multiple components so users can use at least what was not released but fixed.

Unfortunately, I think Gary is the only one that has been doing that, so I must thank him for going around and keeping the release train going on Apache Commons components. We may be slow to fix/release, but believe me, without what Gary is doing, we would definitely be much slower!

Thanks!

@StefanOltmann
Copy link
Contributor Author

All good. Your changes look good to me. 🙂

@garydgregory
Copy link
Member

FWIW and in general I put whomever was involved in an issue in the changes file in chrono order. This avoids (for me) deciding whom is worth mentioning and in what order. HTH.

@kinow
Copy link
Member

kinow commented Apr 13, 2024

All good. Your changes look good to me. 🙂

Thanks for the prompt reply (as always!). Going to merge and update JIRA then.

@kinow kinow merged commit 92d4cd3 into apache:master Apr 13, 2024
15 checks passed
@StefanOltmann StefanOltmann deleted the IMAGING-319 branch April 13, 2024 23:23
@garydgregory
Copy link
Member

As to why a bug was present in any release, we are still in alpha and we can create an RC whenever we want, and (in general) I prefer RERO instead of big bang releases.

@StefanOltmann
Copy link
Contributor Author

we can create an RC whenever we want

Out of curiosity:
When do you think will we want a release containing the bugfix?

@garydgregory
Copy link
Member

garydgregory commented Apr 14, 2024

Well, I can do one in the morning (EST) but why don't you make sure there are no other changes you'd like to see in there.

@StefanOltmann
Copy link
Contributor Author

Well, I can do one in the morning (EST) but why don't you make sure there are no other changes you'd like to see in there.

There are no other open PRs for bugfixes. And right now I don’t plan to create one.

I would suggest a timely release because of the serious bug that got fixed by this PR, but of course you don’t have to do that. Especially not for me as I don’t depend on this library (no more).

@garydgregory
Copy link
Member

@kinow Should we create a release candidate for Alpha 5 ASAP or wait?

@kinow
Copy link
Member

kinow commented Apr 14, 2024

I'd wait a bit more to add more things into alpha5. In the meantime users can use the snapshot jars from Maven (IIRC we had that enabled in Jenkins?). For alpha5 I think it'd be good to knock out a few more 1.0 issues, so that we continue progressing towards the final release.

I should have more time this year as I'm finally getting settled in in Spain 🌞 🥘 So I plan to continue voting on threads (a bit quicker so we have enough binding votes sooner rather than later) and slowly working my way through the Imaging/Jena/OpenNLP issues over next months.

But up to you, if you'd like to cut an alpha5 now that's Ok to me too (no harm done :))

Cheers

@StefanOltmann
Copy link
Contributor Author

StefanOltmann commented Apr 14, 2024

Feel free to approach this matter in any way you see fit, but I'd like to offer my perspective:

It's generally ill-advised to utilize a snapshot version of a dependency in a production application, as its behavior can change unpredictably and may not undergo thorough testing.

Currently, this project boasts over 400 stars, suggesting that potentially 400+ individuals or projects are employing it in production environments - thus risking the corruption of user metadata in this very moment. Despite being labeled as "alpha", there's no stable alternative version available - at least not one that's not many years old.

While Android offers "ExifInterface" for handling metadata, JVM desktop applications lack adequate alternatives, especially for writing metadata. While "metadata-extractor" suffices for reading, options for writing are limited, with "pixymeta" being the only one I found. I chose to build upon Commons Imaging for "Ashampoo Kim" due to issues encountered with "pixymeta", notably its tendency to corrupt Makernote data during writing due to a design flaw. Commons Imaging only had this one serious issue, which Gary Lucas addressed in said JIRA comment.

In my view, it's crucial to recognize the likelihood of numerous production projects relying on this library, underscoring the responsibility that accompanies its maintenance. "Alpha" status or not, people trust Apache projects and you shouldn't let them down.

Should any bugs be discovered in "Ashampoo Kim" I would promptly release a fix — not only due to the dependency of my own production project "Ashampoo Photos", but also out of a broader sense of responsibility to the community.

Even with the existence of my library, people might still want to use Commons Imaging, because my library comes with a Kotlin dependency. That's something a pure JVM project might not want.

So in my view Commons Imaging is for pure JVM projects (that need to write metadata and can't accept a Kotlin dependency) still the go-to project for production use. Keep that in mind.

@garydgregory
Copy link
Member

@kinow
I just pushed a snapshot to https://repository.apache.org/content/repositories/snapshots/org/apache/commons/commons-imaging/1.0.0-alpha5-SNAPSHOT in case Jenkins is not set up for that.

@kinow
Copy link
Member

kinow commented Apr 14, 2024

That is a good and very well put argument @StefanOltmann, thanks! I hadn't considered the impact of a user with metadata being written incorrectly, although I am afraid it has probably happened for a while (the JIRA Issue had alpha2 in the impacted version, but I wouldn't be surprised if the code ended up being from when the project was still Sanselan).

So in my view Commons Imaging is for pure JVM projects (that need to write metadata and can't accept a Kotlin dependency) still the go-to project for production use. Keep that in mind.

I started working on Imaging as I needed a pure-java for IIIF servers and couldn't find one, and the metadata is really important.

@garydgregory if you have time to cut a release in the next days/weeks that'd be good. But given you have done so many times, I can do that in one or two weeks if you prefer, so that you can continue with the other projects you were planning to release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants