-
Notifications
You must be signed in to change notification settings - Fork 181
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[IMAGING-345] Unit test : make tests in memory instead of writing hundred of tmp files #271
Conversation
Codecov Report
@@ Coverage Diff @@
## master #271 +/- ##
============================================
+ Coverage 70.76% 70.81% +0.04%
- Complexity 3368 3373 +5
============================================
Files 332 332
Lines 16936 16936
Branches 2652 2652
============================================
+ Hits 11985 11993 +8
+ Misses 3904 3898 -6
+ Partials 1047 1045 -2
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm busy with $work these days, but
An attempt to perform unit tests in memory using byte array when possible instead of writing temporary files.
Tests pass much faster and do not make useless write on my SSD.
I know that having the tmp files might be useful in some case, if you think it it necessary I could update the patch to write tmp file in case of failed test.
I think it could be useful in some cases, like if you want to debug/troubleshoot it, but I think speeding up the tests and reducing IO sound like a good idea. In case a test fails and we need to troubleshoot, it shouldn't be too hard to write it to disk to investigate (assuming that's needed at all). So I'm a big +1 for this, @kpouer .
I had a quick look and looks like you are adding some public methods too, in non-test files. Were these necessary for the change in this PR, @kpouer ?
Final note, it'd be necessary to squash the commits later. If you are still sorting out how to organize the change, feel free to keep it as is until you have it ready. Otherwise, squash it so that it's ready to be merged, please.
Thanks heaps!
Keep in mind that tests must be able to run in containers that may not provide much memory. |
Hi @kinow, public static ICC_Profile getICCProfile(final InputStream is, final String fileName) Then I created About the troubleshooting I'll update the branch in order to save tmp files for failed test. @garydgregory I don't think it will take a lot of memory as images are processed one by one, I think it should store something like 2 or 3 time the same image during some tests, then the memory is released to the GC to process the next image. And test images do not seem to be huge. However I am interested but not sure about what you mean by tested in containers. Do I have a way to test it to validate ? |
I was referring to GitHub Actions builds being resource constrained (to what extent I don't know exactly). |
Hey, I was able to do the same test without changing public api so I reverted the change (even if I think those functions might be useful, it could be done in another PR if necessary. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @kpouer can you squash your commits? And would you be able to create a JIRA for this? Even though it's a change in the tests I'd like to review (will try to compare number of IO ops, and memory on my laptop at least) and include it in the changelog for the next release. Thanks!
Hey, I was not sure how I could squash commits without deleting my branch and pushing it again so I created a new branch and a new pull request (same code but squashed into one commit) |
Hi @kpouer
It's probably useful to get used to squashing commits, both for Apache Commons code contributions, but for other OSS that follow similar approach. The issue with opening new PR's is that a) you may have to add more commits again, and then squash again, and b) it gets more complicated to keep track of discussions and reviews (there's also c) where users/devs subscribed to the repo may get spammed by similar changes). If you'd like you can try on this PR, as it's still open: https://gitready.com/advanced/2009/02/10/squashing-commits-with-rebase.html I squash rebasing interactively. Basically what I'd do here would be something similar to: git fetch --all
git rebase upstream/master # upstream for me is apache/commons-imaging, kinow for me is the kinow/commons-imaging
gitk # or git log -n N, but I like to take a look at the GUI too, then count the number of commits since the top of master
# you have 17 commits here, ATOW
git rebase -i HEAD~17 # -i is interactive, i.e. I will interact with git to decide how to rebase
# each commit is prefixed by an action code, I mark the first line as r-eword
# and for each new commit line in the VIM/editor, I mark as f-ixup When I save my editor, When I save this file again, git will use the text as the commit message of the reworded commit. The other commits have been all "fixed-up". So now everything has been "squashed" into a single commit. At this point I At this point your local branch contains the squashed commits for your change. You can browse the code, run And the pull request will have been updated with your forced (-f) commit, showing a single commit for reviewers 👍 If that works you can close your other PR. Or if you prefer to try this later, feel free to close this issue 👍 up to you.
I am sorry, due to bots SPAMing our JIRA server we had to lock it for new users 😥 Send the email, please, to the commons private mail (it's the email address you mentioned above, minus the "-imaging" part) with the info on https://infra.apache.org/jira-guidelines.html#account, and I or another committer will create your account. Really sorry for the inconvenience 🙇 Thanks @kpouer ! |
eddd5bf
to
1947c86
Compare
Yes I was able to squash but was missing the "force" option for the push |
1947c86
to
23208d9
Compare
Hi, I just created a JIRA ticket for this patch and updated the commit description |
Sorry for the delay, next on my list! 🤓 |
src/test/java/org/apache/commons/imaging/formats/bmp/BmpRoundtripTest.java
Outdated
Show resolved
Hide resolved
final ImagingParameters params = Util.getImageParser(formatInfo.format).getDefaultParameters(); | ||
try (FileOutputStream fos = new FileOutputStream(temp1)) { | ||
imageParser.writeImage(testImage, fos, params); | ||
try (ByteArrayOutputStream bos = new ByteArrayOutputStream(1000000)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe leave it the default value and let it handle expanding the internal buffer? That could be helpful if we load many smaller images, I think (although I have no idea the distribution of sizes of our images & their metadata too).
00b6560
to
94aba6c
Compare
Used Docker to run the build from this PR limiting the allowed memory for the container. The build crashed at 64mb, 128mb, 256mb. And it passed for 512mb and higher. Switched to docker run -it --rm --name my-maven-project -v "$(pwd)":/usr/src/mymaven -w /usr/src/mymaven --memory=256m maven:3.8-openjdk-17 mvn clean install I think the overhead of doing everything in memory is not noticeable, and also the operations to read/parse & write might end up loading things to memory anyway before writing to disk. Still, might be a good idea to implement this change somehow that's not hard to undo it if needed (maybe encapsulate it into a single class/function reused?) |
94aba6c
to
a49411b
Compare
I updated the branch, first I wanted to comment out code to create tmp files for troubleshooting but finally I just did enclose it in a if (false) {} block, this way no problem with imports and it is even easiest to set it to true for troubleshooting. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kpouer I'd be OK removing the if (false)
block, or leaving it commented out in one test class, or adding a README.md
in the test directory, etc.
As more tests are added, I guess other devs may ignore copying that if (false)
block, and it's not very hard to change the code to write to the disk in case a test is failing.
Furthermore, it's hard to be consistent with this code as it's not used (see comments where in some blocks there's a call to LOGGER.severe
, while others are missing it.
What do you think?
src/test/java/org/apache/commons/imaging/formats/png/PngTextTest.java
Outdated
Show resolved
Hide resolved
src/test/java/org/apache/commons/imaging/formats/png/PngWriteForceTrueColorText.java
Outdated
Show resolved
Hide resolved
src/test/java/org/apache/commons/imaging/formats/png/PngWriteForceTrueColorText.java
Outdated
Show resolved
Hide resolved
src/test/java/org/apache/commons/imaging/roundtrip/RoundtripBase.java
Outdated
Show resolved
Hide resolved
Hi @kinow I think you are right, those blocks adds complexity to the test and I am not sure they will be so useful. In the rare case someone really needs to see the file it is finally easy to add a few lines of code to write the file. I will remove them. |
Excellent @kpouer . Feel free to ignore/resolve my last review comments then. I believe most/all are about the missing |
a49411b
to
598e5e4
Compare
src/test/java/org/apache/commons/imaging/formats/jpeg/exif/ExifRewriteTest.java
Show resolved
Hide resolved
src/test/java/org/apache/commons/imaging/formats/jpeg/iptc/IptcUpdateTest.java
Outdated
Show resolved
Hide resolved
src/test/java/org/apache/commons/imaging/roundtrip/NullParametersRoundtripTest.java
Show resolved
Hide resolved
src/test/java/org/apache/commons/imaging/roundtrip/ImageAsserts.java
Outdated
Show resolved
Hide resolved
598e5e4
to
95405b1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my comment about the JUnit method, but I think this is my last review feedback. It should be ready for a final pass of 👀 and merge. Thanks @kpouer !!!
src/test/java/org/apache/commons/imaging/roundtrip/ImageAsserts.java
Outdated
Show resolved
Hide resolved
Hey I missed the notification, yes a simple assertArrayEquals would be more efficient, do you want that I remove it ? |
Yes, if possible. Rebase and fix the conflicts, and then we can get it ready to be merged. Thanks @kpouer ! |
95405b1
to
0cc46af
Compare
done |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent, @kpouer ! I will add a changelog and merge it. Thank you for your contribution, and for the patience!
Merged! |
An attempt to perform unit tests in memory using byte array when possible instead of writing temporary files.
Tests pass much faster and do not make useless write on my SSD.
I know that having the tmp files might be useful in some case, if you think it it necessary I could update the patch to write tmp file in case of failed test.