Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SUREFIRE-2082] - Close file handles asap to prevent breaching the system's maximum number of open files #531

Closed
wants to merge 1 commit into from
Closed

[SUREFIRE-2082] - Close file handles asap to prevent breaching the system's maximum number of open files #531

wants to merge 1 commit into from

Conversation

spannm
Copy link
Contributor

@spannm spannm commented May 6, 2022

This pull request fixes bug SUREFIRE-2082 by closing stdout/stderr capture file handles of a huge test set as early as possible rather than at the completion of the whole set.

Following this checklist to help us incorporate your contribution quickly and easily:

  • Make sure there is a JIRA issue filed
    for the change (usually before you start working on it). Trivial changes like typos do not
    require a JIRA issue. Your pull request should address just this issue, without
    pulling in other changes.
  • Each commit in the pull request should have a meaningful subject line and body.
  • Format the pull request title like [SUREFIRE-XXX] - Fixes bug in ApproximateQuantiles,
    where you replace SUREFIRE-XXX with the appropriate JIRA issue. Best practice
    is to use the JIRA issue title in the pull request title and in the first line of the
    commit message.
  • Write a pull request description that is detailed enough to understand what the pull request does, how, and why.
  • Run mvn clean install to make sure basic checks pass. A more thorough check will
    be performed on your pull request automatically.
  • You have run the integration tests successfully (mvn -Prun-its clean install).

If your pull request is about ~20 lines of code you don't need to sign an
Individual Contributor License Agreement if you are unsure
please ask on the developers list.

To make clear that you license your contribution under
the Apache License Version 2.0, January 2004
you have to acknowledge this by using the following check-box.

@spannm spannm changed the title [SUREFIRE-2082] Close file handles asap to prevent breaching the system's maximum number of open files [SUREFIRE-2082] - Close file handles asap to prevent breaching the system's maximum number of open files May 6, 2022
@slawekjaranowski
Copy link
Member

@sman-81 please verify IT tests result ...

@spannm
Copy link
Contributor Author

spannm commented May 6, 2022

@sman-81 please verify IT tests result ...

wdym? mvn -Prun-its clean install is green locally on this PR.
I'm fairly new to the process. If you kindly provide some pointers, I can get it fixed, however at this point I do not see what's the issue.

@slawekjaranowski
Copy link
Member

Please look at result of build on GH

@slawekjaranowski
Copy link
Member

eg: https://github.com/apache/maven-surefire/actions/runs/2280904237
at the bottom of the page there are artifacts with logs

@spannm
Copy link
Contributor Author

spannm commented May 7, 2022

eg: https://github.com/apache/maven-surefire/actions/runs/2280904237 at the bottom of the page there are artifacts with logs

After combing through the logs I found test failures in Surefire1367AssumptionLogsIT.

I will try to reproduce using the same command line as the Github build job:
mvn --errors --batch-mode --show-version -D"invoker.streamLogsOnFailures" clean install -nsu -P run-its -Dit.test=Surefire1367AssumptionLogsIT

@slawekjaranowski
Branch master fails on the same tests. Therefore does not appear to be a regression introduced by my code changes.

@spannm
Copy link
Contributor Author

spannm commented May 7, 2022

git checkout master
mvn --errors --batch-mode --show-version -D"invoker.streamLogsOnFailures" clean install -nsu -P run-its -Dit.test=Surefire1367AssumptionLogsIT
[ERROR] Failures: 
[ERROR]   Surefire1367AssumptionLogsIT.shouldSeeLogsForked:66->verifyReportA:93 
[ERROR]   Surefire1367AssumptionLogsIT.shouldSeeLogsInPlugin:76->verifyReportA:93 
[ERROR]   Surefire1367AssumptionLogsIT.shouldSeeLogsParallelForked:46->verifyReportB:109 
[ERROR]   Surefire1367AssumptionLogsIT.shouldSeeLogsParallelInPlugin:57->verifyReportB:109 
[ERROR] Tests run: 4, Failures: 4, Errors: 0, Skipped: 0
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-failsafe-plugin:3.0.0-M6:verify (default) on project surefire-its: There are test failures.

@slawekjaranowski
Copy link
Member

@spannm
Copy link
Contributor Author

spannm commented May 7, 2022

... strange but build on CI are green

mysterious!
My repo is up-to-date with upstream apache:master.

Do the CI builds include integration tests?

Is the build green if you run this locally:

git checkout master
mvn --errors --batch-mode --show-version
    -D"invoker.streamLogsOnFailures" clean install -nsu -P run-its -Dit.test=Surefire1367AssumptionLogsIT

@slawekjaranowski
Copy link
Member

Do the CI builds include integration tests?

yes - with many oses

Is the build green if you run this locally:

yes

Apache Maven 3.8.5 (3599d3414f046de2324203b78ddcf9b5e4388aa0)
Maven home: /usr/local/Cellar/maven/3.8.5/libexec
Java version: 1.8.0_322, vendor: Homebrew, runtime: /usr/local/Cellar/openjdk@8/1.8.0+322/libexec/openjdk.jdk/Contents/Home/jre
Default locale: en_GB, platform encoding: UTF-8
OS name: "mac os x", version: "12.3.1", arch: "x86_64", family: "mac"


[INFO] -------------------------------------------------------
[INFO]  T E S T S
[INFO] -------------------------------------------------------
[INFO] Running org.apache.maven.surefire.its.jiras.Surefire1367AssumptionLogsIT
[INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 10.914 s - in org.apache.maven.surefire.its.jiras.Surefire1367AssumptionLogsIT

git show -s
commit fea624d1503eadfcf95e979bdac7438db05669a5 (HEAD -> master, upstream/master, origin/master, origin/HEAD)

@Tibor17
Copy link
Contributor

Tibor17 commented May 7, 2022

Utf8RecodingDeferredFileOutputStream.java should not be changed from Java NIO to Java IO, it cost me some time to do it with NIO. You involved two interests in one method. Write is write operation. Close is close.
I can show you my proposal to keep the file open between testStarting and test*** but then the method writeTo has to reopen it for reading. But is still does not make sense to do this because I am reworking it. Currently it is supposed that the test events are sequential.

@spannm
Copy link
Contributor Author

spannm commented May 9, 2022

Utf8RecodingDeferredFileOutputStream.java should not be changed from Java NIO to Java IO, it cost me some time to do it with NIO.

The member file was back changed from Path to File, as the original code mostly operates on File (Path.toFile() called in serveral places).

You involved two interests in one method. Write is write operation. Close is close.

The original code opens the file in write, it writes the file in free. There was no clear separation of interests before. Why should my pull request have to introduce a separation?

I can show you my proposal to keep the file open between testStarting and test*** but then the method writeTo has to reopen it for reading. But is still does not make sense to do this because I am reworking it. Currently it is supposed that the test events are sequential.

wym?

The original code opens one file for stdout and one for stderr redirection for each test case (writing to stdout/err) and keeps the file handles open until completion of the whole test set <-- this is were the original code is flawed!
Example: a parameterized test with 2000 test iterations that write to stdout and stderr hogs 4000 file handles (in addition to the rest of surefire and system) and will fail on Linux assuming a maximum number of file descriptors (ulimit -n) setting of 4096.
I've seen it happen, thus the bug report and pull request. The PR resolves the issue nicely.

@Tibor17
Copy link
Contributor

Tibor17 commented May 12, 2022

Here are my changes #534, we can talk about it.

@spannm
Copy link
Contributor Author

spannm commented May 12, 2022

Here are my changes #534, we can talk about it.

Please feel free to go ahead and use your solution. I close my pull request.

@Tibor17
Copy link
Contributor

Tibor17 commented Jun 4, 2022

@sman-81
Sorry that I am late. I will go ahead by today or this week. I have to talk with my colleagues.

@Tibor17
Copy link
Contributor

Tibor17 commented Jun 9, 2022

@sman-81
Thx for contributing anyway!
T

@Tibor17 Tibor17 closed this Jun 9, 2022
@spannm spannm deleted the sman-81-SUREFIRE-2082-too-many-open-files branch June 10, 2022 06:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants