Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

[馃悰 Bug]: ExternalProcess produces a huge amount of excessive records which prevents to fetch/read CI logs #13096

Closed
mykola-mokhnach opened this issue Nov 4, 2023 · 20 comments

Comments

@mykola-mokhnach
Copy link

mykola-mokhnach commented Nov 4, 2023

What happened?

We use SIGINT to kill Appium process created using process builder (e.g. graceful shutdown). This closes the process stdout, but the process remains alive for a while.

There is this code in the ExternalProcess:

        new Thread(
                () -> {
                  InputStream input = process.getInputStream();
                  // use the CircularOutputStream as mandatory, we know it will never raise a
                  // IOException
                  OutputStream output = new MultiOutputStream(circular, copyOutputTo);
                  while (process.isAlive()) {
                    try {
                      // we must read the output to ensure the process will not lock up
                      input.transferTo(output);
                    } catch (IOException ex) {
                      LOG.log(
                          Level.WARNING,
                          "failed to copy the output of process " + process.pid(),
                          ex);
                    }
                  }
                })
            .start();

As a result the above while loop spins for a few seconds thus producing a huge amount of warnings that significantly increase the overall log size, so it cannot be even fetched from the CI properly or read in the browser. A single record looks like:

Nov 04, 2023 8:10:01 AM org.openqa.selenium.os.ExternalProcess$Builder lambda$start$0
WARNING: failed to copy the output of process 63950
java.io.IOException: Stream closed
	at java.base/java.io.BufferedInputStream.getBufIfOpen(BufferedInputStream.java:168)
	at java.base/java.io.BufferedInputStream.read(BufferedInputStream.java:334)
	at java.base/java.io.InputStream.transferTo(InputStream.java:782)
	at org.openqa.selenium.os.ExternalProcess$Builder.lambda$start$0(ExternalProcess.java:209)
	at java.base/java.lang.Thread.run(Thread.java:833)

How can we reproduce the issue?

Run Appium Java client CI. Example: https://github.com/appium/java-client/actions/runs/6751224992/job/18359734036?pr=2067

Relevant log output

https://github.com/appium/java-client/actions/runs/6751224992/job/18359734036?pr=2067

Operating System

macos

Selenium version

4.15.2

What are the browser(s) and version(s) where you see this issue?

Appium

What are the browser driver(s) and version(s) where you see this issue?

Are you using Selenium Grid?

no

Copy link

github-actions bot commented Nov 4, 2023

@mykola-mokhnach, thank you for creating this issue. We will troubleshoot it as soon as we can.


Info for maintainers

Triage this issue by using labels.

If information is missing, add a helpful comment and then I-issue-template label.

If the issue is a question, add the I-question label.

If the issue is valid but there is no time to troubleshoot it, consider adding the help wanted label.

If the issue requires changes or fixes from an external project (e.g., ChromeDriver, GeckoDriver, MSEdgeDriver, W3C), add the applicable G-* label, and it will provide the correct link and auto-close the issue.

After troubleshooting the issue, please add the R-awaiting answer label.

Thank you!

@mykola-mokhnach
Copy link
Author

cc @diemol @titusfortner

@titusfortner
Copy link
Member

Can you add a daily run of your CI to use the nightly jar? It would be useful to know when this started breaking and try to address it before release.

We can try to make sure we wait a day before making changes before a release, too just to make sure.

@joerg1985
Copy link
Member

I think there is nothing that could be done to avoid this message. In similar cases JDK issues had been rejected, e.g. this.

As a workaround i would suggest to sleep for ~800 milliseconds before resuming to read the input stream, this should significantly reduce the number of warnings logged.

@mykola-mokhnach
Copy link
Author

I'm not quite sure if transferTo can resurrect itself once IOException has happened. Although putting a sleep would improve the situation anyway .

@mykola-mokhnach
Copy link
Author

mykola-mokhnach commented Nov 4, 2023

Also the documentation for transferTo says:

If an I/O error occurs reading from the input stream or writing to the output stream, then it may do so after some bytes have been read or written. Consequently the input stream may not be at end of stream and one, or both, streams may be in an inconsistent state. It is strongly recommended that both streams be promptly closed if an I/O error occurs.

@krmahadevan
Copy link
Contributor

@joerg1985 - What if we resorted to just warning once (the first time the error occurs)? I am going to assume that between subsequent failures, the pid and the exception are going to be the same.

Something like this ?

boolean warned = false;
while (process.isAlive()) {
  try {
    // we must read the output to ensure the process will not lock up
    input.transferTo(output);
  } catch (IOException ex) {
    if (!warned) {
      warned = true;
      LOG.log(Level.WARNING,"failed to copy the output of process " + process.pid(),ex);
    }
  }
}

@joerg1985
Copy link
Member

@krmahadevan i think the best option is to close the streams like the documentation does suggest. Swallowing the exception might lead to a high cpu load until the process has terminated.

@titusfortner is this okay for you or do you have other ideas?

@titusfortner
Copy link
Member

My only consideration here is that we should not be throwing warnings in logs if there isn't something a user can do to fix it. I don't see why the user needs to know this at all? Can we log as debug?

@joerg1985
Copy link
Member

When the loop is removed, the stream should be read to it's end without throwing a exception at all. The exception is raised when we try to read the allready closed stream again.

In general it make sense to log a warning if a exception does happen, because something unexpected happens and would propably be ignored when the debug level is used.

@titusfortner
Copy link
Member

But if it is happening everywhere for every run, it seems like we do expect it?
If it happens in the course of expected/passing executions (i.e., isn't always affecting the results of the program), and/or if there isn't anything the user can do to not get the message (e.g., it is a Selenium implementation detail), then it is extra noise we should avoid giving the user.

I do not know enough about how this works to know the right way to fix, only the above input on what I user should get to see in logs. 馃榿 So whatever you think is best

@joerg1985
Copy link
Member

Okay, so i will fix this to the best of my knowledge. This is no permanent decision, the fix might be changed if there are irrelevant messages logged.

@joerg1985
Copy link
Member

@mykola-mokhnach i just pushed a fix to this, thanks for reporting it!

@titusfortner
Copy link
Member

@joerg1985 this fix broke the CI, can you take a look? Thanks.

joerg1985 added a commit that referenced this issue Nov 6, 2023
joerg1985 added a commit that referenced this issue Nov 6, 2023
@joerg1985
Copy link
Member

@titusfortner thank you for the hint, i am currently looking into it

@joerg1985
Copy link
Member

I removed closing the streams, this somehow disturbs the process.
The CI should now flip to green again, will keep an eye on it.

@titusfortner
Copy link
Member

Is it a straight revert and this issue still needs fixing?

@joerg1985
Copy link
Member

The loop is still removed, so this issue should still be fixed.

@titusfortner
Copy link
Member

Then I'll close this; we can re-open if we still see it as an issue with the nightly or after next release.

Copy link

github-actions bot commented Dec 7, 2023

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked and limited conversation to collaborators Dec 7, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants