Propagate stderr from pyre.bin #863

vthemelis · 2024-05-21T22:03:37Z

Pre-submission checklist

I've ran the linters locally and fixed lint errors related to the files I modified in this PR. You can install the linters by running pip install -r requirements-dev.txt && pre-commit install
pre-commit run

Summary

Make it easier to diagnose issues like #860

Test Plan

Ran locally and got the full error message when running pyre

# poetry run pyre
ƛ No watchman binary found. 
To enable pyre incremental, you can install watchman: https://facebook.github.io/watchman/docs/install
ƛ Defaulting to non-incremental check.
ƛ Path does not exist for search path: SubdirectoryElement(root='/', subdirectory='tests')
ƛ /root/.cache/pypoetry/virtualenvs/bug-pyre-il7asoJj-py3.11/bin/pyre.bin: /lib64/libm.so.6: version `GLIBC_2.29' not found (required by /root/.cache/pypoetry/virtualenvs/bug-pyre-il7asoJj-py3.11/bin/pyre.bin)
ƛ /root/.cache/pypoetry/virtualenvs/bug-pyre-il7asoJj-py3.11/bin/pyre.bin: /lib64/libc.so.6: version `GLIBC_2.29' not found (required by /root/.cache/pypoetry/virtualenvs/bug-pyre-il7asoJj-py3.11/bin/pyre.bin)
ƛ /root/.cache/pypoetry/virtualenvs/bug-pyre-il7asoJj-py3.11/bin/pyre.bin: /lib64/libc.so.6: version `GLIBC_2.33' not found (required by /root/.cache/pypoetry/virtualenvs/bug-pyre-il7asoJj-py3.11/bin/pyre.bin)
ƛ /root/.cache/pypoetry/virtualenvs/bug-pyre-il7asoJj-py3.11/bin/pyre.bin: /lib64/libc.so.6: version `GLIBC_2.34' not found (required by /root/.cache/pypoetry/virtualenvs/bug-pyre-il7asoJj-py3.11/bin/pyre.bin)
ƛ /root/.cache/pypoetry/virtualenvs/bug-pyre-il7asoJj-py3.11/bin/pyre.bin: /lib64/libc.so.6: version `GLIBC_2.32' not found (required by /root/.cache/pypoetry/virtualenvs/bug-pyre-il7asoJj-py3.11/bin/pyre.bin)
ƛ Check command exited with non-zero return code: 1.

stroxler · 2024-05-23T21:30:31Z

client/commands/check.py

@@ -192,12 +192,17 @@ def _run_check_command(command: Sequence[str]) -> CheckResult:
            result = subprocess.run(
                command,
                stdout=subprocess.PIPE,
-                stderr=log_file.file,
+                stderr=subprocess.PIPE,


I wonder if it would make more sense to leave this as a file, but dump the file whenever the return code is nonzero.

This might be better if the binary were ever to dump a huge amount to stderr, plus it makes it possible to cat the file if you want to check for messages as the command is running, and it would be enough to fix the issue we just saw where we weren't seeing the glibc link error.

What do you think?

How about I stream the logs to both the file and the python wrapper STDERR? I think it's always worth propagating the stderr as it makes debugging stuff much easier.

Oh wait I think I see what's going on here.

Pyre is piping the stdout, which will only contain type errors, but it is using the log file for stderr which includes all the ocaml logging as well as this backtrace.

So we definitely do want the log file, and we probably don't want to pipe the stderr directly because the python side is doing some prettifying (especially in interactive mode).

I think the problem must be that sometimes not all the logs actually get spit out before we get to line 224 in the case where the binary logs a stack trace and exits nonzero. The problem might be with the assumptions in this code around start.background_logging and how eagerly it logs

@grievejia Do you remember how this works exactly?

I'm not an expert on context managers, but what we saw was that when the pyre binary crashed with a fairly low-level stack trace (a glibc version mismatch), Pyre was never displaying the stderr output.

I wonder if the problem might actually be in the client/log/log.py:

if <big switch on log section from pyre> ... else: LOG.debug(line)

this may mean we are doing debug-level logging of stack traces and so they are invisible by default

@vthemelis and / or I can dig into this but wondering if you remember off the top of your head

Ha, indeed, that seems to be the case (which is why the same error comes up as Debug in this older issue).

Looks like if the log-line doesn't have the usual date-time SEVERITY message format, pyre will be using LOG.Debug to log this line or (I think) the severity of the previously logged line.

Even this though doesn't quite make sense as I can see DEBUG messages in the output provided in #860 yet the libc errors do not seem to have propagated.

I agree that thread joining is the underlying problem, but given that it's a context manager I think it may be that introducing a delay is the easiest fix. I'd be fine with adding a print or time.sleep statement anywhere that fixes the issue for now, with a comment noting the issue it's addressing (could include a link to this discussion).

I slightly prefer using a delay versus delete=False because we probably don't want the backend file to persist if we can avoid it, it's really only there to pipe logs to the frontend. But it's a temp file that will get cleaned up eventually anyway, so maybe my concern is misplaced here.

@samwgoldman might have more ideas on how to join the threads in a more principled way, I'm very much not a threading expert. But I'd like to get a short-term fix in soon and we can improve on it later rather than block this PR, anything that fixes the dropped logs issue is great work and a fix we want sooner than later!

I'd really prefer to avoid adding a sleep call. Can we confirm where the log is being dropped? Based on my reading of the code what should be happening:

pyre.bin's stderr is redirected to a log file

the log file is being "tail"ed by log.log_tailer and converted into an Iterable for each line

the StreamLogger prefixes each line and writes to stderr via the LOG logger

I'm not super familiar with the code in (2) and (3), but honestly it looks overcomplicated. I think it's nice to have a separate thread to read from the file, but that thread should just readline on the file descriptor until EOF. When we join that thread, the join should wait for the thread to hit that EOF.

Sorry, thinking more about this, I think we should move this code to use the lower-level subprocess.Popen code, then read from stderr and stdout in two separate threads. This is what subprocess.communicate does (and what subprocess.run does in turn).

The thread reading from stderr can then output to the log file and transform+log to stderr. This is much nicer than redirecting to the file and then trying to also tail that file.

After a little more looking at this, I think the real issue isn't lack of a join, since we do join. I think the real issue is that StreamLogger has two modes it runs in, a best-effort tailing mode and a proper logging mode.

The tailing mode is used in some situations where we start a pyre daemon, e.g. start._run_in_background, and the logger is currently optimized for that case by having self._should_stop_reading_stream which gets set on __exit__, so that whenever the main thread decides we're done we stop reading logs.

But the proper logging mode is what we want for shell commands like pyre check, and when in that mode we really shouldn't be using _should_stop_reading_stream at all, we should run to completion.

This suggests either using two classes or an attribute to differentiate the cases

Whoops I missed Sam's second comment.

Using Popen and two threads (or even just one thread, stdout is going to be used in the foreground process anyway) should work well here too, and then the existing file-based StreamLogger could be used only for deamonized logic like start._run_in_background.

We can't completely rid of the file-based approach because of the need to use a file in the daemon case; the existing code is trying to use the same approach for both which is indeed questionable.

Propagate stderr from pyre.bin

3d87df4

facebook-github-bot added the CLA Signed label May 21, 2024

vthemelis mentioned this pull request May 22, 2024

Pyre returning ERROR Check command exited with non-zero return code: 1 when running with poetry #860

Closed

stroxler reviewed May 23, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Propagate stderr from pyre.bin #863

Propagate stderr from pyre.bin #863

vthemelis commented May 21, 2024

stroxler May 23, 2024

vthemelis May 29, 2024

stroxler May 29, 2024

stroxler May 29, 2024

vthemelis May 29, 2024

stroxler Jun 1, 2024

samwgoldman Jun 3, 2024

samwgoldman Jun 3, 2024

stroxler Jun 3, 2024

stroxler Jun 3, 2024 •

edited

Propagate stderr from pyre.bin #863

Are you sure you want to change the base?

Propagate stderr from pyre.bin #863

Conversation

vthemelis commented May 21, 2024

Summary

Test Plan

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stroxler Jun 3, 2024 • edited

Choose a reason for hiding this comment

stroxler Jun 3, 2024 •

edited