Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bugfix: Log consumers are now called with exactly one complete log line #5854

Merged
merged 15 commits into from Feb 16, 2023

Conversation

SgtSilvio
Copy link
Contributor

Fixes #5843: log consumers are now called with exactly one complete log line

Motivation
Docker logs provide a stream, the frames provided are not necessarily split by newline characters. So it might happen that a frame contains partial log lines.

Changes

  • Line splitting, partial log buffering and line merging independently of the stream type (RAW, STDOUT, STDERR)
  • OutputFrame does consistently not contain newline characters (independent of TTY)
  • ToStringConsumer now adds newlines
  • Slf4jLogConsumer does not need to remove any newlines

Also fixes #4110, #455

@SgtSilvio
Copy link
Contributor Author

Can someone approve the missing workflows? The code should now be ready.

@SgtSilvio
Copy link
Contributor Author

Sorry to ask again, but now no more checks/tests should fail (hopefully 😁).

@kiview
Copy link
Member

kiview commented Sep 27, 2022

Thanks for the PR @SgtSilvio, I think this is an important topic. We will look into it soon.

@SgtSilvio
Copy link
Contributor Author

I am happy to explain in more detail and answer any questions to help move this forward.
We are heavily affected by this bug because of failing/flaky tests.

@SgtSilvio
Copy link
Contributor Author

I updated the branch on GitHub and now the workflows need approval again.
I don't want to put pressure on you but it would be nice to get a rough estimate of when you plan to incorporate this fix, just that we know if we should add another workaround on our side.

A-Imal
A-Imal previously approved these changes Oct 10, 2022
Copy link

@A-Imal A-Imal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@SgtSilvio SgtSilvio force-pushed the bugfix/log-stream-lines branch 3 times, most recently from 2a03c34 to f9c0d7e Compare October 17, 2022 13:39
@SgtSilvio SgtSilvio force-pushed the bugfix/log-stream-lines branch 2 times, most recently from 1ec6467 to aa8f15c Compare October 22, 2022 14:26
@SgtSilvio
Copy link
Contributor Author

Any updates?

@SgtSilvio
Copy link
Contributor Author

If you have any questions or concerns I am happy to explain and discuss.

@eddumelendez
Copy link
Member

@SgtSilvio thanks for your contribution! I will be taking a look at it next week. Sorry for the delay.

@perlun
Copy link
Contributor

perlun commented Nov 16, 2022

Ping @eddumelendez, just a friendly reminder that we are eagerly waiting on this fix. 🙂

@SgtSilvio
Copy link
Contributor Author

SgtSilvio commented Dec 7, 2022

Can we help in any way to get this merged? I will also take care of updating the PR.
I always thought you were too busy so I did not ping you unnecessarily, but other PRs like #6239 got merged so what can I do to move this PR forward?
@kiview @eddumelendez

@kiview
Copy link
Member

kiview commented Dec 7, 2022

Hi @SgtSilvio, sorry for not yet reviewing and merging. #6239 was a trivial change with minimal effect radius, that's why this got easily merged, while PRs that touch the core of the library sometimes tend to take longer to get reviewed. From the change in CosmosDBEmulatorContainer we can assume this is possibly a breaking change for our users as well, so we will need a bit of time to look into it and prioritize it accordingly.

I will discuss with @eddumelendez when we will find the time to look into this, sorry again and thanks for your patience.

@SgtSilvio
Copy link
Contributor Author

Thanks @kiview, it is perfectly valid feedback if you have any concerns that the changes might be too breaking. Although I rather see this as a broken behavior that was never reliable, and fixing a broken behavior is imho not a breaking change, I can understand if you see this differently.
If you like, I could take another look to restore the old behavior as much as possible and introduce new APIs for the fix instead, for example a method getBytesWithoutNewLine.

My comment was not about putting any stress on you, just about getting any feedback from you how we can help you to get this done.

@SgtSilvio
Copy link
Contributor Author

Any updates? Should I invest more time into it? Can you give me some feedback?

@eddumelendez
Copy link
Member

Sorry for the delay on this one. We are doing our best working on new features, triaging new issues/PRs meanwhile taking a look at the backlog.

@SgtSilvio I just noticed there are some conflicts and if you can solve them, it would be appreciated. Also, regarding to We are heavily affected by this bug because of failing/flaky tests. can you tell us more about when are your having this issue? Just asking to get more context.

@alexandrklimov
Copy link

alexandrklimov commented Jan 16, 2023

Hi all,

Is there any update on this issue?

Just I suspect it affects integration with Podman significantly.
At least, LogMessageWaitStrategy seems isn't working correctly for PostgreSQLContainer on Podman - I've been forced override it to HostPortWaitStrategy as a workaround.

@SgtSilvio
Copy link
Contributor Author

If you like, I could take another look to restore the old behavior as much as possible and introduce new APIs for the fix instead, for example a method getBytesWithoutNewLine.

I now restored the old behavior of the methods OutputFrame.getBytes and OutputFrame.getUtf8String, and added a new method OutputFrame.getUtf8StringWithoutLineEnding.
@kiview This should now resolve all your concerns about this being a breaking change.

From the change in CosmosDBEmulatorContainer we can assume this is possibly a breaking change for our users as well, so we will need a bit of time to look into it and prioritize it accordingly.

Even the change to CosmosDBEmulatorContainer is now reverted. Tests should pass (though I did not run them yet).

@SgtSilvio
Copy link
Contributor Author

Now all tests should pass

@SgtSilvio
Copy link
Contributor Author

Can someone approve the workflows?

@SgtSilvio
Copy link
Contributor Author

Is anything else required from my side? For example, do you require more comments on the code? Or should I just wait for your review?

@alexandrklimov
Copy link

Hi team,

Are there any plans about release this fix?

@SgtSilvio
Copy link
Contributor Author

Sorry to ping again, but when will this be reviewed?

@eddumelendez eddumelendez added this to the next milestone Feb 16, 2023
@eddumelendez eddumelendez merged commit 3ca5e67 into testcontainers:main Feb 16, 2023
@eddumelendez
Copy link
Member

Thanks for your contribution, @SgtSilvio ! and sorry for the long wait on this one. This is finally merged in main branch and will be part of the next release.

@perlun
Copy link
Contributor

perlun commented Feb 16, 2023

@eddumelendez Thanks a lot for this, appreciated! 🙏 Now, if we could just get #3018 merged into the next release as well, I would be completely happy... 😂 😉

@alexandrklimov
Copy link

@eddumelendez I'm sorry for disturbance but could you please provide some info. about the nearest release date?
We are waiting this fix impatiently regarding of the Podman usage.

@perlun
Copy link
Contributor

perlun commented Apr 12, 2023

@alexandrklimov The fix is included in https://github.com/testcontainers/testcontainers-java/releases/tag/1.18.0, released last week. 🎉

However, what I don't understand is why we're now seeing \r\n (CR+LF) line endings when calling org.testcontainers.containers.ContainerState#getLogs() on a Tomcat-based container. 🤔 Both the container and the host are Linux-based in this case. String[] logLines = logsString.split( "\n" ), which is why only the \r is present in the strings in this array.

image

@SgtSilvio
Copy link
Contributor Author

@perlun looked again at my changes, and this must be caused by something else (can't find a way how the code would add a \r).

@kiview
Copy link
Member

kiview commented Apr 13, 2023

@perlun Is this happening from within a Windows process (JVM running on Windows)?

@perlun
Copy link
Contributor

perlun commented Apr 14, 2023

Is this happening from within a Windows process (JVM running on Windows)?

@kiview Nope, interestingly not. Both the container and the host are Linux-based in this case.

I debugged this a bit and concluded that it comes out as CR+LF all the way here, in org.testcontainers.shaded.com.github.dockerjava.core.DefaultInvocationBuilder#executeAndStream:

image

And given the fact that HTTP uses CR+LF (not LF only), this is perhaps not so strange after all. 🤔 The data is just propagated from the DockerHttpClient.Response all the way, and this byte buffer contains CR+LF => log lines include CR+LF.

Here's what I think is an approximate description of the history re. this:

In other words, this change is very likely not "intentional" but a side effect of how the handling of the log output has changed.


One (seemingly "nasty") way to deal with it would be to let this method silently discard all CR characters. It would probably be a quite easy and reasonable way to fix this, which would make the log lines be more natural for Linux users. (TODO: should we detect Windows and still return CR+LF log lines in that case? 🤔)

void processFrame(final byte[] b) {
int start = 0;
int i = 0;
while (i < b.length) {
switch (b[i]) {
case '\n':
buffer.write(b, start, i + 1 - start);
start = i + 1;
consume();
lastCR = false;
break;
case '\r':
if (lastCR) {
consume();
}
buffer.write(b, start, i + 1 - start);
start = i + 1;
lastCR = true;
break;
default:
if (lastCR) {
consume();
}
lastCR = false;
}
i++;
}
buffer.write(b, start, b.length - start);
}

@SgtSilvio
Copy link
Contributor Author

And given the fact that HTTP uses CR+LF

You probably mean that HTTP uses crlf to separate its headers. But the body contains arbitrary bytes as sent by the sender.

@SgtSilvio
Copy link
Contributor Author

SgtSilvio commented Apr 14, 2023

Seems to be the behavior of tty: https://stackoverflow.com/questions/55034214/why-do-docker-run-t-outputs-include-r-in-the-command-output
So I would say this is expected behavior. A option to normalize line endings to only LF could be added, but please don't change the log output by default.

@perlun
Copy link
Contributor

perlun commented Apr 18, 2023

You probably mean that HTTP uses crlf to separate its headers. But the body contains arbitrary bytes as sent by the sender.

Ahh, you're right. 🙈 Now I'm even more confused than before... 😅

Seems to be the behavior of tty: https://stackoverflow.com/questions/55034214/why-do-docker-run-t-outputs-include-r-in-the-command-output

Big thanks for digging this up for us, greatly helpful. ❤️

So I would say this is expected behavior. A option to normalize line endings to only LF could be added, but please don't change the log output by default.

I agree with your thinking. If it comes out from Docker with CR+LF (when running in TTY mode), it feels weird that Testcontainers would mangle the log output without letting the user know about it.

The reason why we used TTY mode to begin with is this, quoting @slovdahl from #4110:

Might be worth adding some context on why we're using TTY as well. We previously had some spurious test failures with the Tomcat container. We never really found a way of reliably reproducing it outside CI (maybe even not at all?), but we believe that it was caused by a WaitStrategy that looked for a certain row logged by Tomcat when it had successfully started, and that that log row was sometimes split up on two different lines in the log output. After adding..

Since this was likely caused by #5843, which is now fixed, it's quite likely that we no longer need to run in TTY mode. I'll try that and see, this is obviously a much simpler and better fix than doing any changes in Testcontainers to begin with.

@perlun
Copy link
Contributor

perlun commented Apr 18, 2023

I'll try that and see, this is obviously a much simpler and better fix than doing any changes in Testcontainers to begin with.

Verified; this fixes the problem with CR+LF. ✔️

I'll let you know in case we see any more (logging/line ending-related) failures, but hopefully this should be the end of it. 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants