Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LISAMZA-27395 removing the current recursive call prevention logic #1641

Merged
merged 2 commits into from
Nov 17, 2022

Conversation

jia-gao
Copy link
Contributor

@jia-gao jia-gao commented Nov 8, 2022

Issue:
We observed logging data loss happens in StreamAppender
The RC is the current implementation of the “append()” method uses one AtomicBoolean variable to detect if the thread is called recursively, which could lead to race conditions when "append()" is invoked by multiple threads and could cause log data loss.

we have reproduced the data loss issue locally in the multithread logging scenarios.

When the current implementation was introduced in samza, log4j1 framework didn't have recursive call prevention logic at the framework layer. More context can be found in SAMZA-723

However, the current implementation in samza has the side effect that data loss might occur in the parallel logging scenario. We believe it is a bug existing for a long time but never got noticed.

Change:
Samza now is upgraded to log4j2 framework which applies recursive call prevention at the framework layer

So we don't need recursive call prevention logic at Samza

API Changes:
Remove a metric recursiveCalls from StreamAppenderMetrics since it is no longer used

Test Done:
./gradlew build

@mynameborat
Copy link
Contributor

Is the default support to prevent recursion not available when this change was introduced or did we upgrade to newer version of log4j that adds this prevention at the framework layer?

@jia-gao
Copy link
Contributor Author

jia-gao commented Nov 14, 2022

Is the default support to prevent recursion not available when this change was introduced or did we upgrade to newer version of log4j that adds this prevention at the framework layer?

log4j1 doesn't have this prevention that's when we added the change in samza. Later we upgraded to log4j2 which has the prevention at the framework layer.

@@ -46,7 +43,6 @@ public class StreamAppenderMetrics extends MetricsBase {
public StreamAppenderMetrics(String prefix, MetricsRegistry registry) {
super(prefix + "-", registry);
bufferFillPct = newGauge("buffer-fill-percent", 0);
recursiveCalls = newCounter("recursive-calls");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this metric published in some Samza doc? If yes, let's remove it from there as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't find StreamAppenderMetrics published in any oss samza doc
https://samza.apache.org/learn/documentation/latest/container/metrics-table.html

@shanthoosh
Copy link
Contributor

@jia-gao As discussed offline, it would be great if we can add a unit-test for validating these changes in this patch.

@jia-gao
Copy link
Contributor Author

jia-gao commented Nov 17, 2022

validating

@jia-gao As discussed offline, it would be great if we can add a unit-test for validating these changes in this patch.

Added unit tests

@shanthoosh
Copy link
Contributor

Thanks for adding the tests @jia-gao . After going through the linkedin documents and JIRA tickets, it seems like we are just proceeding with the approach finalized as a part of the investigation.

We had just removed the recursive call check from the log4j2-appender implementation and we're relying upon native-recursive check from log4j2. Mostly the changes look good to me, let us wait for the build to pass and we can merge this in!

Copy link
Contributor

@shanthoosh shanthoosh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making the changes! LGTM.

@shanthoosh shanthoosh merged commit d5d3d98 into apache:master Nov 17, 2022
ehoner pushed a commit to ehoner/samza that referenced this pull request Apr 11, 2023
…pache#1641)

* LISAMZA-27395 removing the current recursive call prevention logic since it doesn’t work as expected

* Add unit tests for streamappender change
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants