Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KAFKA-6415: Use WARN log level for Metadata in system test #4375

Closed

Conversation

rajinisivaram
Copy link
Contributor

@rajinisivaram rajinisivaram commented Jan 2, 2018

When a log entry is appended to a Kafka topic using KafkaLog4jAppender, the producer.send operation may block waiting for metadata. This can result in deadlocks in a couple of scenarios if a log entry from the producer network thread is also at a log level that results in the entry being appended to a Kafka topic.

  1. Producer's network thread will attempt to send data to a Kafka topic and this is unsafe since producer.send may block waiting for metadata, causing a deadlock since the thread will not process the metadata request/response.
  2. KafkaLog4jAppender#append is invoked while holding the lock of the logger. So the thread waiting for metadata in the initial send will be holding the logger lock. If the producer network thread has.a log entry that needs to be appended, it will attempt to acquire the logger lock and deadlock.

This is a temporary workaround to avoid deadlocks in system tests by setting log level to WARN for Metadata in VerifiableLog4jAppender. The fix has been verified using the system tests log4j_appender_test.py which started failing when the info-level log entry was introduced.

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

@rajinisivaram
Copy link
Contributor Author

@ijuma Can you review please? Thanks.

@ijuma
Copy link
Contributor

ijuma commented Jan 2, 2018

Thanks @rajinisivaram. Did you consider increasing the log level of the log4j appender system test to warn? Perhaps that would be more contained.

@rajinisivaram
Copy link
Contributor Author

@ijuma Thank you for the review. I wasn't sure if there could be users who were doing something similar with INFO level logs, so I thought it was safer to reduce the log level to debug for that particular entry. But if that is unlikely, then it would be better to change the system test. What do you think?

Copy link
Contributor

@tedyu tedyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about disable syncSend until we figure out the solution ?

Change looks good as workaround.

Copy link
Member

@satishd satishd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may not want to change logging level of specific code to make tests pass. Other log message can be added in future with a level which may fail the tests.

Agree with @tedyu on not setting syncSend as true because that seems to avoid deadlock scenario.

KafkaLog4jAppenderTest has SyncSend as false. So this issue may not occur in that test.
VerifiableLog4jAppender uses syncSend as false as here[1] which can be changed to false. Tests can be modified where this logger is used if it is required.

[1] https://github.com/apache/kafka/blob/trunk/tools/src/main/java/org/apache/kafka/tools/VerifiableLog4jAppender.java#L192

@ijuma
Copy link
Contributor

ijuma commented Jan 2, 2018

@satishd that option seems worse as we then don't test syncSend == true at all. Either way, we understand that all of the options in this PR are a short-term workaround until a more complete fix is available.

@satishd
Copy link
Member

satishd commented Jan 2, 2018

@ijuma Suggested solution was intended to be temporary. Your suggestion of having log level as warn in system tests as temporary solution makes sense as that also does not require the code to be changed and it covers SyncSend as true.

@rajinisivaram
Copy link
Contributor Author

Thank you for the reviews and suggestions. I have updated the code to log just Metadata at WARN level in VerifiableLog4jAppender used in system tests.

@rajinisivaram rajinisivaram changed the title KAFKA-6415: Avoid info log during metadata update in network thread KAFKA-6415: Use WARN log level for Metadata in system test Jan 3, 2018
Copy link
Contributor

@ijuma ijuma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update. LGTM, assuming you have verified that the test now passes.

@rajinisivaram
Copy link
Contributor Author

@ijuma Thank you for the review. Yes, I have verified that the system tests pass. Will wait for the builds to complete and then merge.

Copy link
Member

@satishd satishd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rajinisivaram +1 LGTM, thanks for your changes.

@rajinisivaram
Copy link
Contributor Author

retest this please

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants