Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ZOOKEEPER-4427 Migrate to Logback #1793

Closed
wants to merge 20 commits into from

Conversation

anmolnar
Copy link
Contributor

@anmolnar anmolnar commented Jan 5, 2022

This is the first commit of Logback migration task.

  1. Replaced log4j dependencies with logback alternatives,
  2. Replaced log4j.properties files with logback.xml
  3. Refactored unit tests to use logback for log capturing.

This patch doesn't cover the zookeeper-contrib projects, only the main codebase.

Copy link
Member

@ctubbsii ctubbsii left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the XML is far less convenient and readable than the properties files for configuring logging. I do like the consolidation of the test code that uses logging, and it might make sense to use logback for that implementation for now. But, I'm still not a fan of logback over log4j2, as I think it's outdated, and a downgrade in functionality. If the goal is to trade off functionality for simplicity, then I don't think this achieves that, especially with the use of XML. slf4j-simple might achieve it better. But, I'd still prefer log4j2 as the default slf4j binding at runtime.

pom.xml Outdated
Comment on lines 605 to 611
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.mockito</groupId>
<artifactId>mockito-core</artifactId>
<version>${mockito.version}</version>
<scope>test</scope>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not a good idea to put the <scope> in the <dependencyManagement> section of the POM, because that will propagate, and require the <scope>compile</scope> to be explicitly added (normally, it's not needed, because it's the default scope). It's better to just manage the version in the <dependencyManagement> section, and defer adding a scope until the dependency is actually used.

It is unlikely that these dependencies are going to be used in a scope other than test, but the only reason to put the scope here is to omit it later when it is used. However, omitting it later will make it appear as though the scope is used is the compile scope, so it doesn't really have any benefit to adding it here.

Comment on lines +78 to +83
<groupId>ch.qos.logback</groupId>
<artifactId>logback-core</artifactId>
</dependency>
<dependency>
<groupId>ch.qos.logback</groupId>
<artifactId>logback-classic</artifactId>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These really should be runtime scoped dependencies, since they should not actually be used anywhere in the code at compile time.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If they are needed for tests, they could use the test scope explicitly here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. @anmolnar is there a reason why these are not in 'test' scope?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one makes me confused: we didn't do this for the log4j1 reference. If you check it in this file, it's not "test" scoped. Also, if I make logback to "test" scoped, it won't be included in the binary distribution. Is that okay?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... you are right. I didn't think this through. We still need to put logback to the binary distribution as we need some default log engine.

Copy link
Member

@ctubbsii ctubbsii Jan 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@anmolnar Yeah, it wasn't that way for log4j1, because log4j1 was used at compile time. However, part of this is to complete the migration to slf4j-api at compile time, so the runtime implementations can be chosen by users. So, no actual implementation jar should need to be compile scoped anymore. They can just be runtime (or test if it is only needed to compile the tests; runtime implies test). The main point is that they aren't needed at compile time, because slf4j-api is used at compile time instead.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

runtime dependencies should still be included in the binary distribution... I'd have to check to see how you are constructing the binary distribution. But, you shouldn't put it on the compile class path just to put it in the binary distribution. There's better ways to include it in the distribution than changing the scope.


import java.io.ByteArrayOutputStream;

public class LoggerTestTool<T> implements AutoCloseable {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Putting all the log capture test code in one place was a really nice decision. The implementation is pretty simple, too. Not sure what the equivalent in log4j2 or another slf4j binding would look like. So, logback here makes sense, if the alternatives would be substantially worse.

An alternative would be to make better use of mocking in the tests that use this. Capturing the logs is not usually an ideal way to do unit testing. But, this is a nice simple replacement for existing tests.

Copy link
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 as far as @ctubbsii comments about pom.xml are addressed

@anmolnar
Copy link
Contributor Author

anmolnar commented Jan 5, 2022

@ctubbsii Thanks for the useful feedback, I really appreciate it. Isn't log4j2's config file also XML based? Afaik log4j1 was the only one which still uses properties file, but I'm not sure.

I also need some help with the contrib projects, because currently they don't build. The reason is that I removed log4j dependencies from the parent pom and can't figure out how to put them back for only the. contrib projects

Thanks!

@ctubbsii
Copy link
Member

ctubbsii commented Jan 5, 2022

@ctubbsii Thanks for the useful feedback, I really appreciate it. Isn't log4j2's config file also XML based? Afaik log4j1 was the only one which still uses properties file, but I'm not sure.

log4j1 supported XML and properties, but some options were only available with XML. With log4j2, everything can be done with either XML, JSON, properties, or even YAML. We use properties files with log4j2 for Accumulo (https://github.com/apache/accumulo/blob/main/assemble/conf/log4j2.properties)

@ctubbsii
Copy link
Member

ctubbsii commented Jan 5, 2022

I also need some help with the contrib projects, because currently they don't build. The reason is that I removed log4j dependencies from the parent pom and can't figure out how to put them back for only the. contrib projects

To fix the contribs, their POMs use the zookeeper-contrib parent POM, which itself points to the parent POM that you edited. You can leave the log4j stuff in the dependencyManagement section. That section only manages dependencies when they are used. Since they are still used in the contribs, you still need to manage them somewhere. Adding them back to the dependencyManagement section won't add them back as dependencies to the modules that moved to logback, so it's fine to have them there.

Copy link
Contributor

@cnauroth cnauroth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@anmolnar , thanks for picking this up. It's looking good so far! I entered a few comments.

Some additional areas of the codebase to consider, not currently in the pull request:

  • Shell scripts under bin/.
  • owaspSuppressions.xml can remove Log4J CVEs.
  • Minor: the native client has a comment mentioning Log4J in zk_log.c.
  • Various documentation under zookeeper-docs will need a rewrite to discuss Logback instead of Log4J.
  • Possibly some things in zookeeper-recipes reference Log4J?
  • zookeeper-server NOTICE.txt should be updated for Logback instead of Log4J. I'm not sure if the phrasing needs to take care to indicate the license terms. Logback is dual-licensed under LGPL and EPL. LGPL is a category X license, disallowed in Apache project distributions, so we need to take the dependency under EPL intsead.
  • zookeeper-server still has a few references to Log4J in the JMX functionality, mostly just comments and method naming, but good to clean up. (See grep -ri log4j zookeeper-server for the full list.)

I know you already said that you were treating contrib as separate scope. I think it would be fair to split some of this other stuff out into separate tickets/pull requests too.

@anmolnar
Copy link
Contributor Author

anmolnar commented Jan 6, 2022

@anmolnar , thanks for picking this up. It's looking good so far! I entered a few comments.

Some additional areas of the codebase to consider, not currently in the pull request:

  • Shell scripts under bin/.
  • owaspSuppressions.xml can remove Log4J CVEs.
  • Minor: the native client has a comment mentioning Log4J in zk_log.c.
  • Various documentation under zookeeper-docs will need a rewrite to discuss Logback instead of Log4J.
  • Possibly some things in zookeeper-recipes reference Log4J?
  • zookeeper-server NOTICE.txt should be updated for Logback instead of Log4J. I'm not sure if the phrasing needs to take care to indicate the license terms. Logback is dual-licensed under LGPL and EPL. LGPL is a category X license, disallowed in Apache project distributions, so we need to take the dependency under EPL intsead.
  • zookeeper-server still has a few references to Log4J in the JMX functionality, mostly just comments and method naming, but good to clean up. (See grep -ri log4j zookeeper-server for the full list.)

I know you already said that you were treating contrib as separate scope. I think it would be fair to split some of this other stuff out into separate tickets/pull requests too.

Thanks @cnauroth , that's a very nice list of outstanding items. Very useful, because it contains a lot of stuff that I didn't consider previously. First, I'd like to get the buy-in from the community, then I'll polish up as much as I can fit into this PR.

@anmolnar
Copy link
Contributor Author

anmolnar commented Jan 18, 2022

The following commits have been pushed:

  • Clean-up shell scripts under bin/. (this is a new limitation: root logger is no longer configurable via system property)
  • Remove Log4J CVEs from owaspSuppressions.xml
  • Added jmxConfigurator to logback.xml as commented suggestion to register logback beans (not enabled by default)

Still outstanding:

Copy link
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great

@sourabhsparkala
Copy link

Hello All,

Requesting you to confirm if this is the fix to completely remove log4j from zookeeper? If yes, could you please provide a planned release date for this fix?

Thanks and Regards
Sourabh

@anmolnar
Copy link
Contributor Author

@sourabhsparkala Yes, this patch will completely remove log4j from ZK. Unfortunately we don't have a release date yet.

@anmolnar
Copy link
Contributor Author

anmolnar commented Jan 19, 2022

Almost there @eolivelli hold your last approval for a minute. ;)
I've submitted docs changes recently and would like to take care of the licenses file based on our LEGAL ticket. Stay tuned.
Recipes and contrib projects will come in a follow-up patch.

Copy link
Contributor

@cnauroth cnauroth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for incorporating the feedback! I see just a few more things that would be nice to include before committing this.

  • The top-level pom.xml still contains some references to Log4J.
  • zookeeperAdmin.md contains a java -cp ... example that references Log4J jars.
  • ReconfigExceptionTest contains a comment with a java -cp ... command referencing Log4J jars.
  • ZooTrace contains a comment that "Log4J must be correctly configured".
  • zookeeper-server/src/main/resources/NOTICE.txt still mentions Log4J (though I'm unclear on whether or not this file is actually getting into the distribution).

I tested building a binary distribution and confirmed that there are no Log4J jars or configuration files. I ran mvn dependency:tree and confirmed no transitive dependencies to Log4J either. I also tested swapping in Log4J 2 as the SLF4J back-end, and that worked fine:

rm -f lib/logback*.jar
ln -sfn ~/.m2/repository/org/apache/logging/log4j/log4j-api/2.17.1/log4j-api-2.17.1.jar lib/log4j-api-2.17.1.jar
ln -sfn ~/.m2/repository/org/apache/logging/log4j/log4j-core/2.17.1/log4j-core-2.17.1.jar lib/log4j-core-2.17.1.jar
ln -sfn ~/.m2/repository/org/apache/logging/log4j/log4j-slf4j-impl/2.17.1/log4j-slf4j-impl-2.17.1.jar lib/log4j-slf4j-impl-2.17.1.jar

cat > conf/log4j2.properties <<EOF
appender.console.type = Console
appender.console.name = consoleLogger
appender.console.layout.type = PatternLayout
appender.console.layout.pattern = %d{yyyy-MM-dd HH:mm:ss} %-5p %c{1}:%L - %m%n

rootLogger.level = info
rootLogger.appenderRef.stdout.ref = consoleLogger
EOF

LICENSE.txt Outdated
@@ -200,3 +200,209 @@
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading through this...

https://infra.apache.org/licensing-howto.html#permissive-deps

...I don't think we're supposed to put the entire text of the EPL in here. Instead, it's typically a pointer to the dependency's license.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can put the entire text in there, but more importantly, you should not include it in here, because you haven't bundled Logback into the source. You should only include LICENSE/NOTICE information for things actually distributing as a bundle. This means the LICENSE/NOTICE files can be different between the source tarball (official release) and the distribution tarball (AKA "convenience binary") that actually includes the logback jar bundled.

So, these are the wrong files to modify. I'm not familiar enough with the ZK build to know how the convenience binaries are built, but there should be a separate mechanism to bundle LICENSE/NOTICE files into the convenience binary than these, which are for the source.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My assumption was based on this guide: https://infra.apache.org/licensing-howto.html

You must customize LICENSE and NOTICE files according to the content of the specific distribution they reside within. Do not add to LICENSE and NOTICE dependencies which are not in the distribution. Only bundled bits matter.

Even though logback is not included in our source distribution, we also have a convenience binary distribution with first-level dependencies included. But. Jetty falls into this bucket too which also has EPL v1.0 license and we don't include it in top-level files. So, I ended up reverting these files and add the logback license only to the resource folder.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ctubbsii , great point about the different needs for source vs. binary distribution. Thank you.

It's fine to externalize license files to the resources folder (which land in the lib folder for the binary distribution). However, I think there still needs to be some kind of pointer from top-level to those separate licenses. It seems like zookeeper-server/src/main/resources/NOTICE.txt is trying to serve that purpose. I don't see this file actually showing up though when I build a binary distribution. It only has the same LICENSE.txt and NOTICE.txt from the root. (Additionally, my reading of the licensing how-to is that this stuff should be handled in LICENSE instead of NOTICE.)

Can someone else double-check me by testing building a binary distribution? Maybe we have a separate build bug preventing inclusion of the right files in the binary distribution? If so, that could be a separate topic, but it should be addressed before our next release.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cnauroth I compared the binary and source distribution in terms of lincensing and I think we don't have a separate build process for that. They contain the same LICENSE and NOTICE files. Both the binary and source distribution also contains the per-jar lincense.txt files, but the binary distribution includes the jar files themselves. To be honest I don't plan making any changes in this build process. I believe this patch is already aligned with the current habit of releasing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you. That confirms what I saw.

To be honest I don't plan making any changes in this build process.

That makes sense. If we need to address anything in this area for release, it can be handled as separate scope.

Copy link
Contributor

@symat symat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the patch, I saw only a single issue raised by @ctubbsii before that needs to be addressed. Otherwise if the community gives a green light for logback, then I am OK with the patch.

Comment on lines +78 to +83
<groupId>ch.qos.logback</groupId>
<artifactId>logback-core</artifactId>
</dependency>
<dependency>
<groupId>ch.qos.logback</groupId>
<artifactId>logback-classic</artifactId>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. @anmolnar is there a reason why these are not in 'test' scope?

@anmolnar
Copy link
Contributor Author

The top-level pom.xml still contains some references to Log4J.

It's needed for contrib projects to build. I'll remove it in the next patch.

zookeeperAdmin.md contains a java -cp ... example that references Log4J jars.

Thought it's only an example and the CP is probably already outdated, so not worth replacing it. I'll fix it anyway.

ReconfigExceptionTest contains a comment with a java -cp ... command referencing Log4J jars.

I'll fix this too.

ZooTrace contains a comment that "Log4J must be correctly configured".

Fixed.

zookeeper-server/src/main/resources/NOTICE.txt still mentions Log4J (though I'm unclear on whether or not this file is actually getting into the distribution).

Weird. I've never come across this file, but replace it with logback alternatives.

Copy link
Contributor

@symat symat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the community agrees with logback for now, the patch LGTM. Thanks @anmolnar for the work!

The ZooKeeper default *log4j.properties*
file resides in the *conf* directory. Log4j requires that
*log4j.properties* either be in the working directory
version 1.7.5 as its logging infrastructure. By default ZooKeeper is shipped with
Copy link

@craiglawson craiglawson Jan 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be version 1.7.30?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd recommend being less tightly coupled to the current patch version in this document. Just "1.7" should suffice here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is done.

@anmolnar
Copy link
Contributor Author

@cnauroth Please double check the PR and let me know what exactly is outstanding and required for your +1.

Copy link
Contributor

@cnauroth cnauroth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@anmolnar , thank you for addressing the feedback.

@anmolnar
Copy link
Contributor Author

I've finished the last change that was requsted in the review. SLF4j version number is now just 1.7 in the admin docs.
Got the green light from @cnauroth as well, so @eolivelli if you're in the mood, launch it.

@eolivelli
Copy link
Contributor

great work @anmolnar
committing as soon as CI is green

@eolivelli eolivelli closed this in 85551f9 Jan 27, 2022
@eolivelli
Copy link
Contributor

Committed to master branch.
I restarted CI many times, the last failure is unrelated to the patch.

@anmolnar anmolnar deleted the ZOOKEEPER-4427 branch January 27, 2022 16:15
@c3ivodujmovic
Copy link

@eolivelli @anmolnar did this merge?

@eolivelli
Copy link
Contributor

@c3ivodujmovic yes

We are VOTING on ZooKeeper 3.8.0 release candidate 0, please participate in the VOTE

@Co-Din
Copy link

Co-Din commented Feb 9, 2022

Is the vote open to anyone to participate? If so where can we vote?

@ctubbsii
Copy link
Member

ctubbsii commented Feb 9, 2022

Is the vote open to anyone to participate? If so where can we vote?

Voting happens on the project mailing lists, whose addresses you can find on the project website, https://zookeeper.apache.org

Anybody can provide feedback on a release candidate, but at the Apache Software Foundation, only PMC members' votes count (are considered "binding") for a release.

@djuarezg
Copy link

Does anyone know how to switch back to using log4j? Docs mention you can still switch

@ctubbsii
Copy link
Member

@djuarezg This is the first Google result I get for this: https://howtodoinjava.com/log4j2/log4j2-with-slf4j/ (assuming you want log4j2). Basically, you just need to put your preferred slf4j runtime/implementation jar on the class path and whatever config file that goes with it. See https://logging.apache.org/log4j/2.x/index.html, https://logging.apache.org/log4j/1.2/index.html, and https://www.slf4j.org/manual.html for more details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
10 participants