Skip to content

Conversation

@trietopsoft
Copy link
Contributor

Implements fix for #2416.

Computes correct override class and adds unit test for both Hadoop and System property configuration.

Properly set Hadoop compression codec override and log error on failure.
Continue to return null for failures to preserve existing logic.
@milleruntime
Copy link
Contributor

I think this looks like a good fix. And thanks for the additions to the test. The only concern I have is with the exception handling. With your changes, I am now seeing an exception printed when CompressionTest is run, even as it is passing. We should avoid always printing errors if it is not an error. The problem with the class loading in this part of the code is I can't tell what is supposed to be an error and what is a typical configuration. This is not necessarily an issue with your change, it is mostly trying to figure out the original intent, how it has evolved and how it works with your change.

@trietopsoft
Copy link
Contributor Author

Thank you for the review. I am concerned that the error which was previously ignored is rather fatal in certain contexts. For example, if data is written with a codec available on a MR system (such as lzo), but not available on the Accumulo system, the data read will be trash. There's not a good immediate fix here, the risk is relatively low, and the exception will not occur in normal operations; however, if it does happen, I'd rather know instead of attempting to track down some other irrelevant (data corruption) issue.

@trietopsoft
Copy link
Contributor Author

@milleruntime I do not see an exception printed either in my local run or as part of the GH action. Could you please provide the reproduction steps where you see the exception?

@milleruntime
Copy link
Contributor

Run:
mvn clean test -Dtest=CompressionTest
cat <accumulo-src>./core/target/surefire-reports/org.apache.accumulo.core.file.rfile.bcfile.CompressionTest.txt
The error is printed in the log:

[main] ERROR org.apache.accumulo.core.file.rfile.bcfile.Compression [] - Unable to load codec class org.apache.hadoop.io.compress.LzoCodec for io.compression.codec.lzo.class
java.lang.ClassNotFoundException: org.apache.hadoop.io.compress.LzoCodec
	at jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:581) ~[?:?]
	at jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:178) ~[?:?]
	at java.lang.ClassLoader.loadClass(ClassLoader.java:522) ~[?:?]
	at java.lang.Class.forName0(Native Method) ~[?:?]
	at java.lang.Class.forName(Class.java:315) ~[?:?]
	at org.apache.accumulo.core.file.rfile.bcfile.Compression$Algorithm.createNewCodec(Compression.java:773) [classes/:?]
	at org.apache.accumulo.core.file.rfile.bcfile.Compression$Algorithm$2.createNewCodec(Compression.java:268) [classes/:?]
	at org.apache.accumulo.core.file.rfile.bcfile.Compression$Algorithm.initCodec(Compression.java:750) [classes/:?]
	at org.apache.accumulo.core.file.rfile.bcfile.Compression$Algorithm$2.initializeDefaultCodec(Compression.java:263) [classes/:?]
	at org.apache.accumulo.core.file.rfile.bcfile.Compression$Algorithm.(Compression.java:635) [classes/:?]
	at org.apache.accumulo.core.file.rfile.bcfile.CompressionTest.testSupport(CompressionTest.java:57) [test-classes/:?]

@trietopsoft
Copy link
Contributor Author

Thank you - this makes sense as LZO is not bundled with Hadoop due to licensing issues. In this case, the exception would be useful in determining which codec is missing and how to resolve the missing library.

@milleruntime
Copy link
Contributor

In this case, the exception would be useful in determining which codec is missing and how to resolve the missing library.

I agree but I would like to avoid printing an error if the codec is not there by default and most users aren't going to override the implementation.

@trietopsoft
Copy link
Contributor Author

Understood. I would suggest the following without the stack trace but enough context:

        log.warn("Unable to load codec class {} for {}, reason: {}", clazz, codecClazzProp,
            e.getMessage());

Copy link
Member

@ctubbsii ctubbsii left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the logic can be further simplified

Comment on lines +764 to 768
String extClazz = conf.get(codecClazzProp);
if (extClazz == null) {
extClazz = System.getProperty(codecClazzProp);
}
String clazz = (extClazz != null) ? extClazz : defaultClazz;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this behaves the way it should. This would get the class from the config file, and only use the system property if it's not set in the config file. I think it should work the other way around. Setting the system property on the command-line, for example, should be able to override what is set in the config file.

Suggested change
String extClazz = conf.get(codecClazzProp);
if (extClazz == null) {
extClazz = System.getProperty(codecClazzProp);
}
String clazz = (extClazz != null) ? extClazz : defaultClazz;
String clazz = System.getProperty(codecClazzProp, conf.get(codecClazzProp, defaultClazz));

Also, this syntax is much more readable: try system props, fall back to config, fall back to default.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The behavior comes from the existing Accumulo code which didn't work and the Hadoop TFile example linked here: https://github.com/apache/hadoop/blob/e103c83765898f756f88c27b2243c8dd3098a989/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/file/tfile/Compression.java#L88 - It is a bit backwards I agree and may be more Hadoop semantics rather than Accumulo. I would agree with your approach.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the simpler version of that implementation would look like:

      String clazz = conf.get(codecClazzProp, System.getProperty(codecClazzProp, defaultClazz));

But, that would make the value in the config file override a java property on the command-line. And, I think it should go the other way around. The command-line is should take precedence.

Comment on lines +775 to +777
// This is not okay.
log.warn("Unable to load codec class {} for {}, reason: {}", clazz, codecClazzProp,
e.getMessage());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// This is not okay.
log.warn("Unable to load codec class {} for {}, reason: {}", clazz, codecClazzProp,
e.getMessage());
log.warn("Unable to load codec class {} for {}", clazz, codecClazzProp, e);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was discussed here: #2417 (comment) The exception stacktrace is not required since having the message is sufficient.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Varargs doesn't capture the exception object and it is unbound in your requested change.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see harm in having a stack trace in the logs with a warning. I would leave it unbound so it does the stack trace, primarily because I think the code looks cleaner, but could go either way if somebody felt strongly about it.

Comment on lines +134 to +139
Algorithm.conf.clear();
if (extLz4 != null) {
System.setProperty(Compression.Algorithm.CONF_LZ4_CLASS, extLz4);
} else {
System.clearProperty(Compression.Algorithm.CONF_LZ4_CLASS);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We run our unit tests in parallel and reuse JVM forks to do so. Messing with system properties like this could make these tests not thread safe. But, I'm not sure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parallel tests may cause issues here. I can back these tests out.

Comment on lines +304 to +306
Algorithm.conf.set(Compression.Algorithm.CONF_LZ4_CLASS, DummyCodec.class.getName());
CompressionCodec dummyCodec = Compression.Algorithm.LZ4.createNewCodec(4096);
assertEquals("Hadoop override DummyCodec not loaded", DummyCodec.class, dummyCodec.getClass());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests seem a bit heavyweight for what they are effectively testing. They are effectively testing that we can read the class name from the config files and/or system property. We don't actually need to create the codec to test that. These tests can probably be made more lightweight. However, given the triviality of my proposed one-liner above to get the class name, I'm not sure there's much to test here at all.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eh, for someone who had to chase down this bug, I would typically argue for more test cases and code coverage, but I get your point.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the potential problems with JVM reuse and setting system properties, I'd probably just remove the test in this PR, but I agree with you in the general case about more unit testing for greater code coverage.
Yeah, I just don't want us to get to the point in our testing eagerness that we're doing things like: assertTrue(Set.of(x).contains(x));
At a certain point, we're not even testing Accumulo code, but that conf.get() and System.getProperty() work correctly.

@dlmarion
Copy link
Contributor

dlmarion commented Jul 6, 2022

Closing in favor of #2800

@dlmarion dlmarion closed this Jul 6, 2022
@ctubbsii ctubbsii added this to the 2.1.0 milestone Jul 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants