-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update to guice-4.1.0. #3222
Update to guice-4.1.0. #3222
Conversation
What versions did you test with? I tried updating to guice 4.0 previously, but #1628 |
and related #1608 (comment) |
Just local mode, I didn't try a remote cluster. I'll see if we can give that a shot too. |
I tested this patch (on top of druid master) against a few hadoop remote clusters, using a batch ingestion task with S3 as both input and deep storage. Running without
If i configure |
|
Added https://groups.google.com/d/msg/druid-user/i1oPmt9ltR8/Z1xAIk3uBgAJ to the PR description as possibly related. |
@drcrallen I believe not all of those hadoop versions work even without the guice upgrade. What's your reasoning behind not wanting to recommend @jon-wei how many of those hadoop versions work with stock Druid but without |
Running druid 0.9.1.1 without user.classpath.first=true: SequenceIQ hadoop docker:
EMR works. Cloudera 5.7.0.0 (2.6.0) and Hortonworks Sandbox 2.4 (2.7.1) fail with
|
I started digging to try and find what's causing the conflict, but hadoop uses This error should only be caused by something pulling in I went through and excluded all the things which were even referencing jackson.core: #3225 @jon-wei can you try that branch and see if it fixes things? Otherwise we need to figure out what's pulling in an old version of jackson-databind |
@drcrallen sure, i can give that a try |
I think it's hadoop-aws pulling in jackson, most of the distros include that. |
i.e. the actual hadoop machines have com.fasterxml.jackson stuff on their classpath. I think their databind replaces the one Druid wants, but then Druid adds in the jackson guava module, which doesn't work with the databind provided by the hadoop machine. (and that's also why setting user classpath first helps) That's my story and I'm sticking to it :) |
@gianm that sounds like a reasonable theory. In such a case the proper solution is for hadoop to not pollute the application classloader with extension jars. But, if |
If hadoop-aws indeed turns out to be the culprit, then we should file an issue against https://issues.apache.org/jira/browse/HADOOP re: conflicting jackson classes caused by hadoop-aws. Related to https://issues.apache.org/jira/browse/HADOOP-12705 and https://issues.apache.org/jira/browse/HADOOP-11074 |
@drcrallen The patch you provided runs into the same errors described in my last comment #3222 (comment) |
@gianm @drcrallen Not sure if jackson-databind is only used by hadoop-aws, but I do see it on the hadoop classpath on the 2.6.0+ versions that failed with stock druid 0.9.1.1 e.g. HDP 2.4 sandbox:
cloudera
For the later sequenceIQ generic hadoop images (2.6.0+), jackson-databind is stored at:
and I had to add that directory to the classpath on the hadoop side to use S3 successfully. |
On https://hadoop.apache.org/docs/r2.7.1/hadoop-mapreduce-client/hadoop-mapreduce-client-core/mapred-default.xml I see a property mapreduce.job.classloader, defaults to false, that might be a nicer way of dealing with this. I suggest the following:
And after doing that we should also look into updating Jackson :) [in a separate PR] |
If so then I think @gianm 's comments in #3222 (comment) are the best approach we have right now. |
Ran this again with SequenceIQ hadoop 2.30: fails with:
SequenceIQ 2.4.0 to 2.5.2, fails with:
SequenceIQ 2.6.0, 2.7.1 succeeds Cloudera Quickstart succeeds Hortonworks Sandbox 2.4 succeeds EMR 4.7.1 succeeds Looks like Running now with both |
@jon-wei those are odd, makes me think SequenceIQ did something weird to Hadoop classloading. Especially since Hortonworks 2.4 works |
@drcrallen HortonWorks 2.4 is based on hadoop 2.7.1, I believe |
ah, I see |
Based on those I'm pretty happy with recommending If that sounds good then I will update the docs on this PR. |
@gianm I just got some results from running with both Having |
@jon-wei @drcrallen I updated the hadoop docs in #3252 as discussed (see the section "Tip 2: Classloader modification on Hadoop"). |
👍 |
Fixes google/guice#757 among other issues. I tried out the quickstart (batch & streaming with tranquility) and it still worked after this change.
Possibly related: https://groups.google.com/d/msg/druid-user/i1oPmt9ltR8/Z1xAIk3uBgAJ