Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update dependencies #4313

Merged
merged 7 commits into from
Jun 9, 2017
Merged

Update dependencies #4313

merged 7 commits into from
Jun 9, 2017

Conversation

leventov
Copy link
Member

@leventov leventov commented May 23, 2017

Updated many, but not all library dependencies, mainly updated infrastructure dependencies and skipped testing and logging dependencies.

  • Netty
    • 3.10.4.Final -> 3.10.6.Final in kafka-indexing-service
    • 4.1.6.Final -> 4.1.11.Final in other modules
  • Zookeeper 3.4.9 -> 3.4.10
  • Apache Curator 2.11.0 -> 2.12.0
  • Jetty: 9.3.16.v20170120 -> 9.3.19.v20170502
  • Jersey: 1.19 -> 1.19.3 (see also Move from com.sun.jersey to org.glassfish.jersey #4310)
  • com.metamx:emitter:0.4.1 -> 0.4.5
  • com.metamx:http-client:1.0.6 -> 1.1.0
  • commons-io:commons-io:2.4 -> 2.5
  • com.amazonaws:aws-java-sdk-ec2:1.10.56 -> 1.11.132 1.10.77, update to 1.11+ is not possible because Druid is stuck on Jackson 2.4. Also depend just on aws-java-sdk-ec2 instead of aws-java-sdk, fixing Depend only on needed parts of aws-java-sdk #4382.
  • com.ibm.icu:icu4j:4.8.1 -> 54.1.1
  • joda-time:joda-time:2.8.2 -> 2.9.9
  • com.lmax:disruptor:3.3.0 -> 3.3.6
  • net.spy:spymemcached:2.11.7 -> 2.12.3
  • org.apache.httpcomponents:httpclient:4.5.1 -> 4.5.3
  • org.apache.httpcomponents:httpcore:4.4.3 -> 4.4.6
  • it.unimi.dsi:fastutil:7.0.13 -> 7.2.0

Also reduced dependency conflicts.

Updated patch or minor dependency versions, in order not to have to change source code, and better compatibility in patch release 0.10.1. However still labelling Design Review because of potential compatibility concerns.

@leventov leventov added this to the 0.10.1 milestone May 23, 2017
@gianm
Copy link
Contributor

gianm commented May 23, 2017

Curator should not be upgraded. 2.12.0 has a bug that affects Druid, see #4103 and #3837 (comment).

@gianm
Copy link
Contributor

gianm commented May 23, 2017

Thanks for leaving a comment about Curator, I should have done that in #4103. For the other changes, let's run integration tests and test with hadoop as well (it is notoriously finicky about which versions of things get used). If those look good then I'm +1.

@gianm
Copy link
Contributor

gianm commented May 23, 2017

If you don't have an easy way to test with hadoop, maybe someone else could help, perhaps @nishantmonu51 or @b-slim. We also have a hadoop-gauntlet test suite we use from time to time we could dig up if needed.

@leventov leventov removed this from the 0.10.1 milestone May 23, 2017
@b-slim
Copy link
Contributor

b-slim commented May 24, 2017

yeah sure will check this out hopefully this week.

@leventov
Copy link
Member Author

@b-slim thank you!

@leventov leventov closed this Jun 1, 2017
@leventov leventov reopened this Jun 1, 2017
@leventov
Copy link
Member Author

leventov commented Jun 2, 2017

It passes integration tests in Travis CI.

@leventov leventov added this to the 0.10.1 milestone Jun 6, 2017
@drcrallen
Copy link
Contributor

Filed #4372

@b-slim
Copy link
Contributor

b-slim commented Jun 6, 2017

IMO we need to make sure that this will not lock out all the hadoop user, it is default way to ingest batch data in druid as far as i know so i am not sure rushing it is a good idea. Can we set this back to 10.2 release timeline in order to run tests? CI tests does not cover hadoop indexing job. 👎

@leventov
Copy link
Member Author

leventov commented Jun 6, 2017

@b-slim 0.10.1 will be in RC for at least two weeks, and the first RC will go out optimistically the end of this week. Isn't it enough time to do testing?

@b-slim
Copy link
Contributor

b-slim commented Jun 6, 2017

it will take us at least one week to roll this out at the best cases if i make it a number one goal so two weeks can be really tight, also would like to see so other hadoop users like yahoo folks @himanshug and @cheddar roll this out and see if there is any issue. Again i am not against upgrading but would like to release this after testing at scale.

@gianm
Copy link
Contributor

gianm commented Jun 6, 2017

@b-slim, I can't see the sense in vetoing this patch. We would want you and yahoo (and others as well) to test 0.10.1 no matter what, regardless of whether it includes this patch or not. And I do think that when a Druid release is in the RC stage, it should be a high priority for committers to do what they can to put it through the paces and verify that the release will be solid. Otherwise we risk doing a bad release.

If the patch causes issues with hadoop we'll revert it, just like we reverted a curator upgrade in the 0.10.0 cycle since it broke tranquility (#4103).

@b-slim
Copy link
Contributor

b-slim commented Jun 6, 2017

@gianm yes we are working on testing the RCs no matter what. Am not vetoing the merge of this PR but i guess releasing it without proper testing is not a good idea either.

@leventov
Copy link
Member Author

leventov commented Jun 7, 2017

This PR is the only left in 0.10.1, so what we decide to do? @b-slim @gianm

@b-slim
Copy link
Contributor

b-slim commented Jun 7, 2017

@leventov i am currently testing a home made 10.0.1 RC + this PR. If cutting an RC with this PR will help testing it as part of the RC testing process then lets go for it.

@jon-wei
Copy link
Contributor

jon-wei commented Jun 7, 2017

I'm running a build with this patch against our hadoop version test suite as well.

@leventov
Copy link
Member Author

leventov commented Jun 7, 2017

@b-slim @jon-wei thank you

<exclusions>
<exclusion>
<groupId>com.metamx</groupId>
<artifactId>java-util</artifactId>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't exclude this though. server-metricsuses the metamx origin of java-utl, things in druid.io only use the io.druid fork

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's to resolve conflict, I added this because some other library needs newer version of java-util than server-metrics depend on, maybe emitter or http-client

</dependency>
<dependency>
<groupId>org.codehaus.jackson</groupId>
<artifactId>jackson-mapper-asl</artifactId>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what are these brought in for?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Specifying the version in dependencyManagement. There are conflicts in subprojects, where different dependencies want different version of this lib.

@jon-wei
Copy link
Contributor

jon-wei commented Jun 9, 2017

update on hadoop compatibility with this patch:

I downgraded hadoop.compile.version and removed hadoop-aws dependency from druid-hdfs-storage to test indexing tasks against versions < 2.6.0 as a workaround suggested by @b-slim, the indexing tasks all succeeded, so I think this PR is good on that aspect.

@b-slim
Copy link
Contributor

b-slim commented Jun 9, 2017

Great thanks @jon-wei LGTM

@jon-wei jon-wei merged commit 5285eb9 into apache:master Jun 9, 2017
@leventov leventov deleted the update-dependencies branch June 9, 2017 21:32
@himanshug
Copy link
Contributor

@leventov @drcrallen
our integration test build tries to download test segments from AWS, e.g.

"loadSpec":{"type":"s3_zip","bucket":"static.druid.io","key":"data/segments/twitterstream/2013-01-01T00:00:00.000Z_2013-01-02T00:00:00.000Z/2013-01-02T04:13:41.980Z_v9/0/index.zip"}

and fails with following error...

AWS profiles config file not found in the given path: /root/.aws/config\n\tat com.amazonaws.auth.profile.internal.ProfilesConfigFileParser.loadProfiles(ProfilesConfigFileParser.java:55)
\n\tat com.amazonaws.auth.profile.ProfilesConfigFile.loadProfiles(ProfilesConfigFile.java:129)
\n\tat com.amazonaws.auth.profile.ProfilesConfigFile.<init>(ProfilesConfigFile.java:97)
\n\tat com.amazonaws.auth.profile.ProfilesConfigFile.<init>(ProfilesConfigFile.java:77)
\n\tat com.amazonaws.auth.profile.ProfileCredentialsProvider.<init>(ProfileCredentialsProvider.java:45)
\n\tat io.druid.common.aws.AWSCredentialsUtils.defaultAWSCredentialsProviderChain(AWSCredentialsUtils.java:31)
\n\tat io.druid.guice.AWSModule.getAWSCredentialsProvider(AWSModule.java:45)

I wonder if this is related to aws sdk change. Has anyone faced this issue ?

@leventov
Copy link
Member Author

@himanshug internally we had to upgrade to aws-java-sdk 1.11 because of Spark incompatibility as far as I remember... and also to Jackson 2.6, that is not compatible with Hadoop.

But we didn't have such problems with aws-java-sdk 1.11

@himanshug
Copy link
Contributor

@leventov I guess its working for you probably because you're setting up AWS credentials (as described in http://docs.aws.amazon.com/sdk-for-java/v1/developer-guide/credentials.html ) in your environment.
For us, AWS usage is only in the integration testing and it used to work by default without supplying any credentials explicitly. Do you have any vanilla environment where you run the integration tests without any AWS credentials explicitly setup ?

@himanshug
Copy link
Contributor

@leventov I'm gonna try to do a build with aws-java-sdk-ec2 reverted back to aws-java-sdk to check if this is really the cause.

@leventov
Copy link
Member Author

@himanshug only integration tests that run in Travis, that has nothing to do with AWS S3.

Also could try to revert 1.10.77 -> 1.10.56

@gianm
Copy link
Contributor

gianm commented Jun 20, 2017

@himanshug the line numbers in the stack trace you posted don't correspond to 1.10.77, but instead look more like 1.7.4. Perhaps that version is getting pulled in somehow for your build. Fwiw, the 0.10.1-rc1 tarball is bundled with:

druid-0.10.1-rc1/extensions/druid-hdfs-storage/aws-java-sdk-1.7.4.jar
druid-0.10.1-rc1/extensions/druid-hdfs-storage/hadoop-aws-2.7.3.jar
druid-0.10.1-rc1/lib/druid-aws-common-0.10.1-rc1.jar
druid-0.10.1-rc1/lib/aws-java-sdk-ec2-1.10.77.jar
druid-0.10.1-rc1/lib/aws-java-sdk-core-1.10.77.jar

So if you run out of the tarball, you should be getting 1.10.77 in the io.druid.guice.AWSModule code path.

@himanshug
Copy link
Contributor

@gian here are the jars in my build machine...

druid_dev/aws-java-sdk-core-1.10.77.jar
druid_dev/aws-java-sdk-ec2-1.10.77.jar
druid_dev/druid-aws-common-0.10.1-1497388461-ecb8bd9-1422.jar

druid_dev/extensions/druid-hdfs-storage/aws-java-sdk-1.7.4.jar
druid_dev/extensions/druid-hdfs-storage/hadoop-aws-2.7.3.jar

ok, i see that error in the stacktrace comes from file ProfilesConfigFileParser which belongs to aws-java-sdk-1.7.4 which is downloaded as a transitive dependency from hadoop-aws-2.7.3 in hdfs-storage module.

I believe it is coming from hadoop version change in #4116 .

@himanshug
Copy link
Contributor

@gianm @leventov continuing the thread in #4116 (comment) .

@leventov
Copy link
Member Author

@himanshug looks like a bug, aws-java-sdk should be excluded from hdfs-storage dependency

@b-slim
Copy link
Contributor

b-slim commented Jun 20, 2017

Looks like we endup getting different SDKs and this due to https://github.com/druid-io/druid/pull/4313/files#diff-600376dffeb79835ede4a0b285078036L182.
We use to pull the entire suite so we endup with the same aws sdk. IMO it will better to use one version rather having different version of core and sdk

@b-slim
Copy link
Contributor

b-slim commented Jun 20, 2017

looks like a bug, aws-java-sdk should be excluded from hdfs-storage dependency

This dependency is used by HDFS storage when asked to load data from S3a.

@leventov
Copy link
Member Author

@b-slim exclusion means that version resolver won't consider 1.7.4. The project may depend on aws-java-sdk-s3:1.10.77 globally. If it is binary-compatible in the functionality that hdfs-storage uses.

@himanshug
Copy link
Contributor

@leventov @b-slim
yes, if hadoop-aws-2.7.3 can work with aws-java-sdk-s3:1.10.77 to have S3a support then I am in favor of adding that in top level pom and excluding aws-java-sdk in hdfs-storage pom from hadoop-aws dep.

@b-slim
Copy link
Contributor

b-slim commented Jun 20, 2017

an other way to fix this will be to make profile loading lazy initialized. What you guys think ?

@himanshug
Copy link
Contributor

@b-slim that might work for the problem i posted. but then its not great to have two different jars in the system that provide same functionality or conflict with each other. my understanding is that aws-java-sdk and aws-java-sdk-ec2 are in that boat . are they ?

@b-slim
Copy link
Contributor

b-slim commented Jun 20, 2017

@himanshug i have tested it with 1.10.56 against an internal 2.7.3 (patched that is more like 2.8 ish) and it worked, i guess we can go that route of having one version at the top.

@himanshug
Copy link
Contributor

@b-slim ok, let me also test whether that solves the original problem i posted. will update.

@himanshug
Copy link
Contributor

@b-slim if you know that would fix original problem too, then go ahead and send the PR.

@b-slim
Copy link
Contributor

b-slim commented Jun 21, 2017

i see you have already this himanshug@9632b0c

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants