Update dependencies #4313

leventov · 2017-05-23T05:50:10Z

Updated many, but not all library dependencies, mainly updated infrastructure dependencies and skipped testing and logging dependencies.

Netty
- 3.10.4.Final -> 3.10.6.Final in kafka-indexing-service
- 4.1.6.Final -> 4.1.11.Final in other modules
Zookeeper 3.4.9 -> 3.4.10
~~Apache Curator 2.11.0 -> 2.12.0~~
Jetty: 9.3.16.v20170120 -> 9.3.19.v20170502
Jersey: 1.19 -> 1.19.3 (see also Move from com.sun.jersey to org.glassfish.jersey #4310)
com.metamx:emitter:0.4.1 -> 0.4.5
com.metamx:http-client:1.0.6 -> 1.1.0
commons-io:commons-io:2.4 -> 2.5
com.amazonaws:aws-java-sdk-ec2:1.10.56 -> ~~1.11.132~~ 1.10.77, update to 1.11+ is not possible because Druid is stuck on Jackson 2.4. Also depend just on aws-java-sdk-ec2 instead of aws-java-sdk, fixing Depend only on needed parts of aws-java-sdk #4382.
com.ibm.icu:icu4j:4.8.1 -> 54.1.1
joda-time:joda-time:2.8.2 -> 2.9.9
com.lmax:disruptor:3.3.0 -> 3.3.6
net.spy:spymemcached:2.11.7 -> 2.12.3
org.apache.httpcomponents:httpclient:4.5.1 -> 4.5.3
org.apache.httpcomponents:httpcore:4.4.3 -> 4.4.6
it.unimi.dsi:fastutil:7.0.13 -> 7.2.0

Also reduced dependency conflicts.

Updated patch or minor dependency versions, in order not to have to change source code, and better compatibility in patch release 0.10.1. However still labelling Design Review because of potential compatibility concerns.

gianm · 2017-05-23T05:53:56Z

Curator should not be upgraded. 2.12.0 has a bug that affects Druid, see #4103 and #3837 (comment).

gianm · 2017-05-23T06:15:57Z

Thanks for leaving a comment about Curator, I should have done that in #4103. For the other changes, let's run integration tests and test with hadoop as well (it is notoriously finicky about which versions of things get used). If those look good then I'm +1.

gianm · 2017-05-23T06:16:50Z

If you don't have an easy way to test with hadoop, maybe someone else could help, perhaps @nishantmonu51 or @b-slim. We also have a hadoop-gauntlet test suite we use from time to time we could dig up if needed.

b-slim · 2017-05-24T19:51:55Z

yeah sure will check this out hopefully this week.

leventov · 2017-05-24T20:07:10Z

@b-slim thank you!

leventov · 2017-06-02T01:34:25Z

It passes integration tests in Travis CI.

drcrallen · 2017-06-06T17:28:16Z

Filed #4372

b-slim · 2017-06-06T18:06:01Z

IMO we need to make sure that this will not lock out all the hadoop user, it is default way to ingest batch data in druid as far as i know so i am not sure rushing it is a good idea. Can we set this back to 10.2 release timeline in order to run tests? CI tests does not cover hadoop indexing job. 👎

leventov · 2017-06-06T18:10:21Z

@b-slim 0.10.1 will be in RC for at least two weeks, and the first RC will go out optimistically the end of this week. Isn't it enough time to do testing?

b-slim · 2017-06-06T18:17:27Z

it will take us at least one week to roll this out at the best cases if i make it a number one goal so two weeks can be really tight, also would like to see so other hadoop users like yahoo folks @himanshug and @cheddar roll this out and see if there is any issue. Again i am not against upgrading but would like to release this after testing at scale.

gianm · 2017-06-06T18:22:36Z

@b-slim, I can't see the sense in vetoing this patch. We would want you and yahoo (and others as well) to test 0.10.1 no matter what, regardless of whether it includes this patch or not. And I do think that when a Druid release is in the RC stage, it should be a high priority for committers to do what they can to put it through the paces and verify that the release will be solid. Otherwise we risk doing a bad release.

If the patch causes issues with hadoop we'll revert it, just like we reverted a curator upgrade in the 0.10.0 cycle since it broke tranquility (#4103).

b-slim · 2017-06-06T18:36:50Z

@gianm yes we are working on testing the RCs no matter what. Am not vetoing the merge of this PR but i guess releasing it without proper testing is not a good idea either.

leventov · 2017-06-07T17:01:03Z

This PR is the only left in 0.10.1, so what we decide to do? @b-slim @gianm

b-slim · 2017-06-07T17:19:57Z

@leventov i am currently testing a home made 10.0.1 RC + this PR. If cutting an RC with this PR will help testing it as part of the RC testing process then lets go for it.

jon-wei · 2017-06-07T18:43:06Z

I'm running a build with this patch against our hadoop version test suite as well.

leventov · 2017-06-07T19:03:33Z

@b-slim @jon-wei thank you

apache#4382)

drcrallen · 2017-06-08T20:04:09Z

pom.xml

+                <exclusions>
+                    <exclusion>
+                        <groupId>com.metamx</groupId>
+                        <artifactId>java-util</artifactId>


We can't exclude this though. server-metricsuses the metamx origin of java-utl, things in druid.io only use the io.druid fork

It's to resolve conflict, I added this because some other library needs newer version of java-util than server-metrics depend on, maybe emitter or http-client

drcrallen · 2017-06-08T20:05:10Z

pom.xml

+            </dependency>
+            <dependency>
+                <groupId>org.codehaus.jackson</groupId>
+                <artifactId>jackson-mapper-asl</artifactId>


what are these brought in for?

Specifying the version in dependencyManagement. There are conflicts in subprojects, where different dependencies want different version of this lib.

jon-wei · 2017-06-09T20:19:23Z

update on hadoop compatibility with this patch:

I downgraded hadoop.compile.version and removed hadoop-aws dependency from druid-hdfs-storage to test indexing tasks against versions < 2.6.0 as a workaround suggested by @b-slim, the indexing tasks all succeeded, so I think this PR is good on that aspect.

b-slim · 2017-06-09T21:14:18Z

Great thanks @jon-wei LGTM

himanshug · 2017-06-20T18:53:33Z

@leventov @drcrallen
our integration test build tries to download test segments from AWS, e.g.

"loadSpec":{"type":"s3_zip","bucket":"static.druid.io","key":"data/segments/twitterstream/2013-01-01T00:00:00.000Z_2013-01-02T00:00:00.000Z/2013-01-02T04:13:41.980Z_v9/0/index.zip"}

and fails with following error...

AWS profiles config file not found in the given path: /root/.aws/config\n\tat com.amazonaws.auth.profile.internal.ProfilesConfigFileParser.loadProfiles(ProfilesConfigFileParser.java:55)
\n\tat com.amazonaws.auth.profile.ProfilesConfigFile.loadProfiles(ProfilesConfigFile.java:129)
\n\tat com.amazonaws.auth.profile.ProfilesConfigFile.<init>(ProfilesConfigFile.java:97)
\n\tat com.amazonaws.auth.profile.ProfilesConfigFile.<init>(ProfilesConfigFile.java:77)
\n\tat com.amazonaws.auth.profile.ProfileCredentialsProvider.<init>(ProfileCredentialsProvider.java:45)
\n\tat io.druid.common.aws.AWSCredentialsUtils.defaultAWSCredentialsProviderChain(AWSCredentialsUtils.java:31)
\n\tat io.druid.guice.AWSModule.getAWSCredentialsProvider(AWSModule.java:45)

I wonder if this is related to aws sdk change. Has anyone faced this issue ?

leventov · 2017-06-20T18:57:14Z

@himanshug internally we had to upgrade to aws-java-sdk 1.11 because of Spark incompatibility as far as I remember... and also to Jackson 2.6, that is not compatible with Hadoop.

But we didn't have such problems with aws-java-sdk 1.11

himanshug · 2017-06-20T19:06:46Z

@leventov I guess its working for you probably because you're setting up AWS credentials (as described in http://docs.aws.amazon.com/sdk-for-java/v1/developer-guide/credentials.html ) in your environment.
For us, AWS usage is only in the integration testing and it used to work by default without supplying any credentials explicitly. Do you have any vanilla environment where you run the integration tests without any AWS credentials explicitly setup ?

himanshug · 2017-06-20T19:12:28Z

@leventov I'm gonna try to do a build with aws-java-sdk-ec2 reverted back to aws-java-sdk to check if this is really the cause.

leventov · 2017-06-20T19:26:52Z

@himanshug only integration tests that run in Travis, that has nothing to do with AWS S3.

Also could try to revert 1.10.77 -> 1.10.56

gianm · 2017-06-20T19:34:32Z

@himanshug the line numbers in the stack trace you posted don't correspond to 1.10.77, but instead look more like 1.7.4. Perhaps that version is getting pulled in somehow for your build. Fwiw, the 0.10.1-rc1 tarball is bundled with:

druid-0.10.1-rc1/extensions/druid-hdfs-storage/aws-java-sdk-1.7.4.jar
druid-0.10.1-rc1/extensions/druid-hdfs-storage/hadoop-aws-2.7.3.jar
druid-0.10.1-rc1/lib/druid-aws-common-0.10.1-rc1.jar
druid-0.10.1-rc1/lib/aws-java-sdk-ec2-1.10.77.jar
druid-0.10.1-rc1/lib/aws-java-sdk-core-1.10.77.jar

So if you run out of the tarball, you should be getting 1.10.77 in the io.druid.guice.AWSModule code path.

himanshug · 2017-06-20T20:00:10Z

@gian here are the jars in my build machine...

druid_dev/aws-java-sdk-core-1.10.77.jar
druid_dev/aws-java-sdk-ec2-1.10.77.jar
druid_dev/druid-aws-common-0.10.1-1497388461-ecb8bd9-1422.jar

druid_dev/extensions/druid-hdfs-storage/aws-java-sdk-1.7.4.jar
druid_dev/extensions/druid-hdfs-storage/hadoop-aws-2.7.3.jar

ok, i see that error in the stacktrace comes from file ProfilesConfigFileParser which belongs to aws-java-sdk-1.7.4 which is downloaded as a transitive dependency from hadoop-aws-2.7.3 in hdfs-storage module.

I believe it is coming from hadoop version change in #4116 .

himanshug · 2017-06-20T20:04:05Z

@gianm @leventov continuing the thread in #4116 (comment) .

leventov · 2017-06-20T20:14:12Z

@himanshug looks like a bug, aws-java-sdk should be excluded from hdfs-storage dependency

b-slim · 2017-06-20T20:14:17Z

Looks like we endup getting different SDKs and this due to https://github.com/druid-io/druid/pull/4313/files#diff-600376dffeb79835ede4a0b285078036L182.
We use to pull the entire suite so we endup with the same aws sdk. IMO it will better to use one version rather having different version of core and sdk

b-slim · 2017-06-20T20:17:30Z

looks like a bug, aws-java-sdk should be excluded from hdfs-storage dependency

This dependency is used by HDFS storage when asked to load data from S3a.

leventov · 2017-06-20T20:20:59Z

@b-slim exclusion means that version resolver won't consider 1.7.4. The project may depend on aws-java-sdk-s3:1.10.77 globally. If it is binary-compatible in the functionality that hdfs-storage uses.

himanshug · 2017-06-20T20:30:48Z

@leventov @b-slim
yes, if hadoop-aws-2.7.3 can work with aws-java-sdk-s3:1.10.77 to have S3a support then I am in favor of adding that in top level pom and excluding aws-java-sdk in hdfs-storage pom from hadoop-aws dep.

b-slim · 2017-06-20T20:32:25Z

an other way to fix this will be to make profile loading lazy initialized. What you guys think ?

himanshug · 2017-06-20T20:34:40Z

@b-slim that might work for the problem i posted. but then its not great to have two different jars in the system that provide same functionality or conflict with each other. my understanding is that aws-java-sdk and aws-java-sdk-ec2 are in that boat . are they ?

b-slim · 2017-06-20T20:35:41Z

@himanshug i have tested it with 1.10.56 against an internal 2.7.3 (patched that is more like 2.8 ish) and it worked, i guess we can go that route of having one version at the top.

himanshug · 2017-06-20T22:02:23Z

@b-slim ok, let me also test whether that solves the original problem i posted. will update.

himanshug · 2017-06-20T22:03:02Z

@b-slim if you know that would fix original problem too, then go ahead and send the PR.

b-slim · 2017-06-21T12:37:03Z

i see you have already this himanshug@9632b0c

Update dependencies

f8ff083

leventov added the Design Review label May 23, 2017

leventov added this to the 0.10.1 milestone May 23, 2017

Downgrade curator

dab641a

leventov removed this from the 0.10.1 milestone May 23, 2017

Rollback aws-java-sdk dependency to 1.10.77

8ebc20d

leventov closed this Jun 1, 2017

leventov reopened this Jun 1, 2017

leventov added 2 commits June 1, 2017 18:53

Merge remote-tracking branch 'upstream/master' into update-dependencies

0076ca3

Revert exclusions in integration-tests

e5fac2b

leventov force-pushed the update-dependencies branch from ec4ce86 to e5fac2b Compare June 2, 2017 00:06

Merge branch 'master' into update-dependencies

0ebfafd

drcrallen approved these changes Jun 6, 2017

View reviewed changes

leventov added this to the 0.10.1 milestone Jun 6, 2017

Depend only on aws-java-sdk-ec2 instead of umbrella aws-java-sdk (fixes

10bd7ed

apache#4382)

drcrallen reviewed Jun 8, 2017

View reviewed changes

jon-wei merged commit 5285eb9 into apache:master Jun 9, 2017

leventov deleted the update-dependencies branch June 9, 2017 21:32

jon-wei mentioned this pull request Jun 13, 2017

Delagate creation of segmentPath/LoadSpec to DataSegmentPushers and add S3a support #4116

Merged

himanshug mentioned this pull request Jun 21, 2017

exclude aws-java-sdk from hadoop-aws dep in hdfs-storage module #4437

Merged

fjy mentioned this pull request Jun 23, 2017

rollback to previous httpclient/httpcore versions #4457

Merged

egor-ryashin mentioned this pull request Apr 8, 2019

druid-cassandra-storage doesn't include high-scale-lib dependency #7426

Closed

Update dependencies #4313

Update dependencies #4313

Conversation

leventov commented May 23, 2017 • edited

gianm commented May 23, 2017

gianm commented May 23, 2017

gianm commented May 23, 2017

b-slim commented May 24, 2017

leventov commented May 24, 2017

leventov commented Jun 2, 2017

drcrallen commented Jun 6, 2017

b-slim commented Jun 6, 2017

leventov commented Jun 6, 2017

b-slim commented Jun 6, 2017

gianm commented Jun 6, 2017

b-slim commented Jun 6, 2017

leventov commented Jun 7, 2017

b-slim commented Jun 7, 2017

jon-wei commented Jun 7, 2017

leventov commented Jun 7, 2017

drcrallen Jun 8, 2017

Choose a reason for hiding this comment

leventov Jun 8, 2017

Choose a reason for hiding this comment

drcrallen Jun 8, 2017

Choose a reason for hiding this comment

leventov Jun 8, 2017

Choose a reason for hiding this comment

jon-wei commented Jun 9, 2017

b-slim commented Jun 9, 2017

himanshug commented Jun 20, 2017

leventov commented Jun 20, 2017

himanshug commented Jun 20, 2017

himanshug commented Jun 20, 2017

leventov commented Jun 20, 2017

gianm commented Jun 20, 2017 • edited

himanshug commented Jun 20, 2017

himanshug commented Jun 20, 2017

leventov commented Jun 20, 2017

b-slim commented Jun 20, 2017

b-slim commented Jun 20, 2017 • edited

leventov commented Jun 20, 2017

himanshug commented Jun 20, 2017

b-slim commented Jun 20, 2017

himanshug commented Jun 20, 2017

b-slim commented Jun 20, 2017

himanshug commented Jun 20, 2017

himanshug commented Jun 20, 2017

b-slim commented Jun 21, 2017

leventov commented May 23, 2017 •

edited

gianm commented Jun 20, 2017 •

edited

b-slim commented Jun 20, 2017 •

edited