tuningConfig.jobProperties not passed to hadoop #5135

gvsmirnov · 2017-12-04T15:52:23Z

While upgrading from 0.9.1 to 0.10.1, we noticed that the segment reindexing tasks are failing with the following exception:

Caused by: java.lang.IllegalArgumentException: AWS Access Key ID and Secret Access Key must be specified as the username or password (respectively) of a s3n URL, or by setting the fs.s3n.awsAccessKeyId or fs.s3n.awsSecretAccessKey properties (respectively).
	at org.apache.hadoop.fs.s3.S3Credentials.initialize(S3Credentials.java:70) ~[?:?]
	at org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.initialize(Jets3tNativeFileSystemStore.java:80) ~[?:?]
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_131]
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_131]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_131]
	at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_131]
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191) ~[?:?]
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) ~[?:?]
	at org.apache.hadoop.fs.s3native.$Proxy209.initialize(Unknown Source) ~[?:?]
	at org.apache.hadoop.fs.s3native.NativeS3FileSystem.initialize(NativeS3FileSystem.java:334) ~[?:?]
	at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2669) ~[?:?]
	at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:94) ~[?:?]
	at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2703) ~[?:?]
	at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2685) ~[?:?]
	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:373) ~[?:?]
	at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295) ~[?:?]
	at io.druid.indexer.hadoop.DatasourceInputFormat$3$1.listStatus(DatasourceInputFormat.java:173) ~[?:?]
	at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:315) ~[?:?]
	at io.druid.indexer.hadoop.DatasourceInputFormat.lambda$getLocations$1(DatasourceInputFormat.java:213) ~[?:?]
	at java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:267) ~[?:1.8.0_131]
	at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1374) ~[?:1.8.0_131]
	at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481) ~[?:1.8.0_131]
	at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471) ~[?:1.8.0_131]
	at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708) ~[?:1.8.0_131]
	at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) ~[?:1.8.0_131]
	at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499) ~[?:1.8.0_131]
	at io.druid.indexer.hadoop.DatasourceInputFormat.getFrequentLocations(DatasourceInputFormat.java:236) ~[?:?]
	at io.druid.indexer.hadoop.DatasourceInputFormat.toDataSourceSplit(DatasourceInputFormat.java:194) ~[?:?]
	at io.druid.indexer.hadoop.DatasourceInputFormat.getSplits(DatasourceInputFormat.java:124) ~[?:?]
	at org.apache.hadoop.mapreduce.lib.input.DelegatingInputFormat.getSplits(DelegatingInputFormat.java:115) ~[?:?]
	at org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:301) ~[?:?]
	at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:318) ~[?:?]
	at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:196) ~[?:?]
	at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290) ~[?:?]
	at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287) ~[?:?]
	at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_131]
	at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_131]
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698) ~[?:?]
	at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287) ~[?:?]
	at io.druid.indexer.IndexGeneratorJob.run(IndexGeneratorJob.java:205) ~[druid-indexing-hadoop-0.10.1-iap3.jar:0.10.1-iap3]
	at io.druid.indexer.JobHelper.runJobs(JobHelper.java:372) ~[druid-indexing-hadoop-0.10.1-iap3.jar:0.10.1-iap3]
	at io.druid.indexer.HadoopDruidIndexerJob.run(HadoopDruidIndexerJob.java:95) ~[druid-indexing-hadoop-0.10.1-iap3.jar:0.10.1-iap3]
	at io.druid.indexing.common.task.HadoopIndexTask$HadoopIndexGeneratorInnerProcessing.runTask(HadoopIndexTask.java:277) ~[druid-indexing-service-0.10.1-iap3.jar:0.10.1-iap3]

However, some lines above in the logs, the values are visible, set as per the documentation (and in a way that worked fine pre-upgrade):

{
  "type" : "index_hadoop",
  "spec" : {
    // ...
    "tuningConfig" : {
      "type" : "hadoop",
      "jobProperties" : {
        "fs.s3n.awsAccessKeyId" : "<key id>",        // <- here are the
        "fs.s3n.awsSecretAccessKey" : "<secret key>" // <- properties
      },
      //...
    }
  },
  // ...
}

After some investigation, I found out that the real config is ignored, and what hadoop gets instead is this:

JobConf dummyConf = new JobConf();

In the PR that it was introduced in, the only discussion of this line is that dummyConf should be a local variable instead of a field: #2223 (comment)

I am currently looking for a workaround, but this should be fixed for good. However, l cannot understand how the dummyConf was supposed to ever work. Maybe @navis can help explain?

The text was updated successfully, but these errors were encountered:

gvsmirnov · 2017-12-04T16:51:44Z

I can also see that this line is present in 0.9.1 as well. It may be the case that back then, the error did not manifest because the config was never queries. Perhaps some newer change (e.g. #4116) exposed this bug with losing configuration.

himanshug · 2017-12-04T21:14:21Z

commented in #2223 (review)

thanks for identifying the root cause.

gvsmirnov · 2017-12-19T15:34:10Z

@himanshug I fixed this issue here for 0.10.1: Plumbr@640277f. We have by now successfully reindexed some months' worth of data with a version built from that branch. However, porting the fix to master is non-trivial because io.druid.indexer.hadoop.DatasourceInputFormat was changed in newer versions. We also don't have a 0.11 cluster at hand to verify the fix. Is it possible to merge the fix at least for 0.10?

himanshug · 2017-12-19T21:09:55Z

@gvsmirnov In Druid code dev workflow, bugs are always fixed in master and then backported into specific release branch if necessary. we wouldn't be able to do a new druid release of 0.10.1 even if we merged it in 0.10.1 and it would be weird for a bug to be fixed in an older unreleased branch but not in master and upcoming releases.

gvsmirnov · 2017-12-22T13:22:17Z

@himanshug I see. This is reasonable, I agree. It will likely take some time before I can verify the fix in a version built from master, though. Cannot give an ETA at the moment, but will get back to it.

psalaberria002 · 2018-01-15T14:34:52Z

We are having the same issue with the properties not being passed to hadoop. Is there any other way to set them?

gvsmirnov mentioned this issue Dec 4, 2017

Best effort to find locations for input splits #2223

Merged

dclim mentioned this issue Feb 20, 2018

Pass config from context into JobConf for DatasourceInputFormat splits #5408

Merged

fjy closed this as completed in #5408 Feb 21, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tuningConfig.jobProperties not passed to hadoop #5135

tuningConfig.jobProperties not passed to hadoop #5135

gvsmirnov commented Dec 4, 2017 •

edited

Loading

gvsmirnov commented Dec 4, 2017

himanshug commented Dec 4, 2017

gvsmirnov commented Dec 19, 2017

himanshug commented Dec 19, 2017

gvsmirnov commented Dec 22, 2017

psalaberria002 commented Jan 15, 2018

tuningConfig.jobProperties not passed to hadoop #5135

tuningConfig.jobProperties not passed to hadoop #5135

Comments

gvsmirnov commented Dec 4, 2017 • edited Loading

gvsmirnov commented Dec 4, 2017

himanshug commented Dec 4, 2017

gvsmirnov commented Dec 19, 2017

himanshug commented Dec 19, 2017

gvsmirnov commented Dec 22, 2017

psalaberria002 commented Jan 15, 2018

gvsmirnov commented Dec 4, 2017 •

edited

Loading