-
Notifications
You must be signed in to change notification settings - Fork 3.8k
Closed
Description
"Index datasource from Hadoop 3.1.1 hdfs failed in kerberized cluster"
Affected Version
0.18.1
Description
I am trying to index data in HDFS to Druid but failed.
Command :
curl --negotiate -u:druid-XXX@XXX.COM -b /tmp/krb5cc_1008 -X 'POST' -H 'Content-Type:application/json' -d @/home/druid/wikipedia-index-hadoop.json http://XXX.XXX.com:8390/druid/indexer/v1/task
Json Spec :
{
"type" : "index_hadoop",
"spec" : {
"dataSchema" : {
"dataSource" : "wikipedia_hadoop_29092020",
"parser" : {
"type" : "hadoopyString",
"parseSpec" : {
"format" : "json",
"dimensionsSpec" : {
"dimensions" : [
"channel",
"cityName",
"comment",
"countryIsoCode",
"countryName",
"isAnonymous",
"isMinor",
"isNew",
"isRobot",
"isUnpatrolled",
"metroCode",
"namespace",
"page",
"regionIsoCode",
"regionName",
"user",
{ "name": "added", "type": "long" },
{ "name": "deleted", "type": "long" },
{ "name": "delta", "type": "long" }
]
},
"timestampSpec" : {
"format" : "auto",
"column" : "time"
}
}
},
"metricsSpec" : [],
"granularitySpec" : {
"type" : "uniform",
"segmentGranularity" : "day",
"queryGranularity" : "none",
"intervals" : ["2015-09-12/2015-09-13"],
"rollup" : false
}
},
"ioConfig" : {
"type" : "hadoop",
"inputSpec" : {
"type" : "static",
"paths" : "/user/druid/quickstart/wikiticker-2015-09-12-sampled.json.gz"
}
},
"tuningConfig" : {
"type" : "hadoop",
"partitionsSpec" : {
"type" : "hashed",
"targetPartitionSize" : 5000000
},
"forceExtendableShardSpecs" : true,
"jobProperties" : {
"fs.default.name" : "hdfs://nn",
"fs.defaultFS" : "hdfs://nn/user/druid",
"dfs.datanode.address" : "0.0.0.0:50010",
"dfs.client.use.datanode.hostname" : "true",
"dfs.datanode.use.datanode.hostname" : "true",
"yarn.resourcemanager.hostname" : "xxx.xxx.com",
"yarn.nodemanager.vmem-check-enabled" : "false",
"mapreduce.map.java.opts" : "-Duser.timezone=UTC -Dfile.encoding=UTF-8",
"mapreduce.job.user.classpath.first" : "true",
"mapreduce.reduce.java.opts" : "-Duser.timezone=UTC -Dfile.encoding=UTF-8",
"mapreduce.map.memory.mb" : 1024,
"mapreduce.reduce.memory.mb" : 1024
}
}
},
"hadoopDependencyCoordinates": ["org.apache.hadoop:hadoop-client:3.1.1"]
}
Errors while processing index in Task Log
org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]
2020-09-30T03:27:20,417 WARN [task-runner-0-priority-0] org.apache.hadoop.ipc.Client - Exception encountered while connecting to the server : org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]
2020-09-30T03:27:20,531 WARN [task-runner-0-priority-0] org.apache.hadoop.ipc.Client - Exception encountered while connecting to the server : org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]
Error: com.google.inject.internal.Errors.checkNotNull(Ljava/lang/Object;Ljava/lang/String;)Ljava/lang/Object;
Error: com.google.inject.internal.Errors.checkNotNull(Ljava/lang/Object;Ljava/lang/String;)Ljava/lang/Object;
i am already set druid for hadoop kerberos cluster by set in _common
druid.security.extensions.loadList=["druid-kerberos"]
druid.hadoop.security.kerberos.keytab=/etc/security/keytabs/druid.headless.keytab
druid.hadoop.security.kerberos.principal=druid-XXX@XXX.COM
i am also following doc from https://druid.apache.org/docs/0.18.1/tutorials/tutorial-kerberos-hadoop.html ,copying hadoop configuration *-site.xml to druid conf dir but still facing same error.
am I missing something ?
Thanks in advance.
Reactions are currently unavailable