Not able to transfer data from hive to elastic-search #417

jackx-wz · 2015-04-07T09:40:22Z

I want use es-hadoop-hive,My ENV is

CentOS release 6.6 (Final)
elasticsearch-1.5.0
hadoop-2.5.2
hive-1.0.0
elasticsearch-hadoop-hive-2.1.0.Beta3.jar

I've already seen these issues： #416 #359 #409

hence I git clone source code(branch master) and use gradlew to generate
elasticsearch-hadoop-hive-2.1.0.BUILD-SNAPSHOT.jar

./bin/plugin -u file:///tmp/elasticsearch-hadoop-hive-2.1.0.BUILD-SNAPSHOT.jar -i elasticsearch/hive

copy this jar file to hive lib and set hive-site.xml

  <property>
    <name>hive.aux.jars.path</name>
    <value>elasticsearch-hadoop-hive-2.1.0.BUILD-SNAPSHOT.jar</value>                                                                             
    <description>The location of the plugin jars that contain implementations of user defined functions and serdes.</description>
  </property>

but this plugin is unable to work.

my script is:

CREATE EXTERNAL TABLE sos(
    id bigint,
    name string
) ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
STORED AS TEXTFILE
LOCATION '/user/file';

CREATE EXTERNAL TABLE ss(
    id bigint,
    name string
) ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler'
TBLPROPERTIES('es.nodes'='127.0.0.1',
'es.resource'='ss/{id}');

hive> INSERT OVERWRITE TABLE ss select id,name from sos;      
Query ID = root_20150407165656_78e4b9bc-8d8f-46ff-b747-a98366cedf79
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1428396432391_0002, Tracking URL = http://localhost:8088/proxy/application_1428396432391_0002/
Kill Command = /root/hadoop/hadoop-2.5.2/bin/hadoop job  -kill job_1428396432391_0002
Hadoop job information for Stage-0: number of mappers: 0; number of reducers: 0
2015-04-07 16:57:23,355 Stage-0 map = 0%,  reduce = 0%
Ended Job = job_1428396432391_0002 with errors
Error during job, obtaining debugging information...
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
MapReduce Jobs Launched: 
Stage-Stage-0:  HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec

node log:

2015-04-07 16:57:20,764 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Created MRAppMaster for application appattempt_1428396432391_0002_000002
2015-04-07 16:57:21,705 FATAL [main] org.apache.hadoop.conf.Configuration: error parsing conf job.xml
org.xml.sax.SAXParseException; systemId: file:///root/hadoop/tmp/nm-local-dir/usercache/root/appcache/application_1428396432391_0002/container_1428396432391_0002_02_000001/job.xml; lineNumber: 647; columnNumber: 51; 字符引用 "&#
    at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:257)
    at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:347)
    at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:150)
    at org.apache.hadoop.conf.Configuration.parse(Configuration.java:2183)
    at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2252)
    at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2205)
    at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2112)
    at org.apache.hadoop.conf.Configuration.get(Configuration.java:1078)
    at org.apache.hadoop.mapreduce.v2.util.MRWebAppUtil.initialize(MRWebAppUtil.java:50)
    at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1407)
2015-04-07 16:57:21,721 FATAL [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Error starting MRAppMaster
java.lang.RuntimeException: org.xml.sax.SAXParseException; systemId: file:///root/hadoop/tmp/nm-local-dir/usercache/root/appcache/application_1428396432391_0002/container_1428396432391_0002_02_000001/job.xml; lineNumber: 647; columnNumber: 51; 字符引用 "&#
    at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2348)
    at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2205)
    at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2112)
    at org.apache.hadoop.conf.Configuration.get(Configuration.java:1078)
    at org.apache.hadoop.mapreduce.v2.util.MRWebAppUtil.initialize(MRWebAppUtil.java:50)
    at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1407)
Caused by: org.xml.sax.SAXParseException; systemId: file:///root/hadoop/tmp/nm-local-dir/usercache/root/appcache/application_1428396432391_0002/container_1428396432391_0002_02_000001/job.xml; lineNumber: 647; columnNumber: 51; 字符引用 "&#
    at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:257)
    at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:347)
    at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:150)
    at org.apache.hadoop.conf.Configuration.parse(Configuration.java:2183)
    at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2252)
    ... 5 more
2015-04-07 16:57:21,734 INFO [main] org.apache.hadoop.util.ExitUtil: Exiting with status 1

Thanks for any help!!
Best Regards

The text was updated successfully, but these errors were encountered:

costin · 2015-04-07T09:54:12Z

You seem to be confusing a lot of things here. es-hadoop is not a plugin. I'm not sure where you got this idea but if you look at the docs (which I highly recommend you do) you would have noticed that

./bin/plugin -u file:///tmp/elasticsearch-hadoop-hive-2.1.0.BUILD-SNAPSHOT.jar -i elasticsearch/hive

doesn't do anything.

Further more you indicate that you are using elasticsearch-hadoop-hive-2.1.0.Beta3.jar yet install the same library as a plugin.

es-hadoop is a connector and should not be installed as a plugin. In fact, it doesn't touch the elasticsearch installation. As explained several times in the docs, it's a jar that you add to your Hadoop job, and not elasticsearch.

Second, as mentioned in the issues you posted, you need only one version of es-hadoop - if you have multiple versions (like 2.1.0.Beta3 and 2.1.0-SNAPSHOT) you are bound to run into errors.

I highly recommend you spend some time with the docs which indicate how you can install es-hadoop connector.
The error you are encountering is fixed in master - again the issues you link confirm this.
So please, triple-check your classpath, make sure you don't have es-hadoop connector installed, install the latest version and then run your script.

jackx-wz · 2015-04-08T07:05:19Z

@costin Thank you for your help.
I delete redundant version and rm es/plugin/*
use master elasticsearch-hadoop-hive-2.1.0.BUILD-SNAPSHOT.jar
All parts are running ok，Thank you again！！

ritz3623 · 2015-05-18T09:16:06Z

Hi guys,

I am facing the same issue after applying the solution as you guys mention in your post.

My ENV is as follows:

CentOS 6
elasticsearch-1.5.2
hadoop-2.2.0
hive-0.14.0
ES - hadoop connector : elasticsearch-hadoop-2.1.0.BUILD-SNAPSHOT.jar

I have performed the following steps till now:

Placed the ES - Hadoop Connector jar file at hive/lib

2)made entry in hive-site.xml file as below

<property>
  <name>hive.aux.jars.path</name>
  <value><path to hive lib>/elasticsearch-hadoop-2.1.0.BUILD-SNAPSHOT.jar</value>
</property>

3)from csv file i have created one table with the following command:

create table event(event_time STRING,name STRING,serverity STRING,sourceaddress STRING,categorysignificance STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';

4)then created external table on the top of ES

create external table eventlog(event_time STRING,name STRING,serverity STRING,sourceaddress STRING,categorysignificance STRING)
STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler'
TBLPROPERTIES ('es.node' = 'localhost','es.port' = '9200','es.index.auto.create' = 'false','es.resource' = 'eventlog/events');

NOTE - I have already created eventlog index in ES and indexed one document into it manually.

5)after creating the external table i am able to read the indexed data from ES using following query:

select * from eventlog;

6)but when i try to load the data into ES from table which i have created using csv file using follwing command:

insert into table eventlog select * from event;

After running the above query i got this error in hive console:

Query ID = horton_20150515053434_f30d4a09-afbf-4593-9dc6-3f28c17ae651
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1430301842979_0007, Tracking URL = http://sandbox.hortonworks.com:8088/proxy/application_1430301842979_0007/
Kill Command = /usr/hdp/2.2.0.0-2041/hadoop/bin/hadoop job -kill job_1430301842979_0007
Hadoop job information for Stage-0: number of mappers: 0; number of reducers: 0
2015-05-15 05:34:11,845 Stage-0 map = 0%, reduce = 0%
Ended Job = job_1430301842979_0007 with errors
Error during job, obtaining debugging information...
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
MapReduce Jobs Launched:
Stage-Stage-0: HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec

Below is the detailed log of the same issue:

2015-05-18 03:07:36,054 INFO [main]: parse.ParseDriver (ParseDriver.java:parse(185)) - Parsing command: insert into table eventlog select * from event
2015-05-18 03:07:36,057 INFO [main]: parse.ParseDriver (ParseDriver.java:parse(206)) - Parse Completed
2015-05-18 03:07:36,057 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogEnd(135)) - </PERFLOG method=parse start=1431918456054 end=1431918456057 duration=3 from=org.apache.hadoop.hive.ql.Driver>
2015-05-18 03:07:36,060 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(108)) -
2015-05-18 03:07:36,063 INFO [main]: parse.SemanticAnalyzer (SemanticAnalyzer.java:analyzeInternal(9962)) - Starting Semantic Analysis
2015-05-18 03:07:36,064 INFO [main]: parse.SemanticAnalyzer (SemanticAnalyzer.java:analyzeInternal(10001)) - Completed phase 1 of Semantic Analysis
2015-05-18 03:07:36,064 INFO [main]: parse.SemanticAnalyzer (SemanticAnalyzer.java:getMetaData(1554)) - Get metadata for source tables
2015-05-18 03:07:36,088 INFO [main]: parse.SemanticAnalyzer (SemanticAnalyzer.java:getMetaData(1702)) - Get metadata for subqueries
2015-05-18 03:07:36,088 INFO [main]: parse.SemanticAnalyzer (SemanticAnalyzer.java:getMetaData(1726)) - Get metadata for destination tables
2015-05-18 03:07:36,210 INFO [main]: parse.SemanticAnalyzer (SemanticAnalyzer.java:analyzeInternal(10004)) - Completed getting MetaData in Semantic Analysis
2015-05-18 03:07:36,210 INFO [main]: parse.SemanticAnalyzer (SemanticAnalyzer.java:canHandleAstForCbo(10258)) - Not invoking CBO because the statement has too few joins
2015-05-18 03:07:36,247 INFO [main]: parse.SemanticAnalyzer (SemanticAnalyzer.java:genFileSinkPlan(6412)) - Set stats collection dir : hdfs://sandbox.hortonworks.com:8020/tmp/hive/horton/e84fab86-6fcc-4f60-9d33-9ffc2cc0d43f/hive_2015-05-18_03-07-36_054_3350098020850259592-1/-ext-10000
2015-05-18 03:07:36,254 INFO [main]: ppd.OpProcFactory (OpProcFactory.java:process(771)) - Processing for FS(6)
2015-05-18 03:07:36,254 INFO [main]: ppd.OpProcFactory (OpProcFactory.java:process(135)) - Processing for LIM(5)
2015-05-18 03:07:36,254 INFO [main]: ppd.OpProcFactory (OpProcFactory.java:process(771)) - Processing for SEL(4)
2015-05-18 03:07:36,255 INFO [main]: ppd.OpProcFactory (OpProcFactory.java:process(771)) - Processing for RS(3)
2015-05-18 03:07:36,255 INFO [main]: ppd.OpProcFactory (OpProcFactory.java:process(135)) - Processing for LIM(2)
2015-05-18 03:07:36,255 INFO [main]: ppd.OpProcFactory (OpProcFactory.java:process(771)) - Processing for SEL(1)
2015-05-18 03:07:36,255 INFO [main]: ppd.OpProcFactory (OpProcFactory.java:process(390)) - Processing for TS(0)
2015-05-18 03:07:36,257 INFO [main]: optimizer.ColumnPrunerProcFactory (ColumnPrunerProcFactory.java:pruneReduceSinkOperator(761)) - RS 3 oldColExprMap: {VALUE._col4=Column[_col4], VALUE._col3=Column[_col3], VALUE._col2=Column[_col2], VALUE._col1=Column[_col1], VALUE._col0=Column[_col0]}
2015-05-18 03:07:36,257 INFO [main]: optimizer.ColumnPrunerProcFactory (ColumnPrunerProcFactory.java:pruneReduceSinkOperator(812)) - RS 3 newColExprMap: {VALUE._col4=Column[_col4], VALUE._col3=Column[_col3], VALUE._col2=Column[_col2], VALUE._col1=Column[_col1], VALUE._col0=Column[_col0]}
2015-05-18 03:07:36,262 INFO [main]: optimizer.GlobalLimitOptimizer (GlobalLimitOptimizer.java:transform(126)) - Qualify the optimize that reduces input size for 'limit' for limit 1
2015-05-18 03:07:36,263 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(108)) -
2015-05-18 03:07:36,263 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogEnd(135)) - </PERFLOG method=partition-retrieving start=1431918456263 end=1431918456263 duration=0 from=org.apache.hadoop.hive.ql.optimizer.ppr.PartitionPruner>
2015-05-18 03:07:36,263 INFO [main]: optimizer.GenMapRedUtils (GenMapRedUtils.java:setMapWork(558)) - Try to reduce input size for 'limit' sizeNeeded: 100000 file limit : 10
2015-05-18 03:07:36,264 INFO [main]: optimizer.SamplePruner (SamplePruner.java:addPath(260)) - Path pattern = hdfs://sandbox.hortonworks.com:8020/apps/hive/warehouse/testes/*
2015-05-18 03:07:36,267 INFO [main]: optimizer.SamplePruner (SamplePruner.java:addPath(283)) - Got file: hdfs://sandbox.hortonworks.com:8020/apps/hive/warehouse/testes/test.csv
2015-05-18 03:07:36,314 INFO [main]: physical.NullScanTaskDispatcher (NullScanTaskDispatcher.java:dispatch(154)) - Looking for table scans where optimization is applicable
2015-05-18 03:07:36,314 INFO [main]: physical.NullScanTaskDispatcher (NullScanTaskDispatcher.java:dispatch(178)) - Found 0 null table scans
2015-05-18 03:07:36,314 INFO [main]: physical.NullScanTaskDispatcher (NullScanTaskDispatcher.java:dispatch(154)) - Looking for table scans where optimization is applicable
2015-05-18 03:07:36,315 INFO [main]: physical.NullScanTaskDispatcher (NullScanTaskDispatcher.java:dispatch(178)) - Found 0 null table scans
2015-05-18 03:07:36,315 INFO [main]: physical.NullScanTaskDispatcher (NullScanTaskDispatcher.java:dispatch(154)) - Looking for table scans where optimization is applicable
2015-05-18 03:07:36,315 INFO [main]: physical.NullScanTaskDispatcher (NullScanTaskDispatcher.java:dispatch(178)) - Found 0 null table scans
2015-05-18 03:07:36,316 INFO [main]: physical.Vectorizer (Vectorizer.java:validateMapWork(334)) - Validating MapWork...
2015-05-18 03:07:36,316 INFO [main]: physical.Vectorizer (Vectorizer.java:validateMapWork(342)) - Input format: org.apache.hadoop.mapred.TextInputFormat, doesn't provide vectorized input
2015-05-18 03:07:36,316 INFO [main]: parse.SemanticAnalyzer (SemanticAnalyzer.java:analyzeInternal(10174)) - Completed plan generation
2015-05-18 03:07:36,316 INFO [main]: ql.Driver (Driver.java:compile(427)) - Semantic Analysis Completed
2015-05-18 03:07:36,316 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogEnd(135)) - </PERFLOG method=semanticAnalyze start=1431918456060 end=1431918456316 duration=256 from=org.apache.hadoop.hive.ql.Driver>
2015-05-18 03:07:36,317 INFO [main]: ql.Driver (Driver.java:getSchema(235)) - Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:testes.event_time, type:string, comment:null), FieldSchema(name:testes.name, type:string, comment:null), FieldSchema(name:testes.severity, type:string, comment:null), FieldSchema(name:testes.sourceaddress, type:string, comment:null), FieldSchema(name:testes.categorysignificance, type:string, comment:null)], properties:null)
2015-05-18 03:07:36,317 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogEnd(135)) - </PERFLOG method=compile start=1431918456053 end=1431918456317 duration=264 from=org.apache.hadoop.hive.ql.Driver>
2015-05-18 03:07:36,317 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(108)) -
2015-05-18 03:07:36,376 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogEnd(135)) - </PERFLOG method=acquireReadWriteLocks start=1431918456317 end=1431918456376 duration=59 from=org.apache.hadoop.hive.ql.Driver>
2015-05-18 03:07:36,376 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(108)) -
2015-05-18 03:07:36,376 INFO [main]: ql.Driver (Driver.java:execute(1285)) - Starting command: insert into table eventlog select * from event
2015-05-18 03:07:36,376 INFO [main]: hooks.ATSHook (ATSHook.java:(87)) - Created ATS Hook
2015-05-18 03:07:36,376 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(108)) -
2015-05-18 03:07:36,379 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogEnd(135)) - </PERFLOG method=PreHook.org.apache.hadoop.hive.ql.hooks.ATSHook start=1431918456376 end=1431918456379 duration=3 from=org.apache.hadoop.hive.ql.Driver>
2015-05-18 03:07:36,380 INFO [main]: ql.Driver (SessionState.java:printInfo(824)) - Query ID = horton_20150518030707_32a637a4-9b82-43ef-9515-32979a634352
2015-05-18 03:07:36,380 INFO [main]: ql.Driver (SessionState.java:printInfo(824)) - Total jobs = 1
2015-05-18 03:07:36,380 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogEnd(135)) - </PERFLOG method=TimeToSubmit start=1431918456053 end=1431918456380 duration=327 from=org.apache.hadoop.hive.ql.Driver>
2015-05-18 03:07:36,380 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(108)) -
2015-05-18 03:07:36,380 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(108)) -
2015-05-18 03:07:36,381 INFO [main]: ql.Driver (SessionState.java:printInfo(824)) - Launching Job 1 out of 1
2015-05-18 03:07:36,382 INFO [main]: ql.Driver (Driver.java:launchTask(1602)) - Starting task [Stage-0:MAPRED] in serial mode
2015-05-18 03:07:36,382 INFO [main]: exec.Task (SessionState.java:printInfo(824)) - Number of reduce tasks determined at compile time: 1
2015-05-18 03:07:36,382 INFO [main]: exec.Task (SessionState.java:printInfo(824)) - In order to change the average load for a reducer (in bytes):
2015-05-18 03:07:36,382 INFO [main]: exec.Task (SessionState.java:printInfo(824)) - set hive.exec.reducers.bytes.per.reducer=
2015-05-18 03:07:36,383 INFO [main]: exec.Task (SessionState.java:printInfo(824)) - In order to limit the maximum number of reducers:
2015-05-18 03:07:36,383 INFO [main]: exec.Task (SessionState.java:printInfo(824)) - set hive.exec.reducers.max=
2015-05-18 03:07:36,383 INFO [main]: exec.Task (SessionState.java:printInfo(824)) - In order to set a constant number of reducers:
2015-05-18 03:07:36,383 INFO [main]: exec.Task (SessionState.java:printInfo(824)) - set mapreduce.job.reduces=
2015-05-18 03:07:36,383 INFO [main]: ql.Context (Context.java:getMRScratchDir(266)) - New scratch dir is hdfs://sandbox.hortonworks.com:8020/tmp/hive/horton/e84fab86-6fcc-4f60-9d33-9ffc2cc0d43f/hive_2015-05-18_03-07-36_054_3350098020850259592-1
2015-05-18 03:07:36,385 INFO [ATS Logger 0]: hooks.ATSHook (ATSHook.java:createPreHookEvent(148)) - Received pre-hook notification for :horton_20150518030707_32a637a4-9b82-43ef-9515-32979a634352
2015-05-18 03:07:36,402 INFO [main]: mr.ExecDriver (ExecDriver.java:execute(287)) - Using org.apache.hadoop.hive.ql.io.CombineHiveInputFormat
2015-05-18 03:07:36,402 INFO [main]: mr.ExecDriver (ExecDriver.java:execute(309)) - adding libjars: file:///usr/hdp/current/hive-webhcat/share/hcatalog/hive-hcatalog-core-0.14.0.2.2.0.0-2041.jar,file:///usr/hdp/current/hive-webhcat/share/hcatalog/hive-hcatalog-core.jar,file:///usr/hdp/current/hive-webhcat/share/hcatalog/hive-hcatalog-pig-adapter-0.14.0.2.2.0.0-2041.jar,file:///usr/hdp/current/hive-webhcat/share/hcatalog/hive-hcatalog-pig-adapter.jar,file:///usr/hdp/current/hive-webhcat/share/hcatalog/hive-hcatalog-server-extensions-0.14.0.2.2.0.0-2041.jar,file:///usr/hdp/current/hive-webhcat/share/hcatalog/hive-hcatalog-server-extensions.jar,file:///usr/hdp/current/hive-webhcat/share/hcatalog/hive-hcatalog-streaming-0.14.0.2.2.0.0-2041.jar,file:///usr/hdp/current/hive-webhcat/share/hcatalog/hive-hcatalog-streaming.jar
2015-05-18 03:07:36,402 INFO [main]: exec.Utilities (Utilities.java:getInputPaths(3265)) - Processing alias testes
2015-05-18 03:07:36,403 INFO [main]: exec.Utilities (Utilities.java:getInputPaths(3282)) - Adding input file hdfs://sandbox.hortonworks.com:8020/apps/hive/warehouse/testes
2015-05-18 03:07:36,403 INFO [main]: exec.Utilities (Utilities.java:isEmptyPath(2605)) - Content Summary not cached for hdfs://sandbox.hortonworks.com:8020/apps/hive/warehouse/testes
2015-05-18 03:07:36,406 INFO [main]: ql.Context (Context.java:getMRScratchDir(266)) - New scratch dir is hdfs://sandbox.hortonworks.com:8020/tmp/hive/horton/e84fab86-6fcc-4f60-9d33-9ffc2cc0d43f/hive_2015-05-18_03-07-36_054_3350098020850259592-1
2015-05-18 03:07:36,450 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(108)) -
2015-05-18 03:07:36,450 INFO [main]: exec.Utilities (Utilities.java:serializePlan(899)) - Serializing MapWork via kryo
2015-05-18 03:07:36,511 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogEnd(135)) - </PERFLOG method=serializePlan start=1431918456450 end=1431918456511 duration=61 from=org.apache.hadoop.hive.ql.exec.Utilities>
2015-05-18 03:07:36,543 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(108)) -
2015-05-18 03:07:36,543 INFO [main]: exec.Utilities (Utilities.java:serializePlan(899)) - Serializing ReduceWork via kryo
2015-05-18 03:07:36,605 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogEnd(135)) - </PERFLOG method=serializePlan start=1431918456543 end=1431918456605 duration=62 from=org.apache.hadoop.hive.ql.exec.Utilities>
2015-05-18 03:07:36,842 INFO [main]: impl.TimelineClientImpl (TimelineClientImpl.java:serviceInit(285)) - Timeline service address: http://sandbox.hortonworks.com:8188/ws/v1/timeline/
2015-05-18 03:07:36,842 INFO [main]: client.RMProxy (RMProxy.java:createRMProxy(98)) - Connecting to ResourceManager at sandbox.hortonworks.com/100.77.84.63:8050
2015-05-18 03:07:37,121 INFO [main]: impl.TimelineClientImpl (TimelineClientImpl.java:serviceInit(285)) - Timeline service address: http://sandbox.hortonworks.com:8188/ws/v1/timeline/
2015-05-18 03:07:37,121 INFO [main]: client.RMProxy (RMProxy.java:createRMProxy(98)) - Connecting to ResourceManager at sandbox.hortonworks.com/100.77.84.63:8050
2015-05-18 03:07:37,128 INFO [main]: mr.EsOutputFormat (EsOutputFormat.java:init(356)) - Writing to [eventlog/events]
2015-05-18 03:07:37,139 WARN [main]: mapreduce.JobSubmitter (JobSubmitter.java:copyAndConfigureFiles(153)) - Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
2015-05-18 03:07:38,735 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(108)) -
2015-05-18 03:07:38,740 INFO [main]: io.CombineHiveInputFormat (CombineHiveInputFormat.java:getCombineSplits(387)) - CombineHiveInputSplit creating pool for hdfs://sandbox.hortonworks.com:8020/apps/hive/warehouse/testes; using filter path hdfs://sandbox.hortonworks.com:8020/apps/hive/warehouse/testes
2015-05-18 03:07:38,747 INFO [main]: input.FileInputFormat (FileInputFormat.java:listStatus(281)) - Total input paths to process : 1
2015-05-18 03:07:38,748 INFO [main]: input.CombineFileInputFormat (CombineFileInputFormat.java:createSplits(413)) - DEBUG: Terminated node allocation with : CompletedNodes: 1, size left: 0
2015-05-18 03:07:38,750 INFO [main]: io.CombineHiveInputFormat (CombineHiveInputFormat.java:getCombineSplits(442)) - number of splits 1
2015-05-18 03:07:38,750 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogEnd(135)) - </PERFLOG method=getSplits start=1431918458735 end=1431918458750 duration=15 from=org.apache.hadoop.hive.ql.io.CombineHiveInputFormat>
2015-05-18 03:07:38,750 INFO [main]: io.CombineHiveInputFormat (CombineHiveInputFormat.java:getSplits(521)) - Number of all splits 1
2015-05-18 03:07:38,949 INFO [main]: mapreduce.JobSubmitter (JobSubmitter.java:submitJobInternal(494)) - number of splits:1
2015-05-18 03:07:39,058 INFO [main]: mapreduce.JobSubmitter (JobSubmitter.java:printTokens(583)) - Submitting tokens for job: job_1430301842979_0026
2015-05-18 03:07:39,136 INFO [main]: impl.YarnClientImpl (YarnClientImpl.java:submitApplication(251)) - Submitted application application_1430301842979_0026
2015-05-18 03:07:39,139 INFO [main]: mapreduce.Job (Job.java:submit(1300)) - The url to track the job: http://sandbox.hortonworks.com:8088/proxy/application_1430301842979_0026/
2015-05-18 03:07:39,139 INFO [main]: exec.Task (SessionState.java:printInfo(824)) - Starting Job = job_1430301842979_0026, Tracking URL = http://sandbox.hortonworks.com:8088/proxy/application_1430301842979_0026/
2015-05-18 03:07:39,139 INFO [main]: exec.Task (SessionState.java:printInfo(824)) - Kill Command = /usr/hdp/2.2.0.0-2041/hadoop/bin/hadoop job -kill job_1430301842979_0026
2015-05-18 03:07:46,362 INFO [main]: exec.Task (SessionState.java:printInfo(824)) - Hadoop job information for Stage-0: number of mappers: 0; number of reducers: 0
2015-05-18 03:07:46,411 WARN [main]: mapreduce.Counters (AbstractCounters.java:getGroup(234)) - Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead
2015-05-18 03:07:46,411 INFO [main]: exec.Task (SessionState.java:printInfo(824)) - 2015-05-18 03:07:46,410 Stage-0 map = 0%, reduce = 0%
2015-05-18 03:07:46,415 WARN [main]: mapreduce.Counters (AbstractCounters.java:getGroup(234)) - Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead
2015-05-18 03:07:46,416 ERROR [main]: exec.Task (SessionState.java:printError(833)) - Ended Job = job_1430301842979_0026 with errors
2015-05-18 03:07:46,419 ERROR [Thread-81]: exec.Task (SessionState.java:printError(833)) - Error during job, obtaining debugging information...
2015-05-18 03:07:46,419 INFO [Thread-81]: Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(1051)) - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2015-05-18 03:07:46,473 INFO [main]: impl.YarnClientImpl (YarnClientImpl.java:killApplication(364)) - Killed application application_1430301842979_0026
2015-05-18 03:07:46,500 INFO [main]: hooks.ATSHook (ATSHook.java:(87)) - Created ATS Hook

2015-05-18 03:07:46,501 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(108)) -

2015-05-18 03:07:46,501 INFO [ATS Logger 0]: hooks.ATSHook (ATSHook.java:createPostHookEvent(170)) - Received post-hook notification for :horton_20150518030707_32a637a4-9b82-43ef-9515-32979a634352
2015-05-18 03:07:46,508 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogEnd(135)) - </PERFLOG method=FailureHook.org.apache.hadoop.hive.ql.hooks.ATSHook start=1431918466501 end=1431918466508 duration=7 from=org.apache.hadoop.hive.ql.Driver>
2015-05-18 03:07:46,537 ERROR [main]: ql.Driver (SessionState.java:printError(833)) - FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
2015-05-18 03:07:46,537 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogEnd(135)) - </PERFLOG method=Driver.execute start=1431918456376 end=1431918466537 duration=10161 from=org.apache.hadoop.hive.ql.Driver>
2015-05-18 03:07:46,537 INFO [main]: ql.Driver (SessionState.java:printInfo(824)) - MapReduce Jobs Launched:
2015-05-18 03:07:46,539 WARN [main]: mapreduce.Counters (AbstractCounters.java:getGroup(234)) - Group FileSystemCounters is deprecated. Use org.apache.hadoop.mapreduce.FileSystemCounter instead
2015-05-18 03:07:46,539 INFO [main]: ql.Driver (SessionState.java:printInfo(824)) - Stage-Stage-0: HDFS Read: 0 HDFS Write: 0 FAIL
2015-05-18 03:07:46,539 INFO [main]: ql.Driver (SessionState.java:printInfo(824)) - Total MapReduce CPU Time Spent: 0 msec
2015-05-18 03:07:46,539 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(108)) -
2015-05-18 03:07:46,649 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogEnd(135)) - </PERFLOG method=releaseLocks start=1431918466539 end=1431918466649 duration=110 from=org.apache.hadoop.hive.ql.Driver>
2015-05-18 03:07:46,668 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(108)) -
2015-05-18 03:07:46,668 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogEnd(135)) - </PERFLOG method=releaseLocks start=1431918466668 end=1431918466668 duration=0 from=org.apache.hadoop.hive.ql.Driver>

NOTE - when i tried by creating external table with this command:

create external table sample_ex(event_time STRING,name STRING,serverity STRING,sourceaddress STRING,categorysignificance STRING);

Im able to insert the data from event table to this newly created sample_ex table.

I want to know what is the cause that the insertion is failing for the external table since its data storage type is ES. and also in detail log i can see the failure of creation of hook for the job what is that. because the hook was created at the time of "sample_ex" external table creation.

Thanks for any help and let me know if you want any further information.

Regards!

costin added invalid :Hive v2.1.0.Beta4 labels Apr 7, 2015

jackx-wz closed this as completed Apr 8, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Not able to transfer data from hive to elastic-search #417

Not able to transfer data from hive to elastic-search #417

jackx-wz commented Apr 7, 2015

costin commented Apr 7, 2015

jackx-wz commented Apr 8, 2015

ritz3623 commented May 18, 2015

Not able to transfer data from hive to elastic-search #417

Not able to transfer data from hive to elastic-search #417

Comments

jackx-wz commented Apr 7, 2015

costin commented Apr 7, 2015

jackx-wz commented Apr 8, 2015

ritz3623 commented May 18, 2015