Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not able to transfer data from hive to elastic-search #417

Closed
jackx-wz opened this issue Apr 7, 2015 · 3 comments
Closed

Not able to transfer data from hive to elastic-search #417

jackx-wz opened this issue Apr 7, 2015 · 3 comments

Comments

@jackx-wz
Copy link

jackx-wz commented Apr 7, 2015

I want use es-hadoop-hive,My ENV is

  • CentOS release 6.6 (Final)
  • elasticsearch-1.5.0
  • hadoop-2.5.2
  • hive-1.0.0
  • elasticsearch-hadoop-hive-2.1.0.Beta3.jar

I've already seen these issues: #416 #359 #409

hence I git clone source code(branch master) and use gradlew to generate
elasticsearch-hadoop-hive-2.1.0.BUILD-SNAPSHOT.jar

./bin/plugin -u file:///tmp/elasticsearch-hadoop-hive-2.1.0.BUILD-SNAPSHOT.jar -i elasticsearch/hive

copy this jar file to hive lib and set hive-site.xml

  <property>
    <name>hive.aux.jars.path</name>
    <value>elasticsearch-hadoop-hive-2.1.0.BUILD-SNAPSHOT.jar</value>                                                                             
    <description>The location of the plugin jars that contain implementations of user defined functions and serdes.</description>
  </property>

but this plugin is unable to work.

my script is:

CREATE EXTERNAL TABLE sos(
    id bigint,
    name string
) ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
STORED AS TEXTFILE
LOCATION '/user/file';

CREATE EXTERNAL TABLE ss(
    id bigint,
    name string
) ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler'
TBLPROPERTIES('es.nodes'='127.0.0.1',
'es.resource'='ss/{id}');
hive> INSERT OVERWRITE TABLE ss select id,name from sos;      
Query ID = root_20150407165656_78e4b9bc-8d8f-46ff-b747-a98366cedf79
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1428396432391_0002, Tracking URL = http://localhost:8088/proxy/application_1428396432391_0002/
Kill Command = /root/hadoop/hadoop-2.5.2/bin/hadoop job  -kill job_1428396432391_0002
Hadoop job information for Stage-0: number of mappers: 0; number of reducers: 0
2015-04-07 16:57:23,355 Stage-0 map = 0%,  reduce = 0%
Ended Job = job_1428396432391_0002 with errors
Error during job, obtaining debugging information...
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
MapReduce Jobs Launched: 
Stage-Stage-0:  HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec

node log:

2015-04-07 16:57:20,764 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Created MRAppMaster for application appattempt_1428396432391_0002_000002
2015-04-07 16:57:21,705 FATAL [main] org.apache.hadoop.conf.Configuration: error parsing conf job.xml
org.xml.sax.SAXParseException; systemId: file:///root/hadoop/tmp/nm-local-dir/usercache/root/appcache/application_1428396432391_0002/container_1428396432391_0002_02_000001/job.xml; lineNumber: 647; columnNumber: 51; 字符引用 "&#
    at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:257)
    at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:347)
    at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:150)
    at org.apache.hadoop.conf.Configuration.parse(Configuration.java:2183)
    at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2252)
    at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2205)
    at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2112)
    at org.apache.hadoop.conf.Configuration.get(Configuration.java:1078)
    at org.apache.hadoop.mapreduce.v2.util.MRWebAppUtil.initialize(MRWebAppUtil.java:50)
    at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1407)
2015-04-07 16:57:21,721 FATAL [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Error starting MRAppMaster
java.lang.RuntimeException: org.xml.sax.SAXParseException; systemId: file:///root/hadoop/tmp/nm-local-dir/usercache/root/appcache/application_1428396432391_0002/container_1428396432391_0002_02_000001/job.xml; lineNumber: 647; columnNumber: 51; 字符引用 "&#
    at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2348)
    at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2205)
    at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2112)
    at org.apache.hadoop.conf.Configuration.get(Configuration.java:1078)
    at org.apache.hadoop.mapreduce.v2.util.MRWebAppUtil.initialize(MRWebAppUtil.java:50)
    at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1407)
Caused by: org.xml.sax.SAXParseException; systemId: file:///root/hadoop/tmp/nm-local-dir/usercache/root/appcache/application_1428396432391_0002/container_1428396432391_0002_02_000001/job.xml; lineNumber: 647; columnNumber: 51; 字符引用 "&#
    at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:257)
    at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:347)
    at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:150)
    at org.apache.hadoop.conf.Configuration.parse(Configuration.java:2183)
    at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2252)
    ... 5 more
2015-04-07 16:57:21,734 INFO [main] org.apache.hadoop.util.ExitUtil: Exiting with status 1

Thanks for any help!!
Best Regards

@costin
Copy link
Member

costin commented Apr 7, 2015

You seem to be confusing a lot of things here. es-hadoop is not a plugin. I'm not sure where you got this idea but if you look at the docs (which I highly recommend you do) you would have noticed that

./bin/plugin -u file:///tmp/elasticsearch-hadoop-hive-2.1.0.BUILD-SNAPSHOT.jar -i elasticsearch/hive

doesn't do anything.

Further more you indicate that you are using elasticsearch-hadoop-hive-2.1.0.Beta3.jar yet install the same library as a plugin.

es-hadoop is a connector and should not be installed as a plugin. In fact, it doesn't touch the elasticsearch installation. As explained several times in the docs, it's a jar that you add to your Hadoop job, and not elasticsearch.

Second, as mentioned in the issues you posted, you need only one version of es-hadoop - if you have multiple versions (like 2.1.0.Beta3 and 2.1.0-SNAPSHOT) you are bound to run into errors.

I highly recommend you spend some time with the docs which indicate how you can install es-hadoop connector.
The error you are encountering is fixed in master - again the issues you link confirm this.
So please, triple-check your classpath, make sure you don't have es-hadoop connector installed, install the latest version and then run your script.

@jackx-wz
Copy link
Author

jackx-wz commented Apr 8, 2015

@costin Thank you for your help.
I delete redundant version and rm es/plugin/*
use master elasticsearch-hadoop-hive-2.1.0.BUILD-SNAPSHOT.jar
All parts are running ok,Thank you again!!

@jackx-wz jackx-wz closed this as completed Apr 8, 2015
@ritz3623
Copy link

Hi guys,

I am facing the same issue after applying the solution as you guys mention in your post.

My ENV is as follows:

CentOS 6
elasticsearch-1.5.2
hadoop-2.2.0
hive-0.14.0
ES - hadoop connector : elasticsearch-hadoop-2.1.0.BUILD-SNAPSHOT.jar

I have performed the following steps till now:

  1. Placed the ES - Hadoop Connector jar file at hive/lib

2)made entry in hive-site.xml file as below

<property>
  <name>hive.aux.jars.path</name>
  <value><path to hive lib>/elasticsearch-hadoop-2.1.0.BUILD-SNAPSHOT.jar</value>
</property>

3)from csv file i have created one table with the following command:

create table event(event_time STRING,name STRING,serverity STRING,sourceaddress STRING,categorysignificance STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';

4)then created external table on the top of ES

create external table eventlog(event_time STRING,name STRING,serverity STRING,sourceaddress STRING,categorysignificance STRING)
STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler'
TBLPROPERTIES ('es.node' = 'localhost','es.port' = '9200','es.index.auto.create' = 'false','es.resource' = 'eventlog/events');

NOTE - I have already created eventlog index in ES and indexed one document into it manually.

5)after creating the external table i am able to read the indexed data from ES using following query:

select * from eventlog;

6)but when i try to load the data into ES from table which i have created using csv file using follwing command:

insert into table eventlog select * from event;

After running the above query i got this error in hive console:

Query ID = horton_20150515053434_f30d4a09-afbf-4593-9dc6-3f28c17ae651
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1430301842979_0007, Tracking URL = http://sandbox.hortonworks.com:8088/proxy/application_1430301842979_0007/
Kill Command = /usr/hdp/2.2.0.0-2041/hadoop/bin/hadoop job -kill job_1430301842979_0007
Hadoop job information for Stage-0: number of mappers: 0; number of reducers: 0
2015-05-15 05:34:11,845 Stage-0 map = 0%, reduce = 0%
Ended Job = job_1430301842979_0007 with errors
Error during job, obtaining debugging information...
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
MapReduce Jobs Launched:
Stage-Stage-0: HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec

Below is the detailed log of the same issue:

2015-05-18 03:07:36,054 INFO [main]: parse.ParseDriver (ParseDriver.java:parse(185)) - Parsing command: insert into table eventlog select * from event
2015-05-18 03:07:36,057 INFO [main]: parse.ParseDriver (ParseDriver.java:parse(206)) - Parse Completed
2015-05-18 03:07:36,057 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogEnd(135)) - </PERFLOG method=parse start=1431918456054 end=1431918456057 duration=3 from=org.apache.hadoop.hive.ql.Driver>
2015-05-18 03:07:36,060 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(108)) -
2015-05-18 03:07:36,063 INFO [main]: parse.SemanticAnalyzer (SemanticAnalyzer.java:analyzeInternal(9962)) - Starting Semantic Analysis
2015-05-18 03:07:36,064 INFO [main]: parse.SemanticAnalyzer (SemanticAnalyzer.java:analyzeInternal(10001)) - Completed phase 1 of Semantic Analysis
2015-05-18 03:07:36,064 INFO [main]: parse.SemanticAnalyzer (SemanticAnalyzer.java:getMetaData(1554)) - Get metadata for source tables
2015-05-18 03:07:36,088 INFO [main]: parse.SemanticAnalyzer (SemanticAnalyzer.java:getMetaData(1702)) - Get metadata for subqueries
2015-05-18 03:07:36,088 INFO [main]: parse.SemanticAnalyzer (SemanticAnalyzer.java:getMetaData(1726)) - Get metadata for destination tables
2015-05-18 03:07:36,210 INFO [main]: parse.SemanticAnalyzer (SemanticAnalyzer.java:analyzeInternal(10004)) - Completed getting MetaData in Semantic Analysis
2015-05-18 03:07:36,210 INFO [main]: parse.SemanticAnalyzer (SemanticAnalyzer.java:canHandleAstForCbo(10258)) - Not invoking CBO because the statement has too few joins
2015-05-18 03:07:36,247 INFO [main]: parse.SemanticAnalyzer (SemanticAnalyzer.java:genFileSinkPlan(6412)) - Set stats collection dir : hdfs://sandbox.hortonworks.com:8020/tmp/hive/horton/e84fab86-6fcc-4f60-9d33-9ffc2cc0d43f/hive_2015-05-18_03-07-36_054_3350098020850259592-1/-ext-10000
2015-05-18 03:07:36,254 INFO [main]: ppd.OpProcFactory (OpProcFactory.java:process(771)) - Processing for FS(6)
2015-05-18 03:07:36,254 INFO [main]: ppd.OpProcFactory (OpProcFactory.java:process(135)) - Processing for LIM(5)
2015-05-18 03:07:36,254 INFO [main]: ppd.OpProcFactory (OpProcFactory.java:process(771)) - Processing for SEL(4)
2015-05-18 03:07:36,255 INFO [main]: ppd.OpProcFactory (OpProcFactory.java:process(771)) - Processing for RS(3)
2015-05-18 03:07:36,255 INFO [main]: ppd.OpProcFactory (OpProcFactory.java:process(135)) - Processing for LIM(2)
2015-05-18 03:07:36,255 INFO [main]: ppd.OpProcFactory (OpProcFactory.java:process(771)) - Processing for SEL(1)
2015-05-18 03:07:36,255 INFO [main]: ppd.OpProcFactory (OpProcFactory.java:process(390)) - Processing for TS(0)
2015-05-18 03:07:36,257 INFO [main]: optimizer.ColumnPrunerProcFactory (ColumnPrunerProcFactory.java:pruneReduceSinkOperator(761)) - RS 3 oldColExprMap: {VALUE._col4=Column[_col4], VALUE._col3=Column[_col3], VALUE._col2=Column[_col2], VALUE._col1=Column[_col1], VALUE._col0=Column[_col0]}
2015-05-18 03:07:36,257 INFO [main]: optimizer.ColumnPrunerProcFactory (ColumnPrunerProcFactory.java:pruneReduceSinkOperator(812)) - RS 3 newColExprMap: {VALUE._col4=Column[_col4], VALUE._col3=Column[_col3], VALUE._col2=Column[_col2], VALUE._col1=Column[_col1], VALUE._col0=Column[_col0]}
2015-05-18 03:07:36,262 INFO [main]: optimizer.GlobalLimitOptimizer (GlobalLimitOptimizer.java:transform(126)) - Qualify the optimize that reduces input size for 'limit' for limit 1
2015-05-18 03:07:36,263 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(108)) -
2015-05-18 03:07:36,263 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogEnd(135)) - </PERFLOG method=partition-retrieving start=1431918456263 end=1431918456263 duration=0 from=org.apache.hadoop.hive.ql.optimizer.ppr.PartitionPruner>
2015-05-18 03:07:36,263 INFO [main]: optimizer.GenMapRedUtils (GenMapRedUtils.java:setMapWork(558)) - Try to reduce input size for 'limit' sizeNeeded: 100000 file limit : 10
2015-05-18 03:07:36,264 INFO [main]: optimizer.SamplePruner (SamplePruner.java:addPath(260)) - Path pattern = hdfs://sandbox.hortonworks.com:8020/apps/hive/warehouse/testes/*
2015-05-18 03:07:36,267 INFO [main]: optimizer.SamplePruner (SamplePruner.java:addPath(283)) - Got file: hdfs://sandbox.hortonworks.com:8020/apps/hive/warehouse/testes/test.csv
2015-05-18 03:07:36,314 INFO [main]: physical.NullScanTaskDispatcher (NullScanTaskDispatcher.java:dispatch(154)) - Looking for table scans where optimization is applicable
2015-05-18 03:07:36,314 INFO [main]: physical.NullScanTaskDispatcher (NullScanTaskDispatcher.java:dispatch(178)) - Found 0 null table scans
2015-05-18 03:07:36,314 INFO [main]: physical.NullScanTaskDispatcher (NullScanTaskDispatcher.java:dispatch(154)) - Looking for table scans where optimization is applicable
2015-05-18 03:07:36,315 INFO [main]: physical.NullScanTaskDispatcher (NullScanTaskDispatcher.java:dispatch(178)) - Found 0 null table scans
2015-05-18 03:07:36,315 INFO [main]: physical.NullScanTaskDispatcher (NullScanTaskDispatcher.java:dispatch(154)) - Looking for table scans where optimization is applicable
2015-05-18 03:07:36,315 INFO [main]: physical.NullScanTaskDispatcher (NullScanTaskDispatcher.java:dispatch(178)) - Found 0 null table scans
2015-05-18 03:07:36,316 INFO [main]: physical.Vectorizer (Vectorizer.java:validateMapWork(334)) - Validating MapWork...
2015-05-18 03:07:36,316 INFO [main]: physical.Vectorizer (Vectorizer.java:validateMapWork(342)) - Input format: org.apache.hadoop.mapred.TextInputFormat, doesn't provide vectorized input
2015-05-18 03:07:36,316 INFO [main]: parse.SemanticAnalyzer (SemanticAnalyzer.java:analyzeInternal(10174)) - Completed plan generation
2015-05-18 03:07:36,316 INFO [main]: ql.Driver (Driver.java:compile(427)) - Semantic Analysis Completed
2015-05-18 03:07:36,316 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogEnd(135)) - </PERFLOG method=semanticAnalyze start=1431918456060 end=1431918456316 duration=256 from=org.apache.hadoop.hive.ql.Driver>
2015-05-18 03:07:36,317 INFO [main]: ql.Driver (Driver.java:getSchema(235)) - Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:testes.event_time, type:string, comment:null), FieldSchema(name:testes.name, type:string, comment:null), FieldSchema(name:testes.severity, type:string, comment:null), FieldSchema(name:testes.sourceaddress, type:string, comment:null), FieldSchema(name:testes.categorysignificance, type:string, comment:null)], properties:null)
2015-05-18 03:07:36,317 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogEnd(135)) - </PERFLOG method=compile start=1431918456053 end=1431918456317 duration=264 from=org.apache.hadoop.hive.ql.Driver>
2015-05-18 03:07:36,317 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(108)) -
2015-05-18 03:07:36,376 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogEnd(135)) - </PERFLOG method=acquireReadWriteLocks start=1431918456317 end=1431918456376 duration=59 from=org.apache.hadoop.hive.ql.Driver>
2015-05-18 03:07:36,376 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(108)) -
2015-05-18 03:07:36,376 INFO [main]: ql.Driver (Driver.java:execute(1285)) - Starting command: insert into table eventlog select * from event
2015-05-18 03:07:36,376 INFO [main]: hooks.ATSHook (ATSHook.java:(87)) - Created ATS Hook
2015-05-18 03:07:36,376 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(108)) -
2015-05-18 03:07:36,379 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogEnd(135)) - </PERFLOG method=PreHook.org.apache.hadoop.hive.ql.hooks.ATSHook start=1431918456376 end=1431918456379 duration=3 from=org.apache.hadoop.hive.ql.Driver>
2015-05-18 03:07:36,380 INFO [main]: ql.Driver (SessionState.java:printInfo(824)) - Query ID = horton_20150518030707_32a637a4-9b82-43ef-9515-32979a634352
2015-05-18 03:07:36,380 INFO [main]: ql.Driver (SessionState.java:printInfo(824)) - Total jobs = 1
2015-05-18 03:07:36,380 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogEnd(135)) - </PERFLOG method=TimeToSubmit start=1431918456053 end=1431918456380 duration=327 from=org.apache.hadoop.hive.ql.Driver>
2015-05-18 03:07:36,380 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(108)) -
2015-05-18 03:07:36,380 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(108)) -
2015-05-18 03:07:36,381 INFO [main]: ql.Driver (SessionState.java:printInfo(824)) - Launching Job 1 out of 1
2015-05-18 03:07:36,382 INFO [main]: ql.Driver (Driver.java:launchTask(1602)) - Starting task [Stage-0:MAPRED] in serial mode
2015-05-18 03:07:36,382 INFO [main]: exec.Task (SessionState.java:printInfo(824)) - Number of reduce tasks determined at compile time: 1
2015-05-18 03:07:36,382 INFO [main]: exec.Task (SessionState.java:printInfo(824)) - In order to change the average load for a reducer (in bytes):
2015-05-18 03:07:36,382 INFO [main]: exec.Task (SessionState.java:printInfo(824)) - set hive.exec.reducers.bytes.per.reducer=
2015-05-18 03:07:36,383 INFO [main]: exec.Task (SessionState.java:printInfo(824)) - In order to limit the maximum number of reducers:
2015-05-18 03:07:36,383 INFO [main]: exec.Task (SessionState.java:printInfo(824)) - set hive.exec.reducers.max=
2015-05-18 03:07:36,383 INFO [main]: exec.Task (SessionState.java:printInfo(824)) - In order to set a constant number of reducers:
2015-05-18 03:07:36,383 INFO [main]: exec.Task (SessionState.java:printInfo(824)) - set mapreduce.job.reduces=
2015-05-18 03:07:36,383 INFO [main]: ql.Context (Context.java:getMRScratchDir(266)) - New scratch dir is hdfs://sandbox.hortonworks.com:8020/tmp/hive/horton/e84fab86-6fcc-4f60-9d33-9ffc2cc0d43f/hive_2015-05-18_03-07-36_054_3350098020850259592-1
2015-05-18 03:07:36,385 INFO [ATS Logger 0]: hooks.ATSHook (ATSHook.java:createPreHookEvent(148)) - Received pre-hook notification for :horton_20150518030707_32a637a4-9b82-43ef-9515-32979a634352
2015-05-18 03:07:36,402 INFO [main]: mr.ExecDriver (ExecDriver.java:execute(287)) - Using org.apache.hadoop.hive.ql.io.CombineHiveInputFormat
2015-05-18 03:07:36,402 INFO [main]: mr.ExecDriver (ExecDriver.java:execute(309)) - adding libjars: file:///usr/hdp/current/hive-webhcat/share/hcatalog/hive-hcatalog-core-0.14.0.2.2.0.0-2041.jar,file:///usr/hdp/current/hive-webhcat/share/hcatalog/hive-hcatalog-core.jar,file:///usr/hdp/current/hive-webhcat/share/hcatalog/hive-hcatalog-pig-adapter-0.14.0.2.2.0.0-2041.jar,file:///usr/hdp/current/hive-webhcat/share/hcatalog/hive-hcatalog-pig-adapter.jar,file:///usr/hdp/current/hive-webhcat/share/hcatalog/hive-hcatalog-server-extensions-0.14.0.2.2.0.0-2041.jar,file:///usr/hdp/current/hive-webhcat/share/hcatalog/hive-hcatalog-server-extensions.jar,file:///usr/hdp/current/hive-webhcat/share/hcatalog/hive-hcatalog-streaming-0.14.0.2.2.0.0-2041.jar,file:///usr/hdp/current/hive-webhcat/share/hcatalog/hive-hcatalog-streaming.jar
2015-05-18 03:07:36,402 INFO [main]: exec.Utilities (Utilities.java:getInputPaths(3265)) - Processing alias testes
2015-05-18 03:07:36,403 INFO [main]: exec.Utilities (Utilities.java:getInputPaths(3282)) - Adding input file hdfs://sandbox.hortonworks.com:8020/apps/hive/warehouse/testes
2015-05-18 03:07:36,403 INFO [main]: exec.Utilities (Utilities.java:isEmptyPath(2605)) - Content Summary not cached for hdfs://sandbox.hortonworks.com:8020/apps/hive/warehouse/testes
2015-05-18 03:07:36,406 INFO [main]: ql.Context (Context.java:getMRScratchDir(266)) - New scratch dir is hdfs://sandbox.hortonworks.com:8020/tmp/hive/horton/e84fab86-6fcc-4f60-9d33-9ffc2cc0d43f/hive_2015-05-18_03-07-36_054_3350098020850259592-1
2015-05-18 03:07:36,450 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(108)) -
2015-05-18 03:07:36,450 INFO [main]: exec.Utilities (Utilities.java:serializePlan(899)) - Serializing MapWork via kryo
2015-05-18 03:07:36,511 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogEnd(135)) - </PERFLOG method=serializePlan start=1431918456450 end=1431918456511 duration=61 from=org.apache.hadoop.hive.ql.exec.Utilities>
2015-05-18 03:07:36,543 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(108)) -
2015-05-18 03:07:36,543 INFO [main]: exec.Utilities (Utilities.java:serializePlan(899)) - Serializing ReduceWork via kryo
2015-05-18 03:07:36,605 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogEnd(135)) - </PERFLOG method=serializePlan start=1431918456543 end=1431918456605 duration=62 from=org.apache.hadoop.hive.ql.exec.Utilities>
2015-05-18 03:07:36,842 INFO [main]: impl.TimelineClientImpl (TimelineClientImpl.java:serviceInit(285)) - Timeline service address: http://sandbox.hortonworks.com:8188/ws/v1/timeline/
2015-05-18 03:07:36,842 INFO [main]: client.RMProxy (RMProxy.java:createRMProxy(98)) - Connecting to ResourceManager at sandbox.hortonworks.com/100.77.84.63:8050
2015-05-18 03:07:37,121 INFO [main]: impl.TimelineClientImpl (TimelineClientImpl.java:serviceInit(285)) - Timeline service address: http://sandbox.hortonworks.com:8188/ws/v1/timeline/
2015-05-18 03:07:37,121 INFO [main]: client.RMProxy (RMProxy.java:createRMProxy(98)) - Connecting to ResourceManager at sandbox.hortonworks.com/100.77.84.63:8050
2015-05-18 03:07:37,128 INFO [main]: mr.EsOutputFormat (EsOutputFormat.java:init(356)) - Writing to [eventlog/events]
2015-05-18 03:07:37,139 WARN [main]: mapreduce.JobSubmitter (JobSubmitter.java:copyAndConfigureFiles(153)) - Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
2015-05-18 03:07:38,735 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(108)) -
2015-05-18 03:07:38,740 INFO [main]: io.CombineHiveInputFormat (CombineHiveInputFormat.java:getCombineSplits(387)) - CombineHiveInputSplit creating pool for hdfs://sandbox.hortonworks.com:8020/apps/hive/warehouse/testes; using filter path hdfs://sandbox.hortonworks.com:8020/apps/hive/warehouse/testes
2015-05-18 03:07:38,747 INFO [main]: input.FileInputFormat (FileInputFormat.java:listStatus(281)) - Total input paths to process : 1
2015-05-18 03:07:38,748 INFO [main]: input.CombineFileInputFormat (CombineFileInputFormat.java:createSplits(413)) - DEBUG: Terminated node allocation with : CompletedNodes: 1, size left: 0
2015-05-18 03:07:38,750 INFO [main]: io.CombineHiveInputFormat (CombineHiveInputFormat.java:getCombineSplits(442)) - number of splits 1
2015-05-18 03:07:38,750 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogEnd(135)) - </PERFLOG method=getSplits start=1431918458735 end=1431918458750 duration=15 from=org.apache.hadoop.hive.ql.io.CombineHiveInputFormat>
2015-05-18 03:07:38,750 INFO [main]: io.CombineHiveInputFormat (CombineHiveInputFormat.java:getSplits(521)) - Number of all splits 1
2015-05-18 03:07:38,949 INFO [main]: mapreduce.JobSubmitter (JobSubmitter.java:submitJobInternal(494)) - number of splits:1
2015-05-18 03:07:39,058 INFO [main]: mapreduce.JobSubmitter (JobSubmitter.java:printTokens(583)) - Submitting tokens for job: job_1430301842979_0026
2015-05-18 03:07:39,136 INFO [main]: impl.YarnClientImpl (YarnClientImpl.java:submitApplication(251)) - Submitted application application_1430301842979_0026
2015-05-18 03:07:39,139 INFO [main]: mapreduce.Job (Job.java:submit(1300)) - The url to track the job: http://sandbox.hortonworks.com:8088/proxy/application_1430301842979_0026/
2015-05-18 03:07:39,139 INFO [main]: exec.Task (SessionState.java:printInfo(824)) - Starting Job = job_1430301842979_0026, Tracking URL = http://sandbox.hortonworks.com:8088/proxy/application_1430301842979_0026/
2015-05-18 03:07:39,139 INFO [main]: exec.Task (SessionState.java:printInfo(824)) - Kill Command = /usr/hdp/2.2.0.0-2041/hadoop/bin/hadoop job -kill job_1430301842979_0026
2015-05-18 03:07:46,362 INFO [main]: exec.Task (SessionState.java:printInfo(824)) - Hadoop job information for Stage-0: number of mappers: 0; number of reducers: 0
2015-05-18 03:07:46,411 WARN [main]: mapreduce.Counters (AbstractCounters.java:getGroup(234)) - Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead
2015-05-18 03:07:46,411 INFO [main]: exec.Task (SessionState.java:printInfo(824)) - 2015-05-18 03:07:46,410 Stage-0 map = 0%, reduce = 0%
2015-05-18 03:07:46,415 WARN [main]: mapreduce.Counters (AbstractCounters.java:getGroup(234)) - Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead
2015-05-18 03:07:46,416 ERROR [main]: exec.Task (SessionState.java:printError(833)) - Ended Job = job_1430301842979_0026 with errors
2015-05-18 03:07:46,419 ERROR [Thread-81]: exec.Task (SessionState.java:printError(833)) - Error during job, obtaining debugging information...
2015-05-18 03:07:46,419 INFO [Thread-81]: Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(1051)) - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2015-05-18 03:07:46,473 INFO [main]: impl.YarnClientImpl (YarnClientImpl.java:killApplication(364)) - Killed application application_1430301842979_0026
2015-05-18 03:07:46,500 INFO [main]: hooks.ATSHook (ATSHook.java:(87)) - Created ATS Hook

2015-05-18 03:07:46,501 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(108)) -

2015-05-18 03:07:46,501 INFO [ATS Logger 0]: hooks.ATSHook (ATSHook.java:createPostHookEvent(170)) - Received post-hook notification for :horton_20150518030707_32a637a4-9b82-43ef-9515-32979a634352
2015-05-18 03:07:46,508 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogEnd(135)) - </PERFLOG method=FailureHook.org.apache.hadoop.hive.ql.hooks.ATSHook start=1431918466501 end=1431918466508 duration=7 from=org.apache.hadoop.hive.ql.Driver>
2015-05-18 03:07:46,537 ERROR [main]: ql.Driver (SessionState.java:printError(833)) - FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
2015-05-18 03:07:46,537 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogEnd(135)) - </PERFLOG method=Driver.execute start=1431918456376 end=1431918466537 duration=10161 from=org.apache.hadoop.hive.ql.Driver>
2015-05-18 03:07:46,537 INFO [main]: ql.Driver (SessionState.java:printInfo(824)) - MapReduce Jobs Launched:
2015-05-18 03:07:46,539 WARN [main]: mapreduce.Counters (AbstractCounters.java:getGroup(234)) - Group FileSystemCounters is deprecated. Use org.apache.hadoop.mapreduce.FileSystemCounter instead
2015-05-18 03:07:46,539 INFO [main]: ql.Driver (SessionState.java:printInfo(824)) - Stage-Stage-0: HDFS Read: 0 HDFS Write: 0 FAIL
2015-05-18 03:07:46,539 INFO [main]: ql.Driver (SessionState.java:printInfo(824)) - Total MapReduce CPU Time Spent: 0 msec
2015-05-18 03:07:46,539 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(108)) -
2015-05-18 03:07:46,649 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogEnd(135)) - </PERFLOG method=releaseLocks start=1431918466539 end=1431918466649 duration=110 from=org.apache.hadoop.hive.ql.Driver>
2015-05-18 03:07:46,668 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(108)) -
2015-05-18 03:07:46,668 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogEnd(135)) - </PERFLOG method=releaseLocks start=1431918466668 end=1431918466668 duration=0 from=org.apache.hadoop.hive.ql.Driver>

NOTE - when i tried by creating external table with this command:

create external table sample_ex(event_time STRING,name STRING,serverity STRING,sourceaddress STRING,categorysignificance STRING);

Im able to insert the data from event table to this newly created sample_ex table.

I want to know what is the cause that the insertion is failing for the external table since its data storage type is ES. and also in detail log i can see the failure of creation of hook for the job what is that. because the hook was created at the time of "sample_ex" external table creation.

Thanks for any help and let me know if you want any further information.

Regards!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants