Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unable to query elastic search repository from hive #172

Closed
ashishrathi1 opened this Issue Mar 21, 2014 · 16 comments

Comments

Projects
None yet
2 participants
@ashishrathi1
Copy link

ashishrathi1 commented Mar 21, 2014

I'm using ES and hadoop integration jar and using hadoop version 2.2.0 . Querying the data from ES repository using Hive table. I consistently getting following error

Error: java.io.IOException: Cannot create an instance of InputSplit class = org.elasticsearch.hadoop.hive.EsHiveInputFormat$ESHiveSplit:Class org.elasticsearch.hadoop.hive.EsHiveInputFormat$ESHiveSplit not found at org.apache.hadoop.hive.ql.io.HiveInputFormat$HiveInputSplit.readFields(HiveInputFormat.java:147) at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:71) at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:42) at org.apache.hadoop.mapred.MapTask.getSplitDetails(MapTask.java:370) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:402) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157) Caused by: java.lang.ClassNotFoundException: Class org.elasticsearch.hadoop.hive.EsHiveInputFormat$ESHiveSplit not found at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1626) at org.apache.hadoop.hive.ql.io.HiveInputFormat$HiveInputSplit.readFields(HiveInputFormat.java:144) ... 10 more

I copied all the elasticsearch-hadoop jars(elasticsearch-hadoop-1.3.0.M2.jar,elasticsearch-hadoop-1.3.0.M2-yarn.jar in hadoop and aux_lib folder but does not resolve the problem. Any clue ?

@costin

This comment has been minimized.

Copy link
Member

costin commented Mar 21, 2014

Looks like you have a classpath problem. I first recommend using master - there's only one jar that works for both yarn and non-yarn.
Second, if you want to copy it - then make sure to put it under the hive/lib folder - next to the hive jars.
There are other ways (see the docs) but this one looks like the easiest in your case.

@ashishrathi1

This comment has been minimized.

Copy link
Author

ashishrathi1 commented Mar 21, 2014

I tried the way you suggested, but still no luck, I took out all the nodes and just trying with master, but no luck.
Could you please suggest what are other ways to resolve this issue ?
This looks like the issue with hadoop 2.2.0 and wondering if it is naming issues with new elasticsearch jars.

@costin

This comment has been minimized.

Copy link
Member

costin commented Mar 21, 2014

I'm not sure what you mean by taking out all the nodes - by master I was referring to the latest Elasticsearch Hadoop.
What does your script look like? I've tried hadoop 2.2.0 and it works fine for me - plus we have other reports with it running.
What's your setup like?

@ashishrathi1

This comment has been minimized.

Copy link
Author

ashishrathi1 commented Mar 21, 2014

I have hadoop 2.2.0 cluster (1 master, 1 slave)> both has elastic search(ES) repository > integrated using elasticsearch-hadoop M3 plugin with hadoop.

Now I'm using Hive to query the data from ES repository and encountered above error.
Basically, I followed this article to have setup and got into last stage when querying the data from hive and encountered error.

http://ryrobes.com/systems/connecting-tableau-to-elasticsearch-read-how-to-query-elasticsearch-with-hive-sql-and-hadoop/

@ashishrathi1

This comment has been minimized.

Copy link
Author

ashishrathi1 commented Mar 21, 2014

On Otherside of tableau client, i got this error : [MapR]HiveODBC Error from ThriftHiveClient: Query returned non-zero code: 2, cause: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask

@costin

This comment has been minimized.

Copy link
Member

costin commented Mar 22, 2014

@ashishrathi1 I suggest you take a step back and start again with your setup.
There is no elasticsearch-hadoop M3 plugin - the latest released version is M2. My suggestion is to use master - that is the latest code in github which you can download from Maven as a snapshot; see the docs.

The error is clear - the es-hadoop classes cannot be found - this section of the documentation covers the provisioning: http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/current/hive.html
If you want to copy the es-hadoop jar to the classpath make sure you copy the jar within the hive lib folder on each node of your cluster; otherwise the task might be executed on a node without elasticsearch hadoop and thus it will fail.

Again, take it step by step to make you don't skip a step - as a newbie mistakes can easily creep in and are hard to diagnose.

@ashishrathi1

This comment has been minimized.

Copy link
Author

ashishrathi1 commented Mar 24, 2014

It was typo, I'm using M2 jars only.

@costin

This comment has been minimized.

Copy link
Member

costin commented Mar 24, 2014

Then I reiterate my previous comment of trying out the master snapshot as it only has one jar that you need to use.

@ashishrathi1

This comment has been minimized.

Copy link
Author

ashishrathi1 commented Mar 24, 2014

Thanks costin, I was able to resolve the problem by providing the path of each jars in hive-site.xml instead of whole aux_lib folder.

I encountered with other problem, i created following table in hive, and i know there are lot of data in elasticsearch repository

create external table models(CATEGORY STRING, COMPONENT STRING, DATEOFMODELCHANGE STRING, DESCRIPTION STRING, FILEPATH STRING, ID BIGINT, PDFFILEPATH STRING, PNGFILEPATH STRING, SCORE DOUBLE, TERM STRING, TEXTFILEPATH STRING, TITLE STRING, VERSION STRING) stored by 'org.elasticsearch.hadoop.hive.EsStorageHandler' TBLPROPERTIES('es.resource'='models/plus/', 'es.index.auto.create'='true', 'es.nodes' = 'localhost:9200');

Table was created successfully, but when i query the table i get following out put

select * from models;
OK
NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL
NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL
NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL
NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL
NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL
NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL
NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL
NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL
NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL
NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL
Time taken: 0.253 seconds, Fetched: 10 row(s)

Although it fetched correct number of records from ES repository , but all data has been displayed as NULL.
I have no idea why data displayed as null. Am i missing anything here ?

@costin

This comment has been minimized.

Copy link
Member

costin commented Mar 24, 2014

Most likely because your table mapping doesn't match the mapping elasticsearch and hive infers NULL from that.
You can enable logging (through hive's log4j) on org.elasticsearch.hadoop package - set it to TRACE and you'll get see everything that goes to and from Elasticsearch.
As explained in the docs, the table mapping (models) is used to create the query - if the field names do not match, then no data is returned.
Make sure your fields in elasticsearch are the same as in your table definition.

@ashishrathi1

This comment has been minimized.

Copy link
Author

ashishrathi1 commented Mar 24, 2014

As you suggested, i enabled the logging and i could not find any error or issues, If you need logs i can send you those separately.

Here is the mapping definition in ES : {"models":{"mappings":{"xxxxxplus":{"properties":{"xxxxx-plus":{"properties":{"model":{"properties":{"category":{"type":"string"},"component":{"type":"string"},"dateOfModelChange":{"type":"date","format":"dateOptionalTime"},"description":{"type":"string"},"filepath":{"type":"string"},"id":{"type":"string"},"pdfFilepath":{"type":"string"},"pngFilepath":{"type":"string"},"score":{"type":"string"},"term":{"type":"string"},"textFilepath":{"type":"string"},"title":{"type":"string"},"version":{"type":"string"}}}}}}}}}

and here is the table i created inside hive : create external table xxxxxplus_models_new(category string, component string, dateOfModelChange timestamp, description string, filepath string, id string, pdfFilepath string, pngFilepath string, score string,term string, textFilepath string, title string, version string) stored by 'org.elasticsearch.hadoop.hive.EsStorageHandler' TBLPROPERTIES('es.resource'='models/xxxxxplus','es.index.auto.create'='true', 'es.nodes' = 'localhost:9200');

As per documentation,i used type "date" in elastic search as "timestamp" type in hive

Do you see any conflicts here ?

@costin

This comment has been minimized.

Copy link
Member

costin commented Mar 24, 2014

If you enable logging, you should see the query done to elasticsearch as well as the data retrieved from elasticsearch by es-hadoop. If you have them available and yet you don't see anything wrong with them, upload them somewhere and I'll take a look at them.

P.S. How do you load the data in ES? You can do this through Hive which guarantees/verifies that your definition is correct.
P.P.S. Your mapping doesn't have to map the entire elasticsearch index - you can create a table with only one field and only that will be used by es-hadoop.

@ashishrathi1

This comment has been minimized.

Copy link
Author

ashishrathi1 commented Mar 24, 2014

Here is the log for both creation and query of data : i created new table and then i queried the same table

CREATED TABLE


hive> create external table xxxxxplus_models_title(title STRING) stored by 'org.elasticsearch.hadoop.hive.EsStorageHandler' TBLPROPERTIES('es.resource'='models/xxxxxplus','es.index.auto.create'='true', 'es.nodes' = 'localhost:9200');
14/03/24 16:06:42 INFO ql.Driver:
14/03/24 16:06:42 INFO ql.Driver:
14/03/24 16:06:42 INFO ql.Driver:
14/03/24 16:06:42 INFO ql.Driver:
14/03/24 16:06:42 INFO parse.ParseDriver: Parsing command: create external table xxxxxplus_models_title(title STRING) stored by 'org.elasticsearch.hadoop.hive.EsStorageHandler' TBLPROPERTIES('es.resource'='models/xxxxxplus','es.index.auto.create'='true', 'es.nodes' = 'localhost:9200')
14/03/24 16:06:42 INFO parse.ParseDriver: Parse Completed
14/03/24 16:06:42 INFO ql.Driver: </PERFLOG method=parse start=1395691602563 end=1395691602565 duration=2>
14/03/24 16:06:42 INFO ql.Driver:
14/03/24 16:06:42 INFO parse.SemanticAnalyzer: Starting Semantic Analysis
14/03/24 16:06:42 INFO parse.SemanticAnalyzer: Creating table xxxxxplus_models_title position=22
14/03/24 16:06:42 INFO ql.Driver: Semantic Analysis Completed
14/03/24 16:06:42 INFO ql.Driver: </PERFLOG method=semanticAnalyze start=1395691602565 end=1395691602565 duration=0>
14/03/24 16:06:42 INFO ql.Driver: Returning Hive schema: Schema(fieldSchemas:null, properties:null)
14/03/24 16:06:42 INFO ql.Driver: </PERFLOG method=compile start=1395691602563 end=1395691602566 duration=3>
14/03/24 16:06:42 INFO ql.Driver:
14/03/24 16:06:42 INFO ql.Driver: Starting command: create external table xxxxxplus_models_title(title STRING) stored by 'org.elasticsearch.hadoop.hive.EsStorageHandler' TBLPROPERTIES('es.resource'='models/xxxxxplus','es.index.auto.create'='true', 'es.nodes' = 'localhost:9200')
14/03/24 16:06:42 INFO ql.Driver: </PERFLOG method=TimeToSubmit start=1395691602563 end=1395691602566 duration=3>
14/03/24 16:06:42 INFO ql.Driver:
14/03/24 16:06:42 INFO ql.Driver:
14/03/24 16:06:42 INFO exec.DDLTask: Use StorageHandler-supplied org.elasticsearch.hadoop.hive.EsSerDe for table xxxxxplus_models_title
14/03/24 16:06:42 INFO metastore.HiveMetaStore: 0: create_table: Table(tableName:xxxxxplus_models_title, dbName:default, owner:hadoop, createTime:1395691602, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:title, type:string, comment:null)], location:null, inputFormat:org.elasticsearch.hadoop.hive.EsHiveInputFormat, outputFormat:org.elasticsearch.hadoop.hive.EsHiveOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.elasticsearch.hadoop.hive.EsSerDe, parameters:{serialization.format=1}), bucketCols:[], sortCols:[], parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], skewedColValueLocationMaps:{}), storedAsSubDirectories:false), partitionKeys:[], parameters:{EXTERNAL=TRUE, es.nodes=localhost:9200, es.index.auto.create=true, storage_handler=org.elasticsearch.hadoop.hive.EsStorageHandler, es.resource=models/xxxxxplus}, viewOriginalText:null, viewExpandedText:null, tableType:EXTERNAL_TABLE, privileges:PrincipalPrivilegeSet(userPrivileges:null, groupPrivileges:null, rolePrivileges:null))
14/03/24 16:06:42 INFO HiveMetaStore.audit: ugi=hadoop ip=unknown-ip-addr cmd=create_table: Table(tableName:xxxxxplus_models_title, dbName:default, owner:hadoop, createTime:1395691602, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:title, type:string, comment:null)], location:null, inputFormat:org.elasticsearch.hadoop.hive.EsHiveInputFormat, outputFormat:org.elasticsearch.hadoop.hive.EsHiveOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.elasticsearch.hadoop.hive.EsSerDe, parameters:{serialization.format=1}), bucketCols:[], sortCols:[], parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], skewedColValueLocationMaps:{}), storedAsSubDirectories:false), partitionKeys:[], parameters:{EXTERNAL=TRUE, es.nodes=localhost:9200, es.index.auto.create=true, storage_handler=org.elasticsearch.hadoop.hive.EsStorageHandler, es.resource=models/xxxxxplus}, viewOriginalText:null, viewExpandedText:null, tableType:EXTERNAL_TABLE, privileges:PrincipalPrivilegeSet(userPrivileges:null, groupPrivileges:null, rolePrivileges:null))
14/03/24 16:06:42 INFO ql.Driver: </PERFLOG method=task.DDL.Stage-0 start=1395691602566 end=1395691602650 duration=84>
14/03/24 16:06:42 INFO ql.Driver: </PERFLOG method=runTasks start=1395691602566 end=1395691602651 duration=85>
14/03/24 16:06:42 INFO ql.Driver: </PERFLOG method=Driver.execute start=1395691602566 end=1395691602651 duration=85>
OK
14/03/24 16:06:42 INFO ql.Driver: OK
14/03/24 16:06:42 INFO ql.Driver:
14/03/24 16:06:42 INFO ql.Driver: </PERFLOG method=releaseLocks start=1395691602651 end=1395691602651 duration=0>
14/03/24 16:06:42 INFO ql.Driver: </PERFLOG method=Driver.run start=1395691602562 end=1395691602651 duration=89>
Time taken: 0.089 seconds
14/03/24 16:06:42 INFO CliDriver: Time taken: 0.089 seconds
14/03/24 16:06:42 INFO ql.Driver:
14/03/24 16:06:42 INFO ql.Driver: </PERFLOG method=releaseLocks start=1395691602651 end=1395691602651 duration=0>

QUERIED TABLE


hive> select * from xxxxxplus_models_title;
14/03/24 16:06:50 INFO ql.Driver:
14/03/24 16:06:50 INFO ql.Driver:
14/03/24 16:06:50 INFO ql.Driver:
14/03/24 16:06:50 INFO ql.Driver:
14/03/24 16:06:50 INFO parse.ParseDriver: Parsing command: select * from xxxxxplus_models_title
14/03/24 16:06:50 INFO parse.ParseDriver: Parse Completed
14/03/24 16:06:50 INFO ql.Driver: </PERFLOG method=parse start=1395691610538 end=1395691610539 duration=1>
14/03/24 16:06:50 INFO ql.Driver:
14/03/24 16:06:50 INFO parse.SemanticAnalyzer: Starting Semantic Analysis
14/03/24 16:06:50 INFO parse.SemanticAnalyzer: Completed phase 1 of Semantic Analysis
14/03/24 16:06:50 INFO parse.SemanticAnalyzer: Get metadata for source tables
14/03/24 16:06:50 INFO metastore.HiveMetaStore: 0: get_table : db=default tbl=xxxxxplus_models_title
14/03/24 16:06:50 INFO HiveMetaStore.audit: ugi=hadoop ip=unknown-ip-addr cmd=get_table : db=default tbl=xxxxxplus_models_title
14/03/24 16:06:50 INFO parse.SemanticAnalyzer: Get metadata for subqueries
14/03/24 16:06:50 INFO parse.SemanticAnalyzer: Get metadata for destination tables
14/03/24 16:06:50 INFO ql.Context: New scratch dir is hdfs://hadoop1:9000/tmp/hive-hadoop/hive_2014-03-24_16-06-50_538_7990993942077945656-1
14/03/24 16:06:50 INFO parse.SemanticAnalyzer: Completed getting MetaData in Semantic Analysis
14/03/24 16:06:50 INFO ppd.OpProcFactory: Processing for FS(2)
14/03/24 16:06:50 INFO ppd.OpProcFactory: Processing for SEL(1)
14/03/24 16:06:50 INFO ppd.OpProcFactory: Processing for TS(0)
14/03/24 16:06:50 TRACE commonshttp.CommonsHttpTransport: Sending [GET]@[localhost:9200][] w/ payload [null]
14/03/24 16:06:50 TRACE commonshttp.CommonsHttpTransport: Received [200-OK] [{
"status" : 200,
"name" : "es-master",
"version" : {
"number" : "1.0.1",
"build_hash" : "5c03844e1978e5cc924dab2a423dc63ce881c42b",
"build_timestamp" : "2014-02-25T15:52:53Z",
"build_snapshot" : false,
"lucene_version" : "4.6"
},
"tagline" : "You Know, for Search"
}
]
14/03/24 16:06:50 DEBUG hive.EsStorageHandler: Discovered Elasticsearch version [1.0.1]
14/03/24 16:06:50 INFO parse.SemanticAnalyzer: Completed plan generation
14/03/24 16:06:50 INFO ql.Driver: Semantic Analysis Completed
14/03/24 16:06:50 INFO ql.Driver: </PERFLOG method=semanticAnalyze start=1395691610539 end=1395691610676 duration=137>
14/03/24 16:06:50 INFO exec.TableScanOperator: Initializing Self 0 TS
14/03/24 16:06:50 INFO exec.TableScanOperator: Operator 0 TS initialized
14/03/24 16:06:50 INFO exec.TableScanOperator: Initializing children of 0 TS
14/03/24 16:06:50 INFO exec.SelectOperator: Initializing child 1 SEL
14/03/24 16:06:50 INFO exec.SelectOperator: Initializing Self 1 SEL
14/03/24 16:06:50 INFO exec.SelectOperator: SELECT structtitle:string
14/03/24 16:06:50 INFO exec.SelectOperator: Operator 1 SEL initialized
14/03/24 16:06:50 INFO exec.SelectOperator: Initializing children of 1 SEL
14/03/24 16:06:50 INFO exec.ListSinkOperator: Initializing child 3 OP
14/03/24 16:06:50 INFO exec.ListSinkOperator: Initializing Self 3 OP
14/03/24 16:06:50 INFO exec.ListSinkOperator: Operator 3 OP initialized
14/03/24 16:06:50 INFO exec.ListSinkOperator: Initialization Done 3 OP
14/03/24 16:06:50 INFO exec.SelectOperator: Initialization Done 1 SEL
14/03/24 16:06:50 INFO exec.TableScanOperator: Initialization Done 0 TS
14/03/24 16:06:50 INFO ql.Driver: Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:title, type:string, comment:null)], properties:null)
14/03/24 16:06:50 INFO ql.Driver: </PERFLOG method=compile start=1395691610537 end=1395691610678 duration=141>
14/03/24 16:06:50 INFO ql.Driver:
14/03/24 16:06:50 INFO ql.Driver: Starting command: select * from xxxxxplus_models_title
14/03/24 16:06:50 INFO ql.Driver: </PERFLOG method=TimeToSubmit start=1395691610537 end=1395691610678 duration=141>
14/03/24 16:06:50 INFO ql.Driver:
14/03/24 16:06:50 INFO ql.Driver: </PERFLOG method=runTasks start=1395691610678 end=1395691610678 duration=0>
14/03/24 16:06:50 INFO ql.Driver: </PERFLOG method=Driver.execute start=1395691610678 end=1395691610678 duration=0>
OK
14/03/24 16:06:50 INFO ql.Driver: OK
14/03/24 16:06:50 INFO ql.Driver:
14/03/24 16:06:50 INFO ql.Driver: </PERFLOG method=releaseLocks start=1395691610678 end=1395691610679 duration=1>
14/03/24 16:06:50 INFO ql.Driver: </PERFLOG method=Driver.run start=1395691610537 end=1395691610679 duration=142>
14/03/24 16:06:50 TRACE commonshttp.CommonsHttpTransport: Sending [GET]@[localhost:9200][_nodes/transport] w/ payload [null]
14/03/24 16:06:50 TRACE commonshttp.CommonsHttpTransport: Received [200-OK] [{"cluster_name":"xxxxx-es-cluster","nodes":{"uCSlP4ZZQXWrvM-nI7dMNA":{"name":"es-slave1","transport_address":"inet[/10.16.9.231:9300]","host":"hadoop2","ip":"10.16.9.231","version":"1.0.1","build":"5c03844","http_address":"inet[/10.16.9.231:9200]","transport":{"bound_address":"inet[/0:0:0:0:0:0:0:0%0:9300]","publish_address":"inet[/10.16.9.231:9300]"}},"3WYvFKnGS0K1uYC9tyP5eQ":{"name":"es-master","transport_address":"inet[/10.16.10.79:9300]","host":"hadoop1","ip":"10.16.10.79","version":"1.0.1","build":"5c03844","http_address":"inet[/10.16.10.79:9200]","transport":{"bound_address":"inet[/0:0:0:0:0:0:0:0:9300]","publish_address":"inet[/10.16.10.79:9300]"}}}}]
14/03/24 16:06:50 DEBUG mr.EsInputFormat: Nodes discovery enabled - found [10.16.9.231:9200, 10.16.10.79:9200]
14/03/24 16:06:50 TRACE commonshttp.CommonsHttpTransport: Sending [GET]@[localhost:9200][] w/ payload [null]
14/03/24 16:06:50 TRACE commonshttp.CommonsHttpTransport: Received [200-OK] [{
"status" : 200,
"name" : "es-master",
"version" : {
"number" : "1.0.1",
"build_hash" : "5c03844e1978e5cc924dab2a423dc63ce881c42b",
"build_timestamp" : "2014-02-25T15:52:53Z",
"build_snapshot" : false,
"lucene_version" : "4.6"
},
"tagline" : "You Know, for Search"
}
]
14/03/24 16:06:50 DEBUG mr.EsInputFormat: Discovered Elasticsearch version [1.0.1]
14/03/24 16:06:50 TRACE commonshttp.CommonsHttpTransport: Sending [HEAD]@[localhost:9200][models/xxxxxplus] w/ payload [null]
14/03/24 16:06:50 TRACE commonshttp.CommonsHttpTransport: Received [200-OK] [null]
14/03/24 16:06:50 TRACE commonshttp.CommonsHttpTransport: Sending [GET]@[localhost:9200][_nodes/http] w/ payload [null]
14/03/24 16:06:50 TRACE commonshttp.CommonsHttpTransport: Received [200-OK] [{"cluster_name":"xxxxx-es-cluster","nodes":{"uCSlP4ZZQXWrvM-nI7dMNA":{"name":"es-slave1","transport_address":"inet[/10.16.9.231:9300]","host":"hadoop2","ip":"10.16.9.231","version":"1.0.1","build":"5c03844","http_address":"inet[/10.16.9.231:9200]","http":{"bound_address":"inet[/0:0:0:0:0:0:0:0%0:9200]","publish_address":"inet[/10.16.9.231:9200]","max_content_length_in_bytes":104857600}},"3WYvFKnGS0K1uYC9tyP5eQ":{"name":"es-master","transport_address":"inet[/10.16.10.79:9300]","host":"hadoop1","ip":"10.16.10.79","version":"1.0.1","build":"5c03844","http_address":"inet[/10.16.10.79:9200]","http":{"bound_address":"inet[/0:0:0:0:0:0:0:0:9200]","publish_address":"inet[/10.16.10.79:9200]","max_content_length_in_bytes":104857600}}}}]
14/03/24 16:06:50 TRACE commonshttp.CommonsHttpTransport: Sending [GET]@[localhost:9200][models/xxxxxplus/_search_shards] w/ payload [null]
14/03/24 16:06:50 TRACE commonshttp.CommonsHttpTransport: Received [200-OK] [{"nodes":{"3WYvFKnGS0K1uYC9tyP5eQ":{"name":"es-master","transport_address":"inet[/10.16.10.79:9300]"},"uCSlP4ZZQXWrvM-nI7dMNA":{"name":"es-slave1","transport_address":"inet[/10.16.9.231:9300]"}},"shards":[[{"state":"STARTED","primary":false,"node":"uCSlP4ZZQXWrvM-nI7dMNA","relocating_node":null,"shard":3,"index":"models"},{"state":"STARTED","primary":true,"node":"3WYvFKnGS0K1uYC9tyP5eQ","relocating_node":null,"shard":3,"index":"models"}],[{"state":"STARTED","primary":false,"node":"uCSlP4ZZQXWrvM-nI7dMNA","relocating_node":null,"shard":2,"index":"models"},{"state":"STARTED","primary":true,"node":"3WYvFKnGS0K1uYC9tyP5eQ","relocating_node":null,"shard":2,"index":"models"}],[{"state":"STARTED","primary":false,"node":"uCSlP4ZZQXWrvM-nI7dMNA","relocating_node":null,"shard":4,"index":"models"},{"state":"STARTED","primary":true,"node":"3WYvFKnGS0K1uYC9tyP5eQ","relocating_node":null,"shard":4,"index":"models"}],[{"state":"STARTED","primary":true,"node":"3WYvFKnGS0K1uYC9tyP5eQ","relocating_node":null,"shard":1,"index":"models"},{"state":"STARTED","primary":false,"node":"uCSlP4ZZQXWrvM-nI7dMNA","relocating_node":null,"shard":1,"index":"models"}],[{"state":"STARTED","primary":true,"node":"3WYvFKnGS0K1uYC9tyP5eQ","relocating_node":null,"shard":0,"index":"models"},{"state":"STARTED","primary":false,"node":"uCSlP4ZZQXWrvM-nI7dMNA","relocating_node":null,"shard":0,"index":"models"}]]}]
14/03/24 16:06:50 TRACE mr.EsInputFormat: Creating splits for shards {Shard[state=STARTED, primary=false, node=uCSlP4ZZQXWrvM-nI7dMNA, name=3, index=models]=Node[id=uCSlP4ZZQXWrvM-nI7dMNA, name=es-slave1, ipAddress=10.16.9.231, httpPort=9200], Shard[state=STARTED, primary=false, node=uCSlP4ZZQXWrvM-nI7dMNA, name=2, index=models]=Node[id=uCSlP4ZZQXWrvM-nI7dMNA, name=es-slave1, ipAddress=10.16.9.231, httpPort=9200], Shard[state=STARTED, primary=false, node=uCSlP4ZZQXWrvM-nI7dMNA, name=4, index=models]=Node[id=uCSlP4ZZQXWrvM-nI7dMNA, name=es-slave1, ipAddress=10.16.9.231, httpPort=9200], Shard[state=STARTED, primary=true, node=3WYvFKnGS0K1uYC9tyP5eQ, name=1, index=models]=Node[id=3WYvFKnGS0K1uYC9tyP5eQ, name=es-master, ipAddress=10.16.10.79, httpPort=9200], Shard[state=STARTED, primary=true, node=3WYvFKnGS0K1uYC9tyP5eQ, name=0, index=models]=Node[id=3WYvFKnGS0K1uYC9tyP5eQ, name=es-master, ipAddress=10.16.10.79, httpPort=9200]}
14/03/24 16:06:50 TRACE commonshttp.CommonsHttpTransport: Sending [GET]@[localhost:9200][models/xxxxxplus/_mapping] w/ payload [null]
14/03/24 16:06:50 TRACE commonshttp.CommonsHttpTransport: Received [200-OK] [{"models":{"mappings":{"xxxxxplus":{"properties":{"xxxxx-plus":{"properties":{"model":{"properties":{"category":{"type":"string"},"component":{"type":"string"},"dateOfModelChange":{"type":"date","format":"dateOptionalTime"},"description":{"type":"string"},"filepath":{"type":"string"},"id":{"type":"string"},"pdfFilepath":{"type":"string"},"pngFilepath":{"type":"string"},"score":{"type":"string"},"term":{"type":"string"},"textFilepath":{"type":"string"},"title":{"type":"string"},"version":{"type":"string"}}}}}}}}}}]
14/03/24 16:06:50 INFO mr.EsInputFormat: Discovered mapping {models=[mappings=[xxxxxplus=[xxxxx-plus=[model=[category=STRING, component=STRING, dateOfModelChange=DATE, description=STRING, filepath=STRING, id=STRING, pdfFilepath=STRING, pngFilepath=STRING, score=STRING, term=STRING, textFilepath=STRING, title=STRING, version=STRING]]]]]} for [models/xxxxxplus]
14/03/24 16:06:50 DEBUG rest.RestRepository: Closing repository and connection to Elasticsearch ...
14/03/24 16:06:50 INFO mr.EsInputFormat: Created [5] shard-splits
14/03/24 16:06:50 DEBUG mr.EsInputFormat: Initializing RecordReader for [ShardInputSplit [node=[uCSlP4ZZQXWrvM-nI7dMNA/es-slave1|10.16.9.231:9200],shard=3]]
14/03/24 16:06:50 TRACE commonshttp.CommonsHttpTransport: Sending [POST]@[localhost:9200][models/xxxxxplus/_search?search_type=scan&scroll=5&size=50&_source=title&preference=_shards:3;_only_node:uCSlP4ZZQXWrvM-nI7dMNA] w/ payload [{"query":{"match_all":{}}}]
14/03/24 16:06:50 TRACE commonshttp.CommonsHttpTransport: Received [200-OK] [{"_scroll_id":"c2NhbjsxOzg6dUNTbFA0WlpRWFdydk0tbkk3ZE1OQTsxO3RvdGFsX2hpdHM6MDs=","took":3,"timed_out":false,"_shards":{"total":1,"successful":1,"failed":0},"hits":{"total":0,"max_score":0.0,"hits":[]}}]
14/03/24 16:06:50 TRACE mr.EsInputFormat: Received scroll [ScrollQuery [scrollId=c2NhbjsxOzg6dUNTbFA0WlpRWFdydk0tbkk3ZE1OQTsxO3RvdGFsX2hpdHM6MDs=]], size [0] for query [QueryBuilder [models/xxxxxplus/_search?search_type=scan&scroll=5&size=50&_source=title&preference=_shards:3;_only_node:uCSlP4ZZQXWrvM-nI7dMNA]]
14/03/24 16:06:50 DEBUG mr.EsInputFormat: Closing RecordReader for [ShardInputSplit [node=[uCSlP4ZZQXWrvM-nI7dMNA/es-slave1|10.16.9.231:9200],shard=3]]
14/03/24 16:06:50 DEBUG rest.RestRepository: Closing repository and connection to Elasticsearch ...
14/03/24 16:06:50 DEBUG rest.RestRepository: Closing repository and connection to Elasticsearch ...
14/03/24 16:06:50 DEBUG mr.EsInputFormat: Initializing RecordReader for [ShardInputSplit [node=[uCSlP4ZZQXWrvM-nI7dMNA/es-slave1|10.16.9.231:9200],shard=2]]
14/03/24 16:06:50 TRACE commonshttp.CommonsHttpTransport: Sending [POST]@[localhost:9200][models/xxxxxplus/_search?search_type=scan&scroll=5&size=50&_source=title&preference=_shards:2;_only_node:uCSlP4ZZQXWrvM-nI7dMNA] w/ payload [{"query":{"match_all":{}}}]
14/03/24 16:06:50 TRACE commonshttp.CommonsHttpTransport: Received [200-OK] [{"_scroll_id":"c2NhbjsxOzk6dUNTbFA0WlpRWFdydk0tbkk3ZE1OQTsxO3RvdGFsX2hpdHM6MTs=","took":2,"timed_out":false,"_shards":{"total":1,"successful":1,"failed":0},"hits":{"total":1,"max_score":0.0,"hits":[]}}]
14/03/24 16:06:50 TRACE mr.EsInputFormat: Received scroll [ScrollQuery [scrollId=c2NhbjsxOzk6dUNTbFA0WlpRWFdydk0tbkk3ZE1OQTsxO3RvdGFsX2hpdHM6MTs=]], size [1] for query [QueryBuilder [models/xxxxxplus/_search?search_type=scan&scroll=5&size=50&_source=title&preference=_shards:2;_only_node:uCSlP4ZZQXWrvM-nI7dMNA]]
14/03/24 16:06:50 TRACE commonshttp.CommonsHttpTransport: Sending [POST]@[localhost:9200][_search/scroll?scroll=5m] w/ payload [c2NhbjsxOzk6dUNTbFA0WlpRWFdydk0tbkk3ZE1OQTsxO3RvdGFsX2hpdHM6MTs=]
14/03/24 16:06:50 TRACE commonshttp.CommonsHttpTransport: Received [200-OK] [{"_scroll_id":"c2NhbjswOzE7dG90YWxfaGl0czoxOw==","took":5,"timed_out":false,"_shards":{"total":1,"successful":1,"failed":0},"hits":{"total":1,"max_score":0.0,"hits":[{"_index":"models","_type":"xxxxxplus","_id":"5GXZku4fR2KybGO1UcD2kg","_score":0.0, "_source" : {}}]}}]
14/03/24 16:06:50 TRACE serialization.ScrollReader: About to parse scroll content {"_scroll_id":"c2NhbjswOzE7dG90YWxfaGl0czoxOw==","took":5,"timed_out":false,"_shards":{"total":1,"successful":1,"failed":0},"hits":{"total":1,"max_score":0.0,"hits":[{"_index":"models","_type":"xxxxxplus","_id":"5GXZku4fR2KybGO1UcD2kg","_score":0.0, "_source" : {}}]}}
14/03/24 16:06:50 TRACE serialization.ScrollReader: Read hit result [5GXZku4fR2KybGO1UcD2kg]=[{}]
14/03/24 16:06:50 INFO exec.TableScanOperator: 0 forwarding 1 rows
14/03/24 16:06:50 INFO exec.SelectOperator: 1 forwarding 1 rows
14/03/24 16:06:50 DEBUG mr.EsInputFormat: Closing RecordReader for [ShardInputSplit [node=[uCSlP4ZZQXWrvM-nI7dMNA/es-slave1|10.16.9.231:9200],shard=2]]
14/03/24 16:06:50 DEBUG rest.RestRepository: Closing repository and connection to Elasticsearch ...
14/03/24 16:06:50 DEBUG rest.RestRepository: Closing repository and connection to Elasticsearch ...
14/03/24 16:06:50 DEBUG mr.EsInputFormat: Initializing RecordReader for [ShardInputSplit [node=[uCSlP4ZZQXWrvM-nI7dMNA/es-slave1|10.16.9.231:9200],shard=4]]
14/03/24 16:06:50 TRACE commonshttp.CommonsHttpTransport: Sending [POST]@[localhost:9200][models/xxxxxplus/_search?search_type=scan&scroll=5&size=50&_source=title&preference=_shards:4;_only_node:uCSlP4ZZQXWrvM-nI7dMNA] w/ payload [{"query":{"match_all":{}}}]
14/03/24 16:06:50 TRACE commonshttp.CommonsHttpTransport: Received [200-OK] [{"_scroll_id":"c2NhbjsxOzEwOnVDU2xQNFpaUVhXcnZNLW5JN2RNTkE7MTt0b3RhbF9oaXRzOjQ7","took":6,"timed_out":false,"_shards":{"total":1,"successful":1,"failed":0},"hits":{"total":4,"max_score":0.0,"hits":[]}}]
14/03/24 16:06:50 TRACE mr.EsInputFormat: Received scroll [ScrollQuery [scrollId=c2NhbjsxOzEwOnVDU2xQNFpaUVhXcnZNLW5JN2RNTkE7MTt0b3RhbF9oaXRzOjQ7]], size [4] for query [QueryBuilder [models/xxxxxplus/_search?search_type=scan&scroll=5&size=50&_source=title&preference=_shards:4;_only_node:uCSlP4ZZQXWrvM-nI7dMNA]]
14/03/24 16:06:50 TRACE commonshttp.CommonsHttpTransport: Sending [POST]@[localhost:9200][_search/scroll?scroll=5m] w/ payload [c2NhbjsxOzEwOnVDU2xQNFpaUVhXcnZNLW5JN2RNTkE7MTt0b3RhbF9oaXRzOjQ7]
14/03/24 16:06:50 TRACE commonshttp.CommonsHttpTransport: Received [200-OK] [{"_scroll_id":"c2NhbjswOzE7dG90YWxfaGl0czo0Ow==","took":7,"timed_out":false,"_shards":{"total":1,"successful":1,"failed":0},"hits":{"total":4,"max_score":0.0,"hits":[{"_index":"models","_type":"xxxxxplus","_id":"58cLg3O9S-qdq6ceHoTGLA","_score":0.0, "_source" : {}},{"_index":"models","_type":"xxxxxplus","_id":"IezOEjugTFmdZ-Wd_Jkhzg","_score":0.0, "_source" : {}},{"_index":"models","_type":"xxxxxplus","_id":"wXstc17-RdaAa23GvaJadg","_score":0.0, "_source" : {}},{"_index":"models","_type":"xxxxxplus","_id":"cltcL1XqTsKdhaQURbhyeg","_score":0.0, "_source" : {}}]}}]
14/03/24 16:06:50 TRACE serialization.ScrollReader: About to parse scroll content {"_scroll_id":"c2NhbjswOzE7dG90YWxfaGl0czo0Ow==","took":7,"timed_out":false,"_shards":{"total":1,"successful":1,"failed":0},"hits":{"total":4,"max_score":0.0,"hits":[{"_index":"models","_type":"xxxxxplus","_id":"58cLg3O9S-qdq6ceHoTGLA","_score":0.0, "_source" : {}},{"_index":"models","_type":"xxxxxplus","_id":"IezOEjugTFmdZ-Wd_Jkhzg","_score":0.0, "_source" : {}},{"_index":"models","_type":"xxxxxplus","_id":"wXstc17-RdaAa23GvaJadg","_score":0.0, "_source" : {}},{"_index":"models","_type":"xxxxxplus","_id":"cltcL1XqTsKdhaQURbhyeg","_score":0.0, "_source" : {}}]}}
14/03/24 16:06:50 TRACE serialization.ScrollReader: Read hit result [58cLg3O9S-qdq6ceHoTGLA]=[{}]
14/03/24 16:06:50 TRACE serialization.ScrollReader: Read hit result [IezOEjugTFmdZ-Wd_Jkhzg]=[{}]
14/03/24 16:06:50 TRACE serialization.ScrollReader: Read hit result [wXstc17-RdaAa23GvaJadg]=[{}]
14/03/24 16:06:50 TRACE serialization.ScrollReader: Read hit result [cltcL1XqTsKdhaQURbhyeg]=[{}]
14/03/24 16:06:50 DEBUG mr.EsInputFormat: Closing RecordReader for [ShardInputSplit [node=[uCSlP4ZZQXWrvM-nI7dMNA/es-slave1|10.16.9.231:9200],shard=4]]
14/03/24 16:06:50 DEBUG rest.RestRepository: Closing repository and connection to Elasticsearch ...
14/03/24 16:06:50 DEBUG rest.RestRepository: Closing repository and connection to Elasticsearch ...
14/03/24 16:06:50 DEBUG mr.EsInputFormat: Initializing RecordReader for [ShardInputSplit [node=[3WYvFKnGS0K1uYC9tyP5eQ/es-master|10.16.10.79:9200],shard=1]]
14/03/24 16:06:50 TRACE commonshttp.CommonsHttpTransport: Sending [POST]@[localhost:9200][models/xxxxxplus/_search?search_type=scan&scroll=5&size=50&_source=title&preference=_shards:1;_only_node:3WYvFKnGS0K1uYC9tyP5eQ] w/ payload [{"query":{"match_all":{}}}]
14/03/24 16:06:50 TRACE commonshttp.CommonsHttpTransport: Received [200-OK] [{"_scroll_id":"c2NhbjsxOzEwNjozV1l2RktuR1MwSzF1WUM5dHlQNWVROzE7dG90YWxfaGl0czo0Ow==","took":2,"timed_out":false,"_shards":{"total":1,"successful":1,"failed":0},"hits":{"total":4,"max_score":0.0,"hits":[]}}]
14/03/24 16:06:50 TRACE mr.EsInputFormat: Received scroll [ScrollQuery [scrollId=c2NhbjsxOzEwNjozV1l2RktuR1MwSzF1WUM5dHlQNWVROzE7dG90YWxfaGl0czo0Ow==]], size [4] for query [QueryBuilder [models/xxxxxplus/_search?search_type=scan&scroll=5&size=50&_source=title&preference=_shards:1;_only_node:3WYvFKnGS0K1uYC9tyP5eQ]]
14/03/24 16:06:50 TRACE commonshttp.CommonsHttpTransport: Sending [POST]@[localhost:9200][_search/scroll?scroll=5m] w/ payload [c2NhbjsxOzEwNjozV1l2RktuR1MwSzF1WUM5dHlQNWVROzE7dG90YWxfaGl0czo0Ow==]
14/03/24 16:06:50 TRACE commonshttp.CommonsHttpTransport: Received [200-OK] [{"_scroll_id":"c2NhbjswOzE7dG90YWxfaGl0czo0Ow==","took":3,"timed_out":false,"_shards":{"total":1,"successful":1,"failed":0},"hits":{"total":4,"max_score":0.0,"hits":[{"_index":"models","_type":"xxxxxplus","_id":"e7-smRFwQPK66256p8BJbA","_score":0.0, "_source" : {}},{"_index":"models","_type":"xxxxxplus","_id":"2qW2fngWQra3jAzN1arI7A","_score":0.0, "_source" : {}},{"_index":"models","_type":"xxxxxplus","_id":"eQLKyletTT2xev5s6JN2PQ","_score":0.0, "_source" : {}},{"_index":"models","_type":"xxxxxplus","_id":"KiZBenEPQf6MsLsR0-z88A","_score":0.0, "_source" : {}}]}}]
14/03/24 16:06:50 TRACE serialization.ScrollReader: About to parse scroll content {"_scroll_id":"c2NhbjswOzE7dG90YWxfaGl0czo0Ow==","took":3,"timed_out":false,"_shards":{"total":1,"successful":1,"failed":0},"hits":{"total":4,"max_score":0.0,"hits":[{"_index":"models","_type":"xxxxxplus","_id":"e7-smRFwQPK66256p8BJbA","_score":0.0, "_source" : {}},{"_index":"models","_type":"xxxxxplus","_id":"2qW2fngWQra3jAzN1arI7A","_score":0.0, "_source" : {}},{"_index":"models","_type":"xxxxxplus","_id":"eQLKyletTT2xev5s6JN2PQ","_score":0.0, "_source" : {}},{"_index":"models","_type":"xxxxxplus","_id":"KiZBenEPQf6MsLsR0-z88A","_score":0.0, "_source" : {}}]}}
14/03/24 16:06:50 TRACE serialization.ScrollReader: Read hit result [e7-smRFwQPK66256p8BJbA]=[{}]
14/03/24 16:06:50 TRACE serialization.ScrollReader: Read hit result [2qW2fngWQra3jAzN1arI7A]=[{}]
14/03/24 16:06:50 TRACE serialization.ScrollReader: Read hit result [eQLKyletTT2xev5s6JN2PQ]=[{}]
14/03/24 16:06:50 TRACE serialization.ScrollReader: Read hit result [KiZBenEPQf6MsLsR0-z88A]=[{}]
14/03/24 16:06:50 DEBUG mr.EsInputFormat: Closing RecordReader for [ShardInputSplit [node=[3WYvFKnGS0K1uYC9tyP5eQ/es-master|10.16.10.79:9200],shard=1]]
14/03/24 16:06:50 DEBUG rest.RestRepository: Closing repository and connection to Elasticsearch ...
14/03/24 16:06:50 DEBUG rest.RestRepository: Closing repository and connection to Elasticsearch ...
14/03/24 16:06:50 DEBUG mr.EsInputFormat: Initializing RecordReader for [ShardInputSplit [node=[3WYvFKnGS0K1uYC9tyP5eQ/es-master|10.16.10.79:9200],shard=0]]
14/03/24 16:06:50 TRACE commonshttp.CommonsHttpTransport: Sending [POST]@[localhost:9200][models/xxxxxplus/_search?search_type=scan&scroll=5&size=50&_source=title&preference=_shards:0;_only_node:3WYvFKnGS0K1uYC9tyP5eQ] w/ payload [{"query":{"match_all":{}}}]
14/03/24 16:06:50 TRACE commonshttp.CommonsHttpTransport: Received [200-OK] [{"_scroll_id":"c2NhbjsxOzEwNzozV1l2RktuR1MwSzF1WUM5dHlQNWVROzE7dG90YWxfaGl0czoxOw==","took":1,"timed_out":false,"_shards":{"total":1,"successful":1,"failed":0},"hits":{"total":1,"max_score":0.0,"hits":[]}}]
14/03/24 16:06:50 TRACE mr.EsInputFormat: Received scroll [ScrollQuery [scrollId=c2NhbjsxOzEwNzozV1l2RktuR1MwSzF1WUM5dHlQNWVROzE7dG90YWxfaGl0czoxOw==]], size [1] for query [QueryBuilder [models/xxxxxplus/_search?search_type=scan&scroll=5&size=50&_source=title&preference=_shards:0;_only_node:3WYvFKnGS0K1uYC9tyP5eQ]]
14/03/24 16:06:50 TRACE commonshttp.CommonsHttpTransport: Sending [POST]@[localhost:9200][_search/scroll?scroll=5m] w/ payload [c2NhbjsxOzEwNzozV1l2RktuR1MwSzF1WUM5dHlQNWVROzE7dG90YWxfaGl0czoxOw==]
14/03/24 16:06:50 TRACE commonshttp.CommonsHttpTransport: Received [200-OK] [{"_scroll_id":"c2NhbjswOzE7dG90YWxfaGl0czoxOw==","took":1,"timed_out":false,"_shards":{"total":1,"successful":1,"failed":0},"hits":{"total":1,"max_score":0.0,"hits":[{"_index":"models","_type":"xxxxxplus","_id":"7892dqGOQqy0qOAXQbDfwg","_score":0.0, "_source" : {}}]}}]
14/03/24 16:06:50 TRACE serialization.ScrollReader: About to parse scroll content {"_scroll_id":"c2NhbjswOzE7dG90YWxfaGl0czoxOw==","took":1,"timed_out":false,"_shards":{"total":1,"successful":1,"failed":0},"hits":{"total":1,"max_score":0.0,"hits":[{"_index":"models","_type":"xxxxxplus","_id":"7892dqGOQqy0qOAXQbDfwg","_score":0.0, "_source" : {}}]}}
14/03/24 16:06:50 TRACE serialization.ScrollReader: Read hit result [7892dqGOQqy0qOAXQbDfwg]=[{}]
14/03/24 16:06:50 INFO exec.TableScanOperator: 0 forwarding 10 rows
14/03/24 16:06:50 INFO exec.SelectOperator: 1 forwarding 10 rows
14/03/24 16:06:50 DEBUG mr.EsInputFormat: Closing RecordReader for [ShardInputSplit [node=[3WYvFKnGS0K1uYC9tyP5eQ/es-master|10.16.10.79:9200],shard=0]]
14/03/24 16:06:50 DEBUG rest.RestRepository: Closing repository and connection to Elasticsearch ...
14/03/24 16:06:50 DEBUG rest.RestRepository: Closing repository and connection to Elasticsearch ...
NULL
NULL
NULL
NULL
NULL
NULL
NULL
NULL
NULL
NULL
14/03/24 16:06:50 INFO exec.TableScanOperator: 0 finished. closing...
14/03/24 16:06:50 INFO exec.TableScanOperator: 0 forwarded 10 rows
14/03/24 16:06:50 INFO exec.SelectOperator: 1 finished. closing...
14/03/24 16:06:50 INFO exec.SelectOperator: 1 forwarded 10 rows
14/03/24 16:06:50 INFO exec.ListSinkOperator: 3 finished. closing...
14/03/24 16:06:50 INFO exec.ListSinkOperator: 3 forwarded 0 rows
14/03/24 16:06:50 INFO exec.SelectOperator: 1 Close done
14/03/24 16:06:50 INFO exec.TableScanOperator: 0 Close done
Time taken: 0.284 seconds, Fetched: 10 row(s)
14/03/24 16:06:50 INFO CliDriver: Time taken: 0.284 seconds, Fetched: 10 row(s)
14/03/24 16:06:50 INFO ql.Driver:
14/03/24 16:06:50 INFO ql.Driver: </PERFLOG method=releaseLocks start=1395691610821 end=1395691610821 duration=0>

@costin

This comment has been minimized.

Copy link
Member

costin commented Mar 24, 2014

You are using an incorrect mapping.

Discovered mapping {models=[mappings=[xxxxxplus=[xxxxx-plus=[model=[category=STRING, component=STRING, dateOfModelChange=DATE, description=STRING, filepath=STRING, id=STRING, pdfFilepath=STRING, pngFilepath=STRING, score=STRING, term=STRING, textFilepath=STRING, title=STRING, version=STRING]]]]]} for [models/xxxxxplus]

Your category, component, etc... fields are nested and not top level fields. I recommend you spend some time with Elasticsearch mapping, namely index and types.
Currently you are telling es-hadoop to retrieve a field called category when there isn't any - you actually want modes.xxxxplus.xxx-plus.model.category under models/xxxxplus.

I'm closing the issue - if you still have questions let's move the discussion to the mailing list where it belongs.

P.S. it would speed things up considerably if your post would be more readable - using a gist or another site for posting huge logs and using some formatting helps a lot - in its current form your last post is unreadable, which makes finding the issue difficult.

@costin costin closed this Mar 24, 2014

@costin costin added invalid labels Mar 24, 2014

@ashishrathi1

This comment has been minimized.

Copy link
Author

ashishrathi1 commented Mar 24, 2014

Ok Thanks,

Do you have any documentation on how to handle the nested mappings ?

Or it would be easier to write the data through hive directly in case of nested mappings and ES will create the structure automatically ? If so , can you point me to an example of writing the data with hive using JSON object. it is not quite clear in documentation.

@costin

This comment has been minimized.

Copy link
Member

costin commented Mar 24, 2014

http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/current/hive.html#hive-alias
You can use aliases to define nested mappings - map field X to a.b.c.d

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.