can't append a value to array in elasticsearch from hive #2078

ealio · 2023-04-15T14:25:30Z

What kind an issue is this?

[* ] Bug report. If you’ve found a bug, please provide a code snippet or test to reproduce it below.
The easier it is to track down the bug, the faster it is solved.

Issue description

Description
I want to sync data from hive to elasticsearch using the tool of elasticsearch-hadoop. But there is an field of array type (defined as keyword type), I want append the new value into the array when I run insert SQL in hive. But it always failed with the error below.
Ended Job = job_local1121272017_0005 with errors Error during job, obtaining debugging information... FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask MapReduce Jobs Launched: Stage-Stage-2: HDFS Read: 0 HDFS Write: 0 FAIL

Steps to reproduce

1. Create an index in the elasticsearch with mapping defined as shown below.
curl -X PUT -H "Content-Type:application/json" -d '{"mappings":{"employee":{"dynamic":false,"properties" : {"empname" : {"type" : "keyword"},"id" : {"type" : "long"},"targetid":{"type":"keyword"}}}}}' "http://localhost:9200/vidaa"

It's created successfully as shown below.

es@ecs-18775:~$ curl "http://localhost:9200/vidaa/employee/_mapping?pretty" { "vidaa" : { "mappings" : { "employee" : { "dynamic" : "false", "properties" : { "empname" : { "type" : "keyword" }, "id" : { "type" : "long" }, "targetid" : { "type" : "keyword" } } } } } } es@ecs-18775:~$

2. Create an external table in the Hive, with and script defined to append data when update.
CREATE EXTERNAL TABLE ext_employee ( id BIGINT, empName STRING, targetid ARRAY<STRING>) STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler' tblproperties('es.resource'='vidaa/employee', 'es.mapping.id'='id','es.write.operation'='upsert','es.update.script.params' = 'a_data:targetid','es.update.script.inline'="ctx._source.targetid.add(params.a_data)",'es.nodes' ='localhost', 'es.port' = '9200', 'es.nodes.wan.only' = 'true');

3. Then I attempted to insert values to ext_employee table and expect sync data to elasticsearch index. The SQL statements are:
insert into ext_employee values (8, 'Vicky', array('co2')); insert into ext_employee values (9, 'Kevin', array('co3'));

4. I want to store 'co2' and 'co3' into an array in elasticsearch like "targetid":["co2", "co3]. But I always got the error below.
Ended Job = job_local1121272017_0005 with errors Error during job, obtaining debugging information... FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask MapReduce Jobs Launched: Stage-Stage-2: HDFS Read: 0 HDFS Write: 0 FAIL

Version Info

OS: : ubuntu 18
JVM : 1.8
Hadoop/Spark: Hive: 3.1.3
ES-Hadoop : elasticsearch-hadoop-8.7.0.jar
ES : 6.1.4

Feature description

The text was updated successfully, but these errors were encountered:

jbaiera · 2023-06-29T18:51:53Z

Sorry for the late response here - Are you able to obtain more information from the failed tasks?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

can't append a value to array in elasticsearch from hive #2078

can't append a value to array in elasticsearch from hive #2078

ealio commented Apr 15, 2023 •

edited

Loading

jbaiera commented Jun 29, 2023

can't append a value to array in elasticsearch from hive #2078

can't append a value to array in elasticsearch from hive #2078

Comments

ealio commented Apr 15, 2023 • edited Loading

What kind an issue is this?

Issue description

Steps to reproduce

Version Info

Feature description

jbaiera commented Jun 29, 2023

ealio commented Apr 15, 2023 •

edited

Loading