New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to write elastic search data #891
Comments
Could you increase the logging level to |
For some reason I was unable to get the logs when I run it on grid, so I tried running it locally and see another issues, here are the logs, if I turn off auto index create - https://gist.github.com/jayesh4192/fbbd3e7a5280dd12569831993483ba56 |
My use case is store the data in multiple indices based on a filtering criteria, for ex: if I have a table which contains dep_name and employee records and I need to store the employee data per department. For example - computer_science/students, electrical/students. register /tmp/elasticsearch-hadoop-pig-5.0.0.jar ; define EsStorage org.elasticsearch.hadoop.pig.EsStorage('es.resource.write={name}/publisher', 'es.nodes=host', 'es.port=4080', 'es.http.timeout=2m', 'es_net_proxy_http_host=httpproxy', 'es_net_proxy_http_port=4080', 'es.http.retries=2', 'es.index.auto.create=true'); layout = load '$LAYOUT_TABLE' using PigStorage(',') as (layout_id, name); here is the gist with the error - https://gist.github.com/jayesh4192/011e9c8683224e0944fd103125602df7 |
@jayesh4192 My guess in this case is that you have index creation disabled in Elasticsearch when a document is indexed. When using a pattern for your resource write you will need to set |
Issue description
Unable to write the data using ESStorage using multi index dynamic write.
I am able to index and write the data without dynamic write.
So when I say es.resource=index_name/some_type which is static, I am able to write it.
if I change it to es.resource.write=index_name/{some_field_name} it doesn't work. It just says Cannot detect ES version , which is misleading too.
Description
Steps to reproduce
Code:
register /tmp/elasticsearch-hadoop-pig-5.0.0.jar ;
define EsStorage org.elasticsearch.hadoop.pig.EsStorage('es.resource={name}/some', 'es.nodes=some server', 'es.port=4080', 'es.http.timeout=2m', 'es_net_proxy_http_host=some_proxy_host', 'es_net_proxy_http_port=4080', 'es.http.retries=2', 'es_batch_write_retry_wait=100s', 'es_batch_write_refresh=false');
layout = load '$LAYOUT_TABLE' using PigStorage(',') as (layout_id, name);
layout = foreach layout generate
REPLACE(layout_id,'"','') as layout_id,
REPLACE(name,'"','') as name;
layout = foreach layout generate CurrentTime() as timestamp, (long)layout_id as layout, (chararray)name as name;
STORE layout into '{name}/some' USING EsStorage();
Strack trace:
16/11/10 20:41:25 INFO tez.TezJob: DAG Status: status=FAILED, progress=TotalTasks: 1 Succeeded: 0 Running: 0 Failed: 1 Killed: 0 FailedTaskAttempts: 4, diagnostics=Vertex failed, vertexName=scope-23, vertexId=vertex_1478766437060_121497_1_00, diagnostics=[Task failed, taskId=task_1478766437060_121497_1_00_000000, diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running task:org.elasticsearch.hadoop.EsHadoopIllegalArgumentException: Cannot detect ES version - typically this happens if the network/Elasticsearch cluster is not accessible or when targeting a WAN/Cloud instance without the proper setting 'es.nodes.wan.only'
at org.elasticsearch.hadoop.rest.InitializationUtils.discoverEsVersion(InitializationUtils.java:247)
at org.elasticsearch.hadoop.rest.RestService.createWriter(RestService.java:545)
at org.elasticsearch.hadoop.mr.EsOutputFormat$EsRecordWriter.init(EsOutputFormat.java:173)
at org.elasticsearch.hadoop.mr.EsOutputFormat$EsRecordWriter.write(EsOutputFormat.java:149)
at org.elasticsearch.hadoop.pig.EsStorage.putNext(EsStorage.java:189)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:136)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:95)
at org.apache.tez.mapreduce.output.MROutput$1.write(MROutput.java:503)
at org.apache.pig.backend.hadoop.executionengine.tez.plan.operator.POStoreTez.getNextTuple(POStoreTez.java:129)
at org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.runPipeline(PigProcessor.java:378)
at org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.run(PigProcessor.java:243)
Version Info
OS: :
JVM :
Hadoop/Spark:
ES-Hadoop : 5.0
ES :
The text was updated successfully, but these errors were encountered: