New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
savetoES can't use pre-defined mapping #424
Comments
That is strange. what version of elasticsearch are you using? Can you On Wed, Apr 15, 2015 at 4:29 AM, jiangff notifications@github.com wrote:
|
Below is the output from debug. I am using a 3 node ES cluster (version 1.4.4):
|
@jiangff I've edited your post to make them readable - I hope you agree they look much better and the effort was minimal - the Markdown editor in github makes this a breeze and instead of using Coming back to the issue at hand, it looks like es-hadoop is not creating the index but rather using the existing ones. I'm not sure how you are using the client but after creating the mapping, test the mapping in Elasticsearch (on the 10.0.0.x nodes that the connector uses it, as mentioned in the logs). You can use Then report back. |
Not sure if it is related, but I recall at one time in the past (unrelated to elasticsearch-hadoop) if I created an index, then sent a mapping definition, then - within milliseconds - tried to index a document (basically a sequence of create, mapping, insert all in series), I had some instances where the document indexing might fail if it relied on a specific mapping, and the mapping didn't propagate to all cluster nodes in time. If I added a short pause after the mapping was set, it all worked out ok. |
I found out what the issue is. There was a typo ("not_anayzed") when I created mapping, so the mapping wasn't even successfully created. The other library( elastic4s) didn't throw out an exception so I didn't catch it. Thanks for the reply, @costin ! |
Is there any easy way to do this now using Spark 2.2.0 and ES 6.1.2? Can the ElasticSearch-hadoop connector create a mapping given a Streaming For example I have the following code but it doesn't create index or type or its mapping before writing docs to ES.
|
I am using the saveToES method to write an RDD into elasticsearch. Because I need to specify the mapping, I set the "es.index.auto.create" as "false" and define a mapping beforehand using the other scala elasticsearch library (com.sksamuel.elastic4s._) . The mapping is created successfully, but after I called "saveToES" to save the RDD to the predefined index/map , the data are not consistent with the mapping I created beforehand. Below is my code:
To create mapping:
---- and later, write to ES ----------------------
I believe I have set the spark conf properly:
So although I have specified in the code to have, for example, deviceid as "not_analyzed", the deviceid data in indexName/schema still end up as "analyzed" in ES. It seems like savetoES overwrites the predefined map.
Is there a workround on this?
Thanks!
The text was updated successfully, but these errors were encountered: