New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using es.mapping.exclude/include and still getting StrictDynamicMappingException on excluded fields #1015
Comments
Hi there, could you include |
I set the log level to trace and updates the stack trace. Also, is creating a new field and excluding it from being inserted the only way to get elasticsearch-hadoop to use a document id that is a combination of two fields? |
For the time being, that is correct. It's fairly easy to generate a field via the provided tools that we integrate with, so we just provide the ability to pull a value from a field for specifying those pieces of metadata. This keeps the connector simpler and easier to test. Thanks for the logs, but the attached log contents seem like they're just from the job driver. Could you include the logs from the workers? I have a hunch on where this might be breaking, but I want to make sure that it's not something else. Edit: To elaborate - it seems that the serialization code for Pig does not uniformly apply the field filtering logic the same way that serialization code for other integrations do. I've still got some work to do in order to flesh out the exact place that it's breaking, but it definitely smells like a bug with Pig. |
Added the workers logs |
Thanks for the worker logs. I'll see about getting a clean reproduction of this. |
Just for the record, writing a hive script with the same parameters instead of using pig worked as expected. |
@JohnS89 I haven't had a chance to look too deeply into this yet, but I have a suspicion that this has to do with how the Pig serialization logic in the connector handles tuples: See these lines. Normally before writing an entry that has a name that name should be filtered against the mapping exclusion and inclusion properties. It looks like it's just missing here. Will get a fix in hopefully soon. |
What kind an issue is this?
The easier it is to track down the bug, the faster it is solved.
Often a solution already exists! Don’t send pull requests to implement new features without
first getting our support. Sometimes we leave features out on purpose to keep the project small.
Issue description
Description:
I am trying to import data into ES using ES-Hadoop. I need the document ID to be a combination of two fields concatenated together: _. I don't want the concatenated string to be a field in the document since the other two fields are already there.
Steps to reproduce
Code:
Strack trace:
Driver:
Worker:
Version Info
OS: : 4.4.35-33.55.amzn1.x86_64
JVM : openjdk version "1.8.0_131"
Hadoop/Spark: Hadoop 2.7.3-amzn-2
ES-Hadoop : pig-2.4.2
ES : 1.x
Feature description
The text was updated successfully, but these errors were encountered: