-
Notifications
You must be signed in to change notification settings - Fork 986
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix es.output.json
for Cascading.
#898
Conversation
Added tests to verify functionality. fixes elastic#885
Pipe pipe = new Pipe("copy"); | ||
|
||
Tap out = new HadoopPrintStreamTap(Stream.NULL); | ||
build(cfg(), in, out, pipe); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For my education, do we check the result somewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no great way to check the result in this test case since it's writing in a separate process. We're mostly just checking that the job completes instead of throwing.
context[0] = sourceCall.getInput().createKey(); | ||
context[1] = sourceCall.getInput().createValue(); | ||
// as the tuple _might_ vary (some objects might be missing), we use a map rather then a collection | ||
Settings settings = loadSettings(flowProcess.getConfigCopy(), true); | ||
context[2] = CascadingUtils.alias(settings); | ||
context[3] = settings.getOutputAsJson(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to use a constant for the index in the context array, something like
context[OUTPUT_AS_JSON] = settings.getOutputAsJson();
That would help to read the context array as well as quickly identify where it is read & set?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed 9ef7ffe
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not much to say, it looks good to me with my very limited knowledge of Cascading and its integration.
I left 2 minor comments
Added tests to verify functionality. fixes #885
If
es.output.json
is enabled, instead of deserializing JSON data into a map object ES-Hadoop will simply return the raw JSON string to the user. This is returned from the Hadoop RecordReader as aText
object containing the string data. When the Cascading integration receives data from a RecordReader or from a ScrollQuery, it automatically assumes that the data is a Map object. This leads to a cast exception when it receives a Text object instead.This relates to issue #885.