Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for List of Maps (Nested objects) #41

Closed
abhishek376 opened this issue May 1, 2013 · 5 comments
Closed

Support for List of Maps (Nested objects) #41

abhishek376 opened this issue May 1, 2013 · 5 comments

Comments

@abhishek376
Copy link

For example :
if I have a map type (rdata) column in hive

image

and when I try to push the data to Elastic search I only see last mapid/value pair in elastic search.

Log for the writable

INFO org.elasticsearch.hadoop.rest.BufferedRestClient: Writable{rid=1, rdata={value=5, mapid=4}, rdate=1234, mapids=[7, 8, 9]}

Any workaround for this ? Is this a bug ?

Thanks

@costin
Copy link
Member

costin commented May 1, 2013

Hmm - I think there's a problem with your data set. You're using a map but have the same key declared twice which is actually illegal (it's not a valid JSON object).
What happens is that the JSON Object gets converted to a map in which the entries are added one by one but since the key is the same, each entry causes an update instead of insert.

@abhishek376
Copy link
Author

Costin,

Thanks that makes sense.

I will hard code map from hive map type [{1 : 3}, {4: 5}] to ES Query String
[
{
"mapid": "1",
"value": "3"
},
{
"mapid": "4",
"value": "5"
}
]

in BufferedRestClient.

Thanks for the support.

@costin
Copy link
Member

costin commented May 2, 2013

Instead of hard-coding why not apply some transformation to the data, such as converting the map to a bag/list? Note that a wrong JSON implies the data might be properly represented in ES which can trigger issues when doing queries.
Note that while functionally we could support a map with duplicates (using identity instead of the hash code), the problem of duplicates still persist and is likely to cause issues later down the road.

@abhishek376
Copy link
Author

Makes sense thanks costin.

@costin
Copy link
Member

costin commented May 14, 2013

@abhishek376 I'm closing this issue as the problem seems to be fixed. Note that mapping is part of the roadmap to cope with the different data structures however it will take a bit until I get to it.
Watch this project for future updates.

@costin costin closed this as completed May 14, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants