Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Indexing JSON #75

Closed
mohitanchlia opened this issue Aug 15, 2013 · 2 comments
Closed

Indexing JSON #75

mohitanchlia opened this issue Aug 15, 2013 · 2 comments

Comments

@mohitanchlia
Copy link

I see how we can index a flat structured file, but is there a way to index if file already has data in a json format? For eg:

{
user {
phones { [1,2,4]
}
}
}

If this is not possible then I can probably contribute and add this functionality.

One way I can think of is to read each record as a charray in pig and directly index it like: Not sure if this will work though

A= LOAD 'file' using PigStorage (data chararray);
Store A into ...

@costin
Copy link
Member

costin commented Aug 16, 2013

Duplicate of #9

Loading json straight into ES is part of the road-map, probably for M2. There are several performance considerations here since loading the file through Hive or Pig would only slow things down and also confuse the translation (json itself is not recognized but rather translated as strings which is not what one wants).
One could use the Pig json integration to load the file but that would only complicate the workflow w/o any benefits (quite the opposite).

I'm currently thinking of bypassing all layers and reading the resource directly and pass it to ES.

@Downchuck
Copy link

What is the current issue with handling this in the Hadoop MR module?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants