New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pig Storage index name with "_" #91
Comments
The entire parsing of the URL needs to be overhauled especially as one might not use _search and rely on an embedded Thanks! On 03/10/2013 12:30 PM, Nicolas Maillard wrote:
|
Same as issue #80, I hit this problem too :( |
This is fixed in master - let me know if you still encounter issues. |
Hello everyone
I have come across this situation this morning where an "_" in an ES index name will crash the PigStorage.
Here is my pig script
DEFINE ESStorage org.elasticsearch.hadoop.pig.ESStorage('es.host=myhost');
A = LOAD '/file' USING PigStorage() AS (id:long, name, url:chararray, picture: chararray);
STORE A INTO 'index_1/artists' USING org.elasticsearch.hadoop.pig.ESStorage();
This will crash with .StringIndexOutOfBoundsException
The reason is in org.elasticsearch.hadoop.rest.Resource.
line 32 we split location as such:int location = resource.lastIndexOf("_");
I'm guessing this is to find the _search part of the ressource in the case of load.
However in store this will make my store 'index_1/artists' to 'index'
and the next lines:
location = localRoot.substring(0, root.length() - 1).lastIndexOf("/");
will send back a -1 since there is no "/"
I'm thinking look for a stricter matching "search" instead of ""
or disallow "_" in index names all together and raise an error if one is present.
I'm using the current master, I'll go ahead and try my first idea of a stricter matching and let you know how it works.
thanks for all the hard work
The text was updated successfully, but these errors were encountered: