Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

key_prefix is ignored, so nothing gets into elasticsearch #18

Closed
McSvenster opened this issue May 11, 2017 · 11 comments
Closed

key_prefix is ignored, so nothing gets into elasticsearch #18

McSvenster opened this issue May 11, 2017 · 11 comments

Comments

@McSvenster
Copy link

McSvenster commented May 11, 2017

Set a key_prefix as told in documentation. In my catmandu.yml:

store:
  mydb:
    package: ElasticSearch
    options:
      index_name: oai_sk
      key_prefix: sk

Then do a YAML import:
catmandu import YAML to mydb < test.yml

The elasticsearchlog shows:
..."_id":"1_12345-GS"...
and:
org.elasticsearch.index.mapper.MapperParsingException: Field [_id] is a metadata field and cannot be added inside a document. Use...

Feel free to contact me if I can be of any help.
Best regards

@jorol
Copy link
Member

jorol commented May 11, 2017

I can reproduce that on Catmandu-Bart http://lib.ugent.be/download/librecat/Catmandu-Bart.ova:

catmandu@catmandu:~$ echo '{ "title":"My first blog entry","text": "Just trying this out...","date": "2014/01/01","_id":"333"}' | catmandu -I ./lib import -v JSON to ElasticSearch --index_name website --bag blog --key_prefix my_
imported 1 object
done
catmandu@catmandu:~$ catmandu count ElasticSearch --index_name website --bag blog
0
catmandu@catmandu:~$ echo '{ "title":"My first blog entry","text": "Just trying this out...","date": "2014/01/01","my_id":"333"}' | catmandu -I ./lib import -v JSON to ElasticSearch --index_name website --bag blog
imported 1 object
done
catmandu@catmandu:~$ catmandu count ElasticSearch --index_name website --bag blog
0

If I manually prefix _id and use the option --key_prefix the record is stored, but when I export it again (see #13) I get an error message:

catmandu@catmandu:~$ echo '{ "title":"My first blog entry","text": "Just trying this out...","date": "2014/01/01","my_id":"333"}' | catmandu -I ./lib import -v JSON to ElasticSearch --index_name website --bag blog --key_prefix my_
imported 1 object
done
catmandu@catmandu:~$ catmandu count ElasticSearch --index_name website --bag blog
1
catmandu@catmandu:~$ catmandu export ElasticSearch --index_name website --bag blog
[{"my_id":"333","date":"2014/01/01","title":"My first blog entry","text":"Just trying this out..."}Oops! [Request] ** [http://localhost:9200]-[400] [illegal_argument_exception] Failed to parse request body, called from sub Search::Elasticsearch::Client::5_0::Scroll::next at /home/catmandu/perl5/perlbrew/perls/perl-5.22.2/lib/site_perl/5.22.2/Catmandu/Store/ElasticSearch/Bag.pm line 58. With vars: {'body' => {'error' => {'root_cause' => [{'type' => 'illegal_argument_exception','reason' => 'Failed to parse request body'}],'caused_by' => {'type' => 'json_parse_exception','reason' => 'Unrecognized token \'DnF1ZXJ5VGhlbkZldGNoBQAAAAAAAABgFkVhVVNsQjNtUVRlN05BQ3M2bTE5SFEAAAAAAAAAYRZFYVVTbEIzbVFUZTdOQUNzNm0xOUhRAAAAAAAAAGIWRWFVU2xCM21RVGU3TkFDczZtMTlIUQAAAAAAAABjFkVhVVNsQjNtUVRlN05BQ3M2bTE5SFEAAAAAAAAAZBZFYVVTbEIzbVFUZTdOQUNzNm0xOUhR\': was expecting (\'true\', \'false\' or \'null\')
 at [Source: org.elasticsearch.transport.netty4.ByteBufStreamInput@6edd7fea; line: 1, column: 457]'},'reason' => 'Failed to parse request body','type' => 'illegal_argument_exception'},'status' => 400},'status_code' => 400,'request' => {'ignore' => [],'qs' => {'scroll' => '1m'},'path' => '/_search/scroll','mime_type' => 'application/json','method' => 'GET','serialize' => 'std','body' => 'DnF1ZXJ5VGhlbkZldGNoBQAAAAAAAABgFkVhVVNsQjNtUVRlN05BQ3M2bTE5SFEAAAAAAAAAYRZFYVVTbEIzbVFUZTdOQUNzNm0xOUhRAAAAAAAAAGIWRWFVU2xCM21RVGU3TkFDczZtMTlIUQAAAAAAAABjFkVhVVNsQjNtUVRlN05BQ3M2bTE5SFEAAAAAAAAAZBZFYVVTbEIzbVFUZTdOQUNzNm0xOUhR'}}

Cheers

@nics
Copy link
Member

nics commented May 12, 2017

Hi, are you using the latest versions of Catmandu and Catmandu::Store::ElasticSearch?

@jorol
Copy link
Member

jorol commented May 12, 2017

Just updated the VM this week:

Catmandu 1.0504
Catmandu::Store::ElasticSearch 0.0509

@phochste
Copy link
Member

And ElasticSearch version on Bart is 5.4.0

Even with the options "client: 5_0::Direct" I cant get it to work

@McSvenster
Copy link
Author

Sorry, I think that I made a mistake: Looking at the rejected documents I find two id fields: the one in question is added by catmandu and has the key "_id" while mine is correctly prefixed: "sk_id".

I'll have to dig deeper into catmandu.yml and it's options.

I am using catmandu version 1.0306 and Catmandu::Store::ElasticSearch version 0.0507

@nics
Copy link
Member

nics commented May 12, 2017

@jorol I think you get these results because all the examples that fail use key_prefix inconsistently. Either the option is missing, or the data uses another prefix. All your commands should specify the option and all your records should also use it if they contain an id or other special key.

This one is correct, data and options are in sync:

catmandu@catmandu:~$ echo '{ "title":"My first blog entry","text": "Just trying this out...","date": "2014/01/01","my_id":"333"}' | catmandu -I ./lib import -v JSON to ElasticSearch --index_name website --bag blog --key_prefix my_

The last error could be a bug with the Store and scrolling in 5.4.0, i'll investigate.

@jorol
Copy link
Member

jorol commented May 12, 2017

@nics: Can you give a CLI example how to index this object with ES 5.x?

{ "title":"My first blog entry","text": "Just trying this out...","date": "2014/01/01","_id":"333"}

Thanks

@nics
Copy link
Member

nics commented May 12, 2017

@jorol This should work:

echo '{ "title":"My first blog entry", "my_id":"333"}' | catmandu import to ElasticSearch --index_name website --bag blog --key_prefix my_

@jorol
Copy link
Member

jorol commented May 12, 2017

Thanks

@nics
Copy link
Member

nics commented May 12, 2017

Actually to be correct (and this may also fix the export error) the version should also be specified:

echo '{ "title":"My first blog entry", "my_id":"333"}' | catmandu import to ElasticSearch --index_name website --bag blog --key_prefix my_ --client '5_0::Direct'

@McSvenster
Copy link
Author

Thanks a lot for your help - from my point of view we can close this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants