Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

delete_by_query alternatives? #307

Closed
royrusso opened this issue Nov 17, 2015 · 7 comments
Closed

delete_by_query alternatives? #307

royrusso opened this issue Nov 17, 2015 · 7 comments

Comments

@royrusso
Copy link

Now that delete_by_query has been removed from ES 2.x and only available as a plugin, what are the alternatives within the py lib for calling that API when the plugin is actually installed? Is the idea that it won't be supported at all in the py lib?

@corey-hammerton
Copy link

As mentioned (here)[https://www.elastic.co/guide/en/elasticsearch/reference/1.7/docs-delete-by-query.html] you can use the scan/scroll API to find the matching IDs and use the BULK API to delete all those documents in bulk.

The simplest example would be to use the scan helper function to query the results and use them to create a iterable with a 'delete' op_type to send to the bulk help function.

@royrusso
Copy link
Author

I understand that, but was wondering if elasticsearch-py had considered leaving the delete_by_query call (for those that installed the plugin). Offering a plugin, yet removing the SDK support is counter-intuitive.

@honzakral
Copy link
Contributor

There are several ways to call and API that is not supported by the raw client. First is to manually call the underlying transport instance. The only problem there is that you will be responsible for collecting the parameters and constructing the URL, the api is very simple however:

es = Elasticsearch()
status, data = es.transport.perform_request('DELETE', 'some_constructed_url', body={"query": {"..."}}, params={"..."})

You will still get most of the benefits of the library:

  • connection handling
  • (de)serialization
  • error handling

Alternatively you can write a plugin for the client itself, you can have a look at https://github.com/elastic/elasticsearch-watcher-py to see how that'd look.

Hope this helps

@honzakral
Copy link
Contributor

This API has been removed. In the official plugin we don't intend to support all available plugins. For now you can use the transport directly as mentioned above or we might create a set of python plugins in the future following the pattern in elasticsearch-watcher-py

Please let me know if this is an acceptable solution for you, thanks!

@pierre-24
Copy link

pierre-24 commented Dec 30, 2016

Is it me, or did the API come back in ES 5.0 ?

Also, this function is still mentionned in the documentation, even thought it is not existing in the actual code.

@honzakral
Copy link
Contributor

Delete by query is present in the code released as 5.0.1 - https://github.com/elastic/elasticsearch-py/blob/master/elasticsearch/client/__init__.py#L733

@pierre-24
Copy link

Indeed, I was still using the version 2.x of the library, thanks :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants