New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bulk API in High Level Rest Client does not support global parameters #26026
Comments
This is a side-effect of reusing java API objects in the high-level REST client, and it applies also to default index, type and routing value. These parameters are applied at parsing time, when the request comes in as bytes, that go through a parsing phase. In that parsing phase the default values are copied to each items that don't have specific values set. When providing the items as proper request objects (java API or high-level client), such parameters need to be specified as part of each item directly. The other option would be to make them all settable to the Also, users have been using the java API this way for a while, hence this should not cause problems and the migration is straight-forward. We should probably consider making these changes if we'll ever move away from sharing classes with Elasticsearch core. |
Isn't that the same level of complexity when we have to deal with index name or type? Like the following (untested though):
I agree that this can totally wait. That's more a concern for people like me who are moving from their own made REST Client to the official one as the REST API supports this global |
yes, in fact that is not supported by BulkRequest either, as far as I can see. |
Ha! Good point. |
We discussed about this in Fix-it Friday and we agreed that it would be nice to be able to set a default pipeline/index/routing etc on BulkRequest similar to what the other clients are doing. But now I've read again @javanna's comments I agree that we won't be able to do this with the core classes except by modifying their state (and copy them over) which is really not good.
I don't think this is how we should do that. Instead we could add methods in the
The values could be passed to the |
What is the problem that we are trying to solve here: ease of use or performance? If it is the latter it should be measured, rather than assuming it is a problem. The transport client always worked like this, although it uses the transport layer rather than the REST layer which is quite a difference. On the ease of use aspect, I agree this is not ideal, yet not a huge issue, something that transport client users got used to. I am against adding specialized methods to the high level client, as it will complicate things and set a precedent for doing the same in similar cases. We should take this into account as a reason to have our own request objects one day, or accept such limitations as a consequence of depending from Elasticsearch core and reusing its classes. It saves us a lot of work, but it does come with a cost. |
To me, it's the potential network usage overhead. Not really about performance more about network "costs". The second thing is that I'd love that our Java REST Client exposes all options that we have in the REST API. |
I agree with this ideally, but we gave priority to migrating from transport client and reusing existing classes, this is the compromise we made and it's not possible to have 100% consistency at the same time. I think we will encounter more similar issues, and that is kind of expected. |
Thinking more about this, we have encountered a few parameters that are only supported at REST, and the way we dealt with them was to add support for them to the request so that the high-level REST client could expose and use them, although the transport client doesn't do anything with them. We could do the same here by adding index, pipeline etc to |
While the plan is to depreciate TransportClient and to use High Level REST client instead, any solid roadmap for this enhancement? |
Note that transport client is already deprecated. This is something that we want to address, yet not high priority as there is a work-around, meaning that you can provide the info on each inner request like you'd do with the transport client. |
That will work only if this is not turned on: https://www.elastic.co/guide/en/elasticsearch/reference/6.2/url-access-control.html |
@javanna Has anyone already started working on this? If not can I pick this up? |
@pushpavanthar nobody is working on this, feel free to take it. Thanks! |
@javanna indeed we could just add these to the bulk request, and then apply defaults when converting in RequestConverters, but index and type are missing it would fail validation per request (IndexRequeset.validation for instance) |
@pgomulka I see the problem, the request gets converted after validation, and validate fails if we don't have all of the items with the required parameters set. Let's add some logic to |
im ++ to this. Thanks for adding |
Things that need to be covered:
1 & 2 & 3 should be supported when using with all following styles:
|
Bulk Request in High level rest client should be consistent with what is possible in Rest API, therefore should support global parameters. Global parameters are passed in URL in Rest API. Some parameters are mandatory - index, type - and would fail validation if not provided before before the bulk is executed. Optional parameters - routing, pipeline. The usage of these should be consistent across sync/async execution, bulk processor and BulkRequestBuilder closes #26026
Bulk Request in High level rest client should be consistent with what is possible in Rest API, therefore should support global parameters. Global parameters are passed in URL in Rest API. Some parameters are mandatory - index, type - and would fail validation if not provided before before the bulk is executed. Optional parameters - routing, pipeline. The usage of these should be consistent across sync/async execution, bulk processor and BulkRequestBuilder closes elastic#26026
Bulk Request in High level rest client should be consistent with what is possible in Rest API, therefore should support global parameters. Global parameters are passed in URL in Rest API. Some parameters are mandatory - index, type - and would fail validation if not provided before before the bulk is executed. Optional parameters - routing, pipeline. The usage of these should be consistent across sync/async execution, bulk processor and BulkRequestBuilder closes elastic#26026
Bulk Request in High level rest client should be consistent with what is possible in Rest API, therefore should support global parameters. Global parameters are passed in URL in Rest API. Some parameters are mandatory - index, type - and would fail validation if not provided before before the bulk is executed. Optional parameters - routing, pipeline. The usage of these should be consistent across sync/async execution, bulk processor and BulkRequestBuilder closes #26026 backport of #34528
When using the REST API we can use:
But with the High Level REST API, this global pipeline option is not available.
Which means that we must use:
Which will lead to have much more data having to pass on the network.
I suggest that we add a field
pipeline
inBulkRequest
class. The same is true for defaultindex
andtype
.The text was updated successfully, but these errors were encountered: