Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set application/x-ndjson content type on bulk requests. #618

Merged
merged 2 commits into from
Oct 4, 2017

Conversation

jmcarp
Copy link
Contributor

@jmcarp jmcarp commented Jul 13, 2017

Resolves #609

@jmcarp jmcarp force-pushed the issue-609-x-ndjson branch 2 times, most recently from 38d30c0 to 7ee9be3 Compare July 13, 2017 03:41
Copy link
Contributor

@tylerjharden tylerjharden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me. Nice work to stay compliant as ES changes.

Copy link
Contributor

@honzakral honzakral left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the patch! It is very solid, left a few tiny issues and one big one - changes like this needs tests.

@@ -111,6 +111,8 @@ def perform_request(self, method, url, params=None, body=None, timeout=None, ign
if not isinstance(method, str):
method = method.encode('utf-8')

request_headers = dict(self.headers)
request_headers.update(headers or {})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice if this block only ran if headers: to avoid copying the self.headers dict every time when it is only needed for the bulk-like bodies

@@ -1164,7 +1164,8 @@ def bulk(self, body, index=None, doc_type=None, params=None):
if body in SKIP_IN_PATH:
raise ValueError("Empty value passed for a required argument 'body'.")
return self.transport.perform_request('POST', _make_path(index,
doc_type, '_bulk'), params=params, body=self._bulk_body(body))
doc_type, '_bulk'), params=params, body=self._bulk_body(body),
headers={'Content-Type': 'application/x-ndjson'})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we use lowercase for all headers to avoid conflicts - see https://github.com/elastic/elasticsearch-py/blob/master/elasticsearch/connection/http_urllib3.py#L59-L62

A test is needed to make sure the headers are merged properly

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are other APIs using the _bulk_body like msearch and mpercolate

@@ -269,6 +269,8 @@ def perform_request(self, method, url, params=None, body=None):

:arg method: HTTP method to use
:arg url: absolute url (without host) to target
:arg headers: dictionary of headers, will be handed over to the
underlying :class:`~elasticsearch.Connection` class for serialization
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no serialization happen to the headers, we need to set expectations correctly

@davidt99
Copy link

davidt99 commented Aug 7, 2017

@jmcarp Any plan to fix the review comments? Most of the driver requests (if not all) doesn't state content type, resulting warning logs in elastic 5.5.
The header support you wrote will be allow adding content type to all the request, thus fixing the warning.

@honzakral
Copy link
Contributor

@davidt99 All requests are anotated with content-type accepted by elasticsearch since d862c8a (released in 5.2). You can also set headers globally by specifying headers={'custom': 'header'} when instantiating Elasticsearch.

@davidt99
Copy link

davidt99 commented Aug 7, 2017

@honzakral Oh, I was using 5.1 and looked for similar issue. Thanks.

@jmcarp
Copy link
Contributor Author

jmcarp commented Aug 8, 2017

I cleaned this up based on your suggestions @honzakral; let me know if this needs more revisions. I'm not sure why the tests are failing, but it looks like they're failing in the same way as master.

@honzakral honzakral changed the base branch from master to 5.x August 15, 2017 14:52
@honzakral honzakral changed the base branch from 5.x to master August 15, 2017 14:52
Copy link
Contributor

@honzakral honzakral left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you for the updated patch, it looks really good. Only outstanding issues are more of a housekeeping issues where a PR should ideally only touch the code it needs to touch for easier merging and reviewing.

I am more than happy to make those changes when merging your PR, will do that in a few days. It is after all my pedantic standard, not any fault with the code :)

if param in SKIP_IN_PATH:
raise ValueError("Empty value passed for a required argument.")
return self.transport.perform_request('GET', _make_path(index,
doc_type, id, '_percolate', 'count'), params=params, body=body)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

several APIs have been removed from master as they have been removed in elasticsearch 6.0. Either please remove them from the patch or base the patch on 5.x

@@ -1387,4 +1540,3 @@ def field_caps(self, index=None, body=None, params=None):
"""
return self.transport.perform_request('GET', _make_path(index,
'_field_caps'), params=params, body=body)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please configure your editor not to make changes like these. It makes the PR hard to review when they are unrelated stylistic changes in unrelated places.

@@ -123,5 +123,3 @@ def _raise_error(self, status_code, raw_data):
logger.warning('Undecodable raw error response from server: %s', err)

raise HTTP_EXCEPTIONS.get(status_code, TransportError)(status_code, error_message, additional_info)


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please configure your editor not to make changes like these. It makes the PR hard to review when they are unrelated stylistic changes in unrelated places.

@@ -292,7 +294,7 @@ def perform_request(self, method, url, params=None, body=None):

if body is not None:
try:
body = body.encode('utf-8', 'surrogatepass')
body = body.encode('utf-8', 'surrogatepass')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please configure your editor not to make changes like these. It makes the PR hard to review when they are unrelated stylistic changes in unrelated places.

@@ -16,7 +16,7 @@ def bulk(self, *args, **kwargs):
self._called += 1
if self._called in self._fail_at:
raise self._fail_with
return self.client.bulk(*args, **kwargs)
return self.client.bulk(*args, **kwargs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please configure your editor not to make changes like these. It makes the PR hard to review when they are unrelated stylistic changes in unrelated places.

@jmcarp
Copy link
Contributor Author

jmcarp commented Aug 21, 2017

I rebased on master and dropped the deprecated methods, and reverted whitespace changes. This is ready for another look when you have time @honzakral.

@fxdgear
Copy link
Contributor

fxdgear commented Oct 4, 2017

This LGTM. Can you please rebase off master and push again? I had to change the version of ES that travis uses.

@jmcarp
Copy link
Contributor Author

jmcarp commented Oct 4, 2017

Done @fxdgear.

@fxdgear
Copy link
Contributor

fxdgear commented Oct 4, 2017

Great thanks @jmcarp

@fxdgear fxdgear merged commit a03504c into elastic:master Oct 4, 2017
danielmitterdorfer added a commit to danielmitterdorfer/elasticsearch-py that referenced this pull request Dec 13, 2017
outside. However, `Urllib3HttpConnection` did not use these headers but
only its default ones.

With this commit, `Urllib3HttpConnection` will either use the default
ones or merge the default headers with the provided ones if there are
any.

Relates elastic#618
honzakral added a commit that referenced this pull request Jan 1, 2018
outside. However, `Urllib3HttpConnection` did not use these headers but
only its default ones.

With this commit, `Urllib3HttpConnection` will either use the default
ones or merge the default headers with the provided ones if there are
any.

Relates #618
rciorba added a commit to rciorba/elasticsearch-py that referenced this pull request Mar 2, 2018
Set application/x-ndjson content type on bulk requests.
rciorba added a commit to rciorba/elasticsearch-py that referenced this pull request Mar 2, 2018
outside. However, `Urllib3HttpConnection` did not use these headers but
only its default ones.

With this commit, `Urllib3HttpConnection` will either use the default
ones or merge the default headers with the provided ones if there are
any.

Relates elastic#618
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants