Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixing get_hits_count for es 7.x #333

Merged
merged 6 commits into from Aug 3, 2021

Conversation

JeffAshton
Copy link
Contributor

@JeffAshton JeffAshton commented Jul 7, 2021

get_hits_count isn't matching in 7.x because of the doc_type. This is an especially bad because it fails to match silently.

@JeffAshton JeffAshton force-pushed the es7_get_hits_count branch 2 times, most recently from 16549c7 to a58cbf8 Compare July 7, 2021 14:44
try:
res = self.thread_data.current_es.count(index=index, doc_type=rule['doc_type'], body=query, ignore_unavailable=True)
if es_client.is_atleastsixsix():
res = es_client.count(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without doc_type:

es_client.count(
                    index=index,
                    body={'query': {'match_all': {}}},
                    ignore_unavailable=True
                )
{'count': 14074, '_shards': {'total': 1, 'successful': 1, 'skipped': 0, 'failed': 0}}

body=query,
ignore_unavailable=True
)
else:
Copy link
Contributor Author

@JeffAshton JeffAshton Jul 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With doc_type:

es_client.count(
                    index=index,
                    doc_type=rule['doc_type'],
                    body={'query': {'match_all': {}}},
                    ignore_unavailable=True
                )
{'count': 0, '_shards': {'total': 1, 'successful': 1, 'skipped': 0, 'failed': 0}}

This is silently just not matching any documents in 7.x. :(

Copy link
Contributor Author

@JeffAshton JeffAshton Jul 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks to match if you use the implicit _doc type, but this makes side by side migrations not possible where 6.x clusters had a doc_type different from the default.

Copy link
Owner

@jertel jertel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed no unit test was included. I'm assuming because we're already missing unit coverage for this method?

@JeffAshton
Copy link
Contributor Author

I noticed no unit test was included. I'm assuming because we're already missing unit coverage for this meth

I noticed no unit test was included. I'm assuming because we're already missing unit coverage for this method?

I can look at adding some coverage. Found the issue and wanted to get people aware. This is one where we really should have a few integration tests with an Elasticsearch docker instance. Mocking could have easily hidden the same issue.

Copy link
Owner

@jertel jertel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any additional coverage you can provide would be very appreciated. A small improvement to your change would be to define a new helper method called determineDocType() which would return None when using ES 6.2+. That would serve three purposes:

  1. Eliminates maintenance of two similar code statements that call the count() function.
  2. Allows for simple unit testing of the new helper method.
  3. Provides a reusable method that we're likely going to need for ES 8.x compatibility.

@JeffAshton
Copy link
Contributor Author

Any additional coverage you can provide would be very appreciated. A small improvement to your change would be to define a new helper method called determineDocType() which would return None when using ES 6.2+. That would serve three purposes:

  1. Eliminates maintenance of two similar code statements that call the count() function.
  2. Allows for simple unit testing of the new helper method.
  3. Provides a reusable method that we're likely going to need for ES 8.x compatibility.

I was following the established pattern:

if self.writeback_es.is_atleastsixtwo():

There's a few of these in this file. I think my preference would be to move this branching logic into the Elasticsearch provider. I wish they hadn't done inheritance, but composition instead, so that all methods used from the external library are veted in one location.

@jertel jertel merged commit 4ee8a3b into jertel:master Aug 3, 2021
@JeffAshton JeffAshton deleted the es7_get_hits_count branch October 8, 2021 14:53
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Feb 20, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants