Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fail hard on ES shard failure #23724

Merged
merged 2 commits into from
Mar 22, 2019
Merged

Conversation

dannyroberts
Copy link
Member

this seems to happen only very rarely anyways, but returning incomplete results seems
worse in most cases than just getting an error

this seems to happen only very rarely anyways, but returning incomplete results seems
worse in most cases than just getting an error
@@ -536,3 +540,6 @@ def report_shard_failures(search_result):

if search_result.get('_shards', {}).get('failed'):
datadog_counter('commcare.es.partial_results', value=1)
# Example message:
# "_shards: {'successful': 2, 'failed': 0, 'total': 2}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: except the failed value should never be zero if we get here

@@ -536,3 +540,6 @@ def report_shard_failures(search_result):

if search_result.get('_shards', {}).get('failed'):
datadog_counter('commcare.es.partial_results', value=1)
# Example message:
# "_shards: {'successful': 2, 'failed': 0, 'total': 2}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this function (and docstring) is no longer well named as it now raises an exception.
Would it be useful to also include the query in the error message so we can figure out what happened? Or is that included in the stacktrace already?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think ES requests show up in the sentry breadcrumbs. I've confirmed that's the case for some ESError 500 sentry events, so I'm fairly sure about this

@dannyroberts dannyroberts added the product/invisible Change has no end-user visible impact label Mar 21, 2019
@millerdev millerdev merged commit 1f81467 into master Mar 22, 2019
@millerdev millerdev deleted the dmr/fail-hard-on-es-shard-failure branch March 22, 2019 12:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
product/invisible Change has no end-user visible impact
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants