Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes blockage of Transfer Queue if UpsertSearchAttributes is invoked on a Temporal Server w/o Elastic Search #727

Merged
merged 1 commit into from
Sep 15, 2020

Conversation

mastermanu
Copy link
Member

@mastermanu mastermanu commented Sep 15, 2020

The current implementations of UpsertWorkflowExecution on the Cassandra/SQL visibility persistence stores return a Service Error since Query Operations are not supported for those stores.

The problem is that we do allow customers to invoke UpsertWorkflowExecution successfully even if Elastic Search isn't enabled. This results in our Transfer Task Queues getting clogged because it keeps retrying a persistence operation that will always fail.

The fix is to just have all attempts to UpsertWorkflowExecution on Cassandra/SQL to function as a no-op so that it does not "fail" the transfer task perpetually. We will still fail with explicit errors on List/Scan/Count, so the user should be able to easily tell that their application logic is broken if it does depend on Elastic Search.

The change was verified as follows:

  1. Existing Unit test was modified to ensure that no-op is always returned for UpsertWorkflowExecution
  2. Running Bench Test on Temporal Server w/o ES cluster stopped spamming the "Critical error processing task" error log once this change was made.

This is a very low risk change.

@mastermanu mastermanu merged commit a09867d into temporalio:master Sep 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants