Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: lost messages when clickhouse closes while sending messages #10997

Conversation

kjellwinblad
Copy link
Contributor

@kjellwinblad kjellwinblad commented Jun 9, 2023

I think this can be merged without test case so that it gets into the 5.1 release. I can create a follow up PR with a test case.

Fixes:
https://emqx.atlassian.net/browse/EMQX-10217

Summary

🤖 Generated by Copilot at 61488ec

Improve error handling in emqx_bridge_clickhouse_connector.erl by distinguishing recoverable and fatal errors when sending queries to clickhouse. Add a TODO comment for future improvement.

PR Checklist

Please convert it to a draft if any of the following conditions are not met. Reviewers may skip over until all the items are checked:

  • [] Added tests for the changes. Will add test case in a follow up PR
  • [] Changed lines covered in coverage report
  • Change log has been added to changes/{ce,ee}/(feat|perf|fix)-<PR-id>.en.md files
  • [x For internal contributor: there is a jira ticket to track this change
  • If there should be document changes, a PR to emqx-docs.git is sent, or a jira ticket is created to follow up
  • Schema changes are backward compatible

@kjellwinblad kjellwinblad requested a review from a team as a code owner June 9, 2023 14:10
sstrigler

This comment was marked as duplicate.

{error, {closed, PartialBody}} ->
{error, {recoverable_error, {closed_partial_body, PartialBody}}};
{error, disconnected} ->
{error, {recoverable_error, disconnected}};
_ ->
Copy link
Contributor

@sstrigler sstrigler Jun 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  case is_recoverable_error(ClickhouseErrorResult) of
    true ->
      to_recoverable_error(ClickHouseErrorResult):
    false ->
      {error, ClickhouseErrorResult} // not sure, but doesn't that end up as a double wrapped error?
  end.
    
is_recoverable_error({error, Reason}) ->
  is_recoverable_error_reason(Reason).
  
is_recoverable_error_reason(...) -> 
 // fill in all clauses from above
 true;
is_recoverable_error_reason(_) -> 
 false.
 
to_recoverable_error({error, Reason}) ->
  {error, {recoverable_error, Reason}}.
  

#nit

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Fixed in commit below

@kjellwinblad kjellwinblad merged commit 43292c0 into emqx:release-51 Jun 9, 2023
109 checks passed
@yanzhiemq
Copy link
Collaborator

yanzhiemq commented Jun 13, 2023

Bug Fixes

  • The ClickHouse bridge had a problem that could cause messages to be dropped when the ClickHouse server is closed while sending messages even when the request_ttl is set to infinity. This has been fixed by treating errors due to a closed connection as recoverable errors.

@yanzhiemq
Copy link
Collaborator

yanzhiemq commented Jun 13, 2023

修复

  • ClickHouse 桥接存在一个问题,即当 ClickHouse 服务器在发送消息时关闭时,即使请求的 ttl(time to live)设置为无限大,也可能导致消息丢失。通过将由于连接关闭引起的错误视为可恢复错误修复了该问题。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants