Set a non-zero exit code when client elapses receive_timeout#91432
Conversation
|
Workflow [PR], commit [a196398] Summary: ❌
|
f978952 to
2997c1e
Compare
2997c1e to
ad54a68
Compare
|
Good shout @Fgrtue - I have added a test to check for the new exit code. |
|
@sberss сould you also add a test with a different |
There was a problem hiding this comment.
Pull request overview
This PR improves error handling in clickhouse-client by ensuring timeout failures return a proper non-zero exit code (159 - TIMEOUT_EXCEEDED) instead of exiting successfully with code 0. This change enables scripts and automation to correctly detect and handle query timeout scenarios.
Key Changes:
- Modified timeout handling to throw an exception instead of just printing an error message
- Added comprehensive test coverage to verify the new exit code behavior
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
src/Client/ClientBase.cpp |
Changed timeout handling from printing an error message to throwing a TIMEOUT_EXCEEDED exception, ensuring non-zero exit code |
tests/queries/0_stateless/03734_client_receive_timeout_exit_code.sh |
Added test script to verify timeout returns exit code 159 and error message is displayed |
tests/queries/0_stateless/03734_client_receive_timeout_exit_code.reference |
Added expected output for the test (exit code 159 and grep success code 0) |
tests/queries/0_stateless/03734_client_receive_timeout_exit_code.sh
Outdated
Show resolved
Hide resolved
tests/queries/0_stateless/03734_client_receive_timeout_exit_code.sh
Outdated
Show resolved
Hide resolved
This also seems to happen on the I'm struggling to find anything else that triggers the timeout when set to greater than 0. Do you have any suggestions? |
ad54a68 to
3337ad6
Compare
|
I don't have ideas on the test at the moment, but I just wanted to make sure that this is the behavior that you want. Probably the client does receive some packets internally from the server, and that is why the timeout does not fire. However, I can imagine it happening in the cases when there are no packets received for time longer than the Could you also fix the |
If this refers to the Copilot message, it seems to have been deleted. Do you have the specifics on what is needed here? |
I deleted the Copilot message as it was misleading and added my own explanation of what I suggest to do. If it works for you, feel free to use it a guidance. |
|
Sorry @Fgrtue, I'm potentially being completely blind, but I don't see the explanation you are referring to. |
|
No worries, I'll copy it here: The Exception constructor uses PreformattedMessage and error code in the constrictor. At the moment we additionally create another exception before passing it to make_unique constructor. This seems to be redundant, since we can just create PreformatterMessage instead of a std::string and pass it as an argument to make_unique. Could you adjust this part please? |
3337ad6 to
a07fd91
Compare
|
Ah, I see what you mean - does the change I made look OK? |
fcfc6e1 to
fc35478
Compare
|
Hi @sberss, your change looked a bit different from what I had in mind, so I updated it to streamline the implementation. I hope this is fine. I discussed the test issue with other colleagues. As I mentioned, the server is probably sending some information to the client, which is why the timeout doesn't fire. It's unclear what exactly the server is sending, and this requires some investigation. Unfortunately, I don't think I can merge without a test that properly validates the functionality with a non-trivial |
|
I had another look into this and found that the server was sending progress packets every 100ms (controlled by the I've updated the test to set |
8fbe23f to
a0d71f2
Compare
…ogress packets resetting the timeout counter
a0d71f2 to
a196398
Compare
|
Hi @sberss! Sorry for the late reply. Yes, the tests look good now. Let's see what the tests do, and if there is nothing related inside we will proceed. |
002a1ee
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):
Update clickhouse-client to return a non-zero exit code (159 - TIMEOUT_EXCEEDED) when a query times out due to
receive_timeout. Previously, timeouts would return exit code 0 (success), making it difficult for scripts and automation to detect timeout failures.Documentation entry for user-facing changes
Change Details:
Motivation:
When running clickhouse-client in non-interactive mode (scripts/automation), timeout errors were incorrectly treated as successful execution. The client would print "Timeout exceeded while receiving data from server" but exit with code 0, causing scripts to incorrectly assume the query completed successfully.
This is particularly problematic for long-running INSERT statements where timeouts are a real failure condition that should stop script execution.
Behavior Change:
Example: