Skip to content

Reduce the impact of not using fail points#88196

Merged
Algunenano merged 3 commits intoClickHouse:masterfrom
Algunenano:failpoint_imp
Oct 8, 2025
Merged

Reduce the impact of not using fail points#88196
Algunenano merged 3 commits intoClickHouse:masterfrom
Algunenano:failpoint_imp

Conversation

@Algunenano
Copy link
Copy Markdown
Member

@Algunenano Algunenano commented Oct 7, 2025

Changelog category (leave one):

  • Performance Improvement

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Reduce the impact of not using fail points

selectPartsToRead is a really common function to do index analysis and checking for failpoints is expensive as it requires a function call and a global lock (see #88184). This PR removes the slowdown_index_analysis failpoint while trying to keep the associated test running correctly (let's see the CI).

It also introduces a fast path for the most common case which is not using fail points at all (they should only be used in CI) by having a global atomic variable that checks whether any fail point has been enabled or not. If it hasn't, then it avoids the path.

Requires ClickHouse/libfiu#4

Documentation entry for user-facing changes

  • Documentation is written (mandatory for new features)

@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh bot commented Oct 7, 2025

Workflow [PR], commit [a3c0f11]

@Algunenano Algunenano changed the title Failpoint improvements Reduce the impact of not using fail points Oct 7, 2025
@clickhouse-gh clickhouse-gh bot added pr-performance Pull request with some performance improvements submodule changed At least one submodule changed in this PR. labels Oct 7, 2025
@rschu1ze
Copy link
Copy Markdown
Member

rschu1ze commented Oct 7, 2025

Re the fastpath that uses atomics: I am not again it but tbh, the more robust solution would be to force-disable failpoints via CMake in release builds (but not 100% sure if anyone ever used failpoints in a prod/customer build).

@devcrafter devcrafter self-assigned this Oct 7, 2025
@Algunenano
Copy link
Copy Markdown
Member Author

Re the fastpath that uses atomics: I am not again it but tbh, the more robust solution would be to force-disable failpoints via CMake in release builds (but not 100% sure if anyone ever used failpoints in a prod/customer build).

We do, for CI only. For example the test that I'm modifying is enabled only for release build. We can try this way and if it's a problem again think about keeping failpoints only for debug and sanitizer builds, but that will require reviewing all tests.

@Algunenano Algunenano enabled auto-merge October 8, 2025 08:35
@Algunenano Algunenano added this pull request to the merge queue Oct 8, 2025
Merged via the queue into ClickHouse:master with commit 8e6feb5 Oct 8, 2025
32 of 112 checks passed
@Algunenano Algunenano deleted the failpoint_imp branch October 8, 2025 09:37
@robot-ch-test-poll1 robot-ch-test-poll1 added the pr-synced-to-cloud The PR is synced to the cloud repo label Oct 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-performance Pull request with some performance improvements pr-synced-to-cloud The PR is synced to the cloud repo submodule changed At least one submodule changed in this PR.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants