-
Notifications
You must be signed in to change notification settings - Fork 6.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added possibility to save logs on crash and options to configure logs buffer #52174
Conversation
Good day to all, Please label 'can be tested' and assign a reviewer. |
This is an automated comment for commit 112058d with description of existing statuses. It's updated for the latest CI running
|
Integration tests failures are probably related to the changes |
You are right, Alexander. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general LGTM
- `partition_by` — устанавливает [произвольный ключ партиционирования](../../operations/server-configuration-parameters/settings.md). Нельзя использовать если используется `engine` | ||
- `engine` - устанавливает [настройки MergeTree Engine](../../engines/table-engines/mergetree-family/mergetree.md#table_engine-mergetree-creating-a-table) для системной таблицы. Нельзя использовать если используется `partition_by`. | ||
- `flush_interval_milliseconds` — период сброса данных из буфера в памяти в таблицу. | ||
- `max_size` – максимальный размер в линиях для буфера с логами. Когда буфер будет заполнен полностью, сбрасывает логи на диск. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
в строках?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
node.restart_clickhouse() | ||
for i in range(10): | ||
node.query(f"select {i}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need TRUNCATE here as well? (afaik tests may be reordered)
Also we probably need to add these tests to parallel_skip.json
, so they will not be executed in parallel
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TRUNCATE
is not strictly necessary. It just removes entries from previous log queries from other test cases.
Thanks for letting me know about parallel_skip.json
. I did not know that integrational tests could be run in parallel. Does parallel test execution use the same test fixture
?
I saw that the test order is not equal to the declaration order.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added tests to parallel_skip.json
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does parallel test execution use the same test fixture?
I'm not sure
src/Interpreters/Session.cpp
Outdated
@@ -182,7 +182,7 @@ class NamedSessionsStorage | |||
std::chrono::steady_clock::time_point close_cycle_time = std::chrono::steady_clock::now(); | |||
UInt64 close_cycle = 0; | |||
|
|||
void scheduleCloseSession(NamedSessionData & session, std::unique_lock<std::mutex> &) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider using TSA
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunally, It may be some kind of Sanitizer error, at least i don't get why the error happens in this context:
Error text:
/home/admin/ClickHouse5/ClickHouse/src/Interpreters/Session.cpp:212:29: error: calling function 'closeSessions' requires holding mutex 'mutex' exclusively [-Werror,-Wthread-safety-analysis]
auto interval = closeSessions();
^
/home/admin/ClickHouse5/ClickHouse/src/Interpreters/Session.cpp:207:5: note: thread warning in function 'cleanThread'
{
^
1 error generated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just reverted my changes, and keep unused unique_lock
in parameters.
But later I found this:
https://clang.llvm.org/docs/ThreadSafetyAnalysis.html
To use unique_lock with TSA, we need to use PT_GUARDED_BY
macro in Clang
Please note that it will not guarantee that all logs are flushed on crash. However, we could implement a simple "WAL" for system logs, like:
Of course it will be too much for introspection logs ( But it's out of the scope of this PR |
Yes, I confirm that this solution does not guarantee log flushing. It just notifies the flushing thread. I like the idea of the Write Ahead Log. It could be a good addition to these changes. |
#52841 WAL implementation task. |
by-value + move
idiom for system logs.max_size
for the system logs instead of a hardcoded parameter.reserved_size
for the system logs instead of the constexpr non-configurable parameter.buffer_size_flush_threshold
for the system logs instead of the constexpr non-configurable parameter.Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):