Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

In memory coordination inside ClickHouse #19580

Merged
merged 105 commits into from
Feb 13, 2021
Merged

In memory coordination inside ClickHouse #19580

merged 105 commits into from
Feb 13, 2021

Conversation

alesapin
Copy link
Member

I hereby agree to the terms of the CLA available at: https://yandex.ru/legal/cla/?lang=en

Changelog category (leave one):

  • Not for changelog (changelog entry is not required)

@robot-clickhouse robot-clickhouse added the pr-not-for-changelog This PR should not be mentioned in the changelog label Jan 25, 2021
@alesapin
Copy link
Member Author

Todo: simplify startup

@alesapin
Copy link
Member Author

Todo: simplify startup

done

@alesapin
Copy link
Member Author

#20175

@alesapin
Copy link
Member Author

alesapin commented Feb 12, 2021

test_materialize_mysql_database/test.py::test_clickhouse_killed_while_insert_5_7[clickhouse_node1]
test_materialize_mysql_database/test.py::test_clickhouse_killed_while_insert_8_0[clickhouse_node1]
cc @sundy-li

@alesapin
Copy link
Member Author

2021.02.12 14:20:58.800988 [ 17728 ] {de3e7894-b401-4f7d-8530-90cd5ab06682} <Debug> executeQuery: (from [::1]:45792, using production parser) (comment: /usr/share/clickhouse-test/queries/0_stateless/01520_client_print_query_id.expect) SELECT * FROM numbers(34599)
2021.02.12 14:20:58.916484 [ 17728 ] {de3e7894-b401-4f7d-8530-90cd5ab06682} <Trace> ContextAccess (default): Access granted: CREATE TEMPORARY TABLE ON *.*
2021.02.12 14:20:59.071980 [ 17728 ] {de3e7894-b401-4f7d-8530-90cd5ab06682} <Trace> InterpreterSelectQuery: FetchColumns -> Complete
2021.02.12 14:21:10.708202 [ 17728 ] {de3e7894-b401-4f7d-8530-90cd5ab06682} <Information> executeQuery: Read 34599 rows, 270.30 KiB in 11.876294055 sec., 2913 rows/sec., 22.76 KiB/sec.
2021.02.12 14:22:10.506261 [ 17728 ] {de3e7894-b401-4f7d-8530-90cd5ab06682} <Debug> DynamicQueryHandler: Done processing query
2021.02.12 14:22:18.238037 [ 375 ] {} <Fatal> BaseDaemon: (version 21.3.1.5996, build id: 8DBCED54529C989F7AD4D991F51410774D55DE6C) (from thread 17728) Terminate called for uncaught exception:
2021.02.12 14:22:18.811071 [ 18134 ] {} <Fatal> BaseDaemon: ########################################
2021.02.12 14:22:18.878935 [ 18134 ] {} <Fatal> BaseDaemon: (version 21.3.1.5996, build id: 8DBCED54529C989F7AD4D991F51410774D55DE6C) (from thread 17728) (query_id: de3e7894-b401-4f7d-8530-90cd5ab06682) Received signal Aborted (6)
2021.02.12 14:22:18.943148 [ 18134 ] {} <Fatal> BaseDaemon: 
2021.02.12 14:22:19.007073 [ 18134 ] {} <Fatal> BaseDaemon: Stack trace: 0x7f109932018b 0x7f10992ff859 0x8bb33ae 0x8e301dd 0x17dac8c4 0x17dac7c7 0x8c3fe0b 0x8d552c5 0x8d552ea 0x11a29914 0x11a2a2ca 0x12f96092 0x12f8c65e 0x12f84300 0x15b84110 0x15bc0913 0x15bc103f 0x15d29a12 0x15d27fb0 0x15d267b8 0x8badbad 0x7f10994d5609 0x7f10993fc293
2021.02.12 14:22:19.255998 [ 18134 ] {} <Fatal> BaseDaemon: 5. raise @ 0x4618b in /usr/lib/x86_64-linux-gnu/libc-2.31.so
2021.02.12 14:22:19.270203 [ 18134 ] {} <Fatal> BaseDaemon: 6. abort @ 0x25859 in /usr/lib/x86_64-linux-gnu/libc-2.31.so
2021.02.12 14:22:50.108918 [ 370 ] {} <Fatal> Application: Child process was terminated by signal 6.

Not related to changes

@alesapin
Copy link
Member Author

No related failures, going to merge today or tomorrow.

@sundy-li
Copy link
Contributor

test_materialize_mysql_database

cc @zhang2014

@alesapin alesapin merged commit f801376 into master Feb 13, 2021
@alesapin alesapin deleted the in_memory_raft branch February 13, 2021 07:19
@azat
Copy link
Collaborator

azat commented Feb 13, 2021

Not related to changes

Code: 24, e.displayText() = DB::Exception: Cannot write to ostream at offset 262994, Stack trace (when copying this message, always include the lines below):

This came from #19451
I have some ideas, but let's improve diagnostics in stress tests first #20462

@azat
Copy link
Collaborator

azat commented Feb 13, 2021

2021.02.12 14:22:18.238037 [ 375 ] {} BaseDaemon: (version 21.3.1.5996, build id: 8DBCED54529C989F7AD4D991F51410774D55DE6C) (from thread 17728) Terminate called for uncaught exception:

@alesapin should be fixed in #20464

@tisonkun
Copy link
Contributor

commit 57c9b6c belongs to this pull request excludes darwin(macOS) from running ClickHouse with

<Error> Application: DB::Exception: ClickHouse server built without NuRaft library. Cannot use internal coordination.

it is intended? If so, is there a plan to fix?

@alesapin
Copy link
Member Author

commit 57c9b6c belongs to this pull request excludes darwin(macOS) from running ClickHouse with

<Error> Application: DB::Exception: ClickHouse server built without NuRaft library. Cannot use internal coordination.

it is intended? If so, is there a plan to fix?

Yes, it's intended. I think we will fix it after linux version will be completely ready.

@tisonkun
Copy link
Contributor

@alesapin where does the original incompatibility come from? NuRaft itself is compatible with OSX.

@tisonkun
Copy link
Contributor

tisonkun commented Feb 25, 2021

commit 57c9b6c belongs to this pull request excludes darwin(macOS) from running ClickHouse with

<Error> Application: DB::Exception: ClickHouse server built without NuRaft library. Cannot use internal coordination.

it is intended? If so, is there a plan to fix?

Yes, it's intended. I think we will fix it after linux version will be completely ready.

@alesapin locally replace ulong / size_t inherited from nuraft with nuraft::ulong make it compiled on OS_DARWIN. Notice that you're actively working on this topic, maybe you can take it into consideration.

That said, ulong →· nuraft::ulong and size_t as type of override function also → nuraft::ulong

@alesapin
Copy link
Member Author

commit 57c9b6c belongs to this pull request excludes darwin(macOS) from running ClickHouse with

<Error> Application: DB::Exception: ClickHouse server built without NuRaft library. Cannot use internal coordination.

it is intended? If so, is there a plan to fix?

Yes, it's intended. I think we will fix it after linux version will be completely ready.

@alesapin locally replace ulong / size_t inherited from nuraft with nuraft::ulong make it compiled on OS_DARWIN. Notice that you're actively working on this topic, maybe you can take it into consideration.

Yes, but it's not the only problem -- in TCP server handler I use epoll for incoming connections which is not available on non-linux Unix systems. I've replaced it with poll, but probably it will work worse.

The current development stage is still a prototype, but I can fix OS_DARWIN build if you need it.

@tisonkun
Copy link
Contributor

@alesapin I'll appreciate it if you fix that. Either fix the type issue, the epoll issue and possible others, or disable NuKeeper module for OS_DARWIN for now.

@alesapin
Copy link
Member Author

@alesapin I'll appreciate it if you fix that. Either fix the type issue, the epoll issue and possible others, or disable NuKeeper module for OS_DARWIN for now.

But it's already disabled? https://github.com/ClickHouse/ClickHouse/blob/master/src/CMakeLists.txt#L197-L199

@tisonkun
Copy link
Contributor

#else
throw Exception(ErrorCodes::SUPPORT_IS_DISABLED, "ClickHouse server built without NuRaft library. Cannot use internal coordination.");
#endif

You're right. I should use the term "run", normally running on OS_DARWIN. However, if it is an significant burden over prototyping, it's ok we give another pass later for the fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr-not-for-changelog This PR should not be mentioned in the changelog submodule changed At least one submodule changed in this PR.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants