Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge LIVE VIEW #6425

Merged
merged 101 commits into from Aug 20, 2019
Merged

Merge LIVE VIEW #6425

merged 101 commits into from Aug 20, 2019

Conversation

@KochetovNicolai
Copy link
Contributor

KochetovNicolai commented Aug 9, 2019

I hereby agree to the terms of the CLA available at: https://yandex.ru/legal/cla/?lang=en

For changelog. Remove if this is non-significant change.

Category (leave one):

  • New Feature

Short description (up to few sentences):
Added implementation of LIVE VIEW tables that were originally proposed in #3925, and then updated in #5541. Note that currently only LIVE VIEW tables are supported. See #5541 for detailed description.

ludv1x and others added 30 commits Nov 16, 2017
…3452]

Fixed handling of obsolete parts.
Fixed conflict resolution between simultaneous PreCommitted covering parts.
Fixed memory leak caused by ordinary MergeTree parts stucked in Deleting state.
Added hidden _state column into system.parts.
…logger.flush logger.rotateOnOpen

Conflicts:
	.gitmodules
	dbms/src/Common/BackgroundSchedulePool.h
robot-metrika-test
… SELECT

* Adding support for _version column to LIVE VIEWS when using WATCH query
* Adding initial support for WATCH query on LIVE VIEWs
* Fixing issue with setting _version virtual column
* Added uexpect.py module
* Fixed support for CREATE TEMPORARY LIVE VIEW
* Adding support for watching live views via HTTPHandler
* Small fix to WriteBufferValidUTF8.cpp to propagate next() call
* Updated copyData.cpp to treat block with no rows as
  "flush block"
* Updated PushingToViewsBlockOutputStream.cpp to directly use
  LIVE VIEW output stream instead of calling its write method
…House into liveview"

This reverts commit 160ff4d, reversing
changes made to a946d09.
…nterval

  and making the setting to be in seconds instead of usec
* Updated LIVE VIEW table to keep version consistent as long as the server is up
* Removed calculation of hash in writeIntoLiveView method as it is recalculated
  in the LiveViewBlockOutputStream
* Changed to using values from the local context for live_view_heartbeat_interval and
  temporary_live_view_timeout instead of the global context
* Updated LIVE VIEW tests to match the changes
@vzakaznikov

This comment has been minimized.

Copy link
Contributor

vzakaznikov commented Aug 16, 2019

Hi @KochetovNicolai, thanks for trying to merge the PR. I apologize for not getting back earlier. I was just distracted by other work.

What was wrong with the 00963_temporary_live_view_watch_live_timeout test? Maybe I can improve it. I also see that you have rewritten one of the tests, I guess the uexpect.py module did not work for some reason?

@KochetovNicolai

This comment has been minimized.

Copy link
Contributor Author

KochetovNicolai commented Aug 18, 2019

Hi, @vzakaznikov
Now I'm just trying to rewrite tests because for some reasons most of them fails under tsan build (it's not tsan issue, but uexpect seem to hang sometimes). I think it wouldn't be very difficult.
After that I will try to merge pr just to avoid further conflicts with master. And after we could gradually fix other issues.

@vzakaznikov

This comment has been minimized.

Copy link
Contributor

vzakaznikov commented Aug 18, 2019

Hi @KochetovNicolai. I will try building with clang and thread sanitizer enabled to see if I can reproduce the issue. Hopefully, I can fix it so that we don't have to rewrite all the tests.

@vzakaznikov

This comment has been minimized.

Copy link
Contributor

vzakaznikov commented Aug 19, 2019

@KochetovNicolai, I've reproduced the hang issue in the uexpect.py. I will try to fix it.

@vzakaznikov

This comment has been minimized.

Copy link
Contributor

vzakaznikov commented Aug 19, 2019

@KochetovNicolai, could you try the patch below and see if it works for you

$ git diff
diff --git a/dbms/tests/queries/0_stateless/helpers/uexpect.py b/dbms/tests/queries/0_stateless/helpers/uexpect.py
index d65190e2b2..6f149992fc 100644
--- a/dbms/tests/queries/0_stateless/helpers/uexpect.py
+++ b/dbms/tests/queries/0_stateless/helpers/uexpect.py
@@ -189,6 +189,7 @@ def spawn(command):
     queue = Queue()
     reader_kill_event = Event()
     thread = Thread(target=reader, args=(process, master, queue, reader_kill_event))
+    thread.daemon = True
     thread.start()
 
     return IO(process, master, queue, reader={'thread':thread, 'kill_event':reader_kill_event})

Now running the tests shows that all tests pass except your new test which I think we can remove.

Running 21 stateless tests.

00991_live_view_watch_http:                                             [ FAIL ] - result differs with reference:
--- /home/user/github/ClickHouse_Master/dbms/tests/queries/0_stateless/00991_live_view_watch_http.reference	2019-08-18 16:55:59.127478676 -0400
+++ /home/user/github/ClickHouse_Master/dbms/tests/queries/0_stateless/00991_live_view_watch_http.stdout	2019-08-18 21:39:32.760480337 -0400
@@ -0,0 +1,5 @@
+0	1
+0	1
+6	2
+6	2
+curl: (28) Operation timed out after 10001 milliseconds with 8 bytes received

00980_create_temporary_live_view:                                       [ OK ]
00979_live_view_watch_live:                                             [ OK ]
00978_live_view_watch:                                                  [ OK ]
00977_live_view_watch_events:                                           [ OK ]
00976_live_view_select_version:                                         [ OK ]
00975_live_view_create:                                                 [ OK ]
00974_live_view_select_with_aggregation:                                [ OK ]
00973_live_view_select:                                                 [ OK ]
00972_live_view_select_1:                                               [ OK ]
00971_live_view_watch_http_heartbeat:                                   [ OK ]
00970_live_view_watch_events_http_heartbeat:                            [ OK ]
00969_live_view_watch_format_jsoneachrowwithprogress:                   [ OK ]
00968_live_view_select_format_jsoneachrowwithprogress:                  [ OK ]
00967_live_view_watch_http:                                             [ OK ]
00966_live_view_watch_events_http:                                      [ OK ]
00965_live_view_watch_heartbeat:                                        [ OK ]
00964_live_view_watch_events_heartbeat:                                 [ OK ]
00962_temporary_live_view_watch_live:                                   [ OK ]
00961_temporary_live_view_watch:                                        [ OK ]
00960_live_view_watch_events_live:                                      [ OK ]

Having 1 errors! 20 tests passed. 0 tests skipped.
Won't run stateful tests because test data wasn't loaded.

The root cause of the hang issue is that the clickhouse-client is not being exited.
Therefore, the bash process goes into a "defunct" state and can't be killed because it is waiting
for its child to exit. The test itself does not exit because the reader thread is block in os.read().

The solution is to make the reader thread a daemon so that the test is not blocked on the reader thread.

@vzakaznikov

This comment has been minimized.

Copy link
Contributor

vzakaznikov commented Aug 19, 2019

@KochetovNicolai, here is an update to my diff to resolve an issue with clickhouse-client processes left over after test execution.

$ git diff
diff --git a/dbms/tests/queries/0_stateless/helpers/client.py b/dbms/tests/queries/0_stateless/helpers/client.py
index de4da79480..9700c6a374 100644
--- a/dbms/tests/queries/0_stateless/helpers/client.py
+++ b/dbms/tests/queries/0_stateless/helpers/client.py
@@ -10,14 +10,24 @@ import uexpect
 prompt = ':\) '
 end_of_block = r'.*\r\n.*\r\n'
 
-def client(command=None, name='', log=None):
-    client = uexpect.spawn(['/bin/bash','--noediting'])
-    if command is None:
-        command = os.environ.get('CLICKHOUSE_BINARY', 'clickhouse') + '-client'
-    client.command = command
-    client.eol('\r')
-    client.logger(log, prefix=name)
-    client.timeout(20)
-    client.expect('[#\$] ', timeout=2)
-    client.send(command)
-    return client
+class client(object):
+    def __init__(self, command=None, name='', log=None):
+        self.client = uexpect.spawn(['/bin/bash','--noediting'])
+        if command is None:
+            command = os.environ.get('CLICKHOUSE_BINARY', 'clickhouse') + '-client'
+        self.client.command = command
+        self.client.eol('\r')
+        self.client.logger(log, prefix=name)
+        self.client.timeout(20)
+        self.client.expect('[#\$] ', timeout=2)
+        self.client.send(command)
+
+    def __enter__(self):
+        return self.client.__enter__()
+
+    def __exit__(self, type, value, traceback):
+        # send Ctrl-C
+        self.client.send('\x03', eol='')
+        self.client.send('\x03', eol='')
+        self.client.send('\x03', eol='')
+        return self.client.__exit__(type, value, traceback)
diff --git a/dbms/tests/queries/0_stateless/helpers/uexpect.py b/dbms/tests/queries/0_stateless/helpers/uexpect.py
index d65190e2b2..6f149992fc 100644
--- a/dbms/tests/queries/0_stateless/helpers/uexpect.py
+++ b/dbms/tests/queries/0_stateless/helpers/uexpect.py
@@ -189,6 +189,7 @@ def spawn(command):
     queue = Queue()
     reader_kill_event = Event()
     thread = Thread(target=reader, args=(process, master, queue, reader_kill_event))
+    thread.daemon = True
     thread.start()
 
     return IO(process, master, queue, reader={'thread':thread, 'kill_event':reader_kill_event})
Running 20 stateless tests.

00980_create_temporary_live_view:                                       [ OK ]
00979_live_view_watch_live:                                             [ OK ]
00978_live_view_watch:                                                  [ OK ]
00977_live_view_watch_events:                                           [ OK ]
00976_live_view_select_version:                                         [ OK ]
00975_live_view_create:                                                 [ OK ]
00974_live_view_select_with_aggregation:                                [ OK ]
00973_live_view_select:                                                 [ OK ]
00972_live_view_select_1:                                               [ OK ]
00971_live_view_watch_http_heartbeat:                                   [ OK ]
00970_live_view_watch_events_http_heartbeat:                            [ OK ]
00969_live_view_watch_format_jsoneachrowwithprogress:                   [ OK ]
00968_live_view_select_format_jsoneachrowwithprogress:                  [ OK ]
00967_live_view_watch_http:                                             [ OK ]
00966_live_view_watch_events_http:                                      [ OK ]
00965_live_view_watch_heartbeat:                                        [ OK ]
00964_live_view_watch_events_heartbeat:                                 [ OK ]
00962_temporary_live_view_watch_live:                                   [ OK ]
00961_temporary_live_view_watch:                                        [ OK ]
00960_live_view_watch_events_live:                                      [ OK ]

20 tests passed. 0 tests skipped.
Won't run stateful tests because test data wasn't loaded.

$ ps wax | grep clickhouse-client
12976 pts/0    S+     0:00 grep --color=auto clickhouse-client
@vzakaznikov

This comment has been minimized.

Copy link
Contributor

vzakaznikov commented Aug 19, 2019

@KochetovNicolai, here is another update as sometimes I still could see clickhouse-client processes left over. I think the fundamental issue is that clickhouse-client process
ignores SIGCHLD. Do you know why?
Also the tests do not attempt to exit clickhouse-client cleanly.

$ git diff
diff --git a/dbms/tests/queries/0_stateless/helpers/client.py b/dbms/tests/queries/0_stateless/helpers/client.py
index de4da79480..f3938d3bf6 100644
--- a/dbms/tests/queries/0_stateless/helpers/client.py
+++ b/dbms/tests/queries/0_stateless/helpers/client.py
@@ -1,5 +1,6 @@
 import os
 import sys
+import time
 
 CURDIR = os.path.dirname(os.path.realpath(__file__))
 
@@ -10,14 +11,26 @@ import uexpect
 prompt = ':\) '
 end_of_block = r'.*\r\n.*\r\n'
 
-def client(command=None, name='', log=None):
-    client = uexpect.spawn(['/bin/bash','--noediting'])
-    if command is None:
-        command = os.environ.get('CLICKHOUSE_BINARY', 'clickhouse') + '-client'
-    client.command = command
-    client.eol('\r')
-    client.logger(log, prefix=name)
-    client.timeout(20)
-    client.expect('[#\$] ', timeout=2)
-    client.send(command)
-    return client
+class client(object):
+    def __init__(self, command=None, name='', log=None):
+        self.client = uexpect.spawn(['/bin/bash','--noediting'])
+        if command is None:
+            command = os.environ.get('CLICKHOUSE_BINARY', 'clickhouse') + '-client'
+        self.client.command = command
+        self.client.eol('\r')
+        self.client.logger(log, prefix=name)
+        self.client.timeout(20)
+        self.client.expect('[#\$] ', timeout=2)
+        self.client.send(command)
+
+    def __enter__(self):
+        return self.client.__enter__()
+
+    def __exit__(self, type, value, traceback):
+        self.client.reader['kill_event'].set()
+        # send Ctrl-C
+        self.client.send('\x03', eol='')
+        time.sleep(0.3)
+        self.client.send('quit', eol='\r')
+        self.client.send('\x03', eol='')
+        return self.client.__exit__(type, value, traceback)
diff --git a/dbms/tests/queries/0_stateless/helpers/uexpect.py b/dbms/tests/queries/0_stateless/helpers/uexpect.py
index d65190e2b2..930a45a099 100644
--- a/dbms/tests/queries/0_stateless/helpers/uexpect.py
+++ b/dbms/tests/queries/0_stateless/helpers/uexpect.py
@@ -189,6 +189,7 @@ def spawn(command):
     queue = Queue()
     reader_kill_event = Event()
     thread = Thread(target=reader, args=(process, master, queue, reader_kill_event))
+    thread.daemon = True
     thread.start()
 
     return IO(process, master, queue, reader={'thread':thread, 'kill_event':reader_kill_event})
@@ -199,6 +200,6 @@ def reader(process, out, queue, kill_event):
             data = os.read(out, 65536)
             queue.put(data)
         except OSError, e:
-            if e.errno == 5 and kill_event.is_set():
+            if kill_event.is_set():
                 break
             raise
@KochetovNicolai

This comment has been minimized.

Copy link
Contributor Author

KochetovNicolai commented Aug 19, 2019

I think the fundamental issue is that clickhouse-client process
ignores SIGCHLD. Do you know why?

@vzakaznikov I know almost nothing about unix signals. But I have just googled that SIGCHLD is sent to parent process by os after child have finished (and usually should be ignored). Why do you think it's related?

@vzakaznikov

This comment has been minimized.

Copy link
Contributor

vzakaznikov commented Aug 19, 2019

@KochetovNicolai, you are right SIGCHLD has nothing to do with it. I confused it with SIGHUP which clickhouse-client actually does not ignore. We should try killing bash with SIGHUP
so that it re-sends it to all its children.

https://www.gnu.org/software/bash/manual/html_node/Signals.html#Signals

The shell exits by default upon receipt of a SIGHUP. Before exiting, an interactive shell re-sends the SIGHUP to all jobs, running or stopped.

@KochetovNicolai

This comment has been minimized.

Copy link
Contributor Author

KochetovNicolai commented Aug 19, 2019

@vzakaznikov Tests still fail sometimes. For example on release in prev commit.
Wouldn't you mind if I rewrite all tests without uexpect.py and so on? As I did with live_view_watch_http.

@vzakaznikov

This comment has been minimized.

Copy link
Contributor

vzakaznikov commented Aug 19, 2019

@KochetovNicolai, no I don't mind but I think we are very close to making it solid. I will provide you a diff by the end of the day that should fix this last issue and all tests should be good to go.
The small fix is in the reader method and change "except OSError" to just "except" so that all exceptions are ignored when kill event is set.

@KochetovNicolai

This comment has been minimized.

Copy link
Contributor Author

KochetovNicolai commented Aug 19, 2019

@vzakaznikov ok. I will wait till you fix tests. I've already rewritten four, but hope it's not a problem :)

@vzakaznikov

This comment has been minimized.

Copy link
Contributor

vzakaznikov commented Aug 19, 2019

Nicolai (@KochetovNicolai), here is the diff

$ git diff
diff --git a/dbms/tests/queries/0_stateless/helpers/uexpect.py b/dbms/tests/queries/0_stateless/helpers/uexpect.py
index 930a45a099..f71b32a53e 100644
--- a/dbms/tests/queries/0_stateless/helpers/uexpect.py
+++ b/dbms/tests/queries/0_stateless/helpers/uexpect.py
@@ -101,6 +101,7 @@ class IO(object):
 
     def close(self, force=True):
         self.reader['kill_event'].set()
+        os.system('pkill -TERM -P %d' % self.process.pid)
         if force:
             self.process.kill()
         else:
@@ -199,7 +200,7 @@ def reader(process, out, queue, kill_event):
         try:
             data = os.read(out, 65536)
             queue.put(data)
-        except OSError, e:
+        except:
             if kill_event.is_set():
                 break
             raise
@vzakaznikov

This comment has been minimized.

Copy link
Contributor

vzakaznikov commented Aug 19, 2019

Nicolai (@KochetovNicolai), looking at the results of the last commit there are no more failing live view tests. Hopefully, you don't have to rewrite any more of my tests :). I only see the 00395_nullable as failing.

Is there a way for you to re-run all the tests a few more times to make sure we are solid?

@KochetovNicolai

This comment has been minimized.

Copy link
Contributor Author

KochetovNicolai commented Aug 20, 2019

@vzakaznikov I've just merged branch with master. Let's wait for tests again and I will merge the pr if they pass.

@KochetovNicolai KochetovNicolai merged commit aef03fd into master Aug 20, 2019
23 of 24 checks passed
23 of 24 checks passed
Functional stateless tests (unbundled) fail:1, passed:1293, skipped:11
Details
ClickHouse build check 12/12 builds are OK
Details
Code coverage Coverage prepared
Details
Compatibility check Compatibility check passed
Details
Docs check Docs check passed
Details
Functional stateful tests (address) fail:0, passed:97
Details
Functional stateful tests (debug) fail:0, passed:97
Details
Functional stateful tests (release) fail:0, passed:97
Details
Functional stateful tests (thread) fail:0, passed:95, skipped:2
Details
Functional stateful tests (ubsan) fail:0, passed:97
Details
Functional stateless tests (address) fail:0, passed:1303, skipped:2
Details
Functional stateless tests (debug) fail:0, passed:1302, skipped:3
Details
Functional stateless tests (release) fail:0, passed:1305
Details
Functional stateless tests (thread) fail:0, passed:1302, skipped:3
Details
Functional stateless tests (ubsan) fail:0, passed:1303, skipped:2
Details
Integration tests (asan) failed:0, passed:182, error:0
Details
Integration tests (release) failed:0, passed:181, error:0
Details
Marker check All checks were started
PVS check Found 25 errors and 48 warnings
Details
Performance test 1202/1202 queries are OK
Details
Stress test (address) No errors found
Details
Stress test (thread) No errors found
Details
Style check Style check passed
Details
Unit tests fail: 0, passed: 4883
Details
@alexey-milovidov

This comment has been minimized.

Copy link
Member

alexey-milovidov commented Aug 22, 2019

It was too early to merge.

https://clickhouse-test-reports.s3.yandex.net/0/a52c004b5b364d64fa2fd6c5ae521beda60d106d/stress_test_(thread).html

2019.08.23 00:00:17.715007 [ 937 ] {} <Fatal> BaseDaemon: ########################################
2019.08.23 00:00:17.716036 [ 937 ] {} <Fatal> BaseDaemon: (version 19.14.1.1148 (official build)) (from thread 935) Received signal Aborted (6).
2019.08.23 00:00:17.716316 [ 937 ] {} <Fatal> BaseDaemon: 
2019.08.23 00:00:17.716929 [ 937 ] {} <Fatal> BaseDaemon: Stack trace: 0x7fcee345ce97 0x7fcee345e801 0x6afde1a 0x6df097d 0xe3f04a4 0xe3f03a7 0x6b8465b 0xe35bfec 0xc784b24 0xc787d16 0x6af8423 0x7fcee3bb46db 0x7fcee353f88f
2019.08.23 00:00:17.717257 [ 937 ] {} <Fatal> BaseDaemon: 5. 0x7fcee345ce97 raise /lib/x86_64-linux-gnu/libc-2.27.so
2019.08.23 00:00:17.717417 [ 937 ] {} <Fatal> BaseDaemon: 6. 0x7fcee345e801 abort /lib/x86_64-linux-gnu/libc-2.27.so
2019.08.23 00:00:18.033219 [ 937 ] {} <Fatal> BaseDaemon: 7. 0x6afde1a abort /usr/lib/debug/usr/bin/clickhouse
2019.08.23 00:00:18.061689 [ 937 ] {} <Fatal> BaseDaemon: 8. 0x6df097d terminate_handler() /build/obj-x86_64-linux-gnu/../libs/libdaemon/src/BaseDaemon.cpp:0
2019.08.23 00:00:18.156328 [ 937 ] {} <Fatal> BaseDaemon: 9. 0xe3f04a4 std::__terminate(void (*)()) /build/obj-x86_64-linux-gnu/../contrib/libcxxabi/src/cxa_handlers.cpp:61
2019.08.23 00:00:18.234581 [ 937 ] {} <Fatal> BaseDaemon: 10. 0xe3f03a7 std::terminate() /build/obj-x86_64-linux-gnu/../contrib/libcxxabi/src/include/atomic_support.h:78
2019.08.23 00:00:18.307794 [ 937 ] {} <Fatal> BaseDaemon: 11. 0x6b8465b ? /usr/lib/debug/usr/bin/clickhouse
2019.08.23 00:00:18.371763 [ 937 ] {} <Fatal> BaseDaemon: 12. 0xe35bfec ? /build/obj-x86_64-linux-gnu/../contrib/libcxx/src/mutex.cpp:126
2019.08.23 00:00:18.446308 [ 937 ] {} <Fatal> BaseDaemon: 13. 0xc784b24 DB::StorageLiveView::noUsersThread(unsigned long const&) /build/obj-x86_64-linux-gnu/../dbms/src/Storages/StorageLiveView.cpp:299
2019.08.23 00:00:18.561487 [ 937 ] {} <Fatal> BaseDaemon: 14. 0xc787d16 void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, void (DB::StorageLiveView::*)(unsigned long const&), DB::StorageLiveView*, unsigned long> >(void*) /build/obj-x86_64-linux-gnu/../contrib/libcxx/include/memory:2648
2019.08.23 00:00:18.635849 [ 937 ] {} <Fatal> BaseDaemon: 15. 0x6af8423 __tsan_thread_start_func /usr/lib/debug/usr/bin/clickhouse
2019.08.23 00:00:18.653918 [ 937 ] {} <Fatal> BaseDaemon: 16. 0x7fcee3bb46db start_thread /lib/x86_64-linux-gnu/libpthread-2.27.so
2019.08.23 00:00:18.667288 [ 937 ] {} <Fatal> BaseDaemon: 17. 0x7fcee353f88f __clone /lib/x86_64-linux-gnu/libc-2.27.so
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.