You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
### _Why are the changes needed?_
Add quick start documents of the Flink SQL Engine.
### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible
- [ ] Add screenshots for manual tests if appropriate
- [x] [Run test](https://kyuubi.apache.org/docs/latest/develop_tools/testing.html#running-tests) locally before make a pull request
Closes#2106 from deadwind4/KYUUBI-1866-quickstart.
Closes#18662533aaf [Ada Wong] remove Yarn section
6aa4db8 [Ada Wong] compress png
ff6bff7 [Ada Wong] [KYUUBI #1866][DOCS] Add flink sql engine quick start
Authored-by: Ada Wong <rsl4@foxmail.com>
Signed-off-by: Kent Yao <yao@apache.org>
(cherry picked from commit 8f7b2c6)
Signed-off-by: Kent Yao <yao@apache.org>
@@ -36,49 +36,51 @@ You can get the most recent stable release of Apache Kyuubi here:
36
36
## Requirements
37
37
38
38
These are essential components required for Kyuubi to startup.
39
-
For quick start deployment, the only thing you need is `JAVA_HOME`and `SPARK_HOME`being correctly set.
39
+
For quick start deployment, the only thing you need is `JAVA_HOME` being correctly set.
40
40
The Kyuubi release package you downloaded or built contains the rest prerequisites inside already.
41
41
42
42
Components| Role | Optional | Version | Remarks
43
43
--- | --- | --- | --- | ---
44
44
Java | Java<br>Runtime<br>Environment | Required | Java 8/11 | Kyuubi is pre-built with Java 8
45
-
Spark | Distributed<br>SQL<br>Engine | Required | 3.0.0 and above | By default Kyuubi binary release is delivered without<br> a Spark tarball.
45
+
Spark | Distributed<br>SQL<br>Engine | Optional | 3.0.0 and above | By default Kyuubi binary release is delivered without<br> a Spark tarball.
46
+
Flink | Distributed<br>SQL<br>Engine | Optional | 1.14.0 and above | By default Kyuubi binary release is delivered without<br> a Flink tarball.
46
47
HDFS | Distributed<br>File<br>System | Optional | referenced<br>by<br>Spark | Hadoop Distributed File System is a <br>part of Hadoop framework, used to<br> store and process the datasets.<br> You can interact with any<br> Spark-compatible versions of HDFS.
47
48
Hive | Metastore | Optional | referenced<br>by<br>Spark | Hive Metastore for Spark SQL to connect
48
49
Zookeeper | Service<br>Discovery | Optional | Any<br>zookeeper<br>ensemble<br>compatible<br>with<br>curator(2.12.0) | By default, Kyuubi provides a<br> embedded Zookeeper server inside for<br> non-production use.
49
50
50
-
Additionally, if you want to work with other Spark compatible systems or plugins, you only need to take care of them as using them with regular Spark applications.
51
-
For example, you can run Spark SQL engines created by the Kyuubi on any cluster manager, including YARN, Kubernetes, Mesos, e.t.c...
52
-
Or, you can manipulate data from different data sources with the Spark Datasource API, e.g. Delta Lake, Apache Hudi, Apache Iceberg, Apache Kudu and e.t.c...
51
+
Additionally, if you want to work with other Spark/Flink compatible systems or plugins, you only need to take care of them as using them with regular Spark/Flink applications.
52
+
For example, you can run Spark/Flink SQL engines created by the Kyuubi on any cluster manager, including YARN, Kubernetes, Mesos, e.t.c...
53
+
Or, you can manipulate data from different data sources with the Spark Datasource/Flink Table API, e.g. Delta Lake, Apache Hudi, Apache Iceberg, Apache Kudu and e.t.c...
53
54
54
55
## Installation
55
56
56
57
To install Kyuubi, you need to unpack the tarball. For example,
57
58
58
59
```bash
59
-
tar zxf apache-kyuubi-1.3.1-incubating-bin.tgz
60
+
tar zxf apache-kyuubi-1.5.0-incubating-bin.tgz
60
61
```
61
62
62
-
This will result in the creation of a subdirectory named `apache-kyuubi-1.3.1-incubating-bin` shown below,
63
+
This will result in the creation of a subdirectory named `apache-kyuubi-1.5.0-incubating-bin` shown below,
The formerly created Spark application for user 'anonymous' will not be reused in this case, while a brand new application will be submitted for user 'kentyao' instead.
226
254
227
-
Then, you can see 3 processes running in your local environment, including one `KyuubiServer` instance and 2 `SparkSubmit` instances as the SQL engines.
255
+
Then, you can see two processes running in your local environment, including one `KyuubiServer` instance, one `SparkSubmit` or `FlinkSQLEngine` instances as the SQL engines.
256
+
257
+
- Spark
228
258
229
259
```
230
260
75730 Jps
231
261
70843 KyuubiServer
232
262
72566 SparkSubmit
233
-
75356 SparkSubmit
263
+
```
264
+
265
+
- Flink
266
+
267
+
```
268
+
43484 Jps
269
+
43194 KyuubiServer
270
+
43260 FlinkSQLEngine
234
271
```
235
272
236
273
### Execute Statements
237
274
275
+
#### Execute Spark SQL Statements
276
+
238
277
If the beeline session is successfully connected, then you can run any query supported by Spark SQL now. For example,
239
278
240
279
```logtalk
@@ -303,6 +342,88 @@ For example, you can get the Spark web UI from the log for debugging or tuning.
303
342
304
343

305
344
345
+
#### Execute Flink SQL Statements
346
+
347
+
If the beeline session is successfully connected, then you can run any query supported by Flink SQL now. For example,
348
+
349
+
```logtalk
350
+
0: jdbc:hive2://127.0.0.1:10009/default> CREATE TABLE T (
0: jdbc:hive2://127.0.0.1:10009/default> INSERT INTO T VALUES (1, 'Hi'), (2, 'Hello');
394
+
16:28:52.780 INFO org.apache.kyuubi.operation.ExecuteStatement: Processing anonymous's query[d79abf78-d2ae-468f-87b2-19db1fc6e19a]: INITIALIZED_STATE -> PENDING_STATE, statement: INSERT INTO T VALUES (1, 'Hi'), (2, 'Hello')
395
+
16:28:52.786 INFO org.apache.kyuubi.operation.ExecuteStatement: Processing anonymous's query[d79abf78-d2ae-468f-87b2-19db1fc6e19a]: PENDING_STATE -> RUNNING_STATE, statement: INSERT INTO T VALUES (1, 'Hi'), (2, 'Hello')
396
+
16:28:57.827 INFO org.apache.kyuubi.operation.ExecuteStatement: Query[d79abf78-d2ae-468f-87b2-19db1fc6e19a] in RUNNING_STATE
397
+
16:28:59.836 INFO org.apache.kyuubi.operation.ExecuteStatement: Query[d79abf78-d2ae-468f-87b2-19db1fc6e19a] in FINISHED_STATE
398
+
16:28:59.837 INFO org.apache.kyuubi.operation.ExecuteStatement: Processing anonymous's query[d79abf78-d2ae-468f-87b2-19db1fc6e19a]: RUNNING_STATE -> FINISHED_STATE, statement: INSERT INTO T VALUES (1, 'Hi'), (2, 'Hello'), time taken: 7.05 seconds
399
+
+-------------------------------------+
400
+
| default_catalog.default_database.T |
401
+
+-------------------------------------+
402
+
| -1 |
403
+
+-------------------------------------+
404
+
1 row selected (7.104 seconds)
405
+
0: jdbc:hive2://127.0.0.1:10009/default>
406
+
0: jdbc:hive2://127.0.0.1:10009/default> SELECT * FROM T;
407
+
16:29:08.092 INFO org.apache.kyuubi.operation.ExecuteStatement: Processing anonymous's query[af5660c0-fcc4-4f80-b3fd-c4a799faf33f]: INITIALIZED_STATE -> PENDING_STATE, statement: SELECT * FROM T
408
+
16:29:08.101 INFO org.apache.kyuubi.operation.ExecuteStatement: Processing anonymous's query[af5660c0-fcc4-4f80-b3fd-c4a799faf33f]: PENDING_STATE -> RUNNING_STATE, statement: SELECT * FROM T
409
+
16:29:12.519 INFO org.apache.kyuubi.operation.ExecuteStatement: Query[af5660c0-fcc4-4f80-b3fd-c4a799faf33f] in FINISHED_STATE
410
+
16:29:12.520 INFO org.apache.kyuubi.operation.ExecuteStatement: Processing anonymous's query[af5660c0-fcc4-4f80-b3fd-c4a799faf33f]: RUNNING_STATE -> FINISHED_STATE, statement: SELECT * FROM T, time taken: 4.419 seconds
411
+
+----+--------+
412
+
| a | b |
413
+
+----+--------+
414
+
| 1 | Hi |
415
+
| 2 | Hello |
416
+
+----+--------+
417
+
2 rows selected (4.466 seconds)
418
+
```
419
+
420
+
As shown in the above case, you can retrieve all the operation logs, the result schema, and the result to your client-side in the beeline console.
421
+
422
+
Additionally, some useful information about the background Flink SQL application associated with this connection is also printed in the operation log.
423
+
For example, you can get the Flink web UI from the log for debugging or tuning.
424
+
425
+

426
+
306
427
### Closing a Connection
307
428
308
429
Close the session between beeline and Kyuubi server by executing `!quit`, for example,
@@ -338,4 +459,4 @@ Bye!
338
459
339
460
The `KyuubiServer` instance will be stopped immediately while the SQL engine's application will still be alive for a while.
340
461
341
-
If you start Kyuubi again before the SQL engine application terminates itself, it will reconnect to the newly created `KyuubiServer` instance.
462
+
If you start Kyuubi again before the SQL engine application terminates itself, it will reconnect to the newly created `KyuubiServer` instance.
0 commit comments