Skip to content

[feature](graph)Support querying data from the Nebula graph database#18769

Closed
chenlinzhong wants to merge 3 commits intoapache:masterfrom
chenlinzhong:feature_graph
Closed

[feature](graph)Support querying data from the Nebula graph database#18769
chenlinzhong wants to merge 3 commits intoapache:masterfrom
chenlinzhong:feature_graph

Conversation

@chenlinzhong
Copy link
Copy Markdown
Contributor

@chenlinzhong chenlinzhong commented Apr 18, 2023

Proposed changes

Problem summary

Support querying data from the Nebula graph database
This feature comes from the needs of commercial customers who have used Doris and Nebula, hoping to connect these two databases

changes mainly include:

  • add New Graph Database JDBC Type
  • Adapt the type and map the graph to the Doris type

How to use ?

Nebula-graph type map

nebula Doris
tinyint/samllint/int/int64 bigint
double/float double
date date
timestamp bigint
datetime datetime
bool boolean
vertex/edge/path/list/set/time等 varchar

NebulaGraphTest (仅支持查询)

nebula版本 JDBC驱动版本
3.0.0 nebula-jdbc-3.0.0-jar-with-dependencies.jar
#step1.在nebula创建测试数据
#1.1 创建结点
(root@nebula) [basketballplayer]> CREATE TAG test(t_str string, 
    t_int int, 
    t_date date,
    t_datetime datetime,
    t_bool bool,
    t_timestamp timestamp,
    t_float float,
    t_double double
);

#1.2 插入数据
(root@nebula) [basketballplayer]> INSERT VERTEX test_type(t_str,t_int,t_date,t_datetime,t_bool,t_timestamp,t_float,t_double) values "zhangshan":("zhangshan",1000,date("2023-01-01"),datetime("2023-01-23 15:23:32"),true,1234242423,1.2,1.35);

#1.3 查询数据
(root@nebula) [basketballplayer]> match (v:test_type) where id(v)=="zhangshan" return v.test_type.t_str,v.test_type.t_int,v.test_type.t_date,v.test_type.t_datetime,v.test_type.t_bool,v.test_type.t_timestamp,v.test_type.t_float,v.test_type.t_double,v limit 30;
+-------------------+-------------------+--------------------+----------------------------+--------------------+-------------------------+---------------------+----------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| v.test_type.t_str | v.test_type.t_int | v.test_type.t_date | v.test_type.t_datetime     | v.test_type.t_bool | v.test_type.t_timestamp | v.test_type.t_float | v.test_type.t_double | v                                                                                                                                                                                                         |
+-------------------+-------------------+--------------------+----------------------------+--------------------+-------------------------+---------------------+----------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| "zhangshan"       | 1000              | 2023-01-01         | 2023-01-23T15:23:32.000000 | true               | 1234242423              | 1.2000000476837158  | 1.35                 | ("zhangshan" :test_type{t_bool: true, t_date: 2023-01-01, t_datetime: 2023-01-23T15:23:32.000000, t_double: 1.35, t_float: 1.2000000476837158, t_int: 1000, t_str: "zhangshan", t_timestamp: 1234242423}) |
+-------------------+-------------------+--------------------+----------------------------+--------------------+-------------------------+---------------------+----------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
Got 1 rows (time spent 1616/2048 us)

Mon, 17 Apr 2023 17:23:14 CST


#step2.在doris中创建外表
#2.1 创建一个resource
MySQL [test_db]> CREATE EXTERNAL RESOURCE gg_jdbc_resource 
properties (
   "type"="jdbc",
   "user"="root",
   "password"="123",
   "jdbc_url"="jdbc:nebula://127.0.0.1:9669/basketballplayer",
   "driver_url"="file:///home/clz/baidu/bdg/doris/be/lib/nebula-jdbc-3.0.0-jar-with-dependencies.jar",  --仅支持本地路径,需放到be/lib目录下--
   "driver_class"="com.vesoft.nebula.jdbc.NebulaDriver"
);

#2.2 创建一个外表,这个主要是告诉doris如何解析nebulagraph返回的数据
MySQL [test_db]> CREATE TABLE `test_type` ( 
 `t_str` varchar(64),
 `t_int` bigint,
 `t_date` date,
 `t_datetime` datetime,
 `t_bool` boolean,
 `t_timestamp` bigint,
 `t_float` double,
 `t_double` double,
 `t_vertx`  varchar(128) --vertex对应doris类型是varchar---
) ENGINE=JDBC
PROPERTIES (
"resource" = "gg_jdbc_resource",
"table" = "xx",  --因为graph没有表的概念,这里随便填一个值--
"table_type"="nebula"
);

#2.3 查询graph外表,用g()函数把图的nGQL透传给nebula
MySQL [test_db]> select * from test_type where g('match (v:test_type) where id(v)=="zhangshan" return v.test_type.t_str,v.test_type.t_int,v.test_type.t_date,v.test_type.t_datetime,v.test_type.t_bool,v.test_type.t_timestamp,v.test_type.t_float,v.test_type.t_double,v')\G;
*************************** 1. row ***************************
      t_str: zhangshan
      t_int: 1000
     t_date: 2023-01-01
 t_datetime: 2023-01-23 15:23:32
     t_bool: 1
t_timestamp: 1234242423
    t_float: 1.2000000476837158
   t_double: 1.35
    t_vertx: ("zhangshan" :test_type {t_datetime: utc datetime: 2023-01-23T15:23:32.000000, timezoneOffset: 0, t_timestamp: 1234242423, t_date: 2023-01-01, t_double: 1.35, t_str: "zhangshan", t_int: 1000, t_bool: true, t_float: 1.2000000476837158})
1 row in set (0.024 sec)

#2.3 与doris的其他表进行关联查询
#假设有张用户表
MySQL [test_db]> select * from t_user;
+-----------+------+---------------------------------+
| username  | age  | addr                            |
+-----------+------+---------------------------------+
| zhangshan |   26 | 北京市西二旗街道1008号          |
+-----------+------+---------------------------------+
| lisi |   29 | 北京市西二旗街道1007号          |
+-----------+------+---------------------------------+
1 row in set (0.013 sec)

#与这张用表关联查询用户相关的信息
MySQL [test_db]> select u.* from (select t_str username  from test_type where g('match (v:test_type) where id(v)=="zhangshan" return v.test_type.t_str limit 1')) g left join t_user u on g.username=u.username;
+-----------+------+---------------------------------+
| username  | age  | addr                            |
+-----------+------+---------------------------------+
| zhangshan |   26 | 北京市西二旗街道1008号          |
+-----------+------+---------------------------------+
1 row in set (0.029 sec)

Checklist(Required)

  • Does it affect the original behavior
  • Has unit tests been added
  • Has document been added or modified
  • Does it need to update dependencies
  • Is this PR support rollback (If NO, please explain WHY)

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

@github-actions github-actions bot added area/planner Issues or PRs related to the query planner area/sql/function Issues or PRs related to the SQL functions area/vectorization kind/docs Categorizes issue or PR as related to documentation. labels Apr 18, 2023
@github-actions
Copy link
Copy Markdown
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@BiteTheDDDDt
Copy link
Copy Markdown
Contributor

Is it too simple to use g as a function name? The function that is also located in the ES is called es_query

for (int i = 0; i < columnCount; ++i) {
block.get(i)[curBlockRows] = resultSet.getObject(i + 1);
if (isGraph()) {
block.get(i)[curBlockRows] = tranferToDorisObject((ValueWrapper) resultSet.getObject(i + 1));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe can't at here to convert, it's will be exectue low efficiency

@chenlinzhong
Copy link
Copy Markdown
Contributor Author

Is it too simple to use g as a function name? The function that is also located in the ES is called es_query

In the field of graphs, the variable g is usually used to execute GQL, so it is best to keep their usage consistent and simple here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/planner Issues or PRs related to the query planner area/sql/function Issues or PRs related to the SQL functions area/vectorization kind/docs Categorizes issue or PR as related to documentation.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants