ldbc数据集需要做什么样的处理 #11

Sean58238 · 2021-06-17T04:10:12Z

ldbc数据集导入测试部分不太详细，有几个问题请帮忙解答一下
1 ldbc数据集
目前，ldbc数据集是否要做什么特殊处理，之前有在论坛问过的回复是：目前nebula-bench 没更新，比较简单的办法就是用 ldbc v0.3.3 生成数据后，去掉 csv 的第一行。现在任然需要自己去修改指定的数据嘛，v2.0.1的测试报告是否也是基于ldbc v0.3.3来做的？

2
nebula-bench中对ldbc数据处理有一个merger的步骤，我们观察到只是修改了updateStream.csv这个文件，但是yaml配置中似乎并未使用到这个csv文件

HarrisChu · 2021-06-17T05:36:14Z

Q1:
只需要去掉第一行，然后配置 importer 的配置文件就好，不需要再修改其他数据。

Q2：
merger 是将 ldbc 0_0 格式的 csv 文件合并

HarrisChu · 2021-06-17T05:41:51Z

Q1:
Just delete the first line, and then config the importer configuration file, no needs to modify other data.

Q2
merger script is used for merge csv files

Sean58238 · 2021-06-24T07:36:19Z

请问下，如果在单机上进行性能测试，比如验证不同的SSD产品对nebula的性能影响，推荐部署几个meta，storaged和graphd。分别各起一个是否可行？

HarrisChu · 2021-06-24T07:52:16Z

不同的 SSD 对 nebula 的影响，主要在 storage，分别起一个是可以的。
如果测试的 ssd 比较少，也可以使用同样的 meta，storaged，只把 graphd 放在要测试的 ssd 上，做测试。

Sean58238 · 2021-06-24T08:34:46Z

谢谢，另外nebula-importer可以测量导入数据的性能。ldbc下面有很多数据，比如dynamic下面有很多excel表，导入性能测试推荐所有表都一起导入，还是有选择的某个或某几个表就可以？

HarrisChu · 2021-06-24T09:07:30Z

It depends on your scenarios.
If just test the import performance and simple queries, you could import only 1 file.
If you want to test more complex queries, you could import all the files, .e.g. person -> KNOWS -> person -> created -> POST.

这取决于你的场景。
如果只是测试导入性能和简单的查询，你可以导入1个文件。
如果要测试更复杂的查询，可以导入所有文件。.e.g. person -> KNOWS -> person -> created -> POST.

Sean58238 · 2021-06-24T10:57:21Z

请教一个导入的问题
1 创建space；
CREATE SPACE IF NOT EXISTS importer_test2(partition_num=5, replica_factor=1, vid_type=FIXED_STRING(100));

2 创建Forum的schema
CREATE TAG IF NOT EXISTS Forum(title string,creationDate string);

3 import数据
导入的时候有2种错误，似乎都是和 vertex id有关的

INSERT VERTEX Forum(title,creationDate) VALUES 0: ("Wall of Mahinda Perera","2010-02-14T15:32:20.447+0000");
ErrMsg: SemanticError: No schema found for `Forum', ErrCode: -12

INSERT VERTEX Forum(title,creationDate) VALUES 2199023255564: ("Album 11 of Mahinda Perera","2012-09-08T16:20:33.879+0000");
ErrMsg: Wrong vertex id type: 2199023255564, ErrCode: -8

HarrisChu · 2021-06-24T11:06:56Z

question 1, please refer https://docs.nebula-graph.com.cn/2.0.1/5.configurations-and-logs/1.configurations/3.graph-config/#networking

You should wait graph sync the schema.

question2, your space VID type is FIXED_STRING, it should be
INSERT VERTEX Forum(title,creationDate) VALUES '2199023255564': ("Album 11 of Mahinda Perera","2012-09-08T16:20:33.879+0000");

HarrisChu · 2021-06-24T11:07:24Z

close the issue, if you have other questions, please raise a new one.

HarrisChu closed this as completed Jun 24, 2021

This was referenced Jun 27, 2021

[test] Weekly Report 2021-06-27 vesoft-inc/nebula-community#11

Closed

Weekly Report 2021-06-27 vesoft-inc/nebula-community#12

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ldbc数据集需要做什么样的处理 #11

ldbc数据集需要做什么样的处理 #11

Sean58238 commented Jun 17, 2021

HarrisChu commented Jun 17, 2021 •

edited

HarrisChu commented Jun 17, 2021 •

edited

Sean58238 commented Jun 24, 2021

HarrisChu commented Jun 24, 2021

Sean58238 commented Jun 24, 2021

HarrisChu commented Jun 24, 2021

Sean58238 commented Jun 24, 2021

HarrisChu commented Jun 24, 2021

HarrisChu commented Jun 24, 2021

ldbc数据集需要做什么样的处理 #11

ldbc数据集需要做什么样的处理 #11

Comments

Sean58238 commented Jun 17, 2021

HarrisChu commented Jun 17, 2021 • edited

HarrisChu commented Jun 17, 2021 • edited

Sean58238 commented Jun 24, 2021

HarrisChu commented Jun 24, 2021

Sean58238 commented Jun 24, 2021

HarrisChu commented Jun 24, 2021

Sean58238 commented Jun 24, 2021

HarrisChu commented Jun 24, 2021

HarrisChu commented Jun 24, 2021

HarrisChu commented Jun 17, 2021 •

edited

HarrisChu commented Jun 17, 2021 •

edited