gpu benchmark does not support PyTorch 1.5.0. #77

gaoteng-git · 2020-06-08T14:57:27Z

您好，我试图用最新的pytorch做gpu benchmark的对照组，配置如下：
pytorch: 1.5.0
torchvision: 0.6.0
CUDA: 10.2
OS: Ubuntu18.04
即Dockerfile.gpu里对应行修改为"conda install pytorch=1.5.0 torchvision=0.6.0 cudatoolkit=10.2 -c pytorch"

在docker里build后在test时，有好几个测试用例通不过：

Test project /tmp/build
Start 1: tt_core_test
1/12 Test #1: tt_core_test ..................... Passed 0.52 sec
Start 2: tt_kernels_test
2/12 Test #2: tt_kernels_test .................. Passed 29.18 sec
Start 3: bert_attention_test
3/12 Test #3: bert_attention_test ..............***Failed 4.50 sec
date time ( uptime ) [ thread name/id ] file:line v|
2020-06-08 13:10:51.358 ( 0.000s) [main thread ] loguru.cpp:610 INFO| arguments: turbo_transformers_cxx
2020-06-08 13:10:51.358 ( 0.000s) [main thread ] loguru.cpp:613 INFO| Current dir: /tmp/build/turbo_transformers/python
2020-06-08 13:10:51.358 ( 0.000s) [main thread ] loguru.cpp:615 INFO| stderr verbosity: 0
2020-06-08 13:10:51.358 ( 0.000s) [main thread ] loguru.cpp:616 INFO| -----------------------------------
FFFFFFFFFFFFFFFFFFFFFFFBertAttention "(1,010)" CPU Torch QPS, 492.80298203436234, time, 0.002029208500061941
BertAttention "(1,010)" CPU Turbo QPS, 1082.363535550833, time, 0.0009239040000466048`

...

The following tests FAILED:
3 - bert_attention_test (Failed)
5 - bert_encoder_test (Failed)
6 - bert_intermediate_test (Failed)
7 - bert_layer_test (Failed)
8 - bert_model_test (Failed)
9 - bert_output_test (Failed)
10 - bert_pooler_test (Failed)

请问目前这个gpu benchmark对pytorch版本最高支持到多少呢？官方首页里的benchmark实验结果，是和哪个版本的pytorch比较的呢？

feifeibear · 2020-06-09T01:59:11Z

现在比较的是1.4.0。已知Pytorch 1.5.0对tensor transpose，concat的操作和1.4.0结果不一致，这是Pytorch bug还是feature有待考证。
你可以测一下1.4.0和1.5.0性能有啥区别，应该没有区别。

gaoteng-git · 2020-06-09T14:28:37Z

现在比较的是1.4.0。已知Pytorch 1.5.0对tensor transpose，concat的操作和1.4.0结果不一致，这是Pytorch bug还是feature有待考证。
你可以测一下1.4.0和1.5.0性能有啥区别，应该没有区别。

谢谢！测了一下，pytorch1.4.0和1.5.0在该GPU benchmark上确实没有什么性能差别。

feifeibear · 2020-06-15T03:01:09Z

The member function from_torch of class BertModelWithPooler and BertModel does not support PyTorch version as 1.5.0. In my opinion, the tensor transpose API of PyTorch is not stable. We use the following way to transpose weight matrices.

weight = torch.clone(torch.t(pooler_params['dense.weight']))

I have no idea, why it does not work as predicted in Pytorch 1.5.0.

feifeibear · 2020-06-19T02:09:47Z

The bug will fix in version v0.3.0

feifeibear · 2020-06-28T08:27:46Z

The bug fixed!

gaoteng-git · 2020-06-28T08:29:33Z

Great work!

gaoteng-git closed this as completed Jun 9, 2020

feifeibear added the bug Something isn't working label Jun 15, 2020

feifeibear changed the title ~~gpu benchmark中对pytorch的版本支持~~ gpu benchmark中对pytorch 1.5.0的版本不支持 Jun 15, 2020

feifeibear reopened this Jun 15, 2020

feifeibear changed the title ~~gpu benchmark中对pytorch 1.5.0的版本不支持~~ gpu benchmark does not support PyTorch 1.5.0. Jun 15, 2020

feifeibear linked a pull request Jun 15, 2020 that will close this issue

Jiaruifang/decoder gpu allocator #85

Merged

feifeibear linked a pull request Jun 28, 2020 that will close this issue

Develop #86

Merged

feifeibear closed this as completed Jun 28, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gpu benchmark does not support PyTorch 1.5.0. #77

gpu benchmark does not support PyTorch 1.5.0. #77

gaoteng-git commented Jun 8, 2020 •

edited

Loading

feifeibear commented Jun 9, 2020

gaoteng-git commented Jun 9, 2020

feifeibear commented Jun 15, 2020

feifeibear commented Jun 19, 2020

feifeibear commented Jun 28, 2020

gaoteng-git commented Jun 28, 2020

gpu benchmark does not support PyTorch 1.5.0. #77

gpu benchmark does not support PyTorch 1.5.0. #77

Comments

gaoteng-git commented Jun 8, 2020 • edited Loading

feifeibear commented Jun 9, 2020

gaoteng-git commented Jun 9, 2020

feifeibear commented Jun 15, 2020

feifeibear commented Jun 19, 2020

feifeibear commented Jun 28, 2020

gaoteng-git commented Jun 28, 2020

gaoteng-git commented Jun 8, 2020 •

edited

Loading