add mt5 #382

xiezipeng-ML · 2022-09-09T03:39:59Z

MT5的0.3.0发布计划

discussions：#380

xiezipeng-ML · 2022-09-10T02:04:05Z

add training config
add MT5 model
add model config
add model test
T5 Loader同步，MT5模型会加载lm_head权重，T5模型共享embed层权重
genertator测试，load huggingface权重后测试生成效果
准备把IDEA那个项目的T5模型部分换成这个branch下的模型，避免模型重复，这个分支作为libai中MT5的“main”，IDEA的交付项目作为一个项目可以使用这里的模型

@CPFLAME @strint

xiezipeng-ML · 2022-09-10T08:30:46Z

模型单测

Loader单测

xiezipeng-ML · 2022-09-11T07:41:30Z

T5在generator上测试

翻译任务测试

CPFLAME · 2022-09-15T07:50:46Z

projects/MT5/layers/attention_layer.py

+        if attention_mask is not None:
+            attention_scores = flow.mul(attention_scores, attention_mask)
+            attention_scores = attention_scores - 10000.0 * (1 - attention_mask)
+            # TODO(xingyu.liao): graph will occur `where_scalar` errors


这里的注释是不是可以改一下

CPFLAME · 2022-09-15T07:55:10Z

libai/models/utils/model_utils/t5_loader.py

@@ -75,6 +75,12 @@ def _convert_state_dict(self, flow_state_dict, cfg):
            prefix1 + "decoder.final_layer_norm.weight"
        )

+        # Convert MT5's lm_head
+        if cfg.model_type == "mt5":


有个疑问是这个t5_loader.py是放在libai的文件里面的,

但是看这里好像专门支持的是projects下面mt5所写的loader

是不是放到projects/MT5的文件夹下面更为合理一点.

CPFLAME · 2022-09-15T07:57:15Z

projects/MT5/layers/embed_layer.py

+            )
+        )
+        self.init_method(self.weight)
+        # FIXME(lxy): Fill padding_idx is not supported in nd_sbp right now.


这里的注释也可以处理一下

xiezipeng-ML added 2 commits September 9, 2022 03:37

add mt5

73e9622

refine

f522b6a

xiezipeng-ML and others added 4 commits September 10, 2022 02:13

update T

e44ebca

reformat

8caeeda

add mt5 test

1defcb1

add model test

26c20ce

add mt5 loader test

5b3a79a

xiezipeng-ML requested review from oneflow-ci-bot and CPFLAME September 15, 2022 06:25

xiezipeng-ML added 4 commits September 15, 2022 06:33

Merge branch 'main' of https://github.com/Oneflow-Inc/libai into MT5

f2353e0

Merge branch 'MT5' of https://github.com/Oneflow-Inc/libai into MT5

2c0199e

reformat

eb23d2d

refine

297f43e

xiezipeng-ML requested review from oneflow-ci-bot and removed request for oneflow-ci-bot September 15, 2022 06:41

CPFLAME reviewed Sep 15, 2022

View reviewed changes

xiezipeng-ML added 2 commits September 15, 2022 09:45

refine

6382f79

delete comment

e4d7a2f

CPFLAME approved these changes Sep 15, 2022

View reviewed changes

xiezipeng-ML requested review from oneflow-ci-bot and removed request for oneflow-ci-bot September 15, 2022 11:49

xiezipeng-ML merged commit b9ee884 into main Sep 15, 2022

xiezipeng-ML deleted the MT5 branch September 15, 2022 11:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add mt5 #382

add mt5 #382

xiezipeng-ML commented Sep 9, 2022

xiezipeng-ML commented Sep 10, 2022 •

edited

Loading

xiezipeng-ML commented Sep 10, 2022 •

edited

Loading

xiezipeng-ML commented Sep 11, 2022

CPFLAME Sep 15, 2022

xiezipeng-ML Sep 15, 2022

CPFLAME Sep 15, 2022

xiezipeng-ML Sep 15, 2022

CPFLAME Sep 15, 2022

add mt5 #382

add mt5 #382

Conversation

xiezipeng-ML commented Sep 9, 2022

xiezipeng-ML commented Sep 10, 2022 • edited Loading

xiezipeng-ML commented Sep 10, 2022 • edited Loading

模型单测

Loader单测

xiezipeng-ML commented Sep 11, 2022

T5在generator上测试

翻译任务测试

CPFLAME Sep 15, 2022

Choose a reason for hiding this comment

xiezipeng-ML Sep 15, 2022

Choose a reason for hiding this comment

CPFLAME Sep 15, 2022

Choose a reason for hiding this comment

xiezipeng-ML Sep 15, 2022

Choose a reason for hiding this comment

CPFLAME Sep 15, 2022

Choose a reason for hiding this comment

xiezipeng-ML commented Sep 10, 2022 •

edited

Loading

xiezipeng-ML commented Sep 10, 2022 •

edited

Loading