-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add mt5 #382
add mt5 #382
Conversation
if attention_mask is not None: | ||
attention_scores = flow.mul(attention_scores, attention_mask) | ||
attention_scores = attention_scores - 10000.0 * (1 - attention_mask) | ||
# TODO(xingyu.liao): graph will occur `where_scalar` errors |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里的注释是不是可以改一下
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
好的
@@ -75,6 +75,12 @@ def _convert_state_dict(self, flow_state_dict, cfg): | |||
prefix1 + "decoder.final_layer_norm.weight" | |||
) | |||
|
|||
# Convert MT5's lm_head | |||
if cfg.model_type == "mt5": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
有个疑问是 这个t5_loader.py
是放在libai的文件里面的,
但是看这里好像专门支持的是projects下面mt5所写的loader
是不是放到projects/MT5的文件夹下面更为合理一点.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
好的
projects/MT5/layers/embed_layer.py
Outdated
) | ||
) | ||
self.init_method(self.weight) | ||
# FIXME(lxy): Fill padding_idx is not supported in nd_sbp right now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里的注释也可以处理一下
MT5的0.3.0发布计划
discussions:#380