PaddlePaddle Hackathon 56 提交 #1088

JunnYu · 2021-09-24T10:27:12Z

Task: #1074

权重文件等百度云上传，链接：https://pan.baidu.com/s/1dyaOShLEgnL_RJW4sbs40g 提取码：kayr。
模型中有的layernorm未设置eps=1e-5。
删除GPTEmbeddings中paddle.ParamAttr的name属性，设置了话会报错，提示说重复使用了相同的名字，单元测试无法通过，jupyter notebook中无法重复初始化。
添加compare.py比较转换后的模型预测结果。（该文件所处位置等审核通过后再修改）。
添加convert.py转换脚本。（该文件所处位置等审核通过后再修改）。
microsoft-DialoGPT-small转换后的误差有点大，问题与之前的应该类似，其他的两个模型转换后误差正常。（使用了相同的转换代码，因此不存在转换时代码的错误。）
添加GPTForTokenClassification, GPTForSequenceClassification这2个类，并添加注释。
添加单元测试代码。TestElectraForSequenceClassification，TestGPTForTokenClassification。
转换代码时候并未添加lm_head.weight，因为是它是与word embedding绑定的，所以没必要转换它，如果有需要可自行修改。

paddlenlp/transformers/gpt/modeling.py

yingyibiao · 2021-09-27T11:46:14Z

merges.txt也通过百度云上传

JunnYu · 2021-09-27T13:03:58Z

@yingyibiao baidu网盘包含merges.txt，现在已经删除community/junnyu所有文件。

yingyibiao · 2021-10-13T11:01:56Z

@yingyibiao baidu网盘包含merges.txt，现在已经删除community/junnyu所有文件。
还是需要在community/junnyu中上传相关文件，具体可以参考：
https://paddlenlp.readthedocs.io/zh/latest/community/contribute_models/contribute_awesome_pretrained_models.html

yingyibiao · 2021-10-18T04:39:49Z

Task: #1074

权重文件等百度云上传，链接：https://pan.baidu.com/s/1dyaOShLEgnL_RJW4sbs40g 提取码：kayr。

模型中有的layernorm未设置eps=1e-5。

删除GPTEmbeddings中paddle.ParamAttr的name属性，设置了话会报错，提示说重复使用了相同的名字，单元测试无法通过，jupyter notebook中无法重复初始化。

添加compare.py比较转换后的模型预测结果。（该文件所处位置等审核通过后再修改）。

添加convert.py转换脚本。（该文件所处位置等审核通过后再修改）。

microsoft-DialoGPT-small转换后的误差有点大，问题与之前的应该类似，其他的两个模型转换后误差正常。（使用了相同的转换代码，因此不存在转换时代码的错误。）

添加GPTForTokenClassification, GPTForSequenceClassification这2个类，并添加注释。

添加单元测试代码。TestElectraForSequenceClassification，TestGPTForTokenClassification。

转换代码时候并未添加lm_head.weight，因为是它是与word embedding绑定的，所以没必要转换它，如果有需要可自行修改。

权重已上传至bos

yingyibiao · 2021-10-18T04:41:20Z

参照 #1085 的review意见修改类似问题～

community/junnyu/uer-gpt2-chinese-poem/README.md

community/junnyu/microsoft-DialoGPT-small/README.md

yingyibiao · 2021-10-18T12:13:58Z

顺便将 DialoGPT-medium，DialoGPT-large 这两个权重也导入～

yingyibiao · 2021-10-19T03:30:56Z

https://github.com/PaddlePaddle/PaddleNLP/blob/develop/docs/model_zoo/transformers.rst
这个文件也需要同步修改

ZHUI

非常不错！
看 GPTForTokenClassification GPTForSequenceClassification 是否也能给 https://github.com/PaddlePaddle/PaddleNLP/blob/develop/examples/language_model/gpt/ 中添加一些实际使用的例子

paddlenlp/transformers/gpt/modeling.py

ZHUI · 2021-10-19T08:45:11Z

paddlenlp/transformers/gpt/modeling.py

+        return logits
+
+
+class GPTForSequenceClassification(GPTPretrainedModel):


GPTForTokenClassification
GPTForSequenceClassification
这两个东西能否加一个例子到 https://github.com/PaddlePaddle/PaddleNLP/blob/develop/examples/language_model/gpt/ 中

GPTForTokenClassification 跟 BertForTokenClassification基本一样的，只是模型不需要输入token type id.

GPTForSequenceClassification 现已完成

现已添加使用GPTForTokenClassification进行NER的例子，发现效果很差。

JunnYu · 2021-10-22T11:14:49Z

JunnYu · 2021-10-22T11:35:20Z

@yingyibiao
large和medium的权重在这
链接：https://pan.baidu.com/s/1cdqfjHVrE6CwBvYW5692rg
提取码：83jh

docs/model_zoo/transformers.rst

yingyibiao · 2021-10-24T07:18:58Z

@yingyibiao
large和medium的权重在这
链接：https://pan.baidu.com/s/1cdqfjHVrE6CwBvYW5692rg
提取码：83jh

已上传～

yingyibiao · 2021-10-24T09:08:49Z

LGTM for community related files

ZHUI

LGTM

ZHUI · 2021-11-01T10:45:43Z

examples/language_model/gpt/README.md

+Precision                     | 0.484939    |
+Recall                        | 0.634716    |
+F1                            | 0.549810    |
+


这效果感觉可能有些偏低。

ZeyuChen · 2021-11-02T06:56:43Z

examples/language_model/gpt/README.md

+
+基于`gpt-cpm-small-cn-distill`在MSRA的NER任务上Fine-tuning后，在验证集上有如下结果：
+
+ Metric                       | Result      |


GPT可以解决TokenClassification问题这个事有相关paper佐证吗？@ZHUI

yingyibiao

LGTM

JunnYu added 2 commits September 24, 2021 18:17

update

cf2d21c

Merge branch 'develop' into update_gpt

00cf50a

JunnYu mentioned this pull request Sep 24, 2021

【PaddlePaddle Hackathon】任务总览 PaddlePaddle/Paddle#35940

Closed

JunnYu added 2 commits September 24, 2021 18:29

update

f4827d0

Merge branch 'update_gpt' of github.com:JunnYu/PaddleNLP into update_gpt

8e6bf70

ZHUI self-requested a review September 26, 2021 02:36

ZHUI requested changes Sep 26, 2021

View reviewed changes

paddlenlp/transformers/gpt/modeling.py Outdated Show resolved Hide resolved

ZeyuChen added the Hackathon label Sep 26, 2021

remove community/junnyu

ae4b490

yingyibiao assigned ZHUI Sep 28, 2021

Update test_modeling.py

5b66fa2

JunnYu added 3 commits October 13, 2021 23:40

Merge branch 'develop' into update_gpt

4b23a50

suggestion from ZHUI

540dfe0

add community/junnyu

c156b0a

JunnYu requested a review from ZHUI October 13, 2021 16:25

rm gpt link

68b33cd

yingyibiao reviewed Oct 18, 2021

View reviewed changes

community/junnyu/uer-gpt2-chinese-poem/README.md Outdated Show resolved Hide resolved

community/junnyu/microsoft-DialoGPT-small/README.md Outdated Show resolved Hide resolved

Merge branch 'develop' into update_gpt

563aee0

ZHUI reviewed Oct 19, 2021

View reviewed changes

yingyibiao and others added 3 commits October 20, 2021 19:03

Merge branch 'develop' into update_gpt

bf8ea3d

Merge branch 'develop' into update_gpt

8f5c136

update

52b5d0c

JunnYu added 2 commits October 22, 2021 19:19

update readme

d2720ac

add large medium

aa6f747

JunnYu added 2 commits October 22, 2021 19:40

更新权重个数

c9fcf6b

update gpt compare

94f45e9

yingyibiao reviewed Oct 24, 2021

View reviewed changes

docs/model_zoo/transformers.rst Outdated Show resolved Hide resolved

docs/model_zoo/transformers.rst Outdated Show resolved Hide resolved

update docs

23b5469

yingyibiao requested a review from ZHUI October 25, 2021 01:55

yingyibiao and others added 3 commits October 25, 2021 11:04

Merge branch 'develop' into update_gpt

fd8c51b

add msra ner example

3f74bb9

Merge branch 'develop' into update_gpt

2e430d3

ZHUI approved these changes Nov 1, 2021

View reviewed changes

yingyibiao and others added 2 commits November 1, 2021 18:56

Merge branch 'develop' into update_gpt

e8d23a1

Merge branch 'develop' into update_gpt

cbe10ac

ZeyuChen added this to In progress in PaddleNLP v2.2 via automation Nov 2, 2021

ZeyuChen reviewed Nov 2, 2021

View reviewed changes

Merge branch 'develop' into update_gpt

5139361

yingyibiao approved these changes Nov 9, 2021

View reviewed changes

PaddleNLP v2.2 automation moved this from In progress to Reviewer approved Nov 9, 2021

yingyibiao merged commit 844f24c into PaddlePaddle:develop Nov 9, 2021

PaddleNLP v2.2 automation moved this from Reviewer approved to Done Nov 9, 2021

JunnYu deleted the update_gpt branch November 9, 2021 08:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PaddlePaddle Hackathon 56 提交 #1088

PaddlePaddle Hackathon 56 提交 #1088

JunnYu commented Sep 24, 2021 •

edited

yingyibiao commented Sep 27, 2021

JunnYu commented Sep 27, 2021

yingyibiao commented Oct 13, 2021 •

edited

yingyibiao commented Oct 18, 2021

yingyibiao commented Oct 18, 2021

yingyibiao commented Oct 18, 2021

yingyibiao commented Oct 19, 2021

ZHUI left a comment

ZHUI Oct 19, 2021

JunnYu Oct 22, 2021 •

edited

JunnYu Oct 27, 2021

JunnYu commented Oct 22, 2021

JunnYu commented Oct 22, 2021 •

edited

yingyibiao commented Oct 24, 2021

yingyibiao commented Oct 24, 2021

ZHUI left a comment

ZHUI Nov 1, 2021

ZeyuChen Nov 2, 2021

yingyibiao left a comment

		return logits


		class GPTForSequenceClassification(GPTPretrainedModel):


		基于`gpt-cpm-small-cn-distill`在MSRA的NER任务上Fine-tuning后，在验证集上有如下结果：

		Metric \| Result \|

PaddlePaddle Hackathon 56 提交 #1088

PaddlePaddle Hackathon 56 提交 #1088

Conversation

JunnYu commented Sep 24, 2021 • edited

yingyibiao commented Sep 27, 2021

JunnYu commented Sep 27, 2021

yingyibiao commented Oct 13, 2021 • edited

yingyibiao commented Oct 18, 2021

yingyibiao commented Oct 18, 2021

yingyibiao commented Oct 18, 2021

yingyibiao commented Oct 19, 2021

ZHUI left a comment

Choose a reason for hiding this comment

ZHUI Oct 19, 2021

Choose a reason for hiding this comment

JunnYu Oct 22, 2021 • edited

Choose a reason for hiding this comment

JunnYu Oct 27, 2021

Choose a reason for hiding this comment

JunnYu commented Oct 22, 2021

JunnYu commented Oct 22, 2021 • edited

yingyibiao commented Oct 24, 2021

yingyibiao commented Oct 24, 2021

ZHUI left a comment

Choose a reason for hiding this comment

ZHUI Nov 1, 2021

Choose a reason for hiding this comment

ZeyuChen Nov 2, 2021

Choose a reason for hiding this comment

yingyibiao left a comment

Choose a reason for hiding this comment

JunnYu commented Sep 24, 2021 •

edited

yingyibiao commented Oct 13, 2021 •

edited

JunnYu Oct 22, 2021 •

edited

JunnYu commented Oct 22, 2021 •

edited