关于transformers的一些问题 #14

yclzju · 2021-09-08T03:15:49Z

嗨，想问一下，现在使用transformers调用roformer相关模型，和使用本代码库的，是完全一样吗？
我想调用roformer-sim相关模型，是选用什么接口呀？ RoFormerForMaskedLM吗？
我使用transformers RoFormerForMaskedLM调用之后，发现有一部分参数并没有被load（应该是pooler相关）；测试的时候，直接拿来用是很不错的；但如果想拿来作为底座训练，发现loss降不下来（同样代码roberta是可以正常训练的），想知道是不是因为没有加pooler的原因。在你的example中没找到关于roformer-sim相关的例子，大佬有空的时候帮忙解答下哈，多谢啦！

JunnYu · 2021-09-08T03:42:56Z

@yclzju huggingface仓库的roformer没有加pooling。这个仓库的代码加了pooling，可以直接加载

https://github.com/JunnYu/RoFormer_pytorch/blob/f0ca803094eab5dacb3f657d93b629d81a0981c0/src/roformer/modeling_roformer.py#L1086。这里加了pooling了。

JunnYu · 2021-09-08T03:55:47Z

安装本仓库的代码，pip install roformer==0.2.1

import torch
from roformer import RoFormerModel
model = RoFormerModel.from_pretrained("junnyu/roformer_chinese_sim_char_small",add_pooling_layer=True)
model.eval()
x = torch.tensor([[5,6,7,8,9]])
output = model(x)
print(output.pooler_output.shape)
# torch.Size([1, 384])

yclzju · 2021-09-08T06:20:34Z

嗯现在可以了
我原来是transformer 加载模型，然后自己加了一个cls pool操作，如果是用来测试，在sts任务上感觉是不错的；但是用来训练的话，训练loss后面降不下去；现在换成你给的例子，又可以了

yclzju · 2021-09-08T06:22:53Z

安装本仓库的代码，pip install roformer==0.2.1

import torch
from roformer import RoFormerModel
model = RoFormerModel.from_pretrained("junnyu/roformer_chinese_sim_char_small",add_pooling_layer=True)
model.eval()
x = torch.tensor([[5,6,7,8,9]])
output = model(x)
print(output.pooler_output.shape)
# torch.Size([1, 384])

这里为什么是用RoFormerModel 而不是RoFormerForMaskedLM 呢

JunnYu · 2021-09-08T06:30:18Z

RoFormer_pytorch/src/roformer/modeling_roformer.py

Line 1086 in 6f0c0b6

self.roformer = RoFormerModel(config, add_pooling_layer=False)

这里maskedlm没有加adding pooler，如果需要的话，自己可以改一下原来代码，然后输出的时候return_dict=Fasle，就可以得到pooler的结果了

yclzju · 2021-09-14T12:24:16Z

嗨，感谢大佬的回答
你有计划在transformers里也加pooler吗？或者有办法比较快地从transformers的模型中将pooler参数加上吗？

yclzju · 2021-09-14T12:33:18Z

对了，transformers上的代码大部分是不是和你这里的一致，如RoFormerSelfAttention，我只需要在transformers基础上修改RoformerModel等几个类应该就行了吧

JunnYu · 2021-09-14T15:01:13Z

（1）transformers库的代码跟这里的基本一致，只是缺少了pooler部分。
（2）没有什么好办法，最好的办法是使用本仓库的代码（本仓库的代码添加了pooler部分），然后调用就行了。

JunnYu · 2022-04-02T14:40:06Z

如果想要生成结果的话，下载这下面的roformer就可以生成结果了。

roformer.zip

import torch
import numpy as np
from roformer import RoFormerTokenizer, RoFormerForCausalLM, RoFormerConfig

device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')
pretrained_model = "junnyu/roformer_chinese_sim_char_base"
tokenizer = RoFormerTokenizer.from_pretrained(pretrained_model)
config = RoFormerConfig.from_pretrained(pretrained_model)
config.is_decoder = True
config.eos_token_id = tokenizer.sep_token_id
config.pooler_activation = "linear"
model = RoFormerForCausalLM.from_pretrained(pretrained_model, config=config)
model.to(device)
model.eval()

def gen_synonyms(text, n=100, k=20):
    ''''含义： 产生sent的n个相似句，然后返回最相似的k个。
    做法：用seq2seq生成，并用encoder算相似度并排序。
    '''
    # 寻找所有相似的句子
    r = []
    inputs1 = tokenizer(text, return_tensors="pt")
    for _ in range(n):
        inputs1.to(device)
        output = tokenizer.batch_decode(model.generate(**inputs1, top_p=0.95, do_sample=True, max_length=128), skip_special_tokens=True)[0].replace(" ","").replace(text, "") # 去除空格，去除原始text文本。
        r.append(output)
    
    # 对相似的句子进行排序
    r = [i for i in set(r) if i != text and len(i) > 0]
    r = [text] + r
    inputs2 = tokenizer(r, padding=True, return_tensors="pt")
    with torch.no_grad():
        inputs2.to(device)
        outputs = model(**inputs2)
        Z = outputs.pooler_output.cpu().numpy()
    Z /= (Z**2).sum(axis=1, keepdims=True)**0.5
    argsort = np.dot(Z[1:], -Z[0]).argsort()
    
    return [r[i + 1] for i in argsort[:k]]

out = gen_synonyms("广州和深圳哪个好？")
print(out)
# ['深圳和广州哪个好？',
#  '广州和深圳哪个好',
#  '深圳和广州哪个好',
#  '深圳和广州哪个比较好。',
#  '深圳和广州哪个最好？',
#  '深圳和广州哪个比较好',
#  '广州和深圳那个比较好',
#  '深圳和广州哪个更好？',
#  '深圳与广州哪个好',
#  '深圳和广州，哪个比较好',
#  '广州与深圳比较哪个好',
#  '深圳和广州哪里比较好',
#  '深圳还是广州比较好？',
#  '广州和深圳哪个地方好一些？',
#  '广州好还是深圳好？',
#  '广州好还是深圳好呢？',
#  '广州与深圳哪个地方好点？',
#  '深圳好还是广州好',
#  '广州好还是深圳好',
#  '广州和深圳哪个城市好？']

JunnYu mentioned this issue Mar 24, 2022

如何使用已经转好的Roformer_ft 做语义相似度任务？ #25

Open

JunnYu closed this as completed Apr 7, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

关于transformers的一些问题 #14

关于transformers的一些问题 #14

yclzju commented Sep 8, 2021

JunnYu commented Sep 8, 2021 •

edited

JunnYu commented Sep 8, 2021 •

edited

yclzju commented Sep 8, 2021

yclzju commented Sep 8, 2021

JunnYu commented Sep 8, 2021

yclzju commented Sep 14, 2021

yclzju commented Sep 14, 2021

JunnYu commented Sep 14, 2021

JunnYu commented Apr 2, 2022 •

edited

关于transformers的一些问题 #14

关于transformers的一些问题 #14

Comments

yclzju commented Sep 8, 2021

JunnYu commented Sep 8, 2021 • edited

JunnYu commented Sep 8, 2021 • edited

yclzju commented Sep 8, 2021

yclzju commented Sep 8, 2021

JunnYu commented Sep 8, 2021

yclzju commented Sep 14, 2021

yclzju commented Sep 14, 2021

JunnYu commented Sep 14, 2021

JunnYu commented Apr 2, 2022 • edited

如果想要生成结果的话，下载这下面的roformer就可以生成结果了。

roformer.zip

JunnYu commented Sep 8, 2021 •

edited

JunnYu commented Sep 8, 2021 •

edited

JunnYu commented Apr 2, 2022 •

edited