Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

请问这个模型如何使用呢? #2

Closed
Minjuner-97 opened this issue Jan 5, 2021 · 3 comments
Closed

请问这个模型如何使用呢? #2

Minjuner-97 opened this issue Jan 5, 2021 · 3 comments

Comments

@Minjuner-97
Copy link

1、使用pipeline方法出现了 Typeerror: not a string的情况,如何解决?
2、如果不能使用pipeline方法,现在我的数据里有问句+文本, 应该如何使用这个模型呢?
恳请赐教~~

@Minjuner-97
Copy link
Author

或者作者能提供下源码吗? 不胜感激~

@uecah
Copy link

uecah commented Mar 13, 2021

coding=utf8

from transformers import AlbertTokenizer, AlbertForQuestionAnswering
import torch
from transformers import AutoModelForQuestionAnswering, BertTokenizer

tokenizer = AlbertTokenizer.from_pretrained('./model/albert-chinese-large-qa')

model = AlbertForQuestionAnswering.from_pretrained('./model/albert-chinese-large-qa')

import tensorflow as tf

model_path = "./model/albert-chinese-large-qa_git"

model_path = "./model/albert-chinese-large-qa"

model = AutoModelForQuestionAnswering.from_pretrained(model_path)
tokenizer = BertTokenizer.from_pretrained(model_path)

question, text = "伯克利的家乡在哪", "我叫伯克利来自美国加州。"

question, text = "鲁迅在哪上学", "鲁迅(1881年9月25日~1936年10月19日),原名周樟寿,后改名周树人,字豫山,后改字豫才,浙江绍兴人。著名文学家、思想家、革命家、民主战士,新文化运动的重要参与者,中国现代文学的奠基人之一。早年与厉绥之和钱均夫同赴日本公费留学,于日本仙台医科专门学校肄业。“鲁迅”,1918年发表《狂人日记》时所用的笔名,也是最为广泛的笔名"

inputs = tokenizer(question, text, return_tensors='pt')

start_positions = torch.tensor([1])

end_positions = torch.tensor([3])

outputs = model(**inputs, start_positions=start_positions, end_positions=end_positions)

outputs = model(**inputs)
loss = outputs.loss
start_logits = outputs.start_logits.detach().cpu().numpy()
end_logits = outputs.end_logits.detach().cpu().numpy()

import numpy as np
all_tokens = tokenizer.convert_ids_to_tokens(inputs["input_ids"].numpy()[0])

answer = ''.join(all_tokens[np.argmax(start_logits, 1)[0]:np.argmax(end_logits, 1)[0]+1])

print(all_tokens)

print(answer)

is work

@wptoux
Copy link
Owner

wptoux commented Mar 18, 2021

是的,几天前修复了这个问题,现在可以了 #3

huggingface/transformers#9850

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants