Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

加载徐亮版的albert,报错了 #31

Closed
yuhao1982 opened this issue Nov 13, 2019 · 16 comments
Closed

加载徐亮版的albert,报错了 #31

yuhao1982 opened this issue Nov 13, 2019 · 16 comments

Comments

@yuhao1982
Copy link

用example/task_sentiment_albert.py加载徐亮的tiny版albert ,出现下面错误
ValueError: Layer weight shape (21128, 312) not compatible with provided weight shape (21128, 128)
用keras_bert加载也出现同样的错,这是怎么回事啊?

@bojone
Copy link
Owner

bojone commented Nov 13, 2019

我觉得README已经说得够清楚了啊:

(注:徐亮版albert的开源时间早于Google版albert,这导致早期徐亮版albert的权重与Google版的不完全一致,换言之两者不能直接相互替换。为了减少代码冗余,bert4keras的0.2.4及后续版本均只支持加载Google版以徐亮版中带Google字眼的权重。如果要加载早期版本的权重,请用0.2.3版本。)

@yuhao1982
Copy link
Author

sorry,忘注意了。另外再问一下,keras_bert(https://github.com/CyberZHG/keras-bert) 是不是不支持加载albert模型? 谢谢啊!因为我想利用k折交叉建模的代码,虽然直接在您的上面改应该也很容易吧?

@bojone
Copy link
Owner

bojone commented Nov 13, 2019

keras_bert目前是不支持的。

@yuhao1982
Copy link
Author

你好,还想问下您遇到这个错吗---我用的是google的albert预训练模型, 还是用bert4keras 0.2.4,按照https://github.com/bojone/bert4keras/issues/29的要求把代码格式做相应修改,但是会报下面的错:ValueError: Error when checking target: expected dense_7 to have 2 dimensions, but got array with shape (12, 1, 3)。一直没有找到原因,麻烦帮看看吧,谢谢啊!

@yuhao1982
Copy link
Author

整个输出是这样的:
Epoch 1/50
Traceback (most recent call last):
File "task_sentiment_albert.py", line 533, in
train_model_pred, test_model_pred, val_f1_list = run_cv(5, DATA_LIST, DATA_X, DATA_Y, DATA_LIST_TEST)
File "task_sentiment_albert.py", line 472, in run_cv
callbacks=mycallbacks,# + [plateau, checkpoint],#[early_stopping, plateau, checkpoint],
File "/data/yuhao/anaconda3/lib/python3.6/site-packages/keras/legacy/interfaces.py", line 91, in wrapper
return func(*args, **kwargs)
File "/data/yuhao/anaconda3/lib/python3.6/site-packages/keras/engine/training.py", line 1732, in fit_generator
initial_epoch=initial_epoch)
File "/data/yuhao/anaconda3/lib/python3.6/site-packages/keras/engine/training_generator.py", line 220, in fit_generator
reset_metrics=False)
File "/data/yuhao/anaconda3/lib/python3.6/site-packages/keras/engine/training.py", line 1508, in train_on_batch
class_weight=class_weight)
File "/data/yuhao/anaconda3/lib/python3.6/site-packages/keras/engine/training.py", line 621, in _standardize_user_data
exception_prefix='target')
File "/data/yuhao/anaconda3/lib/python3.6/site-packages/keras/engine/training_utils.py", line 135, in standardize_input_data
'with shape ' + str(data_shape))
ValueError: Error when checking target: expected dense_7 to have 2 dimensions, but got array with shape (12, 1, 3)

@bojone
Copy link
Owner

bojone commented Nov 13, 2019

@yuhao1982 按提示是你的训练数据shape跟模型的输出shape对不上的问题,跟bert4keras没关系

@yuhao1982
Copy link
Author

解决了。
但是又有一个问题:那个代码运行到几个epoch后会突然stopIteration:
Traceback (most recent call last):
File "task_sentiment_albert.py", line 595, in
train_model_pred, test_model_pred, val_f1_list = run_cv(5, DATA_LIST, DATA_X, DATA_Y, DATA_LIST_TEST)
File "task_sentiment_albert.py", line 534, in run_cv
callbacks=mycallbacks # + [plateau, checkpoint],#[early_stopping, plateau, checkpoint],
File "/data/yuhao/anaconda3/lib/python3.6/site-packages/keras/legacy/interfaces.py", line 91, in wrapper
return func(*args, **kwargs)
File "/data/yuhao/anaconda3/lib/python3.6/site-packages/keras/engine/training.py", line 1732, in fit_generator
initial_epoch=initial_epoch)
File "/data/yuhao/anaconda3/lib/python3.6/site-packages/keras/engine/training_generator.py", line 242, in fit_generator
workers=0)
File "/data/yuhao/anaconda3/lib/python3.6/site-packages/keras/legacy/interfaces.py", line 91, in wrapper
return func(*args, **kwargs)
File "/data/yuhao/anaconda3/lib/python3.6/site-packages/keras/engine/training.py", line 1791, in evaluate_generator
verbose=verbose)
File "/data/yuhao/anaconda3/lib/python3.6/site-packages/keras/engine/training_generator.py", line 365, in evaluate_generator
generator_output = next(output_generator)
StopIteration

是训练数据的生成器stop了。 这是什么原因呢?
我用的google版albert按你的解答是用如下代码没错吧?(我以为是不是这个搞错了造成的)

config_path = '/root/kg/bert/albert_base_en_tfhub/albert_config.json'
checkpoint_path = '/root/kg/bert/albert_base_en_tfhub/variables/variables'
spm_path = '/root/kg/bert/albert_base_en_tfhub/assets/30k-clean.model'

tokenizer = SpTokenizer(spm_path)
model = build_bert_model(config_path, checkpoint_path, albert=True)

token_ids, segment_ids = tokenizer.encode(first_text = text, max_length = maxlen)
?
麻烦帮解答下。谢谢.

@yuhao1982
Copy link
Author

刚才用了0.2.3的版本,发现用徐亮的albert tiny还是同样的问题:
File "task_sentiment_albert.py", line 123, in
albert=True
File "/data/yuhao/anaconda3/lib/python3.6/site-packages/bert4keras-0.2.3-py3.6.egg/bert4keras/bert.py", line 319, in build_bert_model
File "/data/yuhao/anaconda3/lib/python3.6/site-packages/bert4keras-0.2.3-py3.6.egg/bert4keras/bert.py", line 188, in load_weights_from_checkpoint
File "/data/yuhao/anaconda3/lib/python3.6/site-packages/keras/engine/base_layer.py", line 1126, in set_weights
'provided weight shape ' + str(w.shape))
ValueError: Layer weight shape (4408, 312) not compatible with provided weight shape (4408, 128)
麻烦帮帮忙呢

@bojone
Copy link
Owner

bojone commented Nov 13, 2019

解决了。
但是又有一个问题:那个代码运行到几个epoch后会突然stopIteration:
Traceback (most recent call last):
File "task_sentiment_albert.py", line 595, in
train_model_pred, test_model_pred, val_f1_list = run_cv(5, DATA_LIST, DATA_X, DATA_Y, DATA_LIST_TEST)
File "task_sentiment_albert.py", line 534, in run_cv
callbacks=mycallbacks # + [plateau, checkpoint],#[early_stopping, plateau, checkpoint],
File "/data/yuhao/anaconda3/lib/python3.6/site-packages/keras/legacy/interfaces.py", line 91, in wrapper
return func(*args, **kwargs)
File "/data/yuhao/anaconda3/lib/python3.6/site-packages/keras/engine/training.py", line 1732, in fit_generator
initial_epoch=initial_epoch)
File "/data/yuhao/anaconda3/lib/python3.6/site-packages/keras/engine/training_generator.py", line 242, in fit_generator
workers=0)
File "/data/yuhao/anaconda3/lib/python3.6/site-packages/keras/legacy/interfaces.py", line 91, in wrapper
return func(*args, **kwargs)
File "/data/yuhao/anaconda3/lib/python3.6/site-packages/keras/engine/training.py", line 1791, in evaluate_generator
verbose=verbose)
File "/data/yuhao/anaconda3/lib/python3.6/site-packages/keras/engine/training_generator.py", line 365, in evaluate_generator
generator_output = next(output_generator)
StopIteration

是训练数据的生成器stop了。 这是什么原因呢?
我用的google版albert按你的解答是用如下代码没错吧?(我以为是不是这个搞错了造成的)

config_path = '/root/kg/bert/albert_base_en_tfhub/albert_config.json'
checkpoint_path = '/root/kg/bert/albert_base_en_tfhub/variables/variables'
spm_path = '/root/kg/bert/albert_base_en_tfhub/assets/30k-clean.model'

tokenizer = SpTokenizer(spm_path)
model = build_bert_model(config_path, checkpoint_path, albert=True)

token_ids, segment_ids = tokenizer.encode(first_text = text, max_length = maxlen)
?
麻烦帮解答下。谢谢.

fit_generator的generator需要无限循环。

再提醒一次,请学好keras再来用bert4keras。只要你能加载成功,跑起来,就不是bert4keras的问题。这里不是手把手debug中心。

@bojone
Copy link
Owner

bojone commented Nov 13, 2019

刚才用了0.2.3的版本,发现用徐亮的albert tiny还是同样的问题:
File "task_sentiment_albert.py", line 123, in
albert=True
File "/data/yuhao/anaconda3/lib/python3.6/site-packages/bert4keras-0.2.3-py3.6.egg/bert4keras/bert.py", line 319, in build_bert_model
File "/data/yuhao/anaconda3/lib/python3.6/site-packages/bert4keras-0.2.3-py3.6.egg/bert4keras/bert.py", line 188, in load_weights_from_checkpoint
File "/data/yuhao/anaconda3/lib/python3.6/site-packages/keras/engine/base_layer.py", line 1126, in set_weights
'provided weight shape ' + str(w.shape))
ValueError: Layer weight shape (4408, 312) not compatible with provided weight shape (4408, 128)
麻烦帮帮忙呢

反正我没报错。

@yuhao1982
Copy link
Author

刚才用了0.2.3的版本,发现用徐亮的albert tiny还是同样的问题:
File "task_sentiment_albert.py", line 123, in
albert=True
File "/data/yuhao/anaconda3/lib/python3.6/site-packages/bert4keras-0.2.3-py3.6.egg/bert4keras/bert.py", line 319, in build_bert_model
File "/data/yuhao/anaconda3/lib/python3.6/site-packages/bert4keras-0.2.3-py3.6.egg/bert4keras/bert.py", line 188, in load_weights_from_checkpoint
File "/data/yuhao/anaconda3/lib/python3.6/site-packages/keras/engine/base_layer.py", line 1126, in set_weights
'provided weight shape ' + str(w.shape))
ValueError: Layer weight shape (4408, 312) not compatible with provided weight shape (4408, 128)
麻烦帮帮忙呢

反正我没报错。

我配置tf=1.15, keras 2.3.1. bert4keras 0.2.3 , 还是这个错,怎么办呢? 用不了中文的albert了

@bojone
Copy link
Owner

bojone commented Nov 13, 2019

import bert4keras
print(bert4keras.__version__)

确认是0.2.3?

@yuhao1982
Copy link
Author

[GCC 7.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.

import bert4keras as bk
bk.version
'0.2.3'
import bert4keras
print(bert4keras.version)
0.2.3

@yuhao1982
Copy link
Author

再贴一下部分代码吧,就是改了预训练模型路径,加了gpu

#! -- coding:utf-8 --

情感分析类似,加载albert_zh权重(https://github.com/brightmart/albert_zh)

import json
import numpy as np
import pandas as pd
from random import choice
import re, os
import codecs
from bert4keras.backend import set_gelu
from bert4keras.utils import Tokenizer, load_vocab
from bert4keras.bert import build_bert_model
from bert4keras.train import PiecewiseLinearLearningRate
set_gelu('tanh') # 切换gelu版本

os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID"
os.environ['CUDA_VISIBLE_DEVICES'] = "5" #,6,7"#"2,1,0,5"

maxlen = 100
root_dir = '/data/yuhao/'
config_path = root_dir + 'Bert_for_CFF/brightmart_albert_tiny_489k/albert_config_tiny.json'#'/root/kg/bert/albert_base_zh/bert_config.json'
checkpoint_path = root_dir + 'Bert_for_CFF/brightmart_albert_tiny_489k/albert_model.ckpt'#'/root/kg/bert/albert_base_zh/bert_model.ckpt'
dict_path = root_dir + 'Bert_for_CFF/brightmart_albert_tiny_489k/vocab.txt'#'/root/kg/bert/albert_base_zh/vocab.txt'

neg = pd.read_excel('datasets/neg.xls', header=None)
pos = pd.read_excel('datasets/pos.xls', header=None)
data, tokens = [], {}

_token_dict = load_vocab(dict_path) # 读取词典
_tokenizer = Tokenizer(_token_dict) # 建立临时分词器

for d in neg[0]:
data.append((d, 0))
for t in _tokenizer.tokenize(d):
tokens[t] = tokens.get(t, 0) + 1

for d in pos[0]:
data.append((d, 1))
for t in _tokenizer.tokenize(d):
tokens[t] = tokens.get(t, 0) + 1

tokens = {i: j for i, j in tokens.items() if j >= 4}
token_dict, keep_words = {}, [] # keep_words是在bert中保留的字表

@bojone
Copy link
Owner

bojone commented Nov 13, 2019

我刚特意把albert_tiny_489k也下载试了试,没有任何问题。

建议你把所有bert和bert4keras相关的都删掉重来吧,albert_tiny_489k也删掉重新下。

@bojone
Copy link
Owner

bojone commented Nov 13, 2019

或者按这种方式执行看看

wget -c https://github.com/bojone/bert4keras/archive/v0.2.3.zip
unzip v0.2.3.zip
cd bert4keras-0.2.3
mv examples/* .
vim task_sentiment_albert.py # 此处只修改bert的路径
CUDA_VISIBLE_DEVICES=0 python task_sentiment_albert.py

@bojone bojone closed this as completed Nov 18, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants