加载徐亮版的albert,报错了 #31

yuhao1982 · 2019-11-13T03:32:40Z

用example/task_sentiment_albert.py加载徐亮的tiny版albert ,出现下面错误
ValueError: Layer weight shape (21128, 312) not compatible with provided weight shape (21128, 128)
用keras_bert加载也出现同样的错，这是怎么回事啊？

bojone · 2019-11-13T03:46:56Z

我觉得README已经说得够清楚了啊：

（注：徐亮版albert的开源时间早于Google版albert，这导致早期徐亮版albert的权重与Google版的不完全一致，换言之两者不能直接相互替换。为了减少代码冗余，bert4keras的0.2.4及后续版本均只支持加载Google版以徐亮版中带Google字眼的权重。如果要加载早期版本的权重，请用0.2.3版本。）

yuhao1982 · 2019-11-13T04:39:01Z

sorry,忘注意了。另外再问一下，keras_bert（https://github.com/CyberZHG/keras-bert）是不是不支持加载albert模型? 谢谢啊！因为我想利用k折交叉建模的代码，虽然直接在您的上面改应该也很容易吧？

bojone · 2019-11-13T04:43:06Z

keras_bert目前是不支持的。

yuhao1982 · 2019-11-13T05:26:10Z

你好,还想问下您遇到这个错吗---我用的是google的albert预训练模型, 还是用bert4keras 0.2.4,按照https://github.com/bojone/bert4keras/issues/29的要求把代码格式做相应修改，但是会报下面的错：ValueError: Error when checking target: expected dense_7 to have 2 dimensions, but got array with shape (12, 1, 3)。一直没有找到原因，麻烦帮看看吧，谢谢啊！

yuhao1982 · 2019-11-13T05:26:31Z

整个输出是这样的：
Epoch 1/50
Traceback (most recent call last):
File "task_sentiment_albert.py", line 533, in
train_model_pred, test_model_pred, val_f1_list = run_cv(5, DATA_LIST, DATA_X, DATA_Y, DATA_LIST_TEST)
File "task_sentiment_albert.py", line 472, in run_cv
callbacks=mycallbacks,# + [plateau, checkpoint],#[early_stopping, plateau, checkpoint],
File "/data/yuhao/anaconda3/lib/python3.6/site-packages/keras/legacy/interfaces.py", line 91, in wrapper
return func(*args, **kwargs)
File "/data/yuhao/anaconda3/lib/python3.6/site-packages/keras/engine/training.py", line 1732, in fit_generator
initial_epoch=initial_epoch)
File "/data/yuhao/anaconda3/lib/python3.6/site-packages/keras/engine/training_generator.py", line 220, in fit_generator
reset_metrics=False)
File "/data/yuhao/anaconda3/lib/python3.6/site-packages/keras/engine/training.py", line 1508, in train_on_batch
class_weight=class_weight)
File "/data/yuhao/anaconda3/lib/python3.6/site-packages/keras/engine/training.py", line 621, in _standardize_user_data
exception_prefix='target')
File "/data/yuhao/anaconda3/lib/python3.6/site-packages/keras/engine/training_utils.py", line 135, in standardize_input_data
'with shape ' + str(data_shape))
ValueError: Error when checking target: expected dense_7 to have 2 dimensions, but got array with shape (12, 1, 3)

bojone · 2019-11-13T06:22:56Z

@yuhao1982 按提示是你的训练数据shape跟模型的输出shape对不上的问题，跟bert4keras没关系

yuhao1982 · 2019-11-13T08:24:30Z

解决了。
但是又有一个问题：那个代码运行到几个epoch后会突然stopIteration:
Traceback (most recent call last):
File "task_sentiment_albert.py", line 595, in
train_model_pred, test_model_pred, val_f1_list = run_cv(5, DATA_LIST, DATA_X, DATA_Y, DATA_LIST_TEST)
File "task_sentiment_albert.py", line 534, in run_cv
callbacks=mycallbacks # + [plateau, checkpoint],#[early_stopping, plateau, checkpoint],
File "/data/yuhao/anaconda3/lib/python3.6/site-packages/keras/legacy/interfaces.py", line 91, in wrapper
return func(*args, **kwargs)
File "/data/yuhao/anaconda3/lib/python3.6/site-packages/keras/engine/training.py", line 1732, in fit_generator
initial_epoch=initial_epoch)
File "/data/yuhao/anaconda3/lib/python3.6/site-packages/keras/engine/training_generator.py", line 242, in fit_generator
workers=0)
File "/data/yuhao/anaconda3/lib/python3.6/site-packages/keras/legacy/interfaces.py", line 91, in wrapper
return func(*args, **kwargs)
File "/data/yuhao/anaconda3/lib/python3.6/site-packages/keras/engine/training.py", line 1791, in evaluate_generator
verbose=verbose)
File "/data/yuhao/anaconda3/lib/python3.6/site-packages/keras/engine/training_generator.py", line 365, in evaluate_generator
generator_output = next(output_generator)
StopIteration

是训练数据的生成器stop了。这是什么原因呢？
我用的google版albert按你的解答是用如下代码没错吧？(我以为是不是这个搞错了造成的）

config_path = '/root/kg/bert/albert_base_en_tfhub/albert_config.json'
checkpoint_path = '/root/kg/bert/albert_base_en_tfhub/variables/variables'
spm_path = '/root/kg/bert/albert_base_en_tfhub/assets/30k-clean.model'

tokenizer = SpTokenizer(spm_path)
model = build_bert_model(config_path, checkpoint_path, albert=True)

token_ids, segment_ids = tokenizer.encode(first_text = text, max_length = maxlen)
?
麻烦帮解答下。谢谢.

yuhao1982 · 2019-11-13T08:52:25Z

刚才用了0.2.3的版本，发现用徐亮的albert tiny还是同样的问题：
File "task_sentiment_albert.py", line 123, in
albert=True
File "/data/yuhao/anaconda3/lib/python3.6/site-packages/bert4keras-0.2.3-py3.6.egg/bert4keras/bert.py", line 319, in build_bert_model
File "/data/yuhao/anaconda3/lib/python3.6/site-packages/bert4keras-0.2.3-py3.6.egg/bert4keras/bert.py", line 188, in load_weights_from_checkpoint
File "/data/yuhao/anaconda3/lib/python3.6/site-packages/keras/engine/base_layer.py", line 1126, in set_weights
'provided weight shape ' + str(w.shape))
ValueError: Layer weight shape (4408, 312) not compatible with provided weight shape (4408, 128)
麻烦帮帮忙呢

bojone · 2019-11-13T08:53:44Z

解决了。
但是又有一个问题：那个代码运行到几个epoch后会突然stopIteration:
Traceback (most recent call last):
File "task_sentiment_albert.py", line 595, in
train_model_pred, test_model_pred, val_f1_list = run_cv(5, DATA_LIST, DATA_X, DATA_Y, DATA_LIST_TEST)
File "task_sentiment_albert.py", line 534, in run_cv
callbacks=mycallbacks # + [plateau, checkpoint],#[early_stopping, plateau, checkpoint],
File "/data/yuhao/anaconda3/lib/python3.6/site-packages/keras/legacy/interfaces.py", line 91, in wrapper
return func(*args, **kwargs)
File "/data/yuhao/anaconda3/lib/python3.6/site-packages/keras/engine/training.py", line 1732, in fit_generator
initial_epoch=initial_epoch)
File "/data/yuhao/anaconda3/lib/python3.6/site-packages/keras/engine/training_generator.py", line 242, in fit_generator
workers=0)
File "/data/yuhao/anaconda3/lib/python3.6/site-packages/keras/legacy/interfaces.py", line 91, in wrapper
return func(*args, **kwargs)
File "/data/yuhao/anaconda3/lib/python3.6/site-packages/keras/engine/training.py", line 1791, in evaluate_generator
verbose=verbose)
File "/data/yuhao/anaconda3/lib/python3.6/site-packages/keras/engine/training_generator.py", line 365, in evaluate_generator
generator_output = next(output_generator)
StopIteration

是训练数据的生成器stop了。这是什么原因呢？
我用的google版albert按你的解答是用如下代码没错吧？(我以为是不是这个搞错了造成的）

config_path = '/root/kg/bert/albert_base_en_tfhub/albert_config.json'
checkpoint_path = '/root/kg/bert/albert_base_en_tfhub/variables/variables'
spm_path = '/root/kg/bert/albert_base_en_tfhub/assets/30k-clean.model'

tokenizer = SpTokenizer(spm_path)
model = build_bert_model(config_path, checkpoint_path, albert=True)

token_ids, segment_ids = tokenizer.encode(first_text = text, max_length = maxlen)
?
麻烦帮解答下。谢谢.

fit_generator的generator需要无限循环。

再提醒一次，请学好keras再来用bert4keras。只要你能加载成功，跑起来，就不是bert4keras的问题。这里不是手把手debug中心。

bojone · 2019-11-13T08:59:35Z

刚才用了0.2.3的版本，发现用徐亮的albert tiny还是同样的问题：
File "task_sentiment_albert.py", line 123, in
albert=True
File "/data/yuhao/anaconda3/lib/python3.6/site-packages/bert4keras-0.2.3-py3.6.egg/bert4keras/bert.py", line 319, in build_bert_model
File "/data/yuhao/anaconda3/lib/python3.6/site-packages/bert4keras-0.2.3-py3.6.egg/bert4keras/bert.py", line 188, in load_weights_from_checkpoint
File "/data/yuhao/anaconda3/lib/python3.6/site-packages/keras/engine/base_layer.py", line 1126, in set_weights
'provided weight shape ' + str(w.shape))
ValueError: Layer weight shape (4408, 312) not compatible with provided weight shape (4408, 128)
麻烦帮帮忙呢

反正我没报错。

yuhao1982 · 2019-11-13T10:04:27Z

刚才用了0.2.3的版本，发现用徐亮的albert tiny还是同样的问题：
File "task_sentiment_albert.py", line 123, in
albert=True
File "/data/yuhao/anaconda3/lib/python3.6/site-packages/bert4keras-0.2.3-py3.6.egg/bert4keras/bert.py", line 319, in build_bert_model
File "/data/yuhao/anaconda3/lib/python3.6/site-packages/bert4keras-0.2.3-py3.6.egg/bert4keras/bert.py", line 188, in load_weights_from_checkpoint
File "/data/yuhao/anaconda3/lib/python3.6/site-packages/keras/engine/base_layer.py", line 1126, in set_weights
'provided weight shape ' + str(w.shape))
ValueError: Layer weight shape (4408, 312) not compatible with provided weight shape (4408, 128)
麻烦帮帮忙呢

反正我没报错。

我配置tf=1.15, keras 2.3.1. bert4keras 0.2.3 , 还是这个错，怎么办呢？用不了中文的albert了

bojone · 2019-11-13T10:11:53Z

import bert4keras
print(bert4keras.__version__)

确认是0.2.3？

yuhao1982 · 2019-11-13T10:13:59Z

[GCC 7.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.

import bert4keras as bk
bk.version
'0.2.3'
import bert4keras
print(bert4keras.version)
0.2.3

yuhao1982 · 2019-11-13T10:15:28Z

再贴一下部分代码吧，就是改了预训练模型路径，加了gpu

#! -- coding:utf-8 --

情感分析类似，加载albert_zh权重(https://github.com/brightmart/albert_zh)

import json
import numpy as np
import pandas as pd
from random import choice
import re, os
import codecs
from bert4keras.backend import set_gelu
from bert4keras.utils import Tokenizer, load_vocab
from bert4keras.bert import build_bert_model
from bert4keras.train import PiecewiseLinearLearningRate
set_gelu('tanh') # 切换gelu版本

os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID"
os.environ['CUDA_VISIBLE_DEVICES'] = "5" #,6,7"#"2,1,0,5"

maxlen = 100
root_dir = '/data/yuhao/'
config_path = root_dir + 'Bert_for_CFF/brightmart_albert_tiny_489k/albert_config_tiny.json'#'/root/kg/bert/albert_base_zh/bert_config.json'
checkpoint_path = root_dir + 'Bert_for_CFF/brightmart_albert_tiny_489k/albert_model.ckpt'#'/root/kg/bert/albert_base_zh/bert_model.ckpt'
dict_path = root_dir + 'Bert_for_CFF/brightmart_albert_tiny_489k/vocab.txt'#'/root/kg/bert/albert_base_zh/vocab.txt'

neg = pd.read_excel('datasets/neg.xls', header=None)
pos = pd.read_excel('datasets/pos.xls', header=None)
data, tokens = [], {}

_token_dict = load_vocab(dict_path) # 读取词典
_tokenizer = Tokenizer(_token_dict) # 建立临时分词器

for d in neg[0]:
data.append((d, 0))
for t in _tokenizer.tokenize(d):
tokens[t] = tokens.get(t, 0) + 1

for d in pos[0]:
data.append((d, 1))
for t in _tokenizer.tokenize(d):
tokens[t] = tokens.get(t, 0) + 1

tokens = {i: j for i, j in tokens.items() if j >= 4}
token_dict, keep_words = {}, [] # keep_words是在bert中保留的字表

bojone · 2019-11-13T10:29:12Z

我刚特意把albert_tiny_489k也下载试了试，没有任何问题。

建议你把所有bert和bert4keras相关的都删掉重来吧，albert_tiny_489k也删掉重新下。

bojone · 2019-11-13T10:34:07Z

或者按这种方式执行看看

wget -c https://github.com/bojone/bert4keras/archive/v0.2.3.zip
unzip v0.2.3.zip
cd bert4keras-0.2.3
mv examples/* .
vim task_sentiment_albert.py # 此处只修改bert的路径
CUDA_VISIBLE_DEVICES=0 python task_sentiment_albert.py

bojone closed this as completed Nov 18, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

加载徐亮版的albert,报错了 #31

加载徐亮版的albert,报错了 #31

yuhao1982 commented Nov 13, 2019

bojone commented Nov 13, 2019

yuhao1982 commented Nov 13, 2019

bojone commented Nov 13, 2019

yuhao1982 commented Nov 13, 2019

yuhao1982 commented Nov 13, 2019

bojone commented Nov 13, 2019

yuhao1982 commented Nov 13, 2019

yuhao1982 commented Nov 13, 2019

bojone commented Nov 13, 2019 •

edited

bojone commented Nov 13, 2019

yuhao1982 commented Nov 13, 2019

bojone commented Nov 13, 2019

yuhao1982 commented Nov 13, 2019

yuhao1982 commented Nov 13, 2019

bojone commented Nov 13, 2019

bojone commented Nov 13, 2019

加载徐亮版的albert,报错了 #31

加载徐亮版的albert,报错了 #31

Comments

yuhao1982 commented Nov 13, 2019

bojone commented Nov 13, 2019

yuhao1982 commented Nov 13, 2019

bojone commented Nov 13, 2019

yuhao1982 commented Nov 13, 2019

yuhao1982 commented Nov 13, 2019

bojone commented Nov 13, 2019

yuhao1982 commented Nov 13, 2019

yuhao1982 commented Nov 13, 2019

bojone commented Nov 13, 2019 • edited

bojone commented Nov 13, 2019

yuhao1982 commented Nov 13, 2019

bojone commented Nov 13, 2019

yuhao1982 commented Nov 13, 2019

yuhao1982 commented Nov 13, 2019

情感分析类似，加载albert_zh权重(https://github.com/brightmart/albert_zh)

bojone commented Nov 13, 2019

bojone commented Nov 13, 2019

bojone commented Nov 13, 2019 •

edited