You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Please make sure that this is a Bug or a Feature Request and provide all applicable information asked by the template.
If your issue is an implementation question, please ask your question on StackOverflow or on the Keras Slack channel instead of opening a GitHub issue.
System information
Have I written custom code (as opposed to using example directory): yes
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Ubuntu VM of Paperspace Gradient with jupyter notebook
TensorFlow backend (yes / no): yes
TensorFlow version: 1.13.1
Keras version: 2.2.4
Python version: Python 3.6.8 :: Anaconda, Inc.
CUDA/cuDNN version:
name version build
cudatoolkit 10.0.130 0
cudnn 7.3.1 cuda10.0_0
GPU model and memory: Tesla V100 16gb
number of sequences is 900, I am using keras fit_generator by setting steps_per_epoch to 100, 200, 300 etc. to keep batchsize at between 1-10. Whatever I set steps_per_epoch, the error line is same:
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[900,50,11710] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[{{node training/Adam/gradients/MLM-Sim/truediv_grad/Neg}}]]
When I change steps_per_epoch, I am basically changing batch size. I don't get why the problematic shape is "shape[45000,11710]" every time? 11710 are not surprising since 11710 is number of tokens in the dictionary which is built from sentence_pairs. 45000 must be 900x50 where 900 is number of sequences, 50 is number of tokens in a single sequence. However since I am theoretically changing batch size, first dimension should be batch size, not the whole (900). I couldn't think any reason other than that batch size cannot be changed with different values of steps_per_epoch.
This is the code I use as base:
from keras_bert import get_base_dict, get_model, gen_batch_inputs
import keras
# A toy input example
sentence_pairs = [
[['all', 'work', 'and', 'no', 'play'], ['makes', 'jack', 'a', 'dull', 'boy']],
[['from', 'the', 'day', 'forth'], ['my', 'arm', 'changed']],
[['and', 'a', 'voice', 'echoed'], ['power', 'give', 'me', 'more', 'power']],
]
# Build token dictionary
token_dict = get_base_dict() # A dict that contains some special tokens
for pairs in sentence_pairs:
for token in pairs[0] + pairs[1]:
if token not in token_dict:
token_dict[token] = len(token_dict)
token_list = list(token_dict.keys()) # Used for selecting a random word
# Build & train the model
model = get_model(
token_num=len(token_dict),
head_num=5,
transformer_num=12,
embed_dim=100,
feed_forward_dim=400,
seq_len=50,
pos_num=50,
dropout_rate=0.05,
)
model.summary()
def _generator():
while True:
yield gen_batch_inputs(
sentence_pairs,
token_dict,
token_list,
seq_len=50,
)
model.fit_generator(
generator=_generator(),
steps_per_epoch=100,
epochs=10,
)
This code and imported functions are taken from https://github.com/CyberZHG/keras-bert.
I only removed sentence_pairs and created my own sentence_pairs from file with same shape. In the problematic case described above, I am using 900 pairs (1800 sentences) of the whole data which are converted to 900 sequences in the implementation. The 900 I mentioned above is this 900.
The text was updated successfully, but these errors were encountered:
Please make sure that this is a Bug or a Feature Request and provide all applicable information asked by the template.
If your issue is an implementation question, please ask your question on StackOverflow or on the Keras Slack channel instead of opening a GitHub issue.
System information
Have I written custom code (as opposed to using example directory): yes
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Ubuntu VM of Paperspace Gradient with jupyter notebook
TensorFlow backend (yes / no): yes
TensorFlow version: 1.13.1
Keras version: 2.2.4
Python version: Python 3.6.8 :: Anaconda, Inc.
CUDA/cuDNN version:
name version build
cudatoolkit 10.0.130 0
cudnn 7.3.1 cuda10.0_0
GPU model and memory: Tesla V100 16gb
number of sequences is 900, I am using keras fit_generator by setting steps_per_epoch to 100, 200, 300 etc. to keep batchsize at between 1-10. Whatever I set steps_per_epoch, the error line is same:
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[900,50,11710] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[{{node training/Adam/gradients/MLM-Sim/truediv_grad/Neg}}]]
When I change steps_per_epoch, I am basically changing batch size. I don't get why the problematic shape is "shape[45000,11710]" every time? 11710 are not surprising since 11710 is number of tokens in the dictionary which is built from sentence_pairs. 45000 must be 900x50 where 900 is number of sequences, 50 is number of tokens in a single sequence. However since I am theoretically changing batch size, first dimension should be batch size, not the whole (900). I couldn't think any reason other than that batch size cannot be changed with different values of steps_per_epoch.
This is the code I use as base:
This code and imported functions are taken from https://github.com/CyberZHG/keras-bert.
I only removed sentence_pairs and created my own sentence_pairs from file with same shape. In the problematic case described above, I am using 900 pairs (1800 sentences) of the whole data which are converted to 900 sequences in the implementation. The 900 I mentioned above is this 900.
The text was updated successfully, but these errors were encountered: