keras LSTM Fail to find the dnn implementation #36508

ARozental · 2020-02-06T11:27:23Z

System information

CUDA/cuDNN version: 10.1
GPU model and memory: GeForce RTX 2080
TF 2.1.0:

uncommenting the LSTM layer will yield the following error:

UnknownError:  [_Derived_]  Fail to find the dnn implementation.
	 [[{{node CudnnRNN}}]]
	 [[sequential_6/bidirectional_2/backward_lstm_3/StatefulPartitionedCall]]
	 [[Reshape_11/_38]] [Op:__inference_distributed_function_39046]

working code:

model = tf.keras.Sequential([
    tf.keras.layers.Embedding(encoder.vocab_size, 64),
    #tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(64)),
    tf.keras.layers.Dense(64, activation='relu'),
    tf.keras.layers.Dense(1, activation='sigmoid')
])
model.compile(loss='binary_crossentropy',
              optimizer=tf.keras.optimizers.Adam(1e-4),
              metrics=['accuracy'])
history = model.fit(train_dataset, epochs=10,
                    validation_data=test_dataset, 
                    validation_steps=30)

The text was updated successfully, but these errors were encountered:

Saduf2019 · 2020-02-07T05:19:01Z

@ARozental Could you please provide us with supporting files and complete stand alone code to replicate the issue in our environment.

alonRozental · 2020-02-09T12:34:29Z

@Saduf2019
the code is from one of the TF official tutorials and the working version is attached here, uncommenting the LSTM line will raise the error:

from __future__ import absolute_import, division, print_function, unicode_literals
import os
import tensorflow_datasets as tfds
import tensorflow as tf
from tensorflow.python.client import device_lib

dataset, info = tfds.load('imdb_reviews/subwords8k', with_info=True,
                          as_supervised=True)
train_dataset, test_dataset = dataset['train'], dataset['test']

BUFFER_SIZE = 10000
BATCH_SIZE = 64

train_dataset = train_dataset.shuffle(BUFFER_SIZE)
train_dataset = train_dataset.padded_batch(BATCH_SIZE, train_dataset.output_shapes)

test_dataset = test_dataset.padded_batch(BATCH_SIZE, test_dataset.output_shapes)
encoder = info.features['text'].encoder


model = tf.keras.Sequential([
    tf.keras.layers.Embedding(encoder.vocab_size, 64),
    #tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(64)),
    tf.keras.layers.Dense(64, activation='relu'),
    tf.keras.layers.Dense(1, activation='sigmoid')
])
model.compile(loss='binary_crossentropy',
              optimizer=tf.keras.optimizers.Adam(1e-4),
              metrics=['accuracy'])
history = model.fit(train_dataset, epochs=10,
                    validation_data=test_dataset, 
                    validation_steps=30)

Also, I use ubuntu 18.04.
Thanks.

Saduf2019 · 2020-02-10T09:52:22Z

@alonRozental I ran the code [on nightly] after un-commenting the LSTM line and did not face any issues, please find the gist here

alonRozental · 2020-02-10T10:16:17Z

@Saduf2019 I'm running TF 2.1.0.
I don't think the problem exists in TF1 which is used in the notebook.
also making the following change makes the code work:

    #tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(64)),
    tf.keras.layers.Bidirectional(tf.keras.layers.RNN(tf.keras.layers.LSTMCell(64))),

I would think that those 2 lines should do the same thing (please correct me if I'm wrong) but it seems only the second line works.

Saduf2019 · 2020-02-11T09:46:40Z

@ARozental I ran the code on nightly ['2.2.0-dev20200210'] and on tensorflow==2.1.0, un-commenting the LSTM line as requested by you and did not face any issues, please find the gist of 2.1.0 here

alonRozental · 2020-02-11T11:20:47Z

@Saduf2019 than I don't know how to replicate it on Colab, maybe it only occurs with specific hardware (ti 2080). In anyway, can you confirm that those 2 lines should do the exact same thing? if this is indeed the case we can look at the difference (that shouldn't exist) between the 2 implementations to find the bug.

Lay4U · 2020-02-11T20:32:51Z

me too

tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED
2020-02-12 04:48:50.916938: W tensorflow/core/framework/op_kernel.cc:1655] OP_REQUIRES failed at cudnn_rnn_ops.cc:1510 : Unknown: Fail to find the dnn implementation.
2020-02-12 04:48:50.923690: W tensorflow/core/common_runtime/base_collective_executor.cc:217] BaseCollectiveExecutor::StartAbort Unknown: Fail to find the dnn implementation.
         [[{{node CudnnRNN}}]]
2020-02-12 04:48:50.931195: W tensorflow/core/common_runtime/base_collective_executor.cc:217] BaseCollectiveExecutor::StartAbort Unknown: {{function_node __inference_cudnn_lstm_with_fallback_1954_specialized_for_sequential_1_lstm_StatefulPartitionedCall_at___inference_distributed_function_2139}} {{function_node __inference_cudnn_lstm_with_fallback_1954_specialized_for_sequential_1_lstm_StatefulPartitionedCall_at___inference_distributed_function_2139}} Fail to find the dnn implementation.
         [[{{node CudnnRNN}}]]
         [[sequential_1/lstm/StatefulPartitionedCall]]

gowthamkpr · 2020-02-11T22:57:16Z

@Lay4U @ARozental Please use the below code while importing tensorflow and let me know if the issue still persists. Thanks!

import tensorflow as tf
physical_devices = tf.config.list_physical_devices('GPU')
tf.config.experimental.set_memory_growth(physical_devices[0], enable=True)

alonRozental · 2020-02-16T07:10:44Z

@gowthamkpr It doesn't help

olalakul · 2020-03-01T16:57:57Z

I confirm that it does not help

Yougigun · 2020-03-02T06:12:40Z

I confirm that it does not help

qlzh727 · 2020-03-04T17:50:49Z

Those two line will build different graph under the hood, but should produce same math result.
The first line will use cudnn kernel on GPU if GPU is available, whereas the second line will use generic kernel on GPU.

Adding @houtoms from Nvidia side. Is there any recent change to the kernel CudnnRNN?

qlzh727 · 2020-03-04T17:59:12Z

I wasn't able to produce this issue on a GPU colab as well. I think this somehow indicate its a environment issue, we probably should check the cuda kernel version.

kaixih · 2020-03-05T21:29:16Z

From the error log, the cuDNN didn't successfully create the handler. So, it seems not to be a CuDNN RNN issue. Can you try some convolution examples to see if the cuDNN is able to create handler? @ARozental

sousandrei · 2020-06-06T11:08:49Z

Ok I managed to make it work after fighting with CUDA 10.1 and 10.2 (10.2 works nice with 2.3 nightly) for a while, environments, OS and everything.

Narrowed it to a seeming harmless line

I was running tf.test.gpu_device_name() to check there was a GPU and print its name. That command when run at any time makes the model fail on train with the mentioned error: Unknown: Fail to find the dnn implementation

The tf.config.experimental.set_visible_devices command that @shaoeChen mentioned didn't change anything for me so I removed it.

I managed to make it work more reliably running this right after importing tensorflow (and other libs, but I don't think it changes anything)

gpus = tf.config.experimental.list_physical_devices(device_type='GPU')
tf.config.experimental.set_memory_growth(device=gpus[0], enable=True)

Is this a known bug or some unintended behaviour?

Verythai · 2020-06-14T08:53:58Z

Yes, simply works! Thank you.

paapu88 · 2020-07-11T08:57:52Z

Why this is closed? I got the same error in ubuntu 20.04 jupyterlab '2.1.5' tensorflow 2.2.0 (with GPU) CUDA Version 10.1.105 when building a model in jupyter-lab using a kernel having tensorflow 2.2.0

Only thing that helped is the workaround presented earlier:

from tensorflow.keras.layers import RNN, LSTMCell
def build_model(feature_count=feature_count, seq_len=seq_len):
    inputs = tensorflow.keras.Input(shape=(seq_len, feature_count))
    X = RNN(LSTMCell(units=seq_len), input_shape=(seq_len, feature_count), return_sequences=True, stateful=False)(inputs)

terveisin, Markus

wbadry · 2020-07-14T23:04:07Z

Hello,
Thanks @gowthamkpr 👍
This solved my problem. My configuration is:
OS: Windows 10 x64
Python : 3.6
TensorFlow-GPU : 2.2.0
Cuda : 10.1
Cudnn : 7.6.5

paulmwatson · 2020-07-28T12:31:16Z

conda install -c anaconda cudnn

This worked for us when getting

tensorflow.python.framework.errors_impl.UnknownError:  [_Derived_]  Fail to find the dnn implementation.

Thanks @ElliotVilhelm

huydhoang · 2020-08-21T07:22:14Z

gpus = tf.config.experimental.list_physical_devices(device_type='GPU')
tf.config.experimental.set_visible_devices(devices=gpus[1], device_type='GPU')
tf.config.experimental.set_memory_growth(device=gpus[1], enable=True)

above work for me.

also worked for me (tf 2.3). Does this mean CUDA was not installed correctly or is this a tensorflow bug?

marcosclima · 2020-08-29T18:54:24Z

Worked for me. tks.
Running BiLSTM on TF2.1 with two 2080S

ziliangok · 2020-09-03T08:07:29Z

It solved my problem. Using tf 2.2.0 with one 2070s.

vaecole · 2020-10-14T04:57:08Z

It worked for me, running GRU using TF 2.3.0 with one 2060. Thanks!

TaWeiYeh · 2020-10-15T05:07:23Z

This solves the problem for me as well.
OS: Ubuntu 18.04
Python : 3.6.9
TensorFlow-GPU : 2.3.0
Cuda : 10.1
Cudnn : 7.6.5

RRSBG · 2020-11-16T14:20:50Z

thx, solved the problem:
linux mint 20
geforce RTX 2060

leimao · 2020-12-12T20:53:22Z

I think a lot of the cuDNN related problems could be solved by adding these code.
https://leimao.github.io/blog/TensorFlow-cuDNN-Failure/

3d-illusions · 2020-12-24T01:53:44Z

this solved it for me. What does this do exactly?

JamieMoon · 2020-12-29T17:44:06Z

Just getting this:
RuntimeError: Physical devices cannot be modified after being initialized

Did that work for you?

shubhamdo · 2021-01-08T19:30:01Z

@JamieMoon Just close the terminal/python console and run the below code first, then your LSTM

import tensorflow as tf
physical_devices = tf.config.list_physical_devices('GPU')
tf.config.experimental.set_memory_growth(physical_devices[0], enable=True)

ICG14 · 2021-01-25T17:23:18Z

I continue without solving this issue...
I have tried all that you have mentioned but it continues the same problem
my OS is:

Ubuntu 18.04
CUDA 10.0
Tensorflow 2.0
Nvidia-driver 460 (Although I have tried with 450 and it also does not work)
geForce RTX2060
Python 3.7

I have tried to compile with CUDA 10.1 and TF 2.1 but I continue without solving it. It starts to be a little frustrating

This is what I obtain after fitting:

Epoch 1/50
2021-01-25 18:59:34.964218: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2021-01-25 18:59:35.096029: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
128/2156 [>.............................] - ETA: 15sWARNING:tensorflow:Can save best model only with val_loss available, skipping.

.2021-01-25 18:59:35.364099: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2021-01-25 18:59:35.364136: W tensorflow/core/framework/op_kernel.cc:1655] OP_REQUIRES failed at cudnn_rnn_ops.cc:1510 : Unknown: Fail to find the dnn implementation.
2021-01-25 18:59:35.364158: W tensorflow/core/common_runtime/base_collective_executor.cc:217] BaseCollectiveExecutor::StartAbort Unknown: Fail to find the dnn implementation.
[[{{node CudnnRNN}}]]
2021-01-25 18:59:35.364356: W tensorflow/core/common_runtime/base_collective_executor.cc:217] BaseCollectiveExecutor::StartAbort Unknown: {{function_node __forward_cudnn_lstm_with_fallback_2517_specialized_for_sequential_lstm_StatefulPartitionedCall_at___inference_distributed_function_3196}} {{function_node __forward_cudnn_lstm_with_fallback_2517_specialized_for_sequential_lstm_StatefulPartitionedCall_at___inference_distributed_function_3196}} Fail to find the dnn implementation.
[[{{node CudnnRNN}}]]
[[sequential/lstm/StatefulPartitionedCall]]

All testings of the cuDnn and Cuda works well.

marcelmotta · 2021-02-10T12:55:37Z

Just had the same issue here, managed to fix with this solution

My setup:
Windows 10
CUDA 11.2
Tensorflow 2.3
Nvidia Driver 460.x
Geforce RTX 2060
Python 3.8

trifwn · 2021-04-22T12:17:23Z

Same issue here
I tried all the aforementioned solutions. None seems to resolve the issue

this-is-shashank · 2021-07-04T05:24:10Z

RuntimeError: Physical devices cannot be modified after being initialized

frankl1 · 2021-10-04T12:36:52Z

I had the same issue. Updating tensorflow with pip install -U tensorflow solved it

cloudy-sfu · 2023-06-24T14:06:00Z

I have the same problem. The solutions above doesn't work for me.

OS: ubuntu 20.04
Python: 3.11
Tensorflow: 2.12.0
cuda: 11.8

import tensorflow as tf

l0 = tf.keras.layers.Input(shape=(x.shape[1], x.shape[2]))
_, l1_h_t, _ = tf.keras.layers.LSTM(64, return_state=True)(l0)
l2 = tf.keras.layers.Dense(128, activation='relu')(l1_h_t)
l3 = tf.keras.layers.Dense(128, activation='relu')(l2)
l5 = tf.keras.layers.Dense(32, activation='relu')(l3)
l6 = tf.keras.layers.Dense(1, activation='linear')(l5)
my_model = tf.keras.Model(l0, l6)
my_model.compile(optimizer='adam', loss='mse')
tensorboard = tf.keras.callbacks.TensorBoard(log_dir=f'raw/4_tensorboard/')
stop_early = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=60)
save_best = tf.keras.callbacks.ModelCheckpoint('raw/4_lstm.h5', monitor='val_loss', save_best_only=True)
x_train, x_valid, y_train, y_valid = train_test_split(x, y, train_size=0.8, random_state=974238)
history = my_model.fit(
    x_train, y_train, validation_data=(x_valid, y_valid),
    epochs=1000, batch_size=1200, callbacks=[stop_early, save_best, tensorboard]
)
y_hat = my_model.predict(x)

tensorflow-bot bot assigned ravikyram Feb 6, 2020

Saduf2019 assigned Saduf2019 and unassigned ravikyram Feb 7, 2020

Saduf2019 added comp:keras Keras related issues TF 2.1 for tracking issues in 2.1 release labels Feb 7, 2020

Saduf2019 added the stat:awaiting response Status - Awaiting response from author label Feb 10, 2020

Saduf2019 added type:bug Bug and removed stat:awaiting response Status - Awaiting response from author labels Feb 11, 2020

Saduf2019 assigned gowthamkpr and unassigned Saduf2019 Feb 11, 2020

gowthamkpr added the stat:awaiting response Status - Awaiting response from author label Feb 11, 2020

gowthamkpr removed the stat:awaiting response Status - Awaiting response from author label Mar 4, 2020

gowthamkpr assigned qlzh727 and unassigned gowthamkpr Mar 4, 2020

gowthamkpr added the stat:awaiting tensorflower Status - Awaiting response from tensorflower label Mar 4, 2020

qlzh727 assigned kaixih Mar 4, 2020

geetachavan1 added this to Done in TensorFlow 2.3.0 Jun 9, 2020

qlzh727 mentioned this issue Aug 17, 2020

Cuda Error when training RNN #41863

Closed

Avditvs mentioned this issue Nov 28, 2020

Fail to find the dnn implementation while using recurrent layers #45248

Closed

ricvo mentioned this issue Jan 27, 2021

NotFoundError: No algorithm worked! when using Conv2D #43174

Closed

stromal mentioned this issue Jul 5, 2021

Tensorflow GPU installation don't want to run this code #50614

Closed

Metawhy mentioned this issue Oct 13, 2021

Autokeras timeseries_forecaster official Tutorial : Colab script works with CPU, but not with GPU : CudnnRNN "Fail to find the dnn implementation." keras-team/autokeras#1638

Open

This was referenced Feb 10, 2022

GRU in model leads to error: Fail to find the dnn implementation Dobiasd/frugally-deep#317

Closed

Workaround for cudnn not found problem (issue #317) Dobiasd/frugally-deep#318

Merged

gowthamkpr mentioned this issue Aug 12, 2022

[URGENT] There is less inputs received while giving the exact number of inputs required 2.0 keras-team/tf-keras#499

Closed

keras LSTM Fail to find the dnn implementation #36508

keras LSTM Fail to find the dnn implementation #36508

Comments

ARozental commented Feb 6, 2020

Saduf2019 commented Feb 7, 2020

alonRozental commented Feb 9, 2020 • edited

Saduf2019 commented Feb 10, 2020 • edited

alonRozental commented Feb 10, 2020 • edited

Saduf2019 commented Feb 11, 2020

alonRozental commented Feb 11, 2020

Lay4U commented Feb 11, 2020

gowthamkpr commented Feb 11, 2020

alonRozental commented Feb 16, 2020

olalakul commented Mar 1, 2020

Yougigun commented Mar 2, 2020

qlzh727 commented Mar 4, 2020

qlzh727 commented Mar 4, 2020

kaixih commented Mar 5, 2020

sousandrei commented Jun 6, 2020 • edited

Verythai commented Jun 14, 2020

paapu88 commented Jul 11, 2020

wbadry commented Jul 14, 2020

paulmwatson commented Jul 28, 2020

huydhoang commented Aug 21, 2020

marcosclima commented Aug 29, 2020

ziliangok commented Sep 3, 2020

vaecole commented Oct 14, 2020

TaWeiYeh commented Oct 15, 2020

RRSBG commented Nov 16, 2020

leimao commented Dec 12, 2020

3d-illusions commented Dec 24, 2020

JamieMoon commented Dec 29, 2020

shubhamdo commented Jan 8, 2021

ICG14 commented Jan 25, 2021 • edited

marcelmotta commented Feb 10, 2021

trifwn commented Apr 22, 2021

this-is-shashank commented Jul 4, 2021

frankl1 commented Oct 4, 2021

cloudy-sfu commented Jun 24, 2023

alonRozental commented Feb 9, 2020 •

edited

Saduf2019 commented Feb 10, 2020 •

edited

alonRozental commented Feb 10, 2020 •

edited

sousandrei commented Jun 6, 2020 •

edited

ICG14 commented Jan 25, 2021 •

edited