Fix cuDNN LSTM implementation selection with LoadSavedModel C++ API. #56525

API92 · 2022-06-21T22:44:55Z

If I save tf.keras.layers.LSTM layer with _could_use_gpu_kernel=True into the SavedModel format with tf.saved_model.save, then cuDNN kernel is used when I load this model with tf.saved_model.load and it works fast. But if I load this model from C++ with tensorflow::LoadSavedModel function, then cuDNN kernel isn't used and it is slow.

Here is colab demonstrating the issue https://colab.research.google.com/drive/16WN0sqOoL37M7-5XMhGb-irkRX7fh503?usp=sharing . If model is loaded with tf.saved_model.load, then tf.keras.layers.LSTM and tf.raw_ops.CudnnLSTM layers both takes about 100 ms. But if model is loaded with LoadSavedModel from C++, then tf.keras.layers.LSTM takes about 275 ms, while tf.raw_ops.CudnnLSTM takes 100 ms.

There were some problems in FunctionOptimizer, ImplementationSelector and MetaOptimizer optimizers:

When FunctionOptimizer specialized some function, it copied api_implements from origin function to specialization. It caused an error "Function ... and ... both implment 'interface_name' but their signatures do not match" in FunctionLibraryApiInfo::Init
MetaOptimizer didn't pass whole FunctionDefLibrary along with a body of single function, which is needed for implementation selection.
ImplementationSelector couldn't parse empty device name if it isn't specified on a node.

gbaned · 2022-08-03T10:28:55Z

Hi @penpornk Can you please review this PR ? Thank you!

gbaned · 2022-12-15T11:08:56Z

Hi @ezhulenev Can you please review this PR ? Thank you!

gbaned · 2022-12-16T08:18:36Z

@API92 Can you please check build failures. Thank you!

API92 · 2022-12-16T16:23:40Z

@gbaned Fixed.

gbaned · 2023-02-14T11:55:17Z

Hi @ezhulenev Can you please review this PR ? Thank you!

penpornk · 2023-02-14T13:11:05Z

Adding @reedwm since @ezhulenev is on vacation for a while.

keerthanakadiri · 2024-11-11T06:37:37Z

Hi @API92 , Can you please resolve the conflicts? Thank you!

tensorflow/core/grappler/optimizers/implementation_selector.cc

google-ml-butler bot added the size:L CL Change Size: Large label Jun 21, 2022

google-ml-butler bot assigned gbaned Jun 21, 2022

tilakrayal added the comp:core issues related to core part of tensorflow label Jun 22, 2022

tilakrayal requested a review from penpornk June 22, 2022 10:02

google-ml-butler bot added the awaiting review Pull request awaiting review label Jun 22, 2022

penpornk requested review from ezhulenev and removed request for penpornk August 3, 2022 17:52

gbaned added comp:grappler Grappler related issues and removed comp:core issues related to core part of tensorflow labels Aug 9, 2022

ezhulenev approved these changes Dec 15, 2022

View reviewed changes

google-ml-butler bot added kokoro:force-run Tests on submitted change ready to pull PR ready for merge process labels Dec 15, 2022

kokoro-team removed the kokoro:force-run Tests on submitted change label Dec 15, 2022

gbaned removed awaiting review Pull request awaiting review ready to pull PR ready for merge process labels Dec 16, 2022

gbaned added the stat:awaiting response Status - Awaiting response from author label Dec 16, 2022

google-ml-butler bot removed the stat:awaiting response Status - Awaiting response from author label Dec 16, 2022

gbaned added the kokoro:force-run Tests on submitted change label Dec 19, 2022

kokoro-team removed the kokoro:force-run Tests on submitted change label Dec 19, 2022

gbaned requested a review from ezhulenev December 19, 2022 11:25

google-ml-butler bot added the awaiting review Pull request awaiting review label Dec 19, 2022

gbaned requested review from ezhulenev and removed request for ezhulenev December 29, 2022 09:18

penpornk requested a review from reedwm February 14, 2023 13:10

keerthanakadiri added ready to pull PR ready for merge process and removed ready to pull PR ready for merge process labels Oct 1, 2024

keerthanakadiri added ready to pull PR ready for merge process and removed ready to pull PR ready for merge process labels Oct 14, 2024

keerthanakadiri added ready to pull PR ready for merge process and removed ready to pull PR ready for merge process labels Oct 23, 2024

keerthanakadiri added ready to pull PR ready for merge process and removed ready to pull PR ready for merge process labels Nov 4, 2024

Fix cuDNN LSTM implementation selection with LoadSavedModel C++ API.

e87a7c7

API92 force-pushed the lstm_cudnn_impl_selection branch from 95bdb1d to e87a7c7 Compare November 17, 2024 02:04

google-ml-butler bot removed the ready to pull PR ready for merge process label Nov 17, 2024

keerthanakadiri requested a review from ezhulenev November 18, 2024 07:12

google-ml-butler bot added the awaiting review Pull request awaiting review label Nov 18, 2024

keerthanakadiri requested review from ezhulenev and removed request for ezhulenev December 4, 2024 07:26

mihaimaruseac approved these changes Jan 9, 2025

View reviewed changes

google-ml-butler bot added kokoro:force-run Tests on submitted change ready to pull PR ready for merge process labels Jan 9, 2025

kokoro-team removed the kokoro:force-run Tests on submitted change label Jan 9, 2025

mihaimaruseac requested changes Jan 9, 2025

View reviewed changes

tensorflow/core/grappler/optimizers/implementation_selector.cc Outdated Show resolved Hide resolved

Remove unused variable

63def23

google-ml-butler bot removed the ready to pull PR ready for merge process label Jan 9, 2025

mihaimaruseac approved these changes Jan 9, 2025

View reviewed changes

google-ml-butler bot added kokoro:force-run Tests on submitted change ready to pull PR ready for merge process labels Jan 9, 2025

kokoro-team removed the kokoro:force-run Tests on submitted change label Jan 9, 2025

copybara-service bot merged commit a72d9bf into tensorflow:master Jan 9, 2025
7 of 8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Fix cuDNN LSTM implementation selection with LoadSavedModel C++ API. #56525

Fix cuDNN LSTM implementation selection with LoadSavedModel C++ API. #56525

Uh oh!

API92 commented Jun 21, 2022 •

edited

Loading

Uh oh!

gbaned commented Aug 3, 2022

Uh oh!

gbaned commented Dec 15, 2022

Uh oh!

gbaned commented Dec 16, 2022

Uh oh!

API92 commented Dec 16, 2022

Uh oh!

gbaned commented Feb 14, 2023

Uh oh!

penpornk commented Feb 14, 2023

Uh oh!

keerthanakadiri commented Nov 11, 2024

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

9 participants

Uh oh!

Fix cuDNN LSTM implementation selection with LoadSavedModel C++ API. #56525

Fix cuDNN LSTM implementation selection with LoadSavedModel C++ API. #56525

Uh oh!

Conversation

API92 commented Jun 21, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gbaned commented Aug 3, 2022

Uh oh!

gbaned commented Dec 15, 2022

Uh oh!

gbaned commented Dec 16, 2022

Uh oh!

API92 commented Dec 16, 2022

Uh oh!

gbaned commented Feb 14, 2023

Uh oh!

penpornk commented Feb 14, 2023

Uh oh!

keerthanakadiri commented Nov 11, 2024

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

9 participants

API92 commented Jun 21, 2022 •

edited

Loading