Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TFTRT: Respect device placement requested by user #56056

Conversation

meena-at-work
Copy link
Contributor

@meena-at-work meena-at-work commented May 10, 2022

Previously TRT engines would always default to running on GPU 0. On a multi GPU system, this does not make the best use of the available resources. This commit adds the ability to specify the GPU on which the TRT engine should run.

Signed-off-by: Meenakshi Venkataraman meenakshiv@nvidia.com

CC: @bixia1 @DEKHTIARJonathan @tfeher

Previously TRT engines would always default to
running on GPU 0. On a multi GPU system, this
does not make the best use of the available
resources. This commit adds the ability to specify
the GPU on which the TRT engine should run.

Signed-off-by: Meenakshi Venkataraman <meenakshiv@nvidia.com>
@google-ml-butler google-ml-butler bot added the size:M CL Change Size: Medium label May 10, 2022
@meena-at-work
Copy link
Contributor Author

meena-at-work commented May 10, 2022

A TFTRT user will now be able to specify device placement with the tf.device() API before invoking convert.

model = create_model()
 input_saved_model_dir='./saved_model/my_model'
 model.save(input_saved_model_dir)

 with tf.device('gpu:1'):
     model_loaded = tf.saved_model.load(export_dir='./saved_model/my_model')
     print("Loaded Model device:", model_loaded.variables[0].device)

 with tf.device('gpu:2'):
     from tensorflow.python.compiler.tensorrt import trt_convert as trt
     converter = trt.TrtGraphConverterV2(input_saved_model_dir=input_saved_model_dir)
     converter.convert()
     converter.summary()

The output for the above with the feature is:

INFO:tensorflow:Assets written to: ./saved_model/my_model/assets
Loaded Model device: /job:localhost/replica:0/task:0/device:GPU:1
INFO:tensorflow:Linked TensorRT version: (8, 2, 4)
INFO:tensorflow:Loaded TensorRT version: (8, 2, 4)
INFO:tensorflow:Placing imported graph from `./saved_model/my_model` on device: /job:localhost/replica:0/task:0/device:GPU:2
TRTEngineOP Name                 Device        # Nodes # Inputs      # Outputs     Input DTypes       Output Dtypes      Input Shapes       Output Shapes     
================================================================================================================================================================
TRTEngineOp_000_000              device:GPU:2  5       1             1             ['float32']        ['float32']        [[-1, 784]]        [[-1, 10]]        

        - Const: 2x
        - MatMul: 2x
        - Relu: 1x

================================================================================================================================================================
[*] Total number of TensorRT engines: 1
[*] % of OPs Converted: 71.43% [5/7]

@gbaned gbaned added the comp:gpu:tensorrt Issues specific to TensorRT label May 11, 2022
@gbaned gbaned added this to Assigned Reviewer in PR Queue via automation May 11, 2022
@gbaned gbaned requested a review from bixia1 May 11, 2022 14:21
@google-ml-butler google-ml-butler bot added the awaiting review Pull request awaiting review label May 11, 2022
PR Queue automation moved this from Assigned Reviewer to Approved by Reviewer May 12, 2022
@google-ml-butler google-ml-butler bot added kokoro:force-run Tests on submitted change ready to pull PR ready for merge process labels May 12, 2022
@kokoro-team kokoro-team removed the kokoro:force-run Tests on submitted change label May 12, 2022
@copybara-service copybara-service bot merged commit 67529f3 into tensorflow:master May 16, 2022
trevor-m pushed a commit to trevor-m/tensorflow that referenced this pull request Oct 20, 2022
Previously TRT engines would always default to
running on GPU 0. On a multi GPU system, this
does not make the best use of the available
resources. This commit adds the ability to specify
the GPU on which the TRT engine should run.

Signed-off-by: Meenakshi Venkataraman <meenakshiv@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting review Pull request awaiting review comp:gpu:tensorrt Issues specific to TensorRT ready to pull PR ready for merge process size:M CL Change Size: Medium
Projects
PR Queue
  
Approved by Reviewer
Development

Successfully merging this pull request may close these issues.

None yet

4 participants