Skip to content

TensorRT optimises any Deep Learning model by not only making it lightweight but also by accelerating its inference speed with an idea to extract every ounce of performance from the model, making it perfect to be deployed at the edge. This repository helps you convert any Deep Learning model from TensorFlow to TensorRT!

Notifications You must be signed in to change notification settings

SarthakGarg19/Accelerating-Inference-in-Tensorflow-using-TensorRT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Accelerating-Inference-in-Tensorflow-using-TensorRT

What is TensorRT?

TensorRT is an optimization tool provided by NVIDIA that applies graph optimization and layer fusion, and finds the fastest implementation of a deep learning model. In other words, TensorRT will optimize our deep learning model so that we expect a faster inference time than the original model (before optimization), such as 5x faster or 2x faster. The bigger model we have, the bigger space for TensorRT to optimize the model. Furthermore, this TensorRT supports all NVIDIA GPU devices, such as 1080Ti, Titan XP for Desktop, and Jetson TX1, TX2 for embedded device.

Library used

Pre-requrement: Install TensorRT by following this tutorial here for Ubuntu dekstop or here for Jetson devices

Tensorflow 1.12
OpenCV 3.4.5.20
Pillow 5.2.0
Numpy 1.15.2
Matplotlib 3.0.0

Visualize the original and optimized graphs

One of the easiest way to do that is using netron here: https://lutzroeder.github.io/netron/

About

TensorRT optimises any Deep Learning model by not only making it lightweight but also by accelerating its inference speed with an idea to extract every ounce of performance from the model, making it perfect to be deployed at the edge. This repository helps you convert any Deep Learning model from TensorFlow to TensorRT!

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages