Introduction to TensorFlow-TensorRT (TF-TRT)

Overview

This is a project-based course on optimizing TensorFlow (TF) models for deployment using TensorRT.

Instructor: Snehan Kekre
Certificate: Awarded upon completion
Duration: ~<2 hours

Course Objectives

By the end of this course, you will achieve the following objectives:

Optimize TensorFlow models using TensorRT (TF-TRT).
Optimize deep learning models at FP32, FP16, and INT8 precision using TF-TRT.
Analyze how tuning TF-TRT parameters impacts performance and inference throughput.

Course Outline

This course is divided into three parts:

Course Overview: Introductory reading material.
Optimize TensorFlow Models for Deployment with TensorRT: A hands-on project.
Graded Quiz: A final assignment required to successfully complete the course.

About this Project

This hands-on project guides you in optimizing TensorFlow (TF) models for inference with NVIDIA's TensorRT (TRT).

By the end of this project, you will:

Optimize TensorFlow models using TensorRT (TF-TRT).
Work with models at FP32, FP16, and INT8 precision, observing how TF-TRT parameters affect performance and inference throughput.

Prerequisites

To complete this project successfully, you should have:

Competency in Python programming.
An understanding of deep learning concepts and inference.
Experience building deep learning models using TensorFlow and its Keras API.

Project Structure

Task	Description
Task 1	Introduction and Project Overview
Task 2	Set up TensorFlow and TensorRT Runtime
Task 3	Load Data and Pre-trained InceptionV3 Model
Task 4	Create Batched Input
Task 5	Load the TensorFlow SavedModel
Task 6	Benchmark Prediction Throughput and Accuracy
Task 7	Convert TensorFlow SavedModel to TF-TRT Float32 Graph
Task 8	Benchmark TF-TRT Float32
Task 9	Convert to TF-TRT Float16 and Benchmark
Task 10	Work with TF-TRT INT8 Models
Task 11	Convert to TF-TRT INT8

Lab: Notebook

Description	Notebook	Demo
Intro to TensorFlow-TensorRT		HF/Gradio Space

References

Courses

Videos

Documentation

Deep Learning Model Optimization

Core concepts:
- Efficient Deep Learning: A Comprehensive Overview of Optimization Techniques 👐 📚 - Hugging Face
Quantization in Signal Processing:
- Quantization Signal Processing - Wikipedia
Computer Arithmetic:
- Core:
  - Number Representation and Computer Arithmetic
- Data types & conversion:
- Maths & Algebra:
  - Tensor (Maths): https://en.wikipedia.org/wiki/Tensor_(intrinsic_definition)
  - Matrix Multiplication: https://en.wikipedia.org/wiki/Matrix_multiplication
- ML Tensor: https://en.wikipedia.org/wiki/Tensor_(machine_learning)

Additional Resources

Model Zoo: Edge AI Model Zoo.
Blogs:
- [2019 June 13] High-Performance Inference with TensorRT Integration.
- [2019 June 03] High performance inference with TensorRT Integration - TensorFlow Medium
- [2018 April 18] Speed Up TensorFlow Inference on GPUs with TensorRT.
- [2017 April 08] Advanced Spark and TensorFlow Meetup 2017-05-06 Reduced Precision (FP16, INT8) Inference on Convolutional Neural Networks with TensorRT and NVIDIA Pascal from Chris Gottbrath, Nvidia

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Introduction to TensorFlow-TensorRT (TF-TRT)

Overview

Course Objectives

Course Outline

About this Project

Prerequisites

Project Structure

Lab: Notebook

References

Courses

Videos

Documentation

Deep Learning Model Optimization

Additional Resources

"It's hardware that makes a machine fast. It's software that makes a fast machine slow." - Craig Bruce

About

Releases

Packages

afondiel/Intro-to-TensorFlow-TensorRT-Coursera

Folders and files

Latest commit

History

Repository files navigation

Introduction to TensorFlow-TensorRT (TF-TRT)

Overview

Course Objectives

Course Outline

About this Project

Prerequisites

Project Structure

Lab: Notebook

References

Courses

Videos

Documentation

Deep Learning Model Optimization

Additional Resources

"It's hardware that makes a machine fast. It's software that makes a fast machine slow." - Craig Bruce

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages