Tensorflow profiler

Tracing and profiling tool for TensorFlow. Support CUDA, HIP and SYCL TensorFlow

Requirements

Install LTTng : https://lttng.org/
Install Babeltrace and python bindings : https://github.com/pzins/babeltrace (checkout branch "conversion_atp_to_ctf")

Building the tracers

Go to tracers folder
make

Installing TensorFlow

The first step is to install an instrumented version of TensorFlow;

TensorFlow 1.6 CUDA : https://github.com/pzins/tensorflow
TensorFlow 1.3 HIP/ROCM : https://github.com/pzins/tensorflow-rocm
TensorFlow 1.0 HIP/ROCM : https://github.com/pzins/hiptensorflow
TensorFlow 1.6 SYCL : https://github.com/pzins/tensorflow-sycl

you need to checkout lttng branch
Follow the classic instructions to build TensorFlow.

Tracing API

TensorFlow with CUDA

Nothing addictional is needed

TensorFlow with HIP

Install the ROCm platform : https://rocm.github.io/ROCmInstall.html

HSA tracing :

Clone, checkout lttng branch and build ROCR-Runtime : https://github.com/pzins/ROCR-Runtime
Replace /opt/rocm/hsa/lib/libhsa-runtime64.so with the builded libhsa-runtime64.so

It's also possible to profile HSA API with interception libraries.

HIP tracing :

Clone, checkout lttng branch and build HIP : https://github.com/pzins/HIP

Asynchronous events

The first possiblity is to rebuild an instrumented version of HC.

Clone, checkout lttng branch and build HC : https://github.com/pzins/hcc

Sometimes, it's not possible to rebuild it, so there are 2 options :

Using log output from HC and parsing it (automated in scripts)
Using interception libraries

Interception libraries

These libraries can be LD_PRELOADED to get some informations :

HSA API
GPU kernels begin/end
Performance counters

Build instructions :

Go to tensorflow-profiler/interception-libraries
make
Output libraries are in tensorflow-profiler/interception-libraries/lib/
Before running your application : set the libraries you want into LD_PRELOAD

Scripts

There are several possibilities to profile an application

Automated version

tf_tracer.sh

Manual version

Use scripts into scripts/

start_tracing.sh : start lttng tracing
stop_tracing.sh : stop lttng tracing
set_env.sh : set the environment before tracing. Needed if using HIP/ROCm
post_process.sh : post processing script
- Replace all the asynchronous events at the correct position
- Match all the GPU kernels with the corresponding TensorFlow operation
trace_analysis : get textual statistics of a trace
perfcounters_analysis.py : parse RCP performance counters and match the value with the callstack trace
perfcounters_interception_analysis.py : Parse and create CSV file with the performance counters trace obtained with interception libraries

Distributed TensorFlow

Only support basic "in model" parallelism, when you have a worker and a master and you split your graph on the two machines.

The instrumentation is available only with TensorFlow 1.0 ROCM/HIP TensorFlow 1.3 ROCM/HIP TensorFlow 1.6 SYCL

Scripts :

fabfile.y : Fabric file to automate an execution
scripts/grpc_worker.sh : Used by the fabfile, for the worker computer
scripts/grpc_master.sh : Used by the fabfile, for the master computer

Name		Name	Last commit message	Last commit date
Latest commit History 73 Commits
CLUST		CLUST
analysis		analysis
interception-libraries		interception-libraries
scripts		scripts
tracers		tracers
.gitignore		.gitignore
README.md		README.md
fabfile.py		fabfile.py
tf_tracer.py		tf_tracer.py
tf_tracer.sh		tf_tracer.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Tensorflow profiler

Requirements

Building the tracers

Installing TensorFlow

Tracing API

TensorFlow with CUDA

TensorFlow with HIP

HSA tracing :

HIP tracing :

Asynchronous events

Interception libraries

Scripts

Automated version

Manual version

Distributed TensorFlow

About

Releases

Packages

Languages

pzins/tensorflow-profiler

Folders and files

Latest commit

History

Repository files navigation

Tensorflow profiler

Requirements

Building the tracers

Installing TensorFlow

Tracing API

TensorFlow with CUDA

TensorFlow with HIP

HSA tracing :

HIP tracing :

Asynchronous events

Interception libraries

Scripts

Automated version

Manual version

Distributed TensorFlow

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages