#  Tensorflow Timeline Analysis on Model Zoo Benchmark between Intel optimized and stock Tensorflow

This jupyter notebook will help you evaluate performance benefits from Intel-optimized Tensorflow on the level of Tensorflow operations via several pre-trained models from Intel Model Zoo. The notebook will show users a bar chart like the picture below for the Tensorflow operation level performance comparison. The red horizontal line represents the performance of Tensorflow operations from Stock Tensorflow, and the blue bars represent the speedup of Intel Tensorflow operations. The operations marked as "mkl-True" are accelerated by MKL-DNN a.k.a oneDNN, and users should be able to see a good speedup for those operations accelerated by MKL-DNN. 
> NOTE : Users need to get Tensorflow timeline json files from other Jupyter notebooks like benchmark_perf_comparison
  first to proceed this Jupyter notebook.

<img src="images\compared_tf_op_duration_ratio_bar.png" width="700">

The notebook will also show users two pie charts like the picture below for elapsed time percentage among different Tensorflow operations.   
Users can easily find the Tensorflow operation hotspots in these pie charts among Stock and Intel Tensorflow.

<img src="images\compared_tf_op_duration_pie.png" width="700">

# Get Platform Information 

In [None]:
from profiling.profile_utils import PlatformUtils
plat_utils = PlatformUtils()
plat_utils.dump_platform_info()

#  Section 1: TensorFlow Timeline Analysis
## Prerequisites

In [None]:
!pip install cxxfilt

%matplotlib inline
import matplotlib.pyplot as plt
import tensorflow as tf

In [None]:
import pandas as pd
pd.set_option('display.max_rows', 500)
pd.set_option('display.max_columns', 500)
pd.set_option('display.width', 1500)

## List out the Timeline folders

First, list out all Timeline folders from previous runs.

In [None]:
import os
filenames= os.listdir (".") 
result = []
keyword = "Timeline"
for filename in filenames: 
    if os.path.isdir(os.path.join(os.path.abspath("."), filename)): 
        if filename.find(keyword) != -1:
                result.append(filename)
result.sort()

index =0 
for folder in result:
    print(" %d : %s " %(index, folder))
    index+=1

## Select a Timeline folder from previous runs
#### ACTION: Please select one Timeline folder and change FdIndex accordingly

In [None]:
FdIndex = 0

List out all Timeline json files inside Timeline folder.

In [None]:
import os
TimelineFd = result[FdIndex]
print(TimelineFd)
datafiles = [TimelineFd +os.sep+ x for x in os.listdir(TimelineFd) if '.json' == x[-5:]]
print(datafiles)
if len(datafiles) is 0:
    print("ERROR! No json file in the selected folder. Please select other folder.")
elif len(datafiles) is 1:
    print("WARNING! There is only 1 json file in the selected folder. Please select other folder to proceed Section 1.2.")

## Section 1.1: Performance Analysis for one TF Timeline result
### Step 1: Pick one of the Timeline files
#### List out all the Timeline files first


In [None]:
index = 0
for file in datafiles:
    print(" %d : %s " %(index, file))
    index+=1

#### ACTION: Please select one timeline json file and change file_index accordingly

In [None]:
## USER INPUT
file_index=0

fn = datafiles[file_index]
tfile_prefix = fn.split('_')[0]
tfile_postfix = fn.strip(tfile_prefix)[1:]
fn

### Step 2: Parse timeline into pandas format

In [None]:
from profiling.profile_utils import TFTimelinePresenter
tfp = TFTimelinePresenter(True)
timeline_pd = tfp.postprocess_timeline(tfp.read_timeline(fn))
timeline_pd = timeline_pd[timeline_pd['ph'] == 'X']

### Step 3: Sum up the elapsed time of each TF operation

In [None]:
tfp.get_tf_ops_time(timeline_pd,fn,tfile_prefix)

### Step 4: Draw a bar chart for elapsed time of TF ops 

In [None]:
filename= tfile_prefix +'_tf_op_duration_bar.png'
title_=tfile_prefix +'TF : op duration bar chart'
ax=tfp.summarize_barh(timeline_pd, 'arg_op', title=title_, topk=50, logx=True, figsize=(10,10))
tfp.show(ax,'bar')

### Step 5: Draw a pie chart for total time percentage of TF ops 

In [None]:
filename= tfile_prefix +'_tf_op_duration_pie.png'
title_=tfile_prefix +'TF : op duration pie chart'
timeline_pd_known = timeline_pd[ ~timeline_pd['arg_op'].str.contains('unknown') ]
ax=tfp.summarize_pie(timeline_pd_known, 'arg_op', title=title_, topk=50, logx=True, figsize=(10,10))
tfp.show(ax,'pie')
ax.figure.savefig(filename,bbox_inches='tight')

## Section 1.2: Analyze TF Timeline results between Stock and Intel Tensorflow
### Speedup from MKL-DNN among different TF operations

### Step 1: Select  one Intel and one Stock TF timeline files for analysis

#### List out different kind of timeline files according to its network topology and data type

In [None]:
if len(datafiles) is 1:
    print("ERROR! There is only 1 json file in the selected folder.")
    print("Please select other Timeline folder from beginnning to proceed Section 1.2.")
tindex = int(len(datafiles)/2)
types = datafiles[:tindex]
index = 0
for t in types:
    t=t.split('/')[1]
    t = t.strip("mkl")
    t = t.strip("stock")
    print(" %d : %s " %(index, t))
    index+=1

#### ACTION: Please select one kind of timeline files and change type_index accordingly

In [None]:
type_index = 0

#### List out two selected timeline files

In [None]:
selected_datafiles = []
selected_datafiles.append(datafiles[type_index])
selected_datafiles.append(datafiles[type_index + tindex])
print(selected_datafiles)

### Step 2: Parsing timeline results into CSV files

In [None]:
%matplotlib agg
from profiling.profile_utils import TFTimelinePresenter
tfp = TFTimelinePresenter(True)
for fn in selected_datafiles:
    if fn.find('/'):
        fn_nofd=fn.split('/')[1]
    else:
        fn_nofd=fn
    tfile_name= fn_nofd.split('.')[0]
    tfile_prefix = fn_nofd.split('_')[0]
    tfile_postfix = fn_nofd.strip(tfile_prefix)[1:]
    print(tfile_name)
    timeline_pd = tfp.postprocess_timeline(tfp.read_timeline(fn))
    timeline_pd = timeline_pd[timeline_pd['ph'] == 'X']
    tfp.get_tf_ops_time(timeline_pd,fn,tfile_prefix)

### Step 3: List out result files among different runs

In [None]:
import os
import pandas as pd
csvfiles = [TimelineFd +os.sep+ x for x in os.listdir(TimelineFd) if '.csv' == x[-4:]]
csvarray=[]
for csvf in csvfiles:
    print(csvf)
    a = pd.read_csv(csvf)
    csvarray.append(a)

a = csvarray[0]
b = csvarray[1]

### Step 4: Merge two CSV files and caculate the speedup accordingly

In [None]:
import os
import pandas as pd
fdir='merged'
if not os.path.exists(fdir):
    os.mkdir(fdir)
    
fpath=fdir+os.sep+'merged.csv'
merged=tfp.merge_two_csv_files(fpath,a,b)
merged

### Step 5: Draw a bar chart for elapsed time of TF ops among stock TF and Intel TF

In [None]:
%matplotlib inline
print(fpath)
tfp.plot_compare_bar_charts(fpath)
tfp.plot_compare_ratio_bar_charts(fpath)

### Step 6: Draw pie charts for elapsed time of TF ops among stock TF and Intel TF

In [None]:
tfp.plot_compare_pie_charts(fpath)