#  Compare Model Zoo Benchmark performance between Intel optimized and stock Tensorflow

This jupyter notebook will help you evaluate performance benefits from Intel-optimized Tensorflow via several pre-trained models from Intel Model Zoo. 
The notebook will show users a bar chart like below for performance comparison among Stock and Intel Tensorflow.

<img src="images\perf_comparison.png"  />

# Get Platform Information 

In [None]:
from profiling.profile_utils import PlatformUtils
plat_utils = PlatformUtils()
plat_utils.dump_platform_info()

# Section 1: Run the benchmark on the selected Jupyter Kernels

## Step 1: Check TensorFlow version and MKL enablement

In [None]:
import tensorflow as tf
print ("We are using Tensorflow version", tf.__version__)
major_version = int(tf.__version__.split(".")[0])
if major_version >= 2:
   from tensorflow.python import _pywrap_util_port
   print("MKL enabled:", _pywrap_util_port.IsMklEnabled())
else:
   print("MKL enabled:", tf.pywrap_tensorflow.IsMklEnabled())

## Step 2: Configure parameters for launch_benchmark.py according to the selected Topology

### Step 2.1: List out the supported topologies

In [None]:
import sys
from profiling.profile_utils import ConfigFile
config = ConfigFile()
sections = config.read_section()
print("Supported topologies: ")
index =0 
for section in sections:
    print(" %d: %s " %(index, section))
    index+=1

### Step 2.2: Pick a topology. 
#### ACTION : Please select one supported topology and change topo_index accordingly

In [None]:
# User picks a topology, Batch Size, and number of required threads
## USER INPUT
topo_index=0

#### List out the selected topology name

In [None]:
if topo_index >= len(sections):
    print("ERROR! please input a topo_index within range")
else:
    topology_name=sections[topo_index]
    print(topology_name)

### Step 2.3: User can also manually set batch size and number of threads

In [None]:
import psutil
import subprocess
import os
cpu_count = psutil.cpu_count(logical=False)
cpu_socket_count =  int(subprocess.check_output('cat /proc/cpuinfo | grep "physical id" | sort -u | wc -l', shell=True))
print("CPU count per socket:" ,  cpu_count ," \nSocket count:",cpu_socket_count)

#### ACTION: Users can change the value of thread_number and batch_size to see different performance
1. thread_umber: the value will apply to num_cores parameters in launch_benchmark.py  
2. utilized_socket_number:  the value will apply to the socket-id parameter in launch_benchmark.py 
3. num_inter_threads: the value will  apply to the num-inter-threads parameter in launch_benchmark.py 
4. num_intra_threads: the value will  apply to the num-intra-threads parameter in launch_benchmark.py 
5. batch_size: the value will apply to the batch_size parameter in launch_benchmark.py 
6. log_folder: the folder where the logs are stored.

In [None]:
## USER INPUT
thread_number=cpu_count 
utilized_socket_number=1 #cpu_socket_count
num_inter_threads = utilized_socket_number
num_intra_threads = thread_number
batch_size=32
log_folder=os.getcwd() + os.sep + "logs"

### Step 2.4: Check mandatory file "launch_benchmark.py"

#### ACTION: Users should change the value of os.environ['ModelZooRoot'] according to their environment

In [None]:
import os
# Users should change ModelZooRoot path according to their environment
## USER INPUT
current_path = os.getcwd()
os.environ['ModelZooRoot'] = current_path + "/../../../"
os.environ['ProfileUtilsRoot'] = os.environ['ModelZooRoot'] + "docs/notebooks/perf_analysis/profiling/"
print(os.environ['ModelZooRoot'])
print(os.environ['ProfileUtilsRoot'])

#### Check those mandatory python scripts after users assign ModelRooRoot and ProfileUtilsRoot

In [None]:
import os
current_path = os.getcwd()
benchmark_path = os.environ['ModelZooRoot'] + "benchmarks/launch_benchmark.py"
if os.path.exists(benchmark_path) == True:
    print(benchmark_path)
else:
    print("ERROR! Can't find benchmark script!")
    
profile_utils_path = os.environ['ProfileUtilsRoot'] + "profile_utils.py"
if os.path.exists(profile_utils_path) == True:
    print(profile_utils_path)
else:
    print("ERROR! Can't find profile_utils script!")

### Step 2.6: Prepare pre-trained model and model parameters for running the benchmark
1. Get related parameters according to selected topology
2. Get pretrained model if needed

In [None]:
# Get the parameters
configvals=config.read_config(topology_name)

# Get the pre-trained model file
if config.wget != '' and config.in_graph == '':
    pretrain_model_path = config.download_pretrained_model(current_path=current_path)
    config.in_graph = pretrain_model_path 
    configvals.append("--in-graph")
    configvals.append(pretrain_model_path)

#Set output-dir folder
if log_folder !='':
    configvals.append("--output-dir")
    configvals.append(log_folder)

params = config.get_parameters(topology_name, configvals,
                   batch_size=batch_size, thread_number=thread_number, socket_number=utilized_socket_number,
                   num_inter_threads=num_inter_threads, num_intra_threads=num_intra_threads)

sys.argv=[benchmark_path]+params
print(sys.argv)

### Step 2.7: Create a CSV file to log the performance numbers

In [None]:
from profiling.profile_utils import PerfPresenter
job_type = 'inference'
csv_fname=job_type+'_'+topology_name.replace(' ', '')+'.csv'
print(csv_fname)
perfp=PerfPresenter()
perfp.create_csv_logfile(job_type, csv_fname)

## Step 3:  Run the benchmark 

> NOTE: Below section will enable Tensorflow timeline for the model by patching it, and then unpatch it after the model completes its training or inference.

In [None]:
# patch related model script
repo_path = os.environ['ModelZooRoot'] #current_path + os.sep + "../../"
config.patch_model_to_enable_timeline(repopath=repo_path)

# run the benchmark with the patch
import sys
benchmark_path = os.environ['ModelZooRoot']+os.sep+"benchmarks/"
sys.path.append(benchmark_path)
from launch_benchmark import LaunchBenchmark

util = LaunchBenchmark()
util.main()

# unpatch related model script
config.unpatch_model_to_enable_timeline(model_path=repo_path+'/models/')

## Step 4: Parse output for performance number

In [None]:
# identify the path of the latest log file
configvals=config.read_config(topology_name)
import os
for file in os.listdir(log_folder):
    if file.endswith(".log"):
        logpath = os.path.join(log_folder, file)
        used_logpath = logpath + ".old"
        os.rename(logpath, used_logpath)
        print(used_logpath)
        break

val = config.throughput_keyword
line = perfp.read_throughput(used_logpath, keyword=val)
if line!=None:
    throughput=line
    print(throughput)
    # log the perf number
    perfp.log_infer_perfcsv(0,throughput, csv_fname)
else:
    print("ERROR! can't find correct performance number from log. please check log for runtime issues")

#### Optional : print out the log file for runtime issues

In [None]:
logfile = open(used_logpath)
logout = logfile.read()
print(logout)

#### Users should be able to see a new Timeline json file after running the benchmark
If users don't see a new timeline json file, they need to make sure that they patch the model script correctly.

In [None]:
!ls -l -h $ModelZooRoot/benchmarks/*.json

## Step 5: Draw the performance comparison diagram
### NOTE: Please go over Section 1 on different Jupyter kernel before comparison
Users can find information in docs/notebooks/perf_analysis/README.md for switching among different Juypter kernels.

In [None]:
%matplotlib inline
from profiling.profile_utils import PerfPresenter

perfp=PerfPresenter()
        
# inference  throughput
perfp.draw_perf_diag_from_csv(csv_fname,'throughput','throughput (image/sec)', topology_name)
perfp.draw_perf_ratio_diag_from_csv(csv_fname,'throughput','speedup', topology_name)

## Step 6: Gather all generated Tensorflow Timeline Json files
Copy the timeline json file from benchmark folder to the Timeline folder with time information.
Those Timeline files will be analyzed in another Jupyter notebook.

In [None]:
!mkdir Timeline; mv $ModelZooRoot/benchmarks/*.json Timeline;mv Timeline Timeline_$(date +%m%d%H%M)