# Theoretical Analysis
> Theoretical Analysis (level 0) for all CNN topologies and hardware platforms

- toc: true 
- badges: true
- comments: true
- categories: [Rooflines,MNIST,ImageNet,CIFAR-10]
- image: images/roofline.png

In [1]:
#hide
import numpy as np
import pandas as pd

pd.options.display.max_rows = 500 # this will set limit of columns to 500

import altair as alt
W = 600
H = 480

# Introduction

This page presents a Theoretical Analysis of both hardware platforms as well as CNN topologies.
In order to get a general overview of all CNNs and Hardware Platforms included in our experiments, we present the following 3 tables. 

# Tables

### CNNs and Their Accuracy Over All Pruning and Quantization Variants

Table below provides a complete overview of all CNNs that were included in the experimentation and their corresponding accuracy over all Pruning and Quantization Variants.

In [2]:
#hide_input
%run scripts/script_tables.py  #run the heatmaps script
tableOverviewExperiments(['data/cnn_topologies_accuracy.csv'])

Unnamed: 0_level_0,INT2,INT4,INT8,FP16,FP32
Unnamed: 0_level_1,top1 (top5) [%],top1 (top5) [%],top1 (top5) [%],top1 (top5) [%],top1 (top5) [%]
GoogLeNetv1,nm,nm,69.24 (88.45),66.93 (87.83),66.96 (87.84)
MobileNetv1,nm,nm,69.57 (87.71),nm,nm
EfficientNet small,nm,nm,77,nm,nm
EfficientNet medium,nm,nm,78.6,nm,nm
EfficientNet large,nm,nm,80.2,nm,nm
ResNet-50 100%,nm,nm,73.29 (91.26),75.14 (92.12),75.15 (92.11)
ResNet-50 80%,nm,nm,73.30 (91.40),nm,nm
ResNet-50 50%,nm,nm,69.49 (91.00),nm,nm
ResNet-50 30%,nm,nm,68.83 ( 90.16),nm,nm
CNV 100%,86.86,87.4,nm,87.02,87.06


In [3]:
df = pd.read_csv('data/cnn_topologies_accuracy.csv')
df1 = pd.DataFrame()
columns = (df.loc[:, df.columns!=' ']).columns #select all columns except first
for column in columns:
    df_=pd.melt(df, id_vars=[' '], value_vars=column) #melt df1 into a df1 of 2 columns
    df1=pd.concat([df1,df_])
df1.columns= ['net_prun_quant','quant','top1'] #setting new column names
df1 = df1[df1['top1'] !='top1 (top5) [%]']  #removing trash
df1 = df1.reset_index()
df1.net_prun_quant = df1.net_prun_quant + ' ' + df1.quant
df1 = df1.drop(columns=['index','quant'])
df1 = df1[df1.top1!='nm'] # take all 'nm' out
df1['top1'] = df1['top1'].str.split(' ').str[0] #take top5 acc out
df1['net_prun_quant'] = df1['net_prun_quant'].str.replace(' ','_')
df1

Unnamed: 0,net_prun_quant,top1
9,CNV_100%_INT2,86.86
10,CNV_50%_INT2,84.29
11,CNV_25%_INT2,79.89
12,CNV_12.5%_INT2,73.64
13,MLP_100%_INT2,98.75
14,MLP_50%_INT2,98.49
15,MLP_25%_INT2,98.04
16,MLP_12.5%_INT2,96.85
26,CNV_100%_INT4,87.4
27,CNV_50%_INT4,84.88


In [4]:
df2 = pd.read_csv("data/cleaned_csv/performance_prediction_cifar10.csv")
df3 = pd.read_csv("data/cleaned_csv/performance_prediction_imagenet.csv")
df4 = pd.read_csv("data/cleaned_csv/performance_prediction_mnist.csv")
df2= pd.concat([df2, df3, df4])

In [5]:
df2.head(500)

Unnamed: 0,y,x,values
0,Ultra96-INT8,CNV_100%_INT2,
1,ZCU104-DPU-INT8,CNV_100%_INT2,
2,ZCU102-DPU-INT8,CNV_100%_INT2,
3,ZCU104-FINN-INT2,CNV_100%_INT2,12457.94043
4,ZCU104-FINN-INT4,CNV_100%_INT2,
5,ZCU104-BISMO-INT2,CNV_100%_INT2,12457.94043
6,ZCU104-BISMO-INT4,CNV_100%_INT2,
7,TX2-maxn-FP16,CNV_100%_INT2,
8,TX2-maxn-FP32,CNV_100%_INT2,
9,TX2-maxp-FP16,CNV_100%_INT2,


In [6]:
df2['x'] = df2['x'].str.replace(' ','_')
df2['x']= df2['x'].str.replace('-','_')
df2 = df2[df2['values'].notna()]
df2.columns=['y','net_prun_quant','values']

In [7]:
df1.head(200)

Unnamed: 0,net_prun_quant,top1
9,CNV_100%_INT2,86.86
10,CNV_50%_INT2,84.29
11,CNV_25%_INT2,79.89
12,CNV_12.5%_INT2,73.64
13,MLP_100%_INT2,98.75
14,MLP_50%_INT2,98.49
15,MLP_25%_INT2,98.04
16,MLP_12.5%_INT2,96.85
26,CNV_100%_INT4,87.4
27,CNV_50%_INT4,84.88


In [8]:
df=pd.merge(df1, df2, on='net_prun_quant', how='outer')
df.columns = ['net_prun_quant', 'top1', 'hardw', 'fps']
df = df[df['fps'].notna()]
df = df[df['top1'].notna()]
df.head(500)

Unnamed: 0,net_prun_quant,top1,hardw,fps
0,CNV_100%_INT2,86.86,ZCU104-FINN-INT2,12457.94043
1,CNV_100%_INT2,86.86,ZCU104-BISMO-INT2,12457.94043
2,CNV_50%_INT2,84.29,ZCU104-FINN-INT2,49331.2
3,CNV_50%_INT2,84.29,ZCU104-BISMO-INT2,49331.2
4,CNV_25%_INT2,79.89,ZCU104-FINN-INT2,201600.0
5,CNV_25%_INT2,79.89,ZCU104-BISMO-INT2,201600.0
6,CNV_12.5%_INT2,73.64,ZCU104-FINN-INT2,638592.0
7,CNV_12.5%_INT2,73.64,ZCU104-BISMO-INT2,638592.0
8,MLP_100%_INT2,98.75,ZCU104-FINN-INT2,7680.0
9,MLP_100%_INT2,98.75,ZCU104-BISMO-INT2,7680.0


In [9]:
#cnv_df = df[df.net_prun_quant.isin(['CNV'])] 
#mlp_df = df[df.net_prun_quant.isin(['MLP'])]
#rn50_df = df[df.net_prun_quant.isin(['GNv1','ResNet50','MobileNetv1'])]

In [10]:
cnv_df = df[df.apply(lambda row: row['net_prun_quant'].split('_')[0] == 'CNV', axis=1)]
mlp_df = df[df.apply(lambda row: row['net_prun_quant'].split('_')[0] == 'MLP', axis=1)]
rn50_df = df[df.apply(lambda row: row['net_prun_quant'].split('_')[0] == 'ResNet50', axis=1)]

In [11]:
cnv_df.net_prun_quant= cnv_df.net_prun_quant+'_'+cnv_df.hardw
cnv_df=cnv_df.drop(columns=['hardw'])
cnv_df.head(200)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self[name] = value


Unnamed: 0,net_prun_quant,top1,fps
0,CNV_100%_INT2_ZCU104-FINN-INT2,86.86,12457.94043
1,CNV_100%_INT2_ZCU104-BISMO-INT2,86.86,12457.94043
2,CNV_50%_INT2_ZCU104-FINN-INT2,84.29,49331.2
3,CNV_50%_INT2_ZCU104-BISMO-INT2,84.29,49331.2
4,CNV_25%_INT2_ZCU104-FINN-INT2,79.89,201600.0
5,CNV_25%_INT2_ZCU104-BISMO-INT2,79.89,201600.0
6,CNV_12.5%_INT2_ZCU104-FINN-INT2,73.64,638592.0
7,CNV_12.5%_INT2_ZCU104-BISMO-INT2,73.64,638592.0
20,CNV_100%_INT4_ZCU104-FINN-INT4,87.4,6228.970213
21,CNV_100%_INT4_ZCU104-BISMO-INT4,87.4,6228.970213


In [12]:
%run scripts/altair_plots.py  #run the plot script if it wasn't previously run
pareto_graph(df= cnv_df, 
             groupcol= 'net_prun_quant', 
             xcol= 'fps', 
             ycol= 'top1', 
             W= W, 
             H= H, 
             title= "CNV Cassification Design Space: Accuracy versus Performance")

### CNNs and Their Compute and Memory Requirements

Next table shows Compute and Memory Requirements for all CNNs in number of operations ([GOPs]), Model Size ([ME]) and Operational Intensity ([OI]) in operations per byte read or written from memory.

In [13]:
#hide_input
%run scripts/altair_plots.py  #run the heatmaps script
tableOverviewExperiments(['data/cnn_topologies_compute_memory_requirements.csv'])

Unnamed: 0_level_0,Total OPs,Total Model Size,OI (INT2),OI (INT4),OI (INT8),OI (FP16),OI (FP32)
Unnamed: 0_level_1,GOPs,[ME],[Ops/Byte],[Ops/Byte],[Ops/Byte],[Ops/Byte],[Ops/Byte]
GoogLeNetv1,3.1,6.0,2093.97,1046.99,523.49,261.75,130.87
MobileNetv1,1.1,4.2,1075.47,537.74,268.87,134.43,67.22
ResNet-50 100%,7.7,25.5,1210.84,605.42,302.71,151.36,75.68
ResNet-50 80%,6.5,23.7,1086.59,543.3,271.65,135.82,67.91
ResNet-50 50%,3.8,15.8,949.85,474.93,237.46,118.73,59.37
ResNet-50 30%,2.5,10.1,970.16,485.08,242.54,121.27,60.64
EfficientNet Edge L,4.7,5.4,3481.48,1740.74,870.37,435.18,217.59
EfficientNet Edge M,7.4,6.9,4289.86,2144.93,1072.46,536.23,268.12
EfficientNet Edge S,19.4,10.6,7313.21,3656.6,1828.3,914.15,457.08
CNV 100%,0.47,6.16,304.95,152.48,76.24,38.12,19.06


## Hardware Platforms

Table below summarizes all included hardware platforms, each with its corresponding peak performance for diferent datatypes (INTx, FPx), its Memory Bandwidth, Memory capacity as well as Thermal Design Power.

In [14]:
#hide_input
%run scripts/altair_plots.py  #run the heatmaps script
tableOverviewExperiments(['data/hardware_platforms.csv'])

Hardware Platforms,INT2,INT4,INT8,FP16,FP32,Memory Bandwidth,Memory Capacity,Power
Unnamed: 0_level_1,[TOP/sec],[TOP/sec],[TOP/sec],[TOP/sec],[TOP/sec],[GBps],[GB],[Watt]
Ultra96-DPU,na,na,0.96,na,na,4.26,2,na
ZCU104-DPU,na,na,4.6,na,na,19.2,4,na
ZCU102-DPU,na,na,6.71,na,na,19.2,4,na
ZCU104-FINN,30.7,8.8,na,na,na,19.2,4,na
ZCU104-BISMO,30.7,8.8,na,na,na,19.2,4,na
TX2 - maxn,na,na,na,1.33,0.67,59.7,8,15
TX2 - maxp,na,na,na,1.15,0.57,59.7,8,15
TX2 - maxq,na,na,na,0.87,0.44,59.7,8,15
TPU-fast,na,na,4,na,na,25.6,1,2
TPU-slow,na,na,2,na,na,25.6,1,2


## Overview of Theoretical Evaluation

link to: https://rcl-lab.github.io/Qutibench_Web/mnist/imagenet/cifar-10/2020/04/30/Overview_of_experiments.html

# Rooflines for all Hardware Platforms and CNNs

Combining application requirements with hardware platform characteristics can be leveraged for performance predictions using UCB’s roofline models. Using assumptions for where weights, activation tensors, and state of a neural network are stored, combined with the size of the datatypes used, allow us to derive the arithmetic intensity of a neural network during inference. Combined with the roofline for a given hardware platform, we can provide insight as to whether a neural network will be memory or compute bound and guidance for what is theoretically possible in regards to its throughput.

In [15]:
#hide_input

#first process the following csv's to get clean ready to plot csv's
%run scripts/script_load_save_data.py
clean_csv_rooflines(path_topologies='c:/Users/alinav/Documents/GitHub/Qutibench_Web/_notebooks/data/topology_details.csv',
                    path_hardware='c:/Users/alinav/Documents/GitHub/Qutibench_Web/_notebooks/data/peakPerfBandHardPlatf.csv')

#Now get the cleaned csv, and plot it as a Roofline
%run scripts/altair_plots.py
rooflines(pd.read_csv("data/cleaned_csv/rooflines_hardware_neural_networks.csv"), 'imagenet|mnist|cifar')

# Performance Prediction

The following heatmaps show the theoretical performance for the listed hardware platforms across the various machine learning tasks: MNIST, ImageNet and CIFAR-10. The metric used for the theoretical performance is input/second.

In [16]:
#hide
# First process the unfiltered csv and save it as a pretty csv ready to plotted as a heatmap
%run scripts/script_load_save_data.py
clean_csv_performance_predictions('c:/Users/alinav/Documents/GitHub/Qutibench_Web/_notebooks/data/performance_predictions_imagenet_mnist_cifar.csv')

### MNIST

For MNIST, quantization combined with pruning deliver some of best performance results.

In [17]:
#hide_input
%run scripts/altair_plots.py
#load mnist dataset and plot it
heatmap(pd.read_csv("data/cleaned_csv/performance_prediction_mnist.csv"), 'red', 'Performance Prediction for MNIST')

### ImageNet

For ImageNet, quantization combined with pruning also deliver some of best performance results.

In [18]:
#hide_input
%run scripts/altair_plots.py  #run the heatmaps script
#load imagenet dataset and plot it
heatmap(pd.read_csv("data/cleaned_csv/performance_prediction_imagenet.csv"), 'lightgrey','Performance Prediction for Imagenet')

### CIFAR-10

Finally, for CIFAR-10, quantization combined with pruning deliver some of best performance results

In [19]:
#hide_input
%run scripts/altair_plots.py  #run the heatmaps script
#load cifar10 dataset and plot it
heatmap(pd.read_csv("data/cleaned_csv/performance_prediction_cifar10.csv"), 'pink','Performance Prediction for CIFAR-10')