This project aims to analyze hardware performance metrics related to tasks executed on FPGA (Field-Programmable Gate Array) systems using High-Level Synthesis (HLS).
Python 3.9.6
Run the script using:
python3 main.pyTo run this project, you need to have Python installed along with the following libraries: numpy, pandas, seaborn, matplotlib, scikit-learn
Below are the following codes needed:
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, r2_score
from sklearn.preprocessing import StandardScaler
from sklearn.cluster import KMeansYou can import these libraries through Pip.
The dataframe used in this project is contributed by @georgewzg95. The dataset is loaded from a CSV file. The dataframe contains 1340 entries and different FPGA tasks. The dataset contains various hardware performance measurements. Columns include:
clock_speed: The speed at which the task operates. Higher clock speeds usually indicate faster processing times.
alm(Adaptive Logic Modules): Represents the number of adaptive logic modules used by the task. ALMs are the basic building blocks in FPGAs that implement logic functions.
reg (Registers): The number of registers used by the task. Registers are small storage locations within a processor or FPGA that hold data temporarily.
dsp (Digital Signal Processing Units): pecialized hardware units designed to efficiently perform complex mathematical computations, particularly for digital signal processing tasks.
ram (Random Access Memory):a type of computer memory that can be accessed randomly, and it is used to store data temporarily while the task is running.
mlab (Memory Lab Usage): specialized memory blocks within FPGAs used for various memory-related operations.
The purpose of 'pd.read_csv('data_intel.csv')' is to Specify the path to the CSV file to be loaded. Ensure the file path is correct and accessible from the script's location. 'ata_intel.csv' = CSV dataset used in the project.
HLS_data = pd.read_csv('data_intel.csv')Here are some API and variables used in order to analyze the dataset:
X: Feature variables used as input for the first regression model to predict alm.
y: Target variable for the first regression model, representing all.
X1: Feature variables used as input for the second regression model to predict clock_speed.
y1: Target variable for the second regression model, representing clock_speed.
mse: Mean Squared Error, used to evaluate the performance of the regression models.
r2: R-squared, used to evaluate the goodness of fit of the regression models.
coef and coef1: Coefficients of the linear regression models.
scaled_HLS_features: Standardized features from the dataset used for clustering.
inertia: A measure of how internally coherent the clusters are in K-means clustering.
clusters: Cluster assignments for each task generated by the K-means algorithm.
tasks_by_cluster: A collection of task names grouped by their respective clusters.
There are four different sections of the codes and three ML models created.
- Load and inspect the dataset.
- Visualize relationships between various hardware metrics and 'clock_time'/'alm'
- Explore feature distributions using histograms.
vars = HLS_data.columns[3:]
figure, axes = plt.subplots(len(vars), 3, figsize=(15, 30))
sns.set(font_scale=0.8)
figure.subplots_adjust(hspace=0.8, wspace=0.6)
for var in vars:
if var != 'alm':
sns.scatterplot(x=var, y='alm', data=HLS_data, ax=axes[vars.tolist().index(var),0], alpha=0.4).set_title(f'{var} vs alm', fontsize=7, weight='bold')
else:
continue
if var != 'clock_speed':
sns.scatterplot(x=var, y='clock_speed', data=HLS_data, ax=axes[vars.tolist().index(var),1], alpha=0.4).set_title(f'{var} vs clock_speed', fontsize=7, weight='bold')
else:
continue
sns.histplot(x=var, data=HLS_data, ax=axes[vars.tolist().index(var),2]).set_title('Distribution', fontsize=7, weight='bold')
plt.show()
zvs = HLS_data.select_dtypes(include=[np.number])- Compute and display correlation coefficients to identify relationships between key features with timing and operation delay prediction/ Power estimate
- Develop and evaluate linear regression models to predict alm (Adaptive Logic Modules) and clock_speed based on other hardware metrics.
- Assess model performance using metrics such as Mean Squared Error (MSE) and R-squared (R²).
- Coefficient analyzed.
ALM Correlation:
- Strong positive correlation with reg (0.981) and mlab (0.909)
- Moderate positive correlation with dsp (0.703)
- Weak negative correlation with clock_speed (-0.362)
Clock Speed Correlation:
- Weak negative correlation with alm (-0.362) and mlab (-0.414)
- weak correlation with other metrics
ALM Prediction: Model Performance: R² = 0.988, MSE = 24,238,366.47
- Positive influences: reg (0.404), mlab (8.980)
- Negative influences: clock_speed (-23.433), dsp (-96.263), ram (-7.495)
Clock Speed Prediction: Model Performance: R² = 0.382, MSE = 5,969.85
- Positive influences: dsp (0.378), mlab (0.048)
- Negative influences: alm (-0.008), reg (0.003), ram (-0.097)
-
Understand and describe the characteristics of each cluster, providing insights into resource-intensive tasks, performance-optimized tasks, and more.
-
Create three different clusters and categorize them using various measures.
-
Cluster 0: Tasks with moderate resource usage and lower clock speeds, possibly less intensive but still need considerable resources. Example Tasks: atax_1, bicg_14, k2mm_9, bfs_bulk_43, gemm_blocked_39
-
Cluster 1: High resource usage across all metrics, likely representing the most demanding tasks that need powerful hardware. Example Tasks: gemm_ncubed_6, spmv_crs_27, stencil2D_36
-
Cluster 2: High clock speeds but minimal resource consumption, indicating highly efficient tasks that are less resource-intensive but performance-oriented. Example Tasks: atax_3, bicg_5, k3mm_4, syrk_12, bfs_queue_1