# SGEMM GPU Kernel Performance

## Description

The SGEMM GPU Kernel Performance dataset measures the running time of a product between two matrices of size 2048 x 2048 using a parametrizable SGEMM GPU kernal with 241600 possible parameter contributions. Each of the combinations are representated by a single row in the dataset, along with 4 test runtimes in the last four columns.

The column attributes are as follows:
| Column Name | Notation | Range | Description |
| - | - | - | - |
| mwg | MWG | {16, 32, 64, 128} | Per-matrix 2D tiling at workgroup level |
| nwg | NWG | {16, 32, 64, 128} | Per-matrix 2D tiling at workgroup level |
| kwg | KWG | {16, 32} | Inner dimension of 2D tiling at workgroup level |
| mdimc | MDIMC | {8, 16, 32} | Local workgroup size |
| ndimc | NDIMC | {8, 16, 32} | Local workgroup size |
| mdima | MDIMA | {8, 16, 32} | Local memory shape |
| ndimb | NDIMB | {8, 16, 32} | Local memory shape |
| kwi | KWI | {2, 8} | Kernel loop unrolling factor |
| vwm | VWM | {1, 2, 4, 8} | Per-matrix vector widths for loading and storing |
| vwn | VWN | {1, 2, 4, 8} | Per-matrix vector widths for loading and storing |
| strm | STRM | {0, 1} | Enabling of stride for accessing off-chip memory within a single thread |
| strn | STRN | {0, 1} | Enabling of stride for accessing off-chip memory within a single thread |
| sa | SA | {0, 1} | Per-matrix manual caching of the 2D workgroup tile |
| sb | SB | {0, 1} | Per-matrix manual caching of the 2D workgroup tile |
| run1 | - | - | Run 1 result in milliseconds |
| run2 | - | - | Run 2 result in milliseconds |
| run3 | - | - | Run 3 result in milliseconds |
| run4 | - | - | Run 4 result in milliseconds |

[Source](http://archive.ics.uci.edu/ml/datasets/SGEMM+GPU+kernel+performance)

## Importing the Dataset

In [None]:
import pandas as pd

column_names = ['mwg',
                'nwg',
                'kwg',
                'mdimc',
                'ndimc',
                'mdima',
                'ndimb',
                'kwi',
                'vwm',
                'vwn',
                'strm',
                'strn',
                'sa',
                'sb',
                'run1',
                'run2',
                'run3',
                'run4']

with open("..\..\datasets\\regression\sgemm_product.csv", "r") as dataset_file:
    raw_data = pd.read_csv(dataset_file, delimiter=',', header=0, names=column_names)

## Preparing the Dataset

In [None]:
# No preperations need to be done.
processed_data = raw_data

The following block prints the shape and column datatypes of the processed dataset.

In [None]:
print(processed_data.shape)
print(processed_data.dtypes)