SA-DS

Description

The SA-DS project is committed to automating AI accelerators by introducing a specialized dataset tailored for Deep Neural Network (DNN) Accelerators. This dataset consists of SCALA files used to generate Gemmini hardware accelerators. Due to GitHub policies, only the first 1000 data points are visible, but zip files with similar names contain the complete dataset. It encompasses file formats such as CSV and JSON, facilitating training, fine-tuning, and leveraging multi-short prompt inputs for Large Language Models (LLMs). For a comprehensive understanding of the SA-DS dataset, please refer to the following paper: SA-DS.

Getting Started

This section provides a quick start guide for utilizing the SA-DS dataset along with its associated configurations.

Prerequisites

Ensure Docker is installed on your system for this setup.

Installation

Using Docker

Pull the Docker Image: Execute the following command to retrieve the required Docker image:
```
sudo docker run -it --privileged deepakvungarala/sa_ds:v_1 bash
```
Navigate to Gemmini Configurations: Within the Docker container, navigate to the Gemmini configurations directory:
```
cd chipyard/generators/gemmini/configs/
```
In this directory, a Python script named autogemm.py can be found which is used to verify the data points in dataset.
Replace Configuration Files: Replace GemminiCustomConfigs.scala and GemminiDefaultConfigs.scala with the files provided in this repository.

Without Docker

Prerequisites

Set up GitHub by following the instructions in the following repository: Gemmini. Note: Due to some issues, we cannot successfully clone the above repository. Hence, we opted for the Docker setup. The packages differ, and compiling the same code for a Git-cloned repository and Docker may not yield the expected results.

-After the GitHub repo is successfully cloned perform the following steps:

Utilize the file GeneratingSA_DS_with_title.py in this repository to create the dataset for code corresponding to GitHub.
Replace Configuration Files: Follow the same steps for file replacement as outlined for Docker users.

Citation Information

If this dataset is used in any research, please use the following citation:

@misc{SADS2024dataset,
      title={A Dataset for Large Language Model-Driven AI Accelerator Generation},
      author={Mahmoud Nazzal and Deepak Vungarala and Mehrdad Morsali and Chao Zhang and Arnob Ghosh and Abdallah Khreishah and Shaahin Angizi},
      year={2024},
      eprint={2404.10875},
      archivePrefix={arXiv},
      primaryClass={cs.AR}
}

SADS2024dataset. A Dataset for Large Language Model-Driven AI Accelerator Generation. Mahmoud Nazzal, Deepak Vungarala, Mehrdad Morsali, Chao Zhang, Arnob Ghosh, Abdallah Khreishah, and Shaahin Angizi (2024). arXiv:2404.10875 [cs.AR] https://arxiv.org/abs/2404.10875

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
SA_DS		SA_DS
SA_DS_csv		SA_DS_csv
SA_DS_json		SA_DS_json
GeneratingSA_DS_with_title.py		GeneratingSA_DS_with_title.py
LICENSE		LICENSE
README.md		README.md
SA_DS_csv_zip.zip		SA_DS_csv_zip.zip
SA_DS_json_zip.zip		SA_DS_json_zip.zip
SA_DS_zip.zip		SA_DS_zip.zip
dataset_SA_DS_ALL.csv		dataset_SA_DS_ALL.csv
newgemmconv_json.py		newgemmconv_json.py
sadsverificationscript.py		sadsverificationscript.py

License

ACADLab/SA-DS

Folders and files

Latest commit

History

Repository files navigation

SA-DS

Description

Getting Started

Prerequisites

Installation

Using Docker

Without Docker

Prerequisites

Citation Information

About

Resources

License

Stars

Watchers

Forks

Languages