The Minecraft dataset can be downloaded from Zenodo.
First, clone the repository using
git clone --recursive https://github.com/SET-IITGN/Minecraft.git
Now download the required libraries using:
pip3 install -r requirements.txt
Once downloaded all the dependencies, the dataset can be generated by running the following command:
python3 main.py projects.csv
projects.csv is composed of all open source repositories of multiple languages like C, C++, Java, and Python.
Dataset can be further extended by including different languages. To add languges refer to tree-sitter library and make the necessary changes in AST.py.
If you use this dataset + tool in your research, please cite our ASE 2023 paper in the format specified below:
author={Avula, Sai Krishna and Vobbilisetti, Venkatesh and Mondal, Shouvick},
booktitle={2023 38th IEEE/ACM International Conference on Automated Software Engineering (ASE)},
title={Minecraft: Automated Mining of Software Bug Fixes with Precise Code Context},
year={2023},
volume={},
number={},
pages={1969-1979},
doi={10.1109/ASE56229.2023.00116}}