Skip to content

ualberta-smr/PyMigBench

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PyMigBench is a benchmark of Python Library Migrations. This repository contains the data and code for the dataset.

PyMigBench v2

The current version, PyMigBench-2.0, includes 3,096 migration-related code changes from 335 migrations between 141 analogous library pairs. This includes all migrations from PyMigBench v1 and additional migrations borrowed from the SALM dataset. The data also includes additional information per migration-related code change compared to v1.

The dataset is published through the FSE 2024 paper titled Characterizing Python Library Migrations. We will add the citation info once it is available. Release 2.0.2 points to the exact dataset linked to the paper. The data is also permanently archived in figshare. Use either of these links to reproduce the paper.

We may update this repository to correct any mistakes or add more data and it may not synch with the paper. For, the latest data, use the latest release in this repository.

PyMigBench v1

We recommend using PyMigBench v2 for any new research. However, you want to use the v1 dataset, you should look at Release 1.0.3. Cite the paper below if you use the v1 dataset.

@INPROCEEDINGS{pymigbench,
  author={Islam, Mohayeminul and Jha, Ajay Kumar and Nadi, Sarah and Akhmetov, Ildar},
  booktitle={2023 IEEE/ACM 20th International Conference on Mining Software Repositories (MSR)}, 
  title={PyMigBench: A Benchmark for Python Library Migration}, 
  year={2023},
  volume={},
  number={},
  pages={511-515},
  doi={10.1109/MSR59073.2023.00075}
}

Contributors

For any queries, please contact mohayemin@ualberta.ca.