This project replicates "Towards Accurate Duplicate Bug Retrieval Using Deep Learning Techniques" by Jayati Deshmukh, K. M. Annervaz, Sanjay Podder, Shubhashis Sengupta, and Neville Dubash in International Conference on Software Maintenance (ICSM) 2017. We show that even without handling structured information separately, we can achieve comparable performance with respect to the original work.
Download and store the dataset into MongoDB from here.
If you are using Docker for MongoDB, you can find the docker-compose.yaml file in the root directory.
We highly encourage to use a virtual environment to run the project.
You can find the list of necessary packages in the requirements.txt file in the root directory.
Install them by running
pip install -r requirements.txtStart a jupyter server by running
jupyter notebookThen open notebooks/siamese-trials/title-descr-eclipse.ipynb in the jupyter app.
To see result for different datasets, change the third line in the second cell accordingly.
Check project-report.pdf for detailed analysis of the evaluation and findings.