Software Refactoring Prediction Model 🔄

Welcome to the Software Refactoring Prediction Model repository! This project utilizes machine learning techniques to predict code refactoring needs. It's designed to help developers maintain clean and efficient codebases by whether the developer need to refactor that particular piece of code or not.

Introduction 🌟

This project is part of a replication of an existing study 📄

Original Research Paper
Old Codebase
Their code used a software called Refactroing Miner, which is a task in itself to get it up and running. Contrary to that, we used the SQL Scripts provided to extract data in almost half the time.
- Data Fetching Scripts This project uses Python to analyze codebases and predict refactoring opportunities. It leverages several machine learning models to assess various aspects of the code and suggests potential refactoring to enhance code quality.

Our Work:

Improved the code by remvoing the unwanted methods and features that weren't needed in the final project.
Included Randomized and Grid Search Cross Validation support.
Due to challenges in obtaining the optimal hardware for running this project, performed the same analysis and got similar results on a fraction (0.2%, 0.5% & 1.0%) of the original dataset.

Getting Started 🚀

Follow these simple steps to get a local copy up and running.

Prerequisites 📋

Python 3.8+
pip

Installation 💽

Clone the repo:

git clone https://github.com/Hetav01/Software-Refactoring-Prediction-Model.git

Extract the amount of dataset required for the pipeline from the Data Fetching Scripts.
Copy the CSV dataset in the dataset folder.
Edit the pathnames at required places, namely, preprocessing/preprocessing.py, binaryClassification.py, testing/Runner_Test.py and testing/binaryClassification2.py.
Before running the driver file for the entire pipeline, install all the required dependencies:
```
pip3 install --user -r requirements.txt
```
The driver file for the code is either binaryClassification.py or testing/binaryClassification2.py depending on whether you want to just get the results or additionally test the models on unseen data(use testing/binaryClassification2.py for that). You can run either by executing the following command:
```
 python3 binaryClassification.py
```
```
 python3 testing/binaryClassification2.py
```

The script will follow the configurations in the configs.py. There, you can define which datasets to analyze, which models to build, which under sampling algorithms to use, and etc. Please, read the comments of this file carefully.

For collecting the results, the Python scripts will automatically update the result.txt and result_unseen.txt files to provide you with the latest metrics. Refer to the terminal while the program is running to understand which Hyperparamters work best for each model.

Contributing 🤝

Contributions are what make the open-source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.

Fork the Project
Create your Feature Branch (git checkout -b feature/AmazingFeature)
Commit your Changes (git commit -m 'Add some AmazingFeature')
Push to the Branch (git push origin feature/AmazingFeature)
Open a Pull Request

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Software Refactoring Prediction Model 🔄

Introduction 🌟

This project is part of a replication of an existing study 📄

Our Work:

Getting Started 🚀

Prerequisites 📋

Installation 💽

Contributing 🤝

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 78 Commits
dataset		dataset
ml		ml
preprocessing		preprocessing
testing		testing
utils		utils
README.md		README.md
binaryClassification.py		binaryClassification.py
configs.py		configs.py
requirements.txt		requirements.txt
result.txt		result.txt
result_unseen.txt		result_unseen.txt

Hetav01/Software-Refactoring-Prediction-Model

Folders and files

Latest commit

History

Repository files navigation

Software Refactoring Prediction Model 🔄

Introduction 🌟

This project is part of a replication of an existing study 📄

Our Work:

Getting Started 🚀

Prerequisites 📋

Installation 💽

Contributing 🤝

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages