Create a virtual environnement and add all the dependencies needed with :
$ python -m venv c:\path\to\myenv
$ pip install -r requirements.txt
- Fill a .env file like the .env-example for the default parameters of the inputs.
- A Github Personnal Token is required (only 60 API calls per hour otherwise)
You are ready to call the script with your cli !
- get-data-repo : get the data of the Github repository put in the .env file.
- --repository_name : name of the repository to get the data from
- find-repo : find public repositories on Github which have over the minimum amount of stars and the language put in the .env file
- --lang : language of the repository to find
- --min_stars : minimum amount of stars of the repository to find
- --nb_repo : number of repositories to find
- semantic-test-repo : uses CodeT5 to get the maximum score of a file related to the issues in the db
- --repository_name : name of the repository to test
Every command has a --help option available to get more info on the current cli call.
$ python main.py find-repo --lang python --nb_repo 20
$ python main.py get-data-repo --repository_name "nvbn/thefuck"
Fetching files ⣾
Fetching issues |████████████████████████████████| 41/41
Fetching pulls |████████████████████████████████| 41/41
Fetching data |████████████████████████████████| 34/34