-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dataset Scripts & Docs #1
Conversation
README.md
Outdated
## Dataset preparation | ||
**Pre-requisites:** | ||
* Python dependencies from `scripts/requirements.txt` installed (run `pip install -r scripts/requirements.txt`) | ||
* A repositories folder (dataset), where git projects are stored in format `[dataset path]/author/repo` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have a script we can add for cloning all repos from a spec
file? :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@elatoskinas
yes, it's here: https://github.com/saltudelft/many-types-4-py-dataset/blob/master/repo_cloner/__main__.py
The first step is to run the cloner with this JSON file as input: https://github.com/saltudelft/many-types-4-py-dataset/blob/master/mypy-dependents-by-stars.json
The second step is to write a shell script that changes the state of git repositories based on the commit hash.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pushed the update, I've documented how to run the cloner with the given json
file, and added an auxiliary script in the scripts
folder in order to revert the commit hashes.
@elatoskinas |
@mir-am I've just realized this is missing our JSON representation generation (and also the |
I'll add the JSON representation step to the README. Before that, I need to add |
@elatoskinas |
Added scripts from
ml-typeinf-competition
with an updated readme