Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing apps-train-files json file? #4

Closed
ywen666 opened this issue Sep 1, 2021 · 5 comments
Closed

Missing apps-train-files json file? #4

ywen666 opened this issue Sep 1, 2021 · 5 comments

Comments

@ywen666
Copy link

ywen666 commented Sep 1, 2021

Hi,

Thank you for releasing this amazing codebase! I found that the appsdata need to take apps-train-files json file as an input but I couldn't find anything in the provided apps dataset. I wonder if I am missing somewhere.

Thanks!

@xksteven
Copy link
Collaborator

xksteven commented Sep 1, 2021

Could you please provide more details on what you're trying to do?

I couldn't understand from your description.

@ywen666
Copy link
Author

ywen666 commented Sep 1, 2021

I couldn't find the file this line requires in the dataset (https://github.com/hendrycks/apps/blob/main/train/tune_apps_gpt.py#L159). But I think it is just a json file with list of training folders.

@xksteven
Copy link
Collaborator

xksteven commented Sep 2, 2021

I'll update you later tonight. I need to redownload the apps dataset as I thought we included it in there.
Otherwise I'll create a new one and upload it to the git repo. After that I'll update the README and this issue.

@ywen666
Copy link
Author

ywen666 commented Sep 2, 2021

Not a big issue because the json file can be inferred from the APPSBaseDataset.py file. Btw, I wonder how many gpus need to fine-tune the gpt-neo on apps. I saw the batch size per replica is only 2.

@xksteven
Copy link
Collaborator

xksteven commented Sep 2, 2021

I added the instructions here: https://github.com/hendrycks/apps/blob/main/train/README.md
and the script here: https://github.com/hendrycks/apps/blob/main/train/apps_create_split.py

As for how many GPUs I believe it is listed in the paper. I can't remember the numbers off hand.

@xksteven xksteven closed this as completed Sep 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants