Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't find 10token data #6

Closed
salemohamedo opened this issue Apr 17, 2022 · 2 comments
Closed

Can't find 10token data #6

salemohamedo opened this issue Apr 17, 2022 · 2 comments

Comments

@salemohamedo
Copy link

Hi, I'm trying to run python -m run +alg=mend +experiment=gen +model=distilgpt2 data.wiki_webtext=False, but I get a file not found error for data/10token/data/self_sample/train.json. I downloaded 10token from the linked google drive folder and unzipped it to data/10token. However, when I unzip it, all I get is a single 10token file, no train.json. Not sure if I'm missing something here. Thanks!

@eric-mitchell
Copy link
Owner

After unzipping the Google Drive file (which contains 10token/data/self_sample/*.json, you should move the top-level 10token directory into the data directory inside of the top-level mend directory. Please re-open if you're still having trouble with this!

@cin-hubert
Copy link

@salemohamedo the file format is tar.gz
tar -xf 10token.zip

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants