Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

download data issue #1

Closed
redna11 opened this issue Nov 11, 2020 · 4 comments
Closed

download data issue #1

redna11 opened this issue Nov 11, 2020 · 4 comments

Comments

@redna11
Copy link

redna11 commented Nov 11, 2020

Hello,

Trying to get the listops data from gc.

https://console.cloud.google.com/storage/browser/long-range-arena
https://storage.googleapis.com/long-range-arena/lra_release

doesn't work.

gsutil -m cp -R gs://long-range-arena ./lra_release

gives: "ServiceException: 401 Anonymous caller does not have storage.objects.list access to the Google Cloud Storage bucket."

Any suggestions to improve data accessibility?

thanks

@vanzytay
Copy link
Collaborator

Hi,

Thanks for reporting this issue. You can obtain the listops data using

https://storage.cloud.google.com/long-range-arena/lra_release/listops-1000/basic_train.tsv
https://storage.cloud.google.com/long-range-arena/lra_release/listops-1000/basic_val.tsv
https://storage.cloud.google.com/long-range-arena/lra_release/listops-1000/basic_test.tsv

For now, only the files have been set to public but not the entire folder. Do let us know if you're not able to still access the files.

Thanks!

@da03
Copy link

da03 commented Nov 16, 2020

@vanzytay I can download listops dataset, but what about other datasets? Can you post filenames of other datasets as well?

@vanzytay
Copy link
Collaborator

Hi @da03,

You can find the AAN splits here

https://storage.googleapis.com/long-range-arena/lra_release/aan/new_aan_pairs.eval.only_ids.tsv
https://storage.googleapis.com/long-range-arena/lra_release/aan/new_aan_pairs.test.only_ids.tsv
https://storage.googleapis.com/long-range-arena/lra_release/aan/new_aan_pairs.train.only_ids.tsv

The IMDB and Cifar datasets already come with TFDS so you should be able to use them without any issues.

We'll finding a better way to release the Pathfinder and will reply to this thread ASAP when we have a friendlier format to release the folder.

@vanzytay
Copy link
Collaborator

vanzytay commented Dec 1, 2020

Hey!

You can now download the entire data at https://storage.googleapis.com/long-range-arena/lra_release.gz!

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants