Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HF dataset caching doesn't work correctly with preprocessing pipeline #62

Closed
entrpn opened this issue May 17, 2024 · 0 comments · Fixed by #63
Closed

HF dataset caching doesn't work correctly with preprocessing pipeline #62

entrpn opened this issue May 17, 2024 · 0 comments · Fixed by #63
Assignees
Labels
bug Something isn't working

Comments

@entrpn
Copy link
Collaborator

entrpn commented May 17, 2024

In fn make_pokemon_iterator the vae and encoders are not hashable, causing the dataset transforms to be re-built every time.

As a result:

  • Preprocessing is done on every training script.
  • Unit tests eventually fail due to no disk space.
@entrpn entrpn added the bug Something isn't working label May 17, 2024
@entrpn entrpn self-assigned this May 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant