Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gpt-tfjs only repeats the last prompt token #656

Closed
3 tasks done
JulienVig opened this issue Apr 3, 2024 · 0 comments · Fixed by #658
Closed
3 tasks done

gpt-tfjs only repeats the last prompt token #656

JulienVig opened this issue Apr 3, 2024 · 0 comments · Fixed by #658
Assignees
Labels
bug Something isn't working discojs Related to Disco.js
Milestone

Comments

@JulienVig
Copy link
Collaborator

JulienVig commented Apr 3, 2024

Here are some issues with gpt-tfjs I noted while implementing tokenization:

  • There is a memory leak in the training loop. The memory doesn't grow much (~0.01MB per iteration) but the number of tensors keep growing (+14 new tensors allocated per dataset batch)
  • A trained model (e.g. on wikitext with iteration>1000, validation perplexity<4) almost always repeats the last prompt token
  • Create a test case for the wikitext task
@JulienVig JulienVig added bug Something isn't working discojs Related to Disco.js labels Apr 3, 2024
@JulienVig JulienVig self-assigned this Apr 8, 2024
@martinjaggi martinjaggi added this to the v3.0.0 milestone Jul 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working discojs Related to Disco.js
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants