Skip to content

v0-2023-October-31

Latest
Compare
Choose a tag to compare
@smpanaro smpanaro released this 01 Nov 00:55
· 2 commits to main since this release
d31c674

gpt2 model family, 2-4x faster compared to the prior release.

  • gpt2 now uses KV caching for faster generation
  • all models generate multiple tokens per second (to get the fastest speeds, see the instructions in SETUP.md)
  • iOS 16+/macOS 13+ now required

gpt2-xl is split up into multiple files, per Github's restrictions. Download both parts and decompress them like so:
cat gpt2-xl.mlpackage.tar.gz.* | tar -xzvf -