llama2.go

This is a Go port of llama2.c.

Setup

Download a model:
Download tokenizer.bin
go install github.com/saracen/llama2.go/cmd/llama2go@latest

Do things:

./llama2go --help
llama2go: <checkpoint>
  -cpuprofile string
         write cpu profile to file
  -prompt string
         prompt
  -steps int
         max number of steps to run for, 0: use seq_len (default 256)
  -temperature float
         temperature for sampling (default 0.9)

./llama2go -prompt "Cute llamas are" -steps 38 --temperature 0 stories110M.bin
<s>
Cute llamas are two friends who love to play together. They have a special game that they play every day. They pretend to be superheroes and save the world.
achieved tok/s: 43.268528

Performance

Tokens per second:

system	model	llama2.c	llama2.go (no cgo)	llama2.go (cgo)
M1 Max, 10-Core, 32 GB	stories15M	676.392573	246.885611	473.840849
M1 Max, 10-Core, 32 GB	stories42M	267.295597	98.165245	151.396638
M1 Max, 10-Core, 32 GB	stories110M	100.671141	42.592345	69.804907

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
cmd/llama2go		cmd/llama2go
web		web
LICENSE		LICENSE
README.md		README.md
checkpoint_mmap.go		checkpoint_mmap.go
go.mod		go.mod
llama2.go		llama2.go
llama2_cgo_darwin.go		llama2_cgo_darwin.go
llama2_other.go		llama2_other.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

llama2.go

Setup

Performance

About

Releases

Packages

Languages

License

saracen/llama2.go

Folders and files

Latest commit

History

Repository files navigation

llama2.go

Setup

Performance

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages