Releases · gotzmann/llama.go · GitHub

28 Apr 11:50

gotzmann

v1.4: Server Mode Latest

Latest

Introducing a Server Mode enabling easy-to-use REST API for the inner GPT model. Let's go Production :)

Assets 2

23 Apr 18:29

gotzmann

Better Defaults

Nothing special, more stable inference and more sane default parameters

Assets 2

20 Apr 16:12

gotzmann

AVX2 and NEON

Inference performance was boosted for CPUs supporting vector math.

Please use:

--neon flag for Apple Silicon (M1-M3 processors) and ARM servers

--avx for Intel and AMD CPUs which supports AVX2 instruction set

Assets 2

15 Apr 15:43

gotzmann

Big Models are OK

This version supports bigger / multipart LLaMA models (tested with 7B / 13B) converted into latest GGMJ binary format with custom Python script (see README).

Assets 2

12 Apr 16:10

gotzmann

April 12 - First Man in Space

The very first public release of LLaMA.go

Assets 2