New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Support Adept Persimmon 8b #3410

Merged

ggerganov merged 30 commits into ggerganov:master from phillip-kravtsov:phillip-kravtsov/support-adept-persimmon-8b

Oct 7, 2023

Contributor

phillip-kravtsov commented Sep 29, 2023 •

edited

Adds Persimmon 8B which is, architecturally, a standard dense transformer with:
- Q/K layernorm
- Squared ReLU activations
- partial RoPE
- very large vocab size (most unused for text)

To support Partial RoPE & Squared ReLU, this PR adds concat & square kernels for metal.
I've confirmed agreement between the GGML & HF implementation up to tensor values in the last layer.

phillip-kravtsov added 18 commits

September 20, 2023 17:24


          Produces garbage output

7cdc3ea


          wip: correct tensors up to RoPE

4bcf412


          correct tensors thru RoPE

c9e1446


          Correct outputs through masked & softmax'd KQ

d1b40ef


          fp32 works

db2181a


          Rename adept->persimmon

3f31799


          Merge branch 'master' of github.com:phillip-kravtsov/llama.cpp into p…

720503b

…hillip-kravtsov/support-adept-persimmon-8b


          Produces correct outputs

d61eed0


          Merge branch 'master' of github.com:ggerganov/llama.cpp into phillip-…

d0a7143

…kravtsov/support-adept-persimmon-8b


          clean up convert scripts

fa92f6e


          remove printing logic from ggml.c

c28a6c5


          remove prints from llama.cpp & fix merge

47dcb9f


          Merge branch 'master' of github.com:ggerganov/llama.cpp into phillip-…

…kravtsov/support-adept-persimmon-8b


          trivial cleanups

d904aff


          Add offload funcs

ec0ce97


          update conversion script to directly take adept artifacts rather than…

3db04db

… .saftensors file


          Fix norm eps bug

f28f52c


          Merge branch 'master' of github.com:ggerganov/llama.cpp into phillip-…

d93cf1e

…kravtsov/support-adept-persimmon-8b

goerch reviewed

View reviewed changes

convert-persimmon-to-gguf.py Outdated Show resolved Hide resolved

ggerganov added high priority model labels

phillip-kravtsov added 2 commits

September 30, 2023 13:24


          Merge branch 'master' of github.com:ggerganov/llama.cpp into phillip-…

574a9e1

…kravtsov/support-adept-persimmon-8b


          Support sqr and concat on metal, persimmon-8b-q4 runs correctly

2b56591

ggerganov approved these changes

View reviewed changes

ggml-metal.m Outdated Show resolved Hide resolved

ggml-metal.m Outdated Show resolved Hide resolved

ggml-metal.m Show resolved Hide resolved

ggml-metal.m Outdated Show resolved Hide resolved

gguf-py/gguf/gguf.py Outdated Show resolved Hide resolved

gguf-py/gguf/gguf.py Outdated Show resolved Hide resolved

llama.cpp Outdated Show resolved Hide resolved

llama.cpp Outdated Show resolved Hide resolved

llama.cpp Show resolved Hide resolved

phillip-kravtsov added 3 commits

October 2, 2023 10:21


          Small changes from review

e6bf87f


          Formatting changes

cd4d3df


          Minor changes to conversion script

422b110

phillip-kravtsov commented

View reviewed changes

ggml-metal.m Show resolved Hide resolved

phillip-kravtsov added 2 commits

October 2, 2023 14:00


          Merge branch 'master' of github.com:ggerganov/llama.cpp into phillip-…

5a0990c

…kravtsov/support-adept-persimmon-8b


          Remove old script

7a279fe

Owner

ggerganov commented Oct 3, 2023

Let's resolve the CI fails and merge


          Fix editorconfig formatting

c90ed9f

cebtenzzre reviewed

View reviewed changes

gguf-py/gguf/gguf.py Outdated Show resolved Hide resolved


          Merge branch 'master' of github.com:ggerganov/llama.cpp into phillip-…

5d259d3

…kravtsov/support-adept-persimmon-8b. ggml-ci

phillip-kravtsov force-pushed the phillip-kravtsov/support-adept-persimmon-8b branch from 92acb44 to 5d259d3 Compare

October 5, 2023 18:04


          Fix build

1d518d6

ggerganov reviewed

View reviewed changes

llama.cpp Outdated Show resolved Hide resolved

phillip-kravtsov added 2 commits

October 6, 2023 12:39


          Merge branch 'master' of github.com:ggerganov/llama.cpp into phillip-…

0c1a8f6

…kravtsov/support-adept-persimmon-8b


          add overlooked offload code ggml-ci

485a471

ggerganov merged commit 0e797c2 into ggerganov:master

35 checks passed

Collaborator

slaren commented Oct 7, 2023

The switches in llm_load_hparams and llama_build_graph are missing breaks, so it should be using the refact graph. Does this work currently?

Owner

ggerganov commented Oct 7, 2023

@phillip-kravtsov PTAL at @slaren's comment and fix as necessary

Collaborator

KerfuffleV2 commented Oct 7, 2023

I got tired of seeing the compiler warning and created #3535 (not sure if there are any other issues, haven't had a chance to test it yet).

Contributor Author

phillip-kravtsov commented Oct 8, 2023

Thanks for the fix @KerfuffleV2 -- that PR should be sufficient.

joelkuiper added a commit to vortext/llama.cpp that referenced this pull request


          Merge branch 'master' of github.com:ggerganov/llama.cpp into grammar-…

f7b9bf1

…example

* 'master' of github.com:ggerganov/llama.cpp:
  py : change version of numpy requirement to 1.24.4 (ggerganov#3515)
  quantize : fail fast on write errors (ggerganov#3521)
  metal : support default.metallib load & reuse code for swift package (ggerganov#3522)
  llm : support Adept Persimmon 8B (ggerganov#3410)
  Fix for ggerganov#3454 (ggerganov#3455)
  readme : update models, cuda + ppl instructions (ggerganov#3510)
  server : docs fix default values and add n_probs (ggerganov#3506)

snichols pushed a commit to xgaicc/llama.cpp that referenced this pull request


          llm : support Adept Persimmon 8B (ggerganov#3410)

45ef950

* Produces garbage output

* wip: correct tensors up to RoPE

* correct tensors thru RoPE

* Correct outputs through masked & softmax'd KQ

* fp32 works

* Rename adept->persimmon

* Produces correct outputs

* clean up convert scripts

* remove printing logic from ggml.c

* remove prints from llama.cpp & fix merge

* trivial cleanups

* Add offload funcs

* update conversion script to directly take adept artifacts rather than .saftensors file

* Fix norm eps bug

* Support sqr and concat on metal, persimmon-8b-q4 runs correctly

* Small changes from review

* Formatting changes

* Minor changes to conversion script

* Remove old script

* Fix editorconfig formatting

* Fix build

* add overlooked offload code ggml-ci

leo-gan mentioned this pull request

Update llama.cpp integration langchain-ai/langchain#11864

Merged

Galunid mentioned this pull request

Unbreak persimmon after #3837 #4010

Merged

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment