gguf : add BERT, MPT, and GPT-J arch info #3408

cebtenzzre · 2023-09-29T22:35:24Z

These GGUF architectures will be used in a future release of gpt4all.

mchiang0610 · 2023-09-30T01:49:54Z

This is awesome! Will you be adding these to the implementation of these architectures or will it just be used for your private fork?

Support for these architectures would be amazing for the community as a whole.

cebtenzzre · 2023-09-30T02:25:22Z

The conversion scripts and CPU inference implementations are here: https://github.com/nomic-ai/gpt4all/tree/gguf_latest_llama/gpt4all-backend

I was mainly focused on updating the existing gpt4all code, so there are surely improvements from ggml and koboldcpp that have not been included.

Which model architecture would be best to add upstream support for first? There is also GPT-NeoX, which is not part of gpt4all.

…add-gguf-architectures

…example * 'master' of github.com:ggerganov/llama.cpp: (24 commits) convert : fix Baichuan2 models by using vocab size in config.json (ggerganov#3299) readme : add project status link ggml : fix build after ggerganov#3329 llm : add Refact model (ggerganov#3329) sync : ggml (conv 1d + 2d updates, UB fixes) (ggerganov#3468) finetune : readme fix typo (ggerganov#3465) ggml : add RISC-V Vector Support for K-Quants and improved the existing intrinsics (ggerganov#3453) main : consistent prefix/suffix coloring (ggerganov#3425) llama : fix session saving/loading (ggerganov#3400) llama : expose model's rope_freq_scale in the API (ggerganov#3418) metal : alibi for arbitrary number of heads (ggerganov#3426) cmake : make LLAMA_NATIVE flag actually use the instructions supported by the processor (ggerganov#3273) Work on the BPE tokenizer (ggerganov#3252) convert : fix vocab size when not defined in hparams (ggerganov#3421) cmake : increase minimum version for add_link_options (ggerganov#3444) CLBlast: Add broadcast support for matrix multiplication (ggerganov#3402) gguf : add BERT, MPT, and GPT-J arch info (ggerganov#3408) gguf : general usability improvements (ggerganov#3409) cmake : make CUDA flags more similar to the Makefile (ggerganov#3420) finetune : fix ggerganov#3404 (ggerganov#3437) ...

gguf : add BERT, MPT, and GPT-J model architectures

52f3cae

cebtenzzre changed the title ~~gguf : add BERT, MPT, and GPT-J model architectures~~ gguf : add BERT, MPT, and GPT-J arch info Sep 30, 2023

ggerganov approved these changes Oct 2, 2023

View reviewed changes

Merge branch 'master' of https://github.com/ggerganov/llama.cpp into …

89fa828

…add-gguf-architectures

cebtenzzre merged commit 29a404a into ggerganov:master Oct 2, 2023
9 of 10 checks passed

yusiwen pushed a commit to yusiwen/llama.cpp that referenced this pull request Oct 7, 2023

gguf : add BERT, MPT, and GPT-J arch info (ggerganov#3408)

270c7c4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gguf : add BERT, MPT, and GPT-J arch info #3408

gguf : add BERT, MPT, and GPT-J arch info #3408

cebtenzzre commented Sep 29, 2023

mchiang0610 commented Sep 30, 2023 •

edited

cebtenzzre commented Sep 30, 2023

gguf : add BERT, MPT, and GPT-J arch info #3408

gguf : add BERT, MPT, and GPT-J arch info #3408

Conversation

cebtenzzre commented Sep 29, 2023

mchiang0610 commented Sep 30, 2023 • edited

cebtenzzre commented Sep 30, 2023

mchiang0610 commented Sep 30, 2023 •

edited