Bugfix: missing hparam `type_vocab_size` #32

FFengIll · 2023-09-14T10:56:57Z

type_vocab_size is also a hparam (can not use const as 2).
so does the converter.

FFengIll · 2023-09-14T10:58:17Z

Here is some of the model upon bert DO NOT use type_vocab_size=2 but type_vocab_size=1 (like e5).

https://huggingface.co/intfloat/multilingual-e5-base/blob/main/config.json#L25

skeskinen · 2023-09-18T13:08:20Z

Hi, This seems like a good change.

But surely bert.cpp:358 also needs to be changed? Where the hparams are read from the model file

FFengIll · 2023-09-19T01:56:24Z

Hi, This seems like a good change.

But surely bert.cpp:358 also needs to be changed? Where the hparams are read from the model file

sure, I will add it.

FFengIll · 2023-09-19T01:57:32Z

bert.cpp


        model.layers.resize(n_layer);

        model.word_embeddings = ggml_new_tensor_2d(ctx, wtype, n_embd, n_vocab);
-        model.token_type_embeddings = ggml_new_tensor_2d(ctx, wtype, n_embd, 2);
+        model.token_type_embeddings = ggml_new_tensor_2d(ctx, wtype, n_embd, n_vocab_size);


@skeskinen
here is a change since the tensor is related to n_vocab_size.
for many case, it is 2, but not a const.

some ref
google-research/bert#16
https://huggingface.co/docs/transformers/model_doc/bert#transformers.BertConfig

FFengIll · 2023-09-19T02:03:25Z

bert.cpp

@@ -23,6 +23,7 @@ struct bert_hparams
    int32_t n_intermediate = 1536;
    int32_t n_head = 12;
    int32_t n_layer = 6;
+    int32_t n_vocab_size = 2;


Here is a break change since new field.
I believe some others will happen in future.
So we may try to shift into GGUF.

https://github.com/philpax/ggml/blob/gguf-spec/docs/gguf.md

FFengIll · 2023-09-20T05:41:10Z

bert.cpp

@@ -364,6 +365,7 @@ struct bert_ctx * bert_load_from_file(const char *fname)
        fin.read((char *)&hparams.n_intermediate, sizeof(hparams.n_intermediate));
        fin.read((char *)&hparams.n_head, sizeof(hparams.n_head));
        fin.read((char *)&hparams.n_layer, sizeof(hparams.n_layer));
+        fin.read((char *)&hparams.n_vocab_size, sizeof(hparams.n_vocab_size));


so does here.

FFengIll added 2 commits September 14, 2023 18:55

bugfix: type_vocab_size is also a hparam (can not use const as 2).

dd15c0f

bugfix: add type_vocab_size.

a3c8548

FFengIll mentioned this pull request Sep 18, 2023

GGUF file format specification ggerganov/ggml#302

Merged

bugfix: missing read for new n_vocab_size.

7ef3126

FFengIll commented Sep 20, 2023

View reviewed changes

bugfix: add n_vocab_size into quantize.

79825d8

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bugfix: missing hparam `type_vocab_size` #32

Bugfix: missing hparam `type_vocab_size` #32

FFengIll commented Sep 14, 2023 •

edited

FFengIll commented Sep 14, 2023 •

edited

skeskinen commented Sep 18, 2023

FFengIll commented Sep 19, 2023 •

edited

FFengIll Sep 19, 2023

FFengIll Sep 19, 2023

FFengIll Sep 19, 2023

FFengIll Sep 20, 2023

Bugfix: missing hparam type_vocab_size #32

Are you sure you want to change the base?

Bugfix: missing hparam type_vocab_size #32

Conversation

FFengIll commented Sep 14, 2023 • edited

FFengIll commented Sep 14, 2023 • edited

skeskinen commented Sep 18, 2023

FFengIll commented Sep 19, 2023 • edited

FFengIll Sep 19, 2023

Choose a reason for hiding this comment

FFengIll Sep 19, 2023

Choose a reason for hiding this comment

FFengIll Sep 19, 2023

Choose a reason for hiding this comment

FFengIll Sep 20, 2023

Choose a reason for hiding this comment

Bugfix: missing hparam `type_vocab_size` #32

Bugfix: missing hparam `type_vocab_size` #32

FFengIll commented Sep 14, 2023 •

edited

FFengIll commented Sep 14, 2023 •

edited

FFengIll commented Sep 19, 2023 •

edited