`convert.py --vocab-type hfft` produces `<0x0A>` instead of new lines #5064

Artefact2 · 2024-01-21T12:52:30Z

I am using llama.cpp master. I am trying to convert this model to GGUF: https://huggingface.co/tenyx/TenyxChat-8x7B-v1

After running convert.py --vocab-type hfft, the model will not output new lines correctly:

./main -m ~/TenyxChat-8x7B-v1-Q8_0.gguf -p "[INST]Write some poetry about typography.[/INST]" -n 128

system_info: n_threads = 8 / 16 | AVX = 1 | AVX_VNNI = 0 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | 
sampling: 
        repeat_last_n = 64, repeat_penalty = 1.100, frequency_penalty = 0.000, presence_penalty = 0.000
        top_k = 40, tfs_z = 1.000, top_p = 0.950, min_p = 0.050, typical_p = 1.000, temp = 0.800
        mirostat = 0, mirostat_lr = 0.100, mirostat_ent = 5.000
sampling order: 
CFG -> Penalties -> top_k -> tfs_z -> typical_p -> top_p -> min_p -> temp 
generate: n_ctx = 512, n_batch = 512, n_predict = 128, n_keep = 0


 [INST]Write some poetry about typography.[/INST] In the world of words, where letters roam,<0x0A>Typography sets the tone, like a artist's home.<0x0A>Each stroke and curve, a deliberate choice,<0x0A>A symphony of shapes, with a unified voice.<0x0A><0x0A>Serif, sans-serif, or script,<0x0A>A message is sent, before even dict.<0x0A>In a font's design, we see the creator's mind,<0x0A>A reflection of culture, that's hard to find.<0x0A><0x0A>Some letters kiss, others stand apart,<0x0A>Creating rhythm and flow, with an artful heart.<0x0A>The negative space
llama_print_timings:        load time =    1578.15 ms
llama_print_timings:      sample time =      16.40 ms /   128 runs   (    0.13 ms per token,  7803.45 tokens per second)
llama_print_timings: prompt eval time =    1893.05 ms /    14 tokens (  135.22 ms per token,     7.40 tokens per second)
llama_print_timings:        eval time =   40938.67 ms /   127 runs   (  322.35 ms per token,     3.10 tokens per second)
llama_print_timings:       total time =   42878.28 ms /   141 tokens

Possibly related to #4622.

The text was updated successfully, but these errors were encountered:

Artefact2 · 2024-01-21T13:31:24Z

Patch below seems to fix the issue.

diff --git a/convert.py b/convert.py
index 06768033..333cc1a0 100755
--- a/convert.py
+++ b/convert.py
@@ -509,11 +509,13 @@ class HfVocab:
 
             # Convert token text to bytes
             token_text = reverse_vocab[token_id].encode("utf-8")
+            if token_text.startswith(b"<0x") and token_text.endswith(b">"):
+                toktype = gguf.TokenType.BYTE
+            else:
+                toktype = self.get_token_type(token_id, self.special_ids)
 
             # Yield token text, score, and type
-            yield token_text, self.get_token_score(token_id), self.get_token_type(
-                token_id, self.special_ids  # Reuse already stored special IDs
-            )
+            yield token_text, self.get_token_score(token_id), toktype
 
     def get_token_type(self, token_id: int, special_ids: set[int]) -> gguf.TokenType:
         # Determine token type based on whether it's a special token

If this looks good I can open a MR.

ggerganov · 2024-01-22T09:02:01Z

I guess it's OK, though I would like a second opinion

Artefact2 added the bug-unconfirmed label Jan 21, 2024

Artefact2 mentioned this issue Jan 22, 2024

convert : fix byte tokens for --vocab-type hfft #5084

Closed

ggerganov mentioned this issue Feb 5, 2024

py : handle byte tokens in get_token_type #5341

Merged

ggerganov closed this as completed in #5341 Feb 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`convert.py --vocab-type hfft` produces `<0x0A>` instead of new lines #5064

`convert.py --vocab-type hfft` produces `<0x0A>` instead of new lines #5064

Artefact2 commented Jan 21, 2024

Artefact2 commented Jan 21, 2024 •

edited

Loading

ggerganov commented Jan 22, 2024

convert.py --vocab-type hfft produces <0x0A> instead of new lines #5064

convert.py --vocab-type hfft produces <0x0A> instead of new lines #5064

Comments

Artefact2 commented Jan 21, 2024

Artefact2 commented Jan 21, 2024 • edited Loading

ggerganov commented Jan 22, 2024

`convert.py --vocab-type hfft` produces `<0x0A>` instead of new lines #5064

`convert.py --vocab-type hfft` produces `<0x0A>` instead of new lines #5064

Artefact2 commented Jan 21, 2024 •

edited

Loading