Replicate llama.cpp default settings... #108

oddpxl · 2024-02-26T17:19:18Z

When compiling llama.cpp "out of the box" and prompting it as follows... ( in this case on a Mac M1 )

./main -p "Write a rhyme haiku about a rabbit and a cube." -m llama-2-7b-chat.Q4_0.gguf -n 128 -ngl 33 --mlock --threads 8

We can see that llama.cpp use the following sampling settings and order...

sampling:
repeat_last_n = 64, repeat_penalty = 1.100, frequency_penalty = 0.000, presence_penalty = 0.000
top_k = 40, tfs_z = 1.000, top_p = 0.950, min_p = 0.050, typical_p = 1.000, temp = 0.800
mirostat = 0, mirostat_lr = 0.100, mirostat_ent = 5.000

sampling order:
CFG -> Penalties -> top_k -> tfs_z -> typical_p -> top_p -> min_p -> temperature
generate: n_ctx = 512, n_batch = 512, n_predict = 128, n_keep = 1

The ability to replicate settings and sample order would be very useful when comparing results with llama.cpp

Also - several of these are key to adjusting LLM behaviour - like temperature and penalty etc

MarcusDunn · 2024-02-26T17:29:52Z

We do replicate the default settings. the Default impl for both LlamaContextParams and LlamaModelParams defer to llama.cpp.

main.cpp depends on common.cpp and sampling.cpp - both of which I consider out of scope for this project to maintain bindings too. (I wrote our own version of grammar.cpp to avoid extra bindings).

There is a plan on the llama.cpp side to move sampling.cpp behind llama.h in which case I imagine the sampling params would align a lot better. If there's specific context, model, or sampling params you want to tune I'd be happy to add them one by one.

I've created #109 to attempt to slowly move us towards allowing replicating main.cpp in rust.

oddpxl · 2024-02-26T18:16:26Z

All make sense !

..and clearly I need to read up on sampling - thanks for the pointers !

MarcusDunn mentioned this issue Feb 26, 2024

Add more sampling bindings #109

Closed

5 tasks

MarcusDunn added the 🪄 enhancement additions to the software label Feb 26, 2024

MarcusDunn closed this as completed Feb 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replicate llama.cpp default settings... #108

Replicate llama.cpp default settings... #108

oddpxl commented Feb 26, 2024 •

edited

Loading

MarcusDunn commented Feb 26, 2024 •

edited

Loading

oddpxl commented Feb 26, 2024

Replicate llama.cpp default settings... #108

Replicate llama.cpp default settings... #108

Comments

oddpxl commented Feb 26, 2024 • edited Loading

MarcusDunn commented Feb 26, 2024 • edited Loading

oddpxl commented Feb 26, 2024

oddpxl commented Feb 26, 2024 •

edited

Loading

MarcusDunn commented Feb 26, 2024 •

edited

Loading