Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replicate llama.cpp default settings... #108

Closed
oddpxl opened this issue Feb 26, 2024 · 2 comments
Closed

Replicate llama.cpp default settings... #108

oddpxl opened this issue Feb 26, 2024 · 2 comments
Labels
🪄 enhancement additions to the software

Comments

@oddpxl
Copy link

oddpxl commented Feb 26, 2024

When compiling llama.cpp "out of the box" and prompting it as follows... ( in this case on a Mac M1 )

./main -p "Write a rhyme haiku about a rabbit and a cube." -m llama-2-7b-chat.Q4_0.gguf -n 128 -ngl 33 --mlock --threads 8

We can see that llama.cpp use the following sampling settings and order...

sampling:
repeat_last_n = 64, repeat_penalty = 1.100, frequency_penalty = 0.000, presence_penalty = 0.000
top_k = 40, tfs_z = 1.000, top_p = 0.950, min_p = 0.050, typical_p = 1.000, temp = 0.800
mirostat = 0, mirostat_lr = 0.100, mirostat_ent = 5.000

sampling order:
CFG -> Penalties -> top_k -> tfs_z -> typical_p -> top_p -> min_p -> temperature
generate: n_ctx = 512, n_batch = 512, n_predict = 128, n_keep = 1

The ability to replicate settings and sample order would be very useful when comparing results with llama.cpp

Also - several of these are key to adjusting LLM behaviour - like temperature and penalty etc

@MarcusDunn
Copy link
Contributor

MarcusDunn commented Feb 26, 2024

We do replicate the default settings. the Default impl for both LlamaContextParams and LlamaModelParams defer to llama.cpp.

main.cpp depends on common.cpp and sampling.cpp - both of which I consider out of scope for this project to maintain bindings too. (I wrote our own version of grammar.cpp to avoid extra bindings).

There is a plan on the llama.cpp side to move sampling.cpp behind llama.h in which case I imagine the sampling params would align a lot better. If there's specific context, model, or sampling params you want to tune I'd be happy to add them one by one.

I've created #109 to attempt to slowly move us towards allowing replicating main.cpp in rust.

@MarcusDunn MarcusDunn added the 🪄 enhancement additions to the software label Feb 26, 2024
@oddpxl
Copy link
Author

oddpxl commented Feb 26, 2024

All make sense !

..and clearly I need to read up on sampling - thanks for the pointers !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🪄 enhancement additions to the software
Projects
None yet
Development

No branches or pull requests

2 participants