Feature request: expose min_p #254

morpheus2448 · 2024-01-09T09:13:02Z

Please add min_p to user parameters!

Thanks

Digitous · 2024-01-09T11:53:58Z

Please add min_p to user parameters!

Thanks

min_p really is a game changer

danemadsen · 2024-01-09T22:48:32Z

what api? local?

Digitous · 2024-01-09T23:52:03Z

Local tbh, it doesn't have any kind of big performance hit, raises coherency of interactions noticably with just a minor tweak (used to have a laptop with an experimental kobold.cpp built w/min-p and it fixed the majority of coherence issues even in much smaller models). I'm biased to that RN as Android is literally all I have for inference so, I'm dedicated on-device Android feedback lol.

danemadsen · 2024-01-10T00:51:19Z

It will be easy enough to add it. But what is it meant to do out of curiosity? Is it a new parameter of llama.cpp or something?

Digitous · 2024-01-10T01:49:23Z

I first heard of min-p through kalimaze's experimental kobold.cpp builds; it's quite fascinating - personally I am not 100% certain if llama.cpp has implemented this but it should be standardized. I can scour and see if llama.cpp has it in recent releases.

"Every possible token has a probability percentage attached to it.
The base min p value represents the starting required percentage. (For example, 0.05 = only include tokens that are at least 5% probable)
This gets multiplied by the top token in the entire list's probability. So if your top token is 90%, then that 5% is multiplied by 0.9x (4.5%)
So if the top token is 90% probable, and your base_min_p is set to 0.05, then only tokens that are at least 4.5% probable will be sampled from before temperature is applied.
This method seems more effective at selecting the reasonable tokens compared to both Top P and Top K."

https://github.com/kalomaze/koboldcpp/releases/tag/minP

Also some interesting work from kalimaze that gave great results for me was dynamic temp and noisy sampling. Not sure if that's tests only in their releases, but interesting nonetheless.

https://github.com/kalomaze/koboldcpp/releases

danemadsen · 2024-01-10T03:30:43Z

I first heard of min-p through kalimaze's experimental kobold.cpp builds; it's quite fascinating - personally I am not 100% certain if llama.cpp has implemented this but it should be standardized. I can scour and see if llama.cpp has it in recent releases.

"Every possible token has a probability percentage attached to it.
The base min p value represents the starting required percentage. (For example, 0.05 = only include tokens that are at least 5% probable)
This gets multiplied by the top token in the entire list's probability. So if your top token is 90%, then that 5% is multiplied by 0.9x (4.5%)
So if the top token is 90% probable, and your base_min_p is set to 0.05, then only tokens that are at least 4.5% probable will be sampled from before temperature is applied.
This method seems more effective at selecting the reasonable tokens compared to both Top P and Top K."

https://github.com/kalomaze/koboldcpp/releases/tag/minP

Also some interesting work from kalimaze that gave great results for me was dynamic temp and noisy sampling. Not sure if that's tests only in their releases, but interesting nonetheless.

https://github.com/kalomaze/koboldcpp/releases

Ahh, OK. If its a kobold.cpp exclusive feature I probably won't add it but if its in llama.cpp I'll add it in no problem.

Digitous · 2024-01-10T04:20:24Z

Good news! Apparently llama.cpp did merge it as a feature :D
Usually I do due diligence up front, but had a few things I'm juggling.

PR Merge:
ggerganov/llama.cpp#3841

Reddit:
https://www.reddit.com/r/LocalLLaMA/comments/17lb6et/min_p_sampler_merged_in_llamacpp/

danemadsen added enhancement New feature or request good first issue Good for newcomers labels Jan 9, 2024

danemadsen added the local Issue related to local generation label Jan 10, 2024

danemadsen mentioned this issue Jan 10, 2024

Adding min_p parameter #262

Merged

danemadsen closed this as completed Jan 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature request: expose min_p #254

Feature request: expose min_p #254

morpheus2448 commented Jan 9, 2024

Digitous commented Jan 9, 2024

danemadsen commented Jan 9, 2024

Digitous commented Jan 9, 2024 •

edited

danemadsen commented Jan 10, 2024

Digitous commented Jan 10, 2024

danemadsen commented Jan 10, 2024

Digitous commented Jan 10, 2024 •

edited

Feature request: expose min_p #254

Feature request: expose min_p #254

Comments

morpheus2448 commented Jan 9, 2024

Digitous commented Jan 9, 2024

danemadsen commented Jan 9, 2024

Digitous commented Jan 9, 2024 • edited

danemadsen commented Jan 10, 2024

Digitous commented Jan 10, 2024

danemadsen commented Jan 10, 2024

Digitous commented Jan 10, 2024 • edited

Digitous commented Jan 9, 2024 •

edited

Digitous commented Jan 10, 2024 •

edited