Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: expose min_p #254

Closed
morpheus2448 opened this issue Jan 9, 2024 · 7 comments
Closed

Feature request: expose min_p #254

morpheus2448 opened this issue Jan 9, 2024 · 7 comments
Labels
enhancement New feature or request good first issue Good for newcomers local Issue related to local generation

Comments

@morpheus2448
Copy link
Contributor

Please add min_p to user parameters!

Thanks

@Digitous
Copy link

Digitous commented Jan 9, 2024

Please add min_p to user parameters!

Thanks

min_p really is a game changer

@danemadsen
Copy link
Member

what api? local?

@danemadsen danemadsen added enhancement New feature or request good first issue Good for newcomers labels Jan 9, 2024
@Digitous
Copy link

Digitous commented Jan 9, 2024

Local tbh, it doesn't have any kind of big performance hit, raises coherency of interactions noticably with just a minor tweak (used to have a laptop with an experimental kobold.cpp built w/min-p and it fixed the majority of coherence issues even in much smaller models). I'm biased to that RN as Android is literally all I have for inference so, I'm dedicated on-device Android feedback lol.

@danemadsen
Copy link
Member

It will be easy enough to add it. But what is it meant to do out of curiosity? Is it a new parameter of llama.cpp or something?

@danemadsen danemadsen added the local Issue related to local generation label Jan 10, 2024
@Digitous
Copy link

I first heard of min-p through kalimaze's experimental kobold.cpp builds; it's quite fascinating - personally I am not 100% certain if llama.cpp has implemented this but it should be standardized. I can scour and see if llama.cpp has it in recent releases.

"Every possible token has a probability percentage attached to it.
The base min p value represents the starting required percentage. (For example, 0.05 = only include tokens that are at least 5% probable)
This gets multiplied by the top token in the entire list's probability. So if your top token is 90%, then that 5% is multiplied by 0.9x (4.5%)
So if the top token is 90% probable, and your base_min_p is set to 0.05, then only tokens that are at least 4.5% probable will be sampled from before temperature is applied.
This method seems more effective at selecting the reasonable tokens compared to both Top P and Top K."

https://github.com/kalomaze/koboldcpp/releases/tag/minP

Also some interesting work from kalimaze that gave great results for me was dynamic temp and noisy sampling. Not sure if that's tests only in their releases, but interesting nonetheless.

https://github.com/kalomaze/koboldcpp/releases

@danemadsen
Copy link
Member

I first heard of min-p through kalimaze's experimental kobold.cpp builds; it's quite fascinating - personally I am not 100% certain if llama.cpp has implemented this but it should be standardized. I can scour and see if llama.cpp has it in recent releases.

"Every possible token has a probability percentage attached to it.
The base min p value represents the starting required percentage. (For example, 0.05 = only include tokens that are at least 5% probable)
This gets multiplied by the top token in the entire list's probability. So if your top token is 90%, then that 5% is multiplied by 0.9x (4.5%)
So if the top token is 90% probable, and your base_min_p is set to 0.05, then only tokens that are at least 4.5% probable will be sampled from before temperature is applied.
This method seems more effective at selecting the reasonable tokens compared to both Top P and Top K."

https://github.com/kalomaze/koboldcpp/releases/tag/minP

Also some interesting work from kalimaze that gave great results for me was dynamic temp and noisy sampling. Not sure if that's tests only in their releases, but interesting nonetheless.

https://github.com/kalomaze/koboldcpp/releases

Ahh, OK. If its a kobold.cpp exclusive feature I probably won't add it but if its in llama.cpp I'll add it in no problem.

@Digitous
Copy link

Digitous commented Jan 10, 2024

Good news! Apparently llama.cpp did merge it as a feature :D
Usually I do due diligence up front, but had a few things I'm juggling.

PR Merge:
ggerganov/llama.cpp#3841

Reddit:
https://www.reddit.com/r/LocalLLaMA/comments/17lb6et/min_p_sampler_merged_in_llamacpp/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers local Issue related to local generation
Projects
None yet
Development

No branches or pull requests

3 participants