Skip to content

MinP for Exllama 2

Compare
Choose a tag to compare
@kalomaze kalomaze released this 28 Oct 14:06
· 585 commits to main since this release

Min P sampling added. When Top P = 0.69 it will override and scale based on 'Min P'. Replace the sampler.py in /text-generation-webui-main/installer_files/env/Lib/site-packages/exllamav2/generator and it should function.

The way that it works is:

  • Every possible token has a probability percentage attached to it.
  • The base min p value represents the starting required percentage. (For example, 0.05 = only include tokens that are at least 5% probable)
  • This gets multiplied by the top token in the entire list's probability. So if your top token is 90%, then that 5% is multiplied by 0.9x (4.5%)
  • So if the top token is 90% probable, and your base_min_p is set to 0.05, then only tokens that are at least 4.5% probable will be sampled from before temperature is applied.

This method seems more effective at selecting the reasonable tokens compared to both Top P and Top K.

Edit the SamplerBaseMinP.txt file to change the base 'consideration' value. The default is 0.05 (5%), but lower values can work surprisingly well even with a high temperature.

image

This is how you toggle it on.

Note: This is built off the ALT version of the Entropy sampling implementation, but the Dynamic Temp is still only applied if your temp is set to 1.84, so you are not forced to use it.

Graphic Explanation of Min P:

image