Skip to content

Generation parameters

Daniil edited this page Sep 19, 2023 · 1 revision

Temporarily out of service

This guide explains the fine-tuning of generation settings for GPT models. In NeuroGPT, you can find them as follows: Settings >> Parameters.


temperature

In other words, the temperature. This parameter controls the diversity of the model's responses. A lower value makes the generation more deterministic and predictable, as it tends to choose the most probable words. A higher value makes the generation more random and creative, allowing the model to choose less probable words. A higher temperature can lead to incoherent and illogical responses. The default value is 1. When T>1, it gives more weight to less probable words than to probable ones.

top_p

The top-p parameter indicates how many words can be considered for choosing the next word in text generation. It limits the number of most probable words from which the model randomly selects. This helps create responses with different word variations. What is the difference from temperature? Top-p controls the size of the word pool from which the next word is chosen, while the temperature parameter regulates the diversity and randomness of word selection from that pool. The default value is 0.9.

n_choices

The n_choices parameter affects the number of generated alternative answer options. The default value is 1.

stop

In other words, stop. This is a way to tell the model when it should stop the generation. In this parameter, you can set specific stop words or phrases that, when encountered, the model should stop generating its response. This is useful when you want to control the length of the response or specify a stopping point.

max_context

This parameter controls the maximum number of tokens you want to include in the dialogue context. A token is a unit used to tokenize text into words and punctuation marks. For example, max_context = 3000 means that the model will "remember" only the last 3000 tokens and use them for generating a response. It is important to note that if you have chosen gpt-3.5-turbo, its maximum context is 4097, so even if you specify max_context = 10000, the maximum context will still be 4097.

max_generations

This parameter determines the maximum length of generated tokens. If a long and informative response is required, you should increase this value, but keep in mind that longer responses may increase the API call execution time.

presence_penalty

Changing this parameter affects the likelihood of the model repeating words or phrases in its responses. Increasing the value of this parameter reduces the likelihood of repetitions, while decreasing the value allows for more repetitions to be used.

frequency_penalty

This parameter determines the degree to which the model prefers to use words or phrases that are more commonly encountered in the language. If the value of this parameter is higher, the model will be more inclined to use less common words. If the value is lower, preference will be given to more frequently occurring words.