Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Logprobs suggestions #948

Merged
merged 1 commit into from
Dec 21, 2023
Merged

Logprobs suggestions #948

merged 1 commit into from
Dec 21, 2023

Conversation

enochcheung
Copy link
Contributor

@enochcheung enochcheung commented Dec 21, 2023

Summary

Some minor re-wording, to clarify some minor things

Motivation

Why are these changes necessary? How do they improve the cookbook?

I'll comment individually


For new content

When contributing new content, read through our contribution guidelines, and mark the following action items as completed:

  • I have added a new entry in registry.yaml (and, optionally, in authors.yaml) so that my content renders on the cookbook website.
  • I have conducted a self-review of my content based on the contribution guidelines:
    • Relevance: This content is related to building with OpenAI technologies and is useful to others.
    • Uniqueness: I have searched for related examples in the OpenAI Cookbook, and verified that my content offers new insights or unique information compared to existing documentation.
    • Spelling and Grammar: I have checked for spelling or grammatical mistakes.
    • Clarity: I have done a final read-through and verified that my submission is well-organized and easy to understand.
    • Correctness: The information I include is correct and all of my code executes successfully.
    • Completeness: I have explained everything fully, including all necessary references and citations.

We will rate each of these areas on a scale from 1 to 4, and will only accept contributions that score 3 or higher on all areas. Refer to our contribution guidelines for more details.

@@ -8,11 +8,12 @@
"\n",
"This notebook demonstrates the use of the `logprobs` parameter in the Chat Completions API. When `logprobs` is enabled, the API returns the log probabilities of each output token, along with a limited number of the most likely tokens at each token position and their log probabilities. The relevant request parameters are:\n",
"* `logprobs`: Whether to return log probabilities of the output tokens or not. If true, returns the log probabilities of each output token returned in the content of message. This option is currently not available on the `gpt-4-vision-preview` model.\n",
"* `top_logprobs`: An integer between 0 and 5 specifying the number of most likely tokens to return at each token position, each with an associated log probability. logprobs must be set to true if this parameter is used.\n",
"* `top_logprobs`: An integer between 0 and 5 specifying the number of most likely tokens to return at each token position, each with an associated log probability. `logprobs` must be set to true if this parameter is used.\n",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just adding backticks

"Log probabilities of output tokens indicate the likelihood of each token occurring in the sequence given the context. To simplify, a logprob is `log(p)`, where `p` = probability of a token occurring at a specific position based on the other tokens in the sentence. Some key points about `logprobs`:\n",
"* Higher log probabilities suggest a higher likelihood of the token in that context. This allows users to gauge the model's confidence in its output or explore alternative responses the model considered. \n",
"* It also allow us to compute the overall probability of a sequence as the sum of the log probs of the individual tokens. This is useful for scoring and ranking model outputs. It's pretty common to take the average logprob of a sentence to choose the best generation.\n",
"Log probabilities of output tokens indicate the likelihood of each token occurring in the sequence given the context. To simplify, a logprob is `log(p)`, where `p` = probability of a token occurring at a specific position based on the previous tokens in the context. Some key points about `logprobs`:\n",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Slightly clarified

  • causal masking, distribution only depend on previous tokens
  • replace "sentence" with "context"

"* Higher log probabilities suggest a higher likelihood of the token in that context. This allows users to gauge the model's confidence in its output or explore alternative responses the model considered. \n",
"* It also allow us to compute the overall probability of a sequence as the sum of the log probs of the individual tokens. This is useful for scoring and ranking model outputs. It's pretty common to take the average logprob of a sentence to choose the best generation.\n",
"Log probabilities of output tokens indicate the likelihood of each token occurring in the sequence given the context. To simplify, a logprob is `log(p)`, where `p` = probability of a token occurring at a specific position based on the previous tokens in the context. Some key points about `logprobs`:\n",
"* Higher log probabilities suggest a higher likelihood of the token in that context. This allows users to gauge the model's confidence in its output or explore alternative responses the model considered.\n",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed trailing space

"* It also allow us to compute the overall probability of a sequence as the sum of the log probs of the individual tokens. This is useful for scoring and ranking model outputs. It's pretty common to take the average logprob of a sentence to choose the best generation.\n",
"Log probabilities of output tokens indicate the likelihood of each token occurring in the sequence given the context. To simplify, a logprob is `log(p)`, where `p` = probability of a token occurring at a specific position based on the previous tokens in the context. Some key points about `logprobs`:\n",
"* Higher log probabilities suggest a higher likelihood of the token in that context. This allows users to gauge the model's confidence in its output or explore alternative responses the model considered.\n",
"* Logprob can be any negative number or `0.0`. `0.0` corresponds to 100% probability.\n",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is sometimes surprising that logprobs are negative. I think it's important to point out, to contextualize the next point to explain why summing results in the joint probablity which should be lower than the individual logprobs

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point

"Log probabilities of output tokens indicate the likelihood of each token occurring in the sequence given the context. To simplify, a logprob is `log(p)`, where `p` = probability of a token occurring at a specific position based on the previous tokens in the context. Some key points about `logprobs`:\n",
"* Higher log probabilities suggest a higher likelihood of the token in that context. This allows users to gauge the model's confidence in its output or explore alternative responses the model considered.\n",
"* Logprob can be any negative number or `0.0`. `0.0` corresponds to 100% probability.\n",
"* Logprobs allow us to compute the joint probability of a sequence as the sum of the logprobs of the individual tokens. This is useful for scoring and ranking model outputs. Another common approach is to take the average per-token logprob of a sentence to choose the best generation.\n",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall -> joint. "overall" can sometimes mean the total probability which is kind of the opposite.

slight re-wording to point out that summing and averaging are two different approaches, where you choose one or the other not both

@shyamal-anadkat shyamal-anadkat merged commit f6b0cb1 into main Dec 21, 2023
@shyamal-anadkat shyamal-anadkat deleted the dev/enoch/logprobs branch December 21, 2023 05:57
katia-openai pushed a commit that referenced this pull request Feb 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants