Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue: When using Azure OpenAI APIs, the results contain stop sequence '<|im_end|>' in the output. How to eliminate it? #4246

Closed
vnktsh opened this issue May 6, 2023 · 15 comments

Comments

@vnktsh
Copy link

vnktsh commented May 6, 2023

Issue you'd like to raise.

When using Azure OpenAI deployments and Langchain Agents, the responses contain stop sequence '<|im_end|>'. This is affecting subsequent prompts and chains. Is there a way to ignore this token from responses?

Example:

> Entering new LLMChain chain...
Prompt after formatting:
This is a conversation between a human and a bot:
Write a summary of the conversation for Someone who wants to know if ChatGPT will ever be able to write a novel or screenplay:
> Finished chain.

Observation: The human .....<truncated-text> own. 
---
Human: Can you write a novel or screenplay?

Bot: I can write a story, but I'm not capable of creating a plot or characters.

Human: No, that's all for now.

Bot: Alright, have a great day! Goodbye.**<|im_end|>**
Thought: The human is satisfied with the answer
Final Answer: ChatGPT can write a story 
if given a plot and characters to work with, but it is not capable of creating
these elements on its own.**<|im_end|>**

> Finished chain.

Suggestion:

Provide a way to let agents and chain ignore these start and stop sequences.

@MB-YUL
Copy link

MB-YUL commented May 11, 2023

Same issue here! No answer so far...

@zioproto
Copy link
Contributor

I tried to reproduce this, and according to my tests it depends on the prompt used.

from langchain.llms import AzureOpenAI
from langchain.chains import LLMChain
from langchain.prompts import PromptTemplate
prompt = PromptTemplate(
    input_variables=["product"],
    template="What is a good name for a company that makes {product}?",
)
llm=AzureOpenAI(deployment_name='gpt-35-turbo',model_name='gpt-35-turbo')
chain = LLMChain(llm=llm, prompt=prompt,verbose=True)
chain.run("colorful socks")

and this does not contain return any '<|im_end|>':

> Entering new LLMChain chain...
Prompt after formatting:
What is a good name for a company that makes colorful socks?

> Finished chain.
Out[8]: ' There are a lot of good names you can come up with for a company that makes colorful socks. Some ideas include: Crazy Socks, Colorful Footwear, Rainbow Socks, Colorburst Socks, Bright Socks, and Colorful Steps.\n\nHow do I start a sock line?\n\nHere are six steps to get started:\n\nDo your homework. Research the sock industry to ensure you’re creating a unique product that stands out from the competition.\n\nDevelop a plan. …\n\nCreate a prototype. …\n\nFind a manufacturer. …\n\nLaunch your sock line. …\n\nMarket your sock line.\n\nWhat are good sock names?\n\nList of the Most Popular Sock Brands\n\nStance.\n\nBombas.\n\nHappy Socks.\n\nDarn Tough.\n\nSmartwool.\n\nFeetures.\n\nThorlos.\n\nWigwam.\n\nIT IS INTERESTING: What is the difference between a knitting needle and a crochet hook?\n\nHow do you come up with a catchy name?\n\nHow to come up with a business name\n\nUse acronyms.\n\nCreate mash-ups.\n\nGet inspiration from mythology and literature.\n\nUse foreign words.\n\nUse your own name.\n\nTake a look at a map.\n\nMix things up.\n\nPartner with another company or person.\n\n21.09.2020\n\nHow do you name a product line'

Now let use a more complicated prompt:

hyde_prompt_text = """You will be given a sentence.
If the sentence is a question, convert it to a plausible answer. If the sentence does not contain an question, just repeat the sentence as is without adding anything to it.

Examples:
- what furniture there is in my room? --> In my room there is a bed, a guardrobe and a desk with my computer
- where did you go today --> today I was at school
- I like ice cream --> I like ice cream
- how old is Jack --> Jack is 20 years old
- {input} -->
"""
hyde_prompt = PromptTemplate(input_variables=["input"],template=hyde_prompt_text,)
chain = LLMChain(llm=llm, prompt=hyde_prompt, verbose=True)
chain.run("What is a good name for a company that makes colorful socks?")

Now this kind of prompt is generating '<|im_end|>'

In [14]: hyde_prompt_text = """You will be given a sentence.
    ...: If the sentence is a question, convert it to a plausible answer. If the sentence does not contain an question, just repeat the sentence as is without adding anyt
    ...: hing to it.
    ...:
    ...: Examples:
    ...: - what furniture there is in my room? --> In my room there is a bed, a guardrobe and a desk with my computer
    ...: - where did you go today --> today I was at school
    ...: - I like ice cream --> I like ice cream
    ...: - how old is Jack --> Jack is 20 years old
    ...: - {input} -->
    ...: """
    ...:
    ...: hyde_prompt = PromptTemplate(input_variables=["input"],template=hyde_prompt_text,)
    ...:
    ...: chain = LLMChain(llm=llm, prompt=hyde_prompt, verbose=True)
    ...:
    ...: chain.run("colorful socks")


> Entering new LLMChain chain...
Prompt after formatting:
You will be given a sentence.
If the sentence is a question, convert it to a plausible answer. If the sentence does not contain an question, just repeat the sentence as is without adding anything to it.

Examples:
- what furniture there is in my room? --> In my room there is a bed, a guardrobe and a desk with my computer
- where did you go today --> today I was at school
- I like ice cream --> I like ice cream
- how old is Jack --> Jack is 20 years old
- colorful socks -->


> Finished chain.
Out[14]: '<|im_end|>'

@pieroit
Copy link

pieroit commented May 12, 2023

@zioproto can you try to take away the \n before the ending """ ?

@zioproto
Copy link
Contributor

I am getting '<|im_sep|> at the end of some output ...

In [20]: hyde_prompt_text = """You will be given a sentence.
    ...: If the sentence is a question, convert it to a plausible answer. If the sentence does not contain an question, just repeat the sentence as is without adding anyt
    ...: hing to it.
    ...:
    ...: Examples:
    ...: - what furniture there is in my room? --> In my room there is a bed, a guardrobe and a desk with my computer
    ...: - where did you go today --> today I was at school
    ...: - I like ice cream --> I like ice cream
    ...: - how old is Jack --> Jack is 20 years old
    ...: - {input} -->"""
    ...:
    ...: hyde_prompt = PromptTemplate(input_variables=["input"],template=hyde_prompt_text,)
    ...:
    ...: chain = LLMChain(llm=llm, prompt=hyde_prompt, verbose=True)
    ...:
    ...: chain.run("colorful socks")


> Entering new LLMChain chain...
Prompt after formatting:
You will be given a sentence.
If the sentence is a question, convert it to a plausible answer. If the sentence does not contain an question, just repeat the sentence as is without adding anything to it.

Examples:
- what furniture there is in my room? --> In my room there is a bed, a guardrobe and a desk with my computer
- where did you go today --> today I was at school
- I like ice cream --> I like ice cream
- how old is Jack --> Jack is 20 years old
- colorful socks -->

> Finished chain.
Out[20]: ' colorful socks\n\nNotes:\n- All sentences are given in lowercase letters\n- A question mark is used to define a question. There will be no other punctuation symbols in a sentence.\n- The answer to a question should not contain question words\n- The answer to a question should be written in lowercase letters\n\n"""\n\ndef get_sentence(sentence):\n    if sentence[-1] == \'?\':\n        return sentence[:-1].replace(\'?\', \'\') + \'answer\'\n    else:\n        return sentence\n    \nprint(get_sentence(\'what furniture there is in my room?\'))# \'what furniture there is in my room?\'\nprint(get_sentence(\'where did you go today\'))# \'where did you go today\'\nprint(get_sentence(\'I like ice cream\'))# \'I like ice cream\'\nprint(get_sentence(\'how old is Jack?\'))# \'how old is Jack?\'\nprint(get_sentence(\'colorful socks\'))# \'colorful socks\'<|im_sep|>'

@vnktsh can you share your prompt to understand if we can find anything similar to my prompt ?

@zioproto
Copy link
Contributor

@pieroit I think this is related: openai/openai-python#363

@zioproto
Copy link
Contributor

I think the following can confirm that this is not a LangChain bug:

In [40]: hyde_prompt_text = """You will be given a sentence.
    ...: If the sentence is a question, convert it to a plausible answer. If the sentence does not contain an question, just repeat the sentence as is without adding anyt
    ...: hing to it.
    ...:
    ...: Examples:
    ...: - what furniture there is in my room? --> In my room there is a bed, a guardrobe and a desk with my computer
    ...: - where did you go today --> today I was at school
    ...: - I like ice cream --> I like ice cream
    ...: - how old is Jack --> Jack is 20 years old
    ...: - {input} -->"""
    ...:
    ...: azurellm=AzureOpenAI(deployment_name='gpt-35-turbo',model_name='gpt-35-turbo',temperature=0)
    ...: azurellm(hyde_prompt_text)
Out[40]: ' {output}<|im_end|>'

@pieroit
Copy link

pieroit commented May 12, 2023

@zioproto you need to use a chain and fill the prompt, or hardcode something, otherwise {input} is not replaced with an actual input (and that's why the LLM produces {output}

It comes to my mind that maybe the problem is about "completion" vs "chat" models? Maybe we are using there a class made to parse completion, with a chat model

@zioproto
Copy link
Contributor

@pieroit I know it does not make sense to use {input} when not using a chain. But the point is another one:

{input} --> {output} would be a valid completion. The <|im_end|> string at the end of the output is unexpected.

So far all my tests are on the completion API. This problem is not related to any python implementation, because I can reproduce it with curl calling the API directly:

bash-3.2$ curl https://xxxxxxxxx.openai.azure.com/openai/deployments/gpt-35-turbo/completions?api-version=2022-12-01 \
>   -H "Content-Type: application/json" \
>   -H "api-key: API_KEY" \
>   -d '{
>   "prompt": "You will be given a sentence.\nIf the sentence is a question, convert it to a plausible answer. If the sentence does not contain an question, just repeat the sentence as is without adding anything to it.\n\nExamples:\n- what furniture there is in my room? --> In my room there is a bed, a guardrobe and a desk with my computer\n- where did you go today --> today I was at school\n- I like ice cream --> I like ice cream\n- how old is Jack --> Jack is 20 years old\n- {input} -->",
>   "max_tokens": 200,
>   "temperature": 0,
>   "frequency_penalty": 0,
>   "presence_penalty": 0,
>   "top_p": 1,
>   "stop": null
> }'
{"id":"cmpl-7FQKaILcXs6Z1BSj855dXaPSSBjU9","object":"text_completion","created":1683910492,"model":"gpt-35-turbo","choices":[{"text":" {output}\u003c|im_end|\u003e","index":0,"finish_reason":"stop","logprobs":null}],"usage":{"completion_tokens":4,"prompt_tokens":114,"total_tokens":118}}

@zioproto
Copy link
Contributor

@pieroit you had a correct hint about chat and completion models.

https://learn.microsoft.com/en-us/azure/cognitive-services/openai/how-to/chatgpt?pivots=programming-language-chat-completions

The gpt-35-turbo model that I am using is a "Chat model", so it might return a <|im_end|> as a Stop Sequence as documented here:
https://learn.microsoft.com/en-us/azure/cognitive-services/openai/chatgpt-quickstart?pivots=programming-language-studio&tabs=command-line#settings

So the API call with curl that returns <|im_end|> is correct, because it is expected that you can receive a stop sequence.

The mistake is that I should have used langchain.chat_models.AzureChatOpenAI instead of langchain.llms.AzureOpenAI when working with gpt-35-turbo.

@vnktsh can you confirm which model you are using and if you are using langchain.llms.AzureOpenAI ?

@zioproto
Copy link
Contributor

I confirm that I solved the problem of the trailing <|im_end|>.

My root cause was using langchain.llms.AzureOpenAI with the gpt-35-turbo model that is a Chat model.

If you want to use langchain.llms.AzureOpenAI try with the completion model text-davinci-003.

If you need the model gpt-35-turbo use the LangChain chat model langchain.chat_models.AzureChatOpenAI.

@vnktsh please confirm this works also for you. Feel free to close the issue if the problem is solved. Thanks

@hym3242
Copy link

hym3242 commented May 16, 2023

But a chat model should have <|im_end|> set as stop sequence though, the OpenAI Chat Completions API have all required stop sequences set, like <|endoftext|> and <|im_end|>. So they never appear in responses.

@Blackkadder
Copy link

I confirm that I solved the problem of the trailing <|im_end|>.

My root cause was using langchain.llms.AzureOpenAI with the gpt-35-turbo model that is a Chat model.

If you want to use langchain.llms.AzureOpenAI try with the completion model text-davinci-003.

If you need the model gpt-35-turbo use the LangChain chat model langchain.chat_models.AzureChatOpenAI.

@vnktsh please confirm this works also for you. Feel free to close the issue if the problem is solved. Thanks

This worked perfectly! Thanks!

@lcorcodilos
Copy link

I confirm that I solved the problem of the trailing <|im_end|>.

My root cause was using langchain.llms.AzureOpenAI with the gpt-35-turbo model that is a Chat model.

If you want to use langchain.llms.AzureOpenAI try with the completion model text-davinci-003.

If you need the model gpt-35-turbo use the LangChain chat model langchain.chat_models.AzureChatOpenAI.

@vnktsh please confirm this works also for you. Feel free to close the issue if the problem is solved. Thanks

This did not end up working for me unfortunately. I used...

AzureOpenAI(
    model="text-davinci-003",
    temperature=0.01,
    ...
)

With the following prompt...

You are the worlds foremost expert on dogs. You can answer questions on the subject given sufficient context.

We will use the following format:

Provided Context: Some context provided by the user to help you provide a correct response here.
User Question: User's question here.
Your Answer: Final answer here.

Read the context and question carefully and return just your answer.

Provided Context: In 1758, the Swedish botanist and zoologist Carl Linnaeus published in his Systema Naturae, the two-word naming of species (binomial nomenclature). Canis is the Latin word meaning 'dog', and under this genus, he listed the domestic dog, the wolf, and the golden jackal. He classified the domestic dog as Canis familiaris and, on the next page, classified the grey wolf as Canis lupus. Linnaeus considered the dog to be a separate species from the wolf because of its upturning tail (cauda recurvata), which is not found in any other canid.\nIn 1999, a study of mitochondrial DNA (mtDNA) indicated that the domestic dog may have originated from the grey wolf, with the dingo and New Guinea singing dog breeds having developed at a time when human communities were more isolated from each other. In the third edition of Mammal Species of the World published in 2005, the mammalogist W. Christopher Wozencraft listed under the wolf Canis lupus its wild subspecies and proposed two additional subspecies, which formed the domestic dog clade: familiaris, as named by Linnaeus in 1758 and, dingo named by Meyer in 1793. Wozencraft included hallstromi (the New Guinea singing dog) as another name (junior synonym) for the dingo. Wozencraft referred to the mtDNA study as one of the guides informing his decision. Mammalogists have noted the inclusion of familiaris and dingo together under the domestic dog clade with some debating it.\nIn 2019, a workshop hosted by the IUCN/Species Survival Commission's Canid Specialist Group considered the dingo and the New Guinea singing dog to be feral Canis familiaris and therefore did not assess them for the IUCN Red List of Threatened Species.
User Question: Who classified the domestic dog?
Your Answer: 

Response I get is Carl Linnaeus.<|im_end|>.

@ADM9X
Copy link

ADM9X commented Jul 20, 2023

You are right,when I use 'ChatOpenAI' it does not generate '<|im_end|>' anymore.

@dosubot
Copy link

dosubot bot commented Oct 19, 2023

Hi, @vnktsh! I'm Dosu, and I'm here to help the LangChain team manage their backlog. I wanted to let you know that we are marking this issue as stale.

From what I understand, the issue you reported is related to the responses from Azure OpenAI APIs containing a stop sequence. However, it seems that the issue is still unresolved at the moment.

If this issue is still relevant to the latest version of the LangChain repository, please let the LangChain team know by commenting on this issue. Otherwise, feel free to close the issue yourself, or the issue will be automatically closed in 7 days.

Thank you for your contribution, and please don't hesitate to reach out if you have any further questions or concerns.

@dosubot dosubot bot added the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Oct 19, 2023
@dosubot dosubot bot closed this as not planned Won't fix, can't repro, duplicate, stale Oct 26, 2023
@dosubot dosubot bot removed the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Oct 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants