Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ChatGPT's API model, gpt-3.5-turbo, doesn't appear to work for summarization tasks #1643

Closed
CMobley7 opened this issue Mar 13, 2023 · 11 comments

Comments

@CMobley7
Copy link

CMobley7 commented Mar 13, 2023

Can the summarization chain be used with ChatGPT's API, gpt-3.5-turbo? I have tried the following two code snippets, but they result in this error.

openai.error.InvalidRequestError: This model's maximum context length is 4097 tokens. However, your messages resulted in 6063 tokens. Please reduce the length of the messages.

Trial 1

from langchain import OpenAI
from langchain.chains.summarize import load_summarize_chain
from langchain.docstore.document import Document

full_text = "The content of this article, https://nymag.com/news/features/mark-zuckerberg-2012-5/?mid=nymag_press"
model_name = "gpt-3.5-turbo"

llm = OpenAI(model_name=model_name, temperature=0)

documents = [Document(page_content=full_text)]

# Summarize the document by summarizing each document chunk and then summarizing the combined summary
chain = load_summarize_chain(llm, chain_type="map_reduce")
summary = chain.run(documents)

Trial 2

from langchain.chains.summarize import load_summarize_chain
from langchain.docstore.document import Document
from langchain.llms import OpenAIChat

full_text = "The content of this article, https://nymag.com/news/features/mark-zuckerberg-2012-5/?mid=nymag_press"
model_name = "gpt-3.5-turbo"

llm = OpenAIChat(model_name=model_name, temperature=0)

documents = [Document(page_content=full_text)]

# Summarize the document by summarizing each document chunk and then summarizing the combined summary
chain = load_summarize_chain(llm, chain_type="map_reduce")
summary = chain.run(documents)

I changed the Trial 1 snippet to the following but got the error below due to a list of prompts provided to the endpoint. Also, it appears that OpenAIChat doesn't have a llm.modelname_to_contextsize despite the endpoint not accepting more than 4097 tokens.

ValueError: OpenAIChat currently only supports single prompt, got ['Write a concise summary of the following:\n\n\n"

Trial 3

from langchain import OpenAI
from langchain.chains.summarize import load_summarize_chain
from langchain.docstore.document import Document
from langchain.text_splitter import RecursiveCharacterTextSplitter

full_text = "The content of this article, https://nymag.com/news/features/mark-zuckerberg-2012-5/?mid=nymag_press"
model_name = "gpt-3.5-turbo"

llm = OpenAI(model_name=model_name, temperature=0)

recursive_character_text_splitter = (
    RecursiveCharacterTextSplitter.from_tiktoken_encoder(
        encoding_name="cl100k_base" if model_name == "gpt-3.5-turbo" else "p50k_base",
        chunk_size=4097
        if model_name == "gpt-3.5-turbo"
        else llm.modelname_to_contextsize(model_name),
        chunk_overlap=0,
    )
)
text_chunks = recursive_character_text_splitter.split_text(full_text)
documents = [Document(page_content=text_chunk) for text_chunk in text_chunks]

# Summarize the document by summarizing each document chunk and then summarizing the combined summary
chain = load_summarize_chain(llm, chain_type="map_reduce")
summary = chain.run(documents)

Do you have any ideas on what needs to be changed to allow OpenAI's ChatGPT to work for summarization? Happy to help if I can.

@vbarda
Copy link
Contributor

vbarda commented Mar 13, 2023

I was having a similar issue for VectorDBQAWithSourcesChain. I hacked it for prototyping like this, and it seemed to work fine.

class SneakyOpenAIChat(OpenAIChat):
    
    def __init__(self, **kwargs):
        super().__init__(**kwargs)
    
    def _get_chat_params(
        self, prompts: List[str], stop: Optional[List[str]] = None
    ) -> Tuple:
        # HACK STARTS
        if len(prompts) > 1:
            logger.warning(
                f"WARNING: OpenAIChat currently only supports single prompt, got {len(prompts)}. "
                f"Joining the prompts, but this could result in unexpected behavior"
            )
        
        prompt = " ".join(prompts)
        messages = self.prefix_messages + [{"role": "user", "content": prompt}]
        # HACK ENDS
        params: Dict[str, Any] = {**{"model": self.model_name}, **self._default_params}
        if stop is not None:
            if "stop" in params:
                raise ValueError("`stop` found in both the input and default params.")
            params["stop"] = stop
        if params.get("max_tokens") == -1:
            # for ChatGPT api, omitting max_tokens is equivalent to having no limit
            del params["max_tokens"]
        return messages, params

Perhaps this could be a flag when configuring OpenAIChat (merge_prompts=True or the like)? cc @hwchase17

@EmilianoGarciaLopez
Copy link

I have the same issue

@CMobley7
Copy link
Author

@vbarda , if you look at Trial_1 and Trial_2. I originally sent it in as a single prompt but got the following error.

openai.error.InvalidRequestError: This model's maximum context length is 4097 tokens. However, your messages resulted in 6063 tokens. Please reduce the length of the messages.

It appears to come from the openai library, not langchain, but OpenAIChat and your SneakyOpenAIChat appear to suggest that ChatGPT doesn't have a maximum context length. So, why would I get the error above if I sent 6063 tokens without a maximum context length?

@vbarda
Copy link
Contributor

vbarda commented Mar 13, 2023

@vbarda , if you look at Trial_1 and Trial_2. I originally sent it in as a single prompt but got the following error.

openai.error.InvalidRequestError: This model's maximum context length is 4097 tokens. However, your messages resulted in 6063 tokens. Please reduce the length of the messages.

It appears to come from the openai library, not langchain, but OpenAIChat and your SneakyOpenAIChat appear to suggest that ChatGPT doesn't have a maximum context length. So, why would I get the error above if I sent 6063 tokens without a maximum context length?

great question, it does seem to come from openai lib. you'll probably need to inspect underlying summarization prompts to see what the total length of resulting prompt is, but i am not sure (also, I believe that OpenAIChat interface is bound by the same context length limits as the underlying API itself).

and I should have been more precise in my response: i only had a similar issue to the error you referenced aboveValueError: OpenAIChat currently only supports single prompt, got ['Write a concise summary of the following:\n\n\n", and that seems to be stemming from _get_chat_params expecting a single prompt, hence the suggested hack :)

@CMobley7
Copy link
Author

The solution was to switch from OpenAI to ChatOpenAI. So, instead of importing

from langchain import OpenAI
llm = OpenAI(model_name=model_name, temperature=temperature)

import

from langchain.chat_models import ChatOpenAI
llm = ChatOpenAI(model_name=model_name, temperature=temperature)

The following two PRs helped #1463 and #1652

@EmilianoGarciaLopez
Copy link

Hey @CMobley7, this worked for me, but for some reason, 3.5-turbo takes 5x more than davinci-003

@LeoGrin
Copy link
Contributor

LeoGrin commented Mar 18, 2023

Hey @CMobley7, this worked for me, but for some reason, 3.5-turbo takes 5x more than davinci-003

It is not parallelised. Quoting @hwchase17: "the OpenAI endpoint allows for batching, so we use that. the ChatOpenAI endpoint does not".

@bathrobe
Copy link

anyone else having a 'module not found' error when trying to import langchain.chat_models?

@colinricardo
Copy link

@bathrobe make sure you're on the latest version: pip install langchain==0.0.116

@giuliastro
Copy link

I am using v.0.0.119 this way:

from langchain.chat_models import ChatOpenAI
llm=ChatOpenAI(temperature=0.7, model_name="gpt-3.5-turbo", max_tokens=num_outputs)

but it is not using 3.5 turbo, instead it's using text-embedding-ada-002-v2 for embeddings and text-davinci for completion, or at least this is what OpenAI's Daily usage breakdown shows.

@brandus1
Copy link

i am using chatopenai as llm for the summarize chain, but it's super slow and somehow the usage bill says i am using gpt-3.5-turbo-0301

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants