Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How do fix GPT2 Tokenizer error in Langchain map_reduce (LLama2)? #9273

Closed
4 of 14 tasks
dm-2111 opened this issue Aug 15, 2023 · 6 comments
Closed
4 of 14 tasks

How do fix GPT2 Tokenizer error in Langchain map_reduce (LLama2)? #9273

dm-2111 opened this issue Aug 15, 2023 · 6 comments
Labels
🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature Ɑ: models Related to LLMs or chat model modules

Comments

@dm-2111
Copy link

dm-2111 commented Aug 15, 2023

System Info

I'm using AWS Sagemaker Jumpstart model for Llama2 13b: meta-textgeneration-llama-2-13b-f

On running a Langchain summarize chain with chain_type="map_reduce" I get the below error. Other chain types (refine, stuff) work without issues. I do not have access to https://huggingface.co/ from my environment. Is there a way to set the gpt2 tokenizer in a local dir?

parameters = {
        "properties": {
            "min_length": 100,
            "max_length": 1024,
            "do_sample": True,
            "top_p": 0.9,
            "repetition_penalty": 1.03,
            "temperature": 0.8
        }
    }

class ContentHandler(LLMContentHandler):
    content_type = "application/json"
    accepts = "application/json"

    def transform_input(self, prompt: str, model_kwargs: Dict) -> bytes:
        input_str = json.dumps({"inputs": prompt, **model_kwargs})
        print(input_str)
        return input_str.encode("utf-8")

    def transform_output(self, output: bytes) -> str:
        response_json = json.loads(output.read().decode("utf-8"))
        return response_json[0]["generation"]["content"]

content_handler = ContentHandler()

endpoint_name='xxxxxxxxxxxxxxxxxx'

llm=SagemakerEndpoint(
        endpoint_name=endpoint_name,
        region_name="us-east-1",
        model_kwargs=parameters,
        content_handler=content_handler,
        endpoint_kwargs={"CustomAttributes": 'accept_eula=true'}
)

chain = load_summarize_chain(llm, chain_type="map_reduce")
chain.run(docs)

Error:

File /usr/local/lib/python3.9/site-packages/transformers/tokenization_utils_base.py:1788, in PreTrainedTokenizerBase.from_pretrained(cls, pretrained_model_name_or_path, *init_inputs, **kwargs)
   1782     logger.info(
   1783         f"Can't load following files from cache: {unresolved_files} and cannot check if these "
   1784         "files are necessary for the tokenizer to operate."
   1785     )
   1787 if all(full_file_name is None for full_file_name in resolved_vocab_files.values()):
-> 1788     raise EnvironmentError(
   1789         f"Can't load tokenizer for '{pretrained_model_name_or_path}'. If you were trying to load it from "
   1790         "'https://huggingface.co/models', make sure you don't have a local directory with the same name. "
   1791         f"Otherwise, make sure '{pretrained_model_name_or_path}' is the correct path to a directory "
   1792         f"containing all relevant files for a {cls.__name__} tokenizer."
   1793     )
   1795 for file_id, file_path in vocab_files.items():
   1796     if file_id not in resolved_vocab_files:

OSError: Can't load tokenizer for 'gpt2'. If you were trying to load it from 'https://huggingface.co/models', make sure you don't have a local directory with the same name. Otherwise, make sure 'gpt2' is the correct path to a directory containing all relevant files for a GPT2TokenizerFast tokenizer.

Who can help?

No response

Information

  • The official example notebooks/scripts
  • My own modified scripts

Related Components

  • LLMs/Chat Models
  • Embedding Models
  • Prompts / Prompt Templates / Prompt Selectors
  • Output Parsers
  • Document Loaders
  • Vector Stores / Retrievers
  • Memory
  • Agents / Agent Executors
  • Tools / Toolkits
  • Chains
  • Callbacks/Tracing
  • Async

Reproduction

Provided above

Expected behavior

Provided above

@dosubot dosubot bot added Ɑ: models Related to LLMs or chat model modules 🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature labels Aug 15, 2023
@dosubot
Copy link

dosubot bot commented Aug 15, 2023

🤖

Hello,

The error you're encountering is due to the fact that the GPT2 tokenizer is trying to download the necessary files from HuggingFace's model hub, which is not accessible in your environment.

To resolve this issue, you can download the necessary tokenizer files in an environment where you have access to HuggingFace's model hub, then transfer these files to your AWS Sagemaker environment. Once the files are in your environment, you can set the path of the GPT2 tokenizer to the local directory where you stored the tokenizer files.

However, the LangChain framework, as shown in your provided context, uses the OpenAI API for language model operations and does not directly interact with the GPT2 tokenizer. The tokenizer is typically used for preprocessing the input data before it is fed into the language model. If you are experiencing issues with the GPT2 tokenizer, it might be related to the specific way you are using the LangChain framework or the specific configuration of your AWS Sagemaker environment.

If you are using a custom language model that requires the GPT2 tokenizer, you might need to modify the LangChain framework to support this. This could involve modifying the _get_encoding_model method in the ChatOpenAI class to use the local GPT2 tokenizer instead of the tiktoken package. However, this would require a deep understanding of the LangChain framework and the specific requirements of your application.

Here is the code for the _get_encoding_model method:

def _get_encoding_model(self) -> Tuple[str, tiktoken.Encoding]:
    tiktoken_ = _import_tiktoken()
    if self.tiktoken_model_name is not None:
        model = self.tiktoken_model_name
    else:
        model = self.model_name
        if model == "gpt-3.5-turbo":
            # gpt-3.5-turbo may change over time.
            # Returning num tokens assuming gpt-3.5-turbo-0301.
            model = "gpt-3.5-turbo-0301"
        elif model == "gpt-4":
            # gpt-4 may change over time.
            # Returning num tokens assuming gpt-4-0314.
            model = "gpt-4-0314"
    # Returns the number of tokens used by a list of messages.
    try:
        encoding = tiktoken_.encoding_for_model(model)
    except KeyError:
        logger.warning("Warning: model not found. Using cl100k_base encoding.")
        model = "cl100k_base"
        encoding = tiktoken_.get_encoding(model)
    return model, encoding

You can find this method in the ChatOpenAI class in the libs/langchain/langchain/chat_models/openai.py file in the LangChain codebase.

Please provide more specific information about the error you are experiencing and how you are using the LangChain framework for more detailed assistance.

I hope this helps! Let me know if you have any other questions.

Sources


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

@dm-2111
Copy link
Author

dm-2111 commented Aug 15, 2023

@dosu-bot I'm using the Llama2 jumpstart model in sagemaker (model name: Llama-2-13b-chat | model_id: meta-textgeneration-llama-2-13b-f) setup as an inference endpoint.

Outside of the sagemaker environment, I'm running my langchain summarize chain with map_reduce.

llm=SagemakerEndpoint(
        endpoint_name=endpoint_name,
        region_name="us-east-1",
        model_kwargs=parameters,
        content_handler=content_handler,
        endpoint_kwargs={"CustomAttributes": 'accept_eula=true'}
)
chain = load_summarize_chain(llm, chain_type="map_reduce")
chain.run(docs)

I dont have access to openai or huggingface in my environment. I do have the gpt2 model downloaded locally. How do I supply this parameter to the map_reduce function thats throwing the error?

@luzhouuu
Copy link

Hi @dm-2111
I encountered a similar issue as you did, where my environment couldn't access the internet, which prevented me from loading GPT-2. Have you managed to resolve this problem?
Thank you very much!

@dosubot
Copy link

dosubot bot commented Sep 16, 2023

🤖

Hello,

Based on the information you've provided and the context from the LangChain repository, it seems like you're trying to use a locally downloaded GPT2 model with the LangChain's map_reduce function.

Currently, the LangChain framework does not directly support the use of locally downloaded models in its map_reduce function. The map_reduce function in LangChain is designed to work with language models that are hosted on a server and accessed via an API endpoint, such as the AWS Sagemaker endpoint you're using.

However, you can modify the LangChain code to support locally downloaded models. Here's a rough idea of how you can do this:

  1. Create a new class that inherits from the LLM base class in LangChain. This new class should override the __call__ method to use your locally downloaded GPT2 model instead of making a request to an API endpoint.
from langchain.llms import LLM

class LocalGPT2(LLM):
    def __init__(self, model_path, **kwargs):
        super().__init__(**kwargs)
        self.model = GPT2Model.from_pretrained(model_path)

    def __call__(self, text, **kwargs):
        # Use your GPT2 model to generate a response
        response = self.model.generate(text, **kwargs)
        return response
  1. Use this new LocalGPT2 class when creating your chain object:
llm = LocalGPT2(model_path="/path/to/your/model/gpt2-model.bin")
chain = load_summarize_chain(llm, chain_type="map_reduce")
chain.run(docs)

Please note that this is a rough idea and you might need to adjust the code to fit your specific needs. Also, this approach requires a good understanding of the LangChain codebase and the workings of the GPT2 model.

I hope this helps! If you have any other questions, feel free to ask.

Sources


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

Copy link

dosubot bot commented Dec 18, 2023

Hi, @dm-2111,

I'm helping the LangChain team manage their backlog and am marking this issue as stale. From what I understand, you encountered a GPT2 tokenizer error when running a Langchain summarize chain with chain_type="map_reduce" in the Llama2 model. You were unable to load the tokenizer for 'gpt2' from 'https://huggingface.co/models' and were seeking a way to set the gpt2 tokenizer in a local directory. Suggestions were provided to download the necessary tokenizer files in an environment with access to HuggingFace's model hub and transfer them to the AWS Sagemaker environment, as well as to modify the LangChain code to support locally downloaded models. Additionally, it was mentioned that the LangChain framework does not directly support the use of locally downloaded models in its map_reduce function.

Could you please confirm if this issue is still relevant to the latest version of the LangChain repository? If it is, please let the LangChain team know by commenting on the issue. Otherwise, feel free to close the issue yourself, or the issue will be automatically closed in 7 days.

Thank you for your understanding and cooperation. If you have any further questions or need assistance, feel free to reach out.

@dosubot dosubot bot added the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Dec 18, 2023
@dosubot dosubot bot closed this as not planned Won't fix, can't repro, duplicate, stale Dec 25, 2023
@dosubot dosubot bot removed the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Dec 25, 2023
@analyticsinsights
Copy link

🤖

Hello,

Based on the information you've provided and the context from the LangChain repository, it seems like you're trying to use a locally downloaded GPT2 model with the LangChain's map_reduce function.

Currently, the LangChain framework does not directly support the use of locally downloaded models in its map_reduce function. The map_reduce function in LangChain is designed to work with language models that are hosted on a server and accessed via an API endpoint, such as the AWS Sagemaker endpoint you're using.

However, you can modify the LangChain code to support locally downloaded models. Here's a rough idea of how you can do this:

  1. Create a new class that inherits from the LLM base class in LangChain. This new class should override the __call__ method to use your locally downloaded GPT2 model instead of making a request to an API endpoint.
from langchain.llms import LLM

class LocalGPT2(LLM):
    def __init__(self, model_path, **kwargs):
        super().__init__(**kwargs)
        self.model = GPT2Model.from_pretrained(model_path)

    def __call__(self, text, **kwargs):
        # Use your GPT2 model to generate a response
        response = self.model.generate(text, **kwargs)
        return response
  1. Use this new LocalGPT2 class when creating your chain object:
llm = LocalGPT2(model_path="/path/to/your/model/gpt2-model.bin")
chain = load_summarize_chain(llm, chain_type="map_reduce")
chain.run(docs)

Please note that this is a rough idea and you might need to adjust the code to fit your specific needs. Also, this approach requires a good understanding of the LangChain codebase and the workings of the GPT2 model.

I hope this helps! If you have any other questions, feel free to ask.

Sources

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

This answer appears to be wrong.

First, it should be from langchain.llms.base import LLM instead of from langchain.llms import LLM.

Second, running this suggested solution creates an error: 'TypeError: Can't instantiate abstract class LocalGPT2 with abstract methods _call, _llm_type'.

Please provide a working solution for using a local version of the GPT2-model instead of attempting to download it from the internet, in cases when an internet connection is not available.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature Ɑ: models Related to LLMs or chat model modules
Projects
None yet
Development

No branches or pull requests

3 participants