# Custom

To create a custom chat model in LangChain, you will need to subclass the BaseChatModel class and implement the required methods such as _generate and optionally, _stream, _agenerate, or _astream. Here’s a step-by-step breakdown of how you can implement your own custom chat model:

# 1. How to create a custom chat model class

To create a custom chat model class in LangChain, you need to implement a class that inherits from BaseChatModel and override its key methods, such as _generate and _stream. This allows you to integrate your own logic for generating chat responses and streaming results.

# Steps for Implementation:
1. Message Types:

Understand that LangChain models interact with different types of messages like HumanMessage, AIMessage, etc. These are the building blocks for chat interactions in LangChain.

2. BaseChatModel:

Your custom chat model will inherit from BaseChatModel, which provides a standard interface for chat models. You'll need to implement at least the _generate method to define how your model generates responses.

3. Implementing _generate:

_generate is the key method that generates a response from the input messages. You can use it to implement your custom logic for how the model responds based on the input.

4. Streaming with _stream:

If your model supports streaming responses, you can implement the _stream method. This method allows the model to return partial results as they're generated, which is useful for large or slow responses.

5. Async Support:

If you want to support asynchronous execution, you can implement _agenerate and _astream, which handle async generation and streaming respectively.

# Example Implementation:
Here's how to create a custom chat model class (ChatParrotLink) that echoes the first parrot_buffer_length characters of the last message in the conversation:


In [28]:
from typing import Any, Dict, Iterator, List, Optional

from langchain_core.callbacks import (
    CallbackManagerForLLMRun,
)
from langchain_core.language_models import BaseChatModel
from langchain_core.messages import (
    AIMessage,
    AIMessageChunk,
    BaseMessage,
)
from langchain_core.messages.ai import UsageMetadata
from langchain_core.outputs import ChatGeneration, ChatGenerationChunk, ChatResult
from pydantic import Field


class ChatParrotLink(BaseChatModel):
    """A custom chat model that echoes the first `parrot_buffer_length` characters
    of the input.

    When contributing an implementation to LangChain, carefully document
    the model including the initialization parameters, include
    an example of how to initialize the model and include any relevant
    links to the underlying models documentation or API.

    Example:

        .. code-block:: python

            model = ChatParrotLink(parrot_buffer_length=2, model="bird-brain-001")
            result = model.invoke([HumanMessage(content="hello")])
            result = model.batch([[HumanMessage(content="hello")],
                                 [HumanMessage(content="world")]])
    """

    model_name: str = Field(alias="model")
    """The name of the model"""
    parrot_buffer_length: int
    """The number of characters from the last message of the prompt to be echoed."""
    temperature: Optional[float] = None
    max_tokens: Optional[int] = None
    timeout: Optional[int] = None
    stop: Optional[List[str]] = None
    max_retries: int = 2

    def _generate(
        self,
        messages: List[BaseMessage],
        stop: Optional[List[str]] = None,
        run_manager: Optional[CallbackManagerForLLMRun] = None,
        **kwargs: Any,
    ) -> ChatResult:
        """Override the _generate method to implement the chat model logic.

        This can be a call to an API, a call to a local model, or any other
        implementation that generates a response to the input prompt.

        Args:
            messages: the prompt composed of a list of messages.
            stop: a list of strings on which the model should stop generating.
                  If generation stops due to a stop token, the stop token itself
                  SHOULD BE INCLUDED as part of the output. This is not enforced
                  across models right now, but it's a good practice to follow since
                  it makes it much easier to parse the output of the model
                  downstream and understand why generation stopped.
            run_manager: A run manager with callbacks for the LLM.
        """
        # Replace this with actual logic to generate a response from a list
        # of messages.
        last_message = messages[-1]
        tokens = last_message.content[: self.parrot_buffer_length]
        ct_input_tokens = sum(len(message.content) for message in messages)
        ct_output_tokens = len(tokens)
        message = AIMessage(
            content=tokens,
            additional_kwargs={},  # Used to add additional payload to the message
            response_metadata={  # Use for response metadata
                "time_in_seconds": 3,
            },
            usage_metadata={
                "input_tokens": ct_input_tokens,
                "output_tokens": ct_output_tokens,
                "total_tokens": ct_input_tokens + ct_output_tokens,
            },
        )
        ##

        generation = ChatGeneration(message=message)
        return ChatResult(generations=[generation])

    def _stream(
        self,
        messages: List[BaseMessage],
        stop: Optional[List[str]] = None,
        run_manager: Optional[CallbackManagerForLLMRun] = None,
        **kwargs: Any,
    ) -> Iterator[ChatGenerationChunk]:
        """Stream the output of the model.

        This method should be implemented if the model can generate output
        in a streaming fashion. If the model does not support streaming,
        do not implement it. In that case streaming requests will be automatically
        handled by the _generate method.

        Args:
            messages: the prompt composed of a list of messages.
            stop: a list of strings on which the model should stop generating.
                  If generation stops due to a stop token, the stop token itself
                  SHOULD BE INCLUDED as part of the output. This is not enforced
                  across models right now, but it's a good practice to follow since
                  it makes it much easier to parse the output of the model
                  downstream and understand why generation stopped.
            run_manager: A run manager with callbacks for the LLM.
        """
        last_message = messages[-1]
        tokens = str(last_message.content[: self.parrot_buffer_length])
        ct_input_tokens = sum(len(message.content) for message in messages)

        for token in tokens:
            usage_metadata = UsageMetadata(
                {
                    "input_tokens": ct_input_tokens,
                    "output_tokens": 1,
                    "total_tokens": ct_input_tokens + 1,
                }
            )
            ct_input_tokens = 0
            chunk = ChatGenerationChunk(
                message=AIMessageChunk(content=token, usage_metadata=usage_metadata)
            )

            if run_manager:
                # This is optional in newer versions of LangChain
                # The on_llm_new_token will be called automatically
                run_manager.on_llm_new_token(token, chunk=chunk)

            yield chunk

        # Let's add some other information (e.g., response metadata)
        chunk = ChatGenerationChunk(
            message=AIMessageChunk(content="", response_metadata={"time_in_sec": 3})
        )
        if run_manager:
            # This is optional in newer versions of LangChain
            # The on_llm_new_token will be called automatically
            run_manager.on_llm_new_token(token, chunk=chunk)
        yield chunk

    @property
    def _llm_type(self) -> str:
        """Get the type of language model used by this chat model."""
        return "echoing-chat-model-advanced"

    @property
    def _identifying_params(self) -> Dict[str, Any]:
        """Return a dictionary of identifying parameters.

        This information is used by the LangChain callback system, which
        is used for tracing purposes make it possible to monitor LLMs.
        """
        return {
            # The model name allows users to specify custom token counting
            # rules in LLM monitoring applications (e.g., in LangSmith users
            # can provide per token pricing for their model and monitor
            # costs for the given LLM.)
            "model_name": self.model_name,
        }

# Let's test it 🧪
This LLM will implement the standard Runnable interface of LangChain which many of the LangChain abstractions support!

In [29]:
llm = CustomLLM(n=5)
print(llm)

[1mCustomLLM[0m
Params: {'model_name': 'CustomChatModel'}


In [30]:
llm.invoke("This is a foobar thing")

'This '

In [31]:
await llm.ainvoke("world")

'world'

In [32]:
llm.batch(["woof woof woof", "meow meow meow"])

['woof ', 'meow ']

In [33]:
await llm.abatch(["woof woof woof", "meow meow meow"])

['woof ', 'meow ']

In [34]:
async for token in llm.astream("hello"):
    print(token, end="|", flush=True)

h|e|l|l|o|

# Explanation:
1. Initialization: The class ChatParrotLink takes several parameters, including parrot_buffer_length, which determines how many characters from the last message to echo.

2. _generate method: This method generates a response based on the input messages, specifically returning the first parrot_buffer_length characters from the last message in the sequence.

3. _stream method: If streaming is supported, this method streams the response one character at a time.

4. _llm_type and _identifying_params: These properties help LangChain track and manage your custom model for logging and monitoring purposes.

# 3. How to create a custom LLM class

In [35]:
from typing import Any, Dict, Iterator, List, Mapping, Optional

from langchain_core.callbacks.manager import CallbackManagerForLLMRun
from langchain_core.language_models.llms import LLM
from langchain_core.outputs import GenerationChunk


class CustomLLM(LLM):
    """A custom chat model that echoes the first `n` characters of the input.

    When contributing an implementation to LangChain, carefully document
    the model including the initialization parameters, include
    an example of how to initialize the model and include any relevant
    links to the underlying models documentation or API.

    Example:

        .. code-block:: python

            model = CustomChatModel(n=2)
            result = model.invoke([HumanMessage(content="hello")])
            result = model.batch([[HumanMessage(content="hello")],
                                 [HumanMessage(content="world")]])
    """

    n: int
    """The number of characters from the last message of the prompt to be echoed."""

    def _call(
        self,
        prompt: str,
        stop: Optional[List[str]] = None,
        run_manager: Optional[CallbackManagerForLLMRun] = None,
        **kwargs: Any,
    ) -> str:
        """Run the LLM on the given input.

        Override this method to implement the LLM logic.

        Args:
            prompt: The prompt to generate from.
            stop: Stop words to use when generating. Model output is cut off at the
                first occurrence of any of the stop substrings.
                If stop tokens are not supported consider raising NotImplementedError.
            run_manager: Callback manager for the run.
            **kwargs: Arbitrary additional keyword arguments. These are usually passed
                to the model provider API call.

        Returns:
            The model output as a string. Actual completions SHOULD NOT include the prompt.
        """
        if stop is not None:
            raise ValueError("stop kwargs are not permitted.")
        return prompt[: self.n]

    def _stream(
        self,
        prompt: str,
        stop: Optional[List[str]] = None,
        run_manager: Optional[CallbackManagerForLLMRun] = None,
        **kwargs: Any,
    ) -> Iterator[GenerationChunk]:
        """Stream the LLM on the given prompt.

        This method should be overridden by subclasses that support streaming.

        If not implemented, the default behavior of calls to stream will be to
        fallback to the non-streaming version of the model and return
        the output as a single chunk.

        Args:
            prompt: The prompt to generate from.
            stop: Stop words to use when generating. Model output is cut off at the
                first occurrence of any of these substrings.
            run_manager: Callback manager for the run.
            **kwargs: Arbitrary additional keyword arguments. These are usually passed
                to the model provider API call.

        Returns:
            An iterator of GenerationChunks.
        """
        for char in prompt[: self.n]:
            chunk = GenerationChunk(text=char)
            if run_manager:
                run_manager.on_llm_new_token(chunk.text, chunk=chunk)

            yield chunk

    @property
    def _identifying_params(self) -> Dict[str, Any]:
        """Return a dictionary of identifying parameters."""
        return {
            # The model name allows users to specify custom token counting
            # rules in LLM monitoring applications (e.g., in LangSmith users
            # can provide per token pricing for their model and monitor
            # costs for the given LLM.)
            "model_name": "CustomChatModel",
        }

    @property
    def _llm_type(self) -> str:
        """Get the type of language model used by this chat model. Used for logging purposes only."""
        return "custom"

In [36]:
llm = CustomLLM(n=5)
print(llm.invoke("This is a foobar thing"))  # Output: "This "


This 


In [37]:
await llm.ainvoke("world")

'world'

In [38]:
llm.batch(["woof woof woof", "meow meow meow"])

['woof ', 'meow ']

In [39]:
await llm.abatch(["woof woof woof", "meow meow meow"])

['woof ', 'meow ']

In [40]:
async for token in llm.astream("hello"):
    print(token, end="|", flush=True)

h|e|l|l|o|

In [41]:
from langchain_core.prompts import ChatPromptTemplate

In [42]:
prompt = ChatPromptTemplate.from_messages(
    [("system", "you are a bot"), ("human", "{input}")]
)

In [43]:
llm = CustomLLM(n=7)
chain = prompt | llm

In [44]:
idx = 0
async for event in chain.astream_events({"input": "hello there!"}, version="v1"):
    print(event)
    idx += 1
    if idx > 7:
        # Truncate
        break

{'event': 'on_chain_start', 'run_id': 'b26302f2-4984-4cd7-bc92-dd740d829fd1', 'name': 'RunnableSequence', 'tags': [], 'metadata': {}, 'data': {'input': {'input': 'hello there!'}}, 'parent_ids': []}
{'event': 'on_prompt_start', 'name': 'ChatPromptTemplate', 'run_id': 'f8564c25-0874-4137-a264-5475fd3adb27', 'tags': ['seq:step:1'], 'metadata': {}, 'data': {'input': {'input': 'hello there!'}}, 'parent_ids': []}
{'event': 'on_prompt_end', 'name': 'ChatPromptTemplate', 'run_id': 'f8564c25-0874-4137-a264-5475fd3adb27', 'tags': ['seq:step:1'], 'metadata': {}, 'data': {'input': {'input': 'hello there!'}, 'output': ChatPromptValue(messages=[SystemMessage(content='you are a bot', additional_kwargs={}, response_metadata={}), HumanMessage(content='hello there!', additional_kwargs={}, response_metadata={})])}, 'parent_ids': []}
{'event': 'on_llm_start', 'name': 'CustomLLM', 'run_id': '9ba92711-c625-44ae-9ad5-c6a2f5ef2691', 'tags': ['seq:step:2'], 'metadata': {'ls_provider': 'custom', 'ls_model_type'

# How to create a custom Retriever

Creating a custom Retriever in LangChain involves implementing a class that extends the BaseRetriever interface. A retriever fetches documents relevant to a user's query and is a key component in applications where an LLM relies on external information to generate responses.

Explanation
1. BaseRetriever: The parent class that provides a framework for implementing custom retrieval logic. By inheriting this class, you ensure compatibility with LangChain's Runnable interface.

2. Required Method:

  * _get_relevant_documents: This synchronous method is where you define the logic to fetch relevant documents.

3. Optional Method:

* _aget_relevant_documents: If your retriever involves network calls or file access, providing an asynchronous version improves efficiency.

4. Runnable Interface: Since retrievers are LangChain Runnables, you can leverage their standard methods, like invoke, ainvoke, batch, and astream_events

Code Walkthrough

In [45]:
from typing import List  # To define a list of documents
from langchain_core.callbacks import CallbackManagerForRetrieverRun  # For callback functionality during retrieval
from langchain_core.documents import Document  # Represents a single document
from langchain_core.retrievers import BaseRetriever  # Base class for creating retrievers

# Define a custom retriever by extending BaseRetriever
class ToyRetriever(BaseRetriever):
    """A toy retriever that retrieves top-k documents matching the query."""
    
    documents: List[Document]  # List of documents available for retrieval
    k: int  # Number of top documents to return

    # Define the synchronous method for retrieving documents
    def _get_relevant_documents(
        self, query: str, *, run_manager: CallbackManagerForRetrieverRun
    ) -> List[Document]:
        """Retrieve documents that match the query."""
        
        # Initialize a list to store matching documents
        matching_documents = []
        
        # Iterate through all available documents
        for document in self.documents:
            # If already fetched k documents, stop searching
            if len(matching_documents) >= self.k:
                return matching_documents
            
            # Check if the query text is found in the document's content (case-insensitive)
            if query.lower() in document.page_content.lower():
                matching_documents.append(document)
        
        # Return the list of matching documents
        return matching_documents

    # Optional: Override to provide asynchronous retrieval logic
    # async def _aget_relevant_documents(self, query: str, *, run_manager: AsyncCallbackManagerForRetrieverRun):
    #     """Efficient asynchronous version of document retrieval."""
    #     pass  # Implement if needed


Example Usage

1. Define Some Documents

Here, we create a small set of documents to be queried.

In [46]:
documents = [
    Document(
        page_content="Dogs are great companions, known for their loyalty and friendliness.",
        metadata={"type": "dog", "trait": "loyalty"},
    ),
    Document(
        page_content="Cats are independent pets that often enjoy their own space.",
        metadata={"type": "cat", "trait": "independence"},
    ),
    Document(
        page_content="Goldfish are popular pets for beginners, requiring relatively simple care.",
        metadata={"type": "fish", "trait": "low maintenance"},
    ),
    Document(
        page_content="Parrots are intelligent birds capable of mimicking human speech.",
        metadata={"type": "bird", "trait": "intelligence"},
    ),
    Document(
        page_content="Rabbits are social animals that need plenty of space to hop around.",
        metadata={"type": "rabbit", "trait": "social"},
    ),
]


2. Create a Retriever Instance

We initialize the retriever with the documents and specify k=3 to fetch at most three results.

In [47]:
retriever = ToyRetriever(documents=documents, k=3)


3. Retrieve Relevant Documents

The invoke method allows querying the retriever. For example:

In [48]:
retriever.invoke("that")
# Output: Documents containing the word "that" in their content


[Document(metadata={'type': 'cat', 'trait': 'independence'}, page_content='Cats are independent pets that often enjoy their own space.'),
 Document(metadata={'type': 'rabbit', 'trait': 'social'}, page_content='Rabbits are social animals that need plenty of space to hop around.')]

4. Batch Retrieval

You can query multiple inputs at once using batch:

In [49]:
retriever.batch(["dog", "cat"])
# Output: Relevant documents for "dog" and "cat"


[[Document(metadata={'type': 'dog', 'trait': 'loyalty'}, page_content='Dogs are great companions, known for their loyalty and friendliness.')],
 [Document(metadata={'type': 'cat', 'trait': 'independence'}, page_content='Cats are independent pets that often enjoy their own space.')]]

5. Asynchronous Invocation

If required, you can asynchronously retrieve documents with ainvoke:

In [50]:
await retriever.ainvoke("that")


[Document(metadata={'type': 'cat', 'trait': 'independence'}, page_content='Cats are independent pets that often enjoy their own space.'),
 Document(metadata={'type': 'rabbit', 'trait': 'social'}, page_content='Rabbits are social animals that need plenty of space to hop around.')]

6. Event Streaming

You can stream responses (useful in applications like chatbots):

In [51]:
async for event in retriever.astream_events("bar", version="v1"):
    print(event)


{'event': 'on_retriever_start', 'run_id': 'b9560f5e-a7eb-4b3c-bc1a-b6fa8832b52b', 'name': 'ToyRetriever', 'tags': [], 'metadata': {}, 'data': {'input': 'bar'}, 'parent_ids': []}
{'event': 'on_retriever_stream', 'run_id': 'b9560f5e-a7eb-4b3c-bc1a-b6fa8832b52b', 'tags': [], 'metadata': {}, 'name': 'ToyRetriever', 'data': {'chunk': []}, 'parent_ids': []}
{'event': 'on_retriever_end', 'name': 'ToyRetriever', 'run_id': 'b9560f5e-a7eb-4b3c-bc1a-b6fa8832b52b', 'tags': [], 'metadata': {}, 'data': {'output': []}, 'parent_ids': []}


# Key Takeaways
* Flexibility: Custom retrievers allow arbitrary logic (e.g., database queries, API calls).
* Synchronous vs. Asynchronous: Use _aget_relevant_documents for efficient asynchronous processing when needed.
* Runnable Compatibility: By inheriting BaseRetriever, you gain access to standard LangChain Runnable methods.

# How to create a custom Document Loader

Creating a custom document loader in LangChain involves a systematic approach to extracting, parsing, and loading data into Document objects. These documents are typically utilized in LLM-based applications for downstream tasks like summarization, indexing into vector stores, or querying.

Step-by-Step Guide

1. Implement a Standard Document Loader

A standard document loader converts raw data into Document objects. For instance, if your data source is a plain text file, each line in the file can become a document.

Here's how to implement a custom loader:

In [52]:
from typing import AsyncIterator, Iterator
from langchain_core.document_loaders import BaseLoader
from langchain_core.documents import Document

class CustomDocumentLoader(BaseLoader):
    """Custom loader that reads a file line by line."""

    def __init__(self, file_path: str) -> None:
        self.file_path = file_path

    def lazy_load(self) -> Iterator[Document]:
        """Yield documents line by line."""
        with open(self.file_path, encoding="utf-8") as f:
            for line_number, line in enumerate(f):
                yield Document(
                    page_content=line.strip(),
                    metadata={"line_number": line_number, "source": self.file_path},
                )

    async def alazy_load(self) -> AsyncIterator[Document]:
        """Asynchronously yield documents line by line."""
        import aiofiles
        async with aiofiles.open(self.file_path, encoding="utf-8") as f:
            line_number = 0
            async for line in f:
                yield Document(
                    page_content=line.strip(),
                    metadata={"line_number": line_number, "source": self.file_path},
                )
                line_number += 1


Key points:

Use the lazy_load method for production scenarios to process data incrementally.

Avoid eager loading (e.g., using load()) in production if data size is significant.

2. Parsing Binary Data Using BaseBlobParser

Sometimes, you need to parse non-text files (e.g., PDFs, images). Use Blob to represent binary data and BaseBlobParser to process it.

In [53]:
from langchain_core.document_loaders import BaseBlobParser, Blob
from typing import Iterator
from langchain_core.documents import Document

class MyBlobParser(BaseBlobParser):
    """A parser that creates a document for each line in a blob."""

    def lazy_parse(self, blob: Blob) -> Iterator[Document]:
        """Parse the blob and yield documents."""
        with blob.as_bytes_io() as f:
            for line_number, line in enumerate(f):
                yield Document(
                    page_content=line.strip(),
                    metadata={"line_number": line_number, "source": blob.source},
                )


Testing the Parser:

In [54]:
blob = Blob.from_path(r"C:\Users\Admin\Desktop\10-20-2024\data\paul_graham_essay.txt")
parser = MyBlobParser()

for doc in parser.lazy_parse(blob):
    print(doc)


page_content='' metadata={'line_number': 0, 'source': 'C:\\Users\\Admin\\Desktop\\10-20-2024\\data\\paul_graham_essay.txt'}
page_content='' metadata={'line_number': 1, 'source': 'C:\\Users\\Admin\\Desktop\\10-20-2024\\data\\paul_graham_essay.txt'}
page_content='What I Worked On' metadata={'line_number': 2, 'source': 'C:\\Users\\Admin\\Desktop\\10-20-2024\\data\\paul_graham_essay.txt'}
page_content='' metadata={'line_number': 3, 'source': 'C:\\Users\\Admin\\Desktop\\10-20-2024\\data\\paul_graham_essay.txt'}
page_content='February 2021' metadata={'line_number': 4, 'source': 'C:\\Users\\Admin\\Desktop\\10-20-2024\\data\\paul_graham_essay.txt'}
page_content='' metadata={'line_number': 5, 'source': 'C:\\Users\\Admin\\Desktop\\10-20-2024\\data\\paul_graham_essay.txt'}
page_content='Before college the two main things I worked on, outside of school, were writing and programming. I didn't write essays. I wrote what beginning writers were supposed to write then, and probably still are: short sto

3. Combine Loaders and Parsers Using Blob Loaders

To load blobs from a filesystem and parse them:

In [55]:
from langchain_community.document_loaders.blob_loaders import FileSystemBlobLoader

blob_loader = FileSystemBlobLoader(path="./data", glob="*.txt")
parser = MyBlobParser()

for blob in blob_loader.yield_blobs():
    for doc in parser.lazy_parse(blob):
        print(doc)


4. Use GenericLoader for Convenience

LangChain’s GenericLoader combines blob loaders and parsers for common file-handling scenarios.

In [56]:
from langchain_community.document_loaders.generic import GenericLoader

loader = GenericLoader.from_filesystem(
    path="./data",
    glob="*.txt",
    show_progress=True,
    parser=MyBlobParser(),
)

for doc in loader.lazy_load():
    print(doc)


  from .autonotebook import tqdm as notebook_tqdm
0it [00:00, ?it/s]


5. Advanced Customization: Custom Generic Loader

To encapsulate custom logic, subclass GenericLoader:

In [57]:
from typing import Any

class MyCustomLoader(GenericLoader):
    @staticmethod
    def get_parser(**kwargs: Any) -> BaseBlobParser:
        return MyBlobParser()

loader = MyCustomLoader.from_filesystem(path="./data", glob="*.txt", show_progress=True)

for doc in loader.lazy_load():
    print(doc)


0it [00:00, ?it/s]


Tips for Implementation

Use lazy_load for memory-efficient processing in production.

Rely on GenericLoader for quick prototyping.

Keep parsing logic modular by separating it into BaseBlobParser subclasses.

If working with in-memory data, leverage the Blob API to avoid creating temporary files.

# How to create a custom Output Parser


1. Using RunnableLambda or RunnableGenerator (Recommended)

RunnableLambda Example

This method is straightforward and involves using a simple Python function. The function is automatically upgraded to a RunnableLambda.

Example: Invert Case

In [59]:
from typing import Iterable

from langchain_anthropic.chat_models import ChatAnthropic
from langchain_core.messages import AIMessage, AIMessageChunk

model = ChatAnthropic(model_name="claude-2.1")


def parse(ai_message: AIMessage) -> str:
    """Parse the AI message."""
    return ai_message.content.swapcase()


chain = model | parse
chain.invoke("hello")

BadRequestError: Error code: 400 - {'type': 'error', 'error': {'type': 'invalid_request_error', 'message': 'Your credit balance is too low to access the Anthropic API. Please go to Plans & Billing to upgrade or purchase credits.'}}