# Retrieval Augmented Generation using Runnables and Chains w/ LangChain

Enhance generation with specialized knowledge.

**Purpose**:
This notebook's purpose is to teach you how to build your own custom `Runnable`s from the `LangChain` ecosystem to build your own RAG app.

## Definitions: `Runnables` and `Chains`

### *Runnables*:

• A Runnable represents a unit of work that can be executed.

• It can perform a specific task or action, such as making an API call, processing data, or running a machine learning model.

• Runnables can have input and output types specified, and they can be composed together to form more complex workflows.

• They are designed to be flexible and reusable components that can be easily combined and configured

• Require an `invoke` method, which is used to execute the Runnable.

• Examples of Runnables include API calls, data processing functions, and machine learning models.

### *Chains*:

• A Chain is a sequence of Runnables that are executed in a specific order.

• Chains provide a way to string together multiple Runnables to create a workflow or pipeline.

• Each Runnable in the Chain takes the output of the previous Runnable as its input.

• Chains can be used to build complex applications by combining and orchestrating the execution of multiple Runnables.

• They provide a higher-level abstraction for organizing and structuring the flow of data and operations.

• Examples of Chains include data processing pipelines, machine learning workflows, and API request/response sequences.

## **Deeper explanation**:

In the process of building an AI chatbot, we often need to connect different components together to create a functional system.

One way to achieve this is by *chaining* these components, ensuring that the output of one component is properly passed to the next component for further processing. To accomplish this, we can directly call the functions or methods of each component and pass the output as arguments to the next component. 

- This straightforward approach works well when we only need to pass the output from one component to another ***without any*** additional processing or transformations in between.

However, in more complex scenarios where we require intermediate processing or transformations on the output, we can use a concept called "runnables." Runnables provide a flexible and modular way to encapsulate and compose these processing steps within a chain.

By using runnables, we can easily add additional functionality, such as filtering or modifying the output, before passing it to the next component. 

- This allows us to *customize the behavior* of the chatbot and *ensure* that the output is properly prepared for the subsequent steps.

### "How are `Runnable`s different than normal classes?"

*Similarities*:

• Runnables can have methods and attributes, just like normal classes.

• They can define and implement their own logic and functionality.

• Runnables can have constructor arguments and can be instantiated with different configurations.

*Differences*:

• Runnables are designed to be executed as part of a larger system or workflow, often in a distributed or parallelized manner.

• They are typically used for data processing, transformation, or analysis tasks.
• Runnables have specific interfaces and methods that define how they interact with other runnables and the overall system.

• They can be composed and combined with other runnables to create complex workflows.

• Runnables often have additional features and capabilities specific to the Langchain platform, such as input and output type validation, configuration management, and error handling.

• They can be executed asynchronously and in parallel, taking advantage of distributed computing resources.

• Runnables can be versioned and deployed as part of a larger system, allowing for easy updates and maintenance.

### "How do I decide to use either a `Runnable` or a `Chain`?"
Ultimately, the decision to use runnables or a more straightforward sequential approach depends on the specific requirements and complexity of the chatbot system. You might find yourself using one, both, or neither based on your needs.

In summary, 
1. Chains, which are sequences of interconnected tasks, can operate effectively on their own, without the need for Runnables. They are designed to link various components of a system in a specific order, allowing for the smooth execution of a workflow or pipeline. This makes them particularly useful in scenarios where a straightforward, sequential process is sufficient and where the complexity of Runnables is not required.

2. Runnables resemble traditional classes but offer enhanced functionality, particularly in complex AI chatbot systems. They facilitate the integration and processing of outputs between different components, allowing for customization and increased flexibility in system design. This makes Runnables ideal for scenarios requiring more than just sequential processing, such as when intermediate steps or specific transformations of data are necessary.

## Set up

In [14]:
%pip install -Uq openai tiktoken chromadb langchain langchain-openai faiss-cpu beautifulsoup4

Note: you may need to restart the kernel to use updated packages.


In [6]:
# Set API Key Directly
import os

#os.environ["OPENAI_API_KEY"] = ""

# Load from an .env file
# import dotenv

# dotenv.load_dotenv()

## Creating and Executing a Custom Runnable

In this section, we will put into practice the concepts of `Runnables` that we've discussed earlier. We will create a custom `Runnable` called `AddNumbersRunnable` and demonstrate how to execute it within our application.

### Step 1: Define the Runnable

First, we instantiate our `AddNumbersRunnable`. This is a class that we've designed to perform a specific task—in this case, adding two numbers. The design of this class follows the principles of `Runnables` in the LangChain ecosystem, making it a modular and reusable component.

### Step 2: Prepare the Input

Next, we prepare the input for our runnable using the `InputType` class. This class is a Pydantic model that ensures our input data is structured and typed correctly. By creating an instance of `InputType`, we are packaging our data (the two numbers we want to add) in a way that our `Runnable` can easily understand and process.

### Step 3: Execute the Runnable

With our input ready, we call the `run` method of our `AddNumbersRunnable` instance. This method encapsulates the logic of our task and is responsible for executing the work defined by the `Runnable`. It takes our structured input, performs the addition, and returns an output in the form of an `OutputType` instance.

### Step 4: Display the Result

Finally, we display the result of our runnable's execution. The `OutputType` class defines the expected structure of the output from our `Runnable`. By accessing the `result` attribute, we can retrieve the sum of the two numbers and present it to the user.

This simple example illustrates the power of `Runnables` and how they can be used to build clean, organized, and reusable components within your applications. By following this pattern, you can create more complex workflows and applications using the LangChain library.

In [3]:
from langchain_core.runnables.base import Runnable
from pydantic import BaseModel

class AddNumbersRunnable(Runnable):
    """
    A class representing a runnable that adds two numbers together.

    Attributes:
        InputType (BaseModel): A Pydantic model representing the input to the runnable. It has two attributes: num1 and num2, both integers.
        OutputType (BaseModel): A Pydantic model representing the output of the runnable. It has one attribute: result, an integer.

    Methods:
        run(input: InputType) -> OutputType: This method takes an instance of InputType as input, adds the two numbers together, and returns an instance of OutputType containing the sum.
        invoke(input: InputType) -> OutputType: This method is a wrapper around the run method. It takes an instance of InputType as input and returns the result of calling the run method with the same input.
    """
    class InputType(BaseModel):
        num1: int
        num2: int

    class OutputType(BaseModel):
        result: int

    def run(self, input: InputType) -> OutputType:
        result = input.num1 + input.num2
        output = self.OutputType(result=result)
        return output
    
    def invoke(self, input: InputType) -> OutputType:
        return self.run(input)

In [4]:
# Instantiate the AddNumbersRunnable class.
# This class is an example of a Runnable, which is a core concept in LangChain for creating modular and reusable units of work.
# As we've learned, Runnables like this one encapsulate specific tasks—in this case, adding two numbers.
add_numbers = AddNumbersRunnable()

# Prepare the input data for the runnable.
# The InputType class is a Pydantic model that defines the expected structure of the input, ensuring type safety and validation.
# Here, we're creating an instance of InputType with two numbers, demonstrating how inputs are structured for Runnables.
input_data = AddNumbersRunnable.InputType(num1=5, num2=3)

# Execute the runnable with the provided input data.
# The run method is where the logic of the Runnable is executed. This method is a clear example of how a Runnable performs its task.
# By calling this method, we're following the pattern of Runnables where they take an input, process it, and produce an output.
output_data = add_numbers.run(input_data)

# Display the result of the runnable's execution.
# The OutputType class defines the structure of the output, which is another aspect of Runnables that promotes consistency and predictability.
# This print statement not only shows the result but also reinforces the concept of Runnables having defined inputs and outputs.
print(f"The result of adding {input_data.num1} and {input_data.num2} is: {output_data.result}")  # Output: 8

The result of adding 5 and 3 is: 8


---

# Examples of ChatOpenAI Chatbot Chain Examples
These end-to-end examples were generated at https://chat.langchain.com/ on 12/22/23:

In [11]:
from langchain_openai.chat_models import ChatOpenAI
from langchain.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain.memory import ConversationBufferWindowMemory
from langchain_core.output_parsers import StrOutputParser

In [12]:
# Create the chat model
chat_model = ChatOpenAI(
    model_name="gpt-3.5-turbo-1106",
    temperature=0.25,   
    )

# Create the prompt template with memory placeholders
prompt_template = ChatPromptTemplate.from_messages(
    [
        ("system", "You are a helpful chatbot"),
        MessagesPlaceholder(variable_name="history"),
        ("human", "{input}"),
    ]
)

# Create the memory with a window size of 2
memory = ConversationBufferWindowMemory(k=2, return_messages=True)


In [None]:
from langchain_community.document_loaders import WebBaseLoader
from langchain.text_splitter import CharacterTextSplitter
from langchain_openai.embeddings import OpenAIEmbeddings
from langchain_community.vectorstores import FAISS


# Initialize a loader, and load documents. \
    # We'll use the WebBaseLoader to load documents from the web.
loader = WebBaseLoader("https://python.langchain.com/docs/integrations/document_loaders/web_base")

documents = loader.load()

text_splitter = CharacterTextSplitter(chunk_size=256, chunk_overlap=25)

# Split our documents into chunks using the CharacterTextSplitter
docs = text_splitter.split_documents(documents)

# Initialize the embeddings model
embeddings = OpenAIEmbeddings(openai_api_key="")

# Initialize a FAISS index with our embedded documents
vector_store = FAISS.from_documents(docs, embeddings)



In [None]:
# Create the output parser
output_parser = StrOutputParser()

# Create the chain
chain = (
    RunnablePassthrough.assign(
        history = RunnableLambda(memory.load_memory_variables) | (lambda x: x.get("history", []))
    )
    | prompt_template
    | chat_model
    | retriever
    | output_parser
)

# Define the user input
user_input = "Hi, how can I help you?"

# Invoke the chain
output = chain.invoke({"input": user_input})

# Parse the output
parsed_output = output_parser.parse(output)

# Update the memory with the user input and model output
memory.save_context({"input": user_input}, {"output": parsed_output})

# Print the parsed output
print(parsed_output)

In [6]:
from langchain.chains.conversation.base import ConversationChain
from langchain.memory.buffer_window import ConversationBufferWindowMemory
from langchain_community.chat_models import ChatOpenAI
from langchain_core.output_parsers import StrOutputParser
from langchain_community.vectorstores import Chroma

# Create the chat model
chat_model = ChatOpenAI(
    model_name="gpt-3.5-turbo-1106",
    temperature=0.25,
    )

# Create the memory
memory = ConversationBufferWindowMemory(k=5)

# Create the vector store and retriever
vector_store = Chroma()
retriever = vector_store.as_retriever()

# Create the chatbot chain
chatbot_chain = ConversationChain(
    llm=chat_model,
    memory=memory,
)

chain = 

# Run the chatbot chain
response = chatbot_chain.invoke("Hello!")
print(response["response"])

SyntaxError: invalid syntax (1260384661.py, line 27)