[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/aurelio-labs/langchain-course/blob/main/chapters/07-lcel.ipynb)

#### LangChain Essentials Course

# LangChains Expression Language

LangChain is one of the most popular open source libraries for AI Engineers. It's goal is to abstract away the complexity in building AI software, provide easy-to-use building blocks, and make it easier when switching between AI service providers.

In this example, we will introduce LangChain's Expression Langauge (LCEL), abstracting a full chain and understanding how it will work. We'll provide examples for both OpenAI's `gpt-4o-mini` *and* Meta's `llama3.2` via Ollama!

In [1]:
# !pip install -qU \
#   langchain-core==0.3.33 \
#   langchain-openai==0.3.3 \
#   langchain-community==0.3.16 \
#   langsmith==0.3.4 \
#   docarray==0.40.0

In [2]:
from dotenv import load_dotenv

load_dotenv()


True

---

> ⚠️ We will be using OpenAI for this example allowing us to run everything via API. If you would like to use Ollama instead, check out the [Ollama LangChain Course](https://github.com/aurelio-labs/langchain-course/tree/main/notebooks/ollama).

---

---

> ⚠️ If using LangSmith, add your API key below:

In [3]:
import os
from getpass import getpass

os.environ["LANGCHAIN_API_KEY"] = os.getenv("LANGCHAIN_API_KEY") or \
    getpass("Enter LangSmith API Key: ")

os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_ENDPOINT"] = "https://api.smith.langchain.com"
os.environ["LANGCHAIN_PROJECT"] = "aurelioai-langchain-course-lcel-openai"

---

## Traditional Chains vs LCEL

In this section we're going to dive into a basic example using the traditional method for building chains before jumping into LCEL. We will build a pipeline where the user must input a specific topic, and then the LLM will look and return a report on the specified topic. Generating a _research report_ for the user.

### Traditional LLMChain

The `LLMChain` is the simplest chain originally introduced in LangChain. This chain takes a prompt, feeds it into an LLM, and _optionally_ adds an output parsing step before returning the result.

Let's see how we construct this using the traditional method, for this we need:

* `prompt` — a `PromptTemplate` that will be used to generate the prompt for the LLM.
* `llm` — the LLM we will be using to generate the output.
* `output_parser` — an optional output parser that will be used to parse the structured output of the LLM.

In [4]:
from langchain import PromptTemplate

prompt_template = "Give me a small report on {topic}"

prompt = PromptTemplate(
    input_variables=["topic"],
    template=prompt_template
)

For the LLM, we'll start by initializing our connection to the OpenAI API. We do need an OpenAI API key, which you can get from the [OpenAI platform](https://platform.openai.com/api-keys).

We will use the `gpt-4o-mini` model with a `temperature` of `0.0`:

In [5]:
import os
from getpass import getpass
from langchain_openai import ChatOpenAI

os.environ["OPENAI_API_KEY"] = os.getenv("OPENAI_API_KEY") \
    or getpass("Enter your OpenAI API key: ")

llm = ChatOpenAI(
    model_name="gpt-4o-mini",
    temperature=0.0,
)

In [6]:
llm_out = llm.invoke("Hello there")
llm_out

AIMessage(content='Hello! How can I assist you today?', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 9, 'prompt_tokens': 9, 'total_tokens': 18, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-4o-mini-2024-07-18', 'system_fingerprint': 'fp_34a54ae93c', 'finish_reason': 'stop', 'logprobs': None}, id='run--6ad8cbcf-536b-4f32-90a3-270c6978414e-0', usage_metadata={'input_tokens': 9, 'output_tokens': 9, 'total_tokens': 18, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 0}})

Then we define our output parser, this will be used to parse the output of the LLM. In this case, we will use the `StrOutputParser` which will parse the `AIMessage` output from our LLM into a single string.

In [7]:
from langchain.schema.output_parser import StrOutputParser

output_parser = StrOutputParser()

In [8]:
out = output_parser.invoke(llm_out)
out

'Hello! How can I assist you today?'

Through the `LLMChain` class we can place each of our components into a linear `chain`.

In [9]:
from langchain.chains import LLMChain

chain = LLMChain(prompt=prompt, llm=llm, output_parser=output_parser)

  chain = LLMChain(prompt=prompt, llm=llm, output_parser=output_parser)


Note that the `LLMChain` _was_ deprecated in LangChain `0.1.17`, the expected way of constructing these chains today is through LCEL, which we'll cover in a moment.

We can `invoke` our `chain`, providing a `topic` that we'd like to be researched.

In [10]:
result = chain.invoke("retrieval augmented generation")
result

{'topic': 'retrieval augmented generation',
 'text': '### Report on Retrieval-Augmented Generation (RAG)\n\n#### Introduction\nRetrieval-Augmented Generation (RAG) is an innovative approach that combines the strengths of information retrieval and natural language generation. This method enhances the capabilities of language models by allowing them to access external knowledge bases or documents during the generation process, thereby improving the accuracy and relevance of the generated content.\n\n#### Key Concepts\n1. **Information Retrieval**: RAG utilizes a retrieval mechanism to fetch relevant documents or snippets from a large corpus based on a given query. This step ensures that the model has access to up-to-date and contextually relevant information.\n\n2. **Natural Language Generation**: After retrieving relevant information, the model generates responses or text based on both the input query and the retrieved documents. This dual approach allows for more informed and contextua

We can view a formatted version of this output using the `Markdown` display:

In [11]:
from IPython.display import display, Markdown

display(Markdown(result["text"]))

### Report on Retrieval-Augmented Generation (RAG)

#### Introduction
Retrieval-Augmented Generation (RAG) is an innovative approach that combines the strengths of information retrieval and natural language generation. This method enhances the capabilities of language models by allowing them to access external knowledge bases or documents during the generation process, thereby improving the accuracy and relevance of the generated content.

#### Key Concepts
1. **Information Retrieval**: RAG utilizes a retrieval mechanism to fetch relevant documents or snippets from a large corpus based on a given query. This step ensures that the model has access to up-to-date and contextually relevant information.

2. **Natural Language Generation**: After retrieving relevant information, the model generates responses or text based on both the input query and the retrieved documents. This dual approach allows for more informed and contextually rich outputs.

3. **Hybrid Architecture**: RAG typically employs a two-component architecture:
   - **Retriever**: This component identifies and retrieves relevant documents from a knowledge base or database.
   - **Generator**: This component processes the retrieved documents along with the original query to produce coherent and contextually appropriate text.

#### Advantages
- **Enhanced Accuracy**: By leveraging external information, RAG can provide more accurate and factually correct responses, especially in domains where knowledge is constantly evolving.
- **Contextual Relevance**: The integration of retrieval allows the model to generate responses that are more relevant to the specific context of the query.
- **Scalability**: RAG can scale to large datasets, making it suitable for applications requiring extensive knowledge bases.

#### Applications
- **Customer Support**: RAG can be used in chatbots to provide accurate answers to customer inquiries by retrieving relevant information from product manuals or FAQs.
- **Content Creation**: Writers can use RAG to generate articles or reports that are well-informed and factually accurate by pulling in data from various sources.
- **Research Assistance**: RAG can assist researchers by summarizing findings from multiple papers or articles, providing a comprehensive overview of a topic.

#### Challenges
- **Quality of Retrieval**: The effectiveness of RAG heavily depends on the quality of the retrieval component. Poor retrieval can lead to irrelevant or incorrect information being used in the generation process.
- **Complexity**: Implementing a RAG system can be more complex than traditional generation models, requiring careful tuning of both the retrieval and generation components.
- **Latency**: The retrieval process can introduce latency, which may be a concern in real-time applications.

#### Conclusion
Retrieval-Augmented Generation represents a significant advancement in the field of natural language processing. By combining retrieval and generation, RAG systems can produce more accurate, relevant, and context-aware outputs. As research and development in this area continue, RAG is likely to play an increasingly important role in various applications, from customer service to content generation and beyond. Future work will focus on improving retrieval accuracy, reducing latency, and enhancing the overall user experience.

That is a simple `LLMChain` using the traditional LangChain method. Now let's move onto LCEL.

## LangChain Expression Language (LCEL)

**L**ang**C**hain **E**xpression **L**anguage (LCEL) is the recommended approach to building chains in LangChain. Having superceeded the traditional methods with `LLMChain`, etc. LCEL gives us a more flexible system for building chains. The pipe operator `|` is used by LCEL to _chain_ together components. Let's see how we'd construct an `LLMChain` using LCEL.

In [12]:
lcel_chain = prompt | llm | output_parser

We can `invoke` this chain in the same way as we did before:

In [13]:
result = lcel_chain.invoke("retrieval augmented generation")
result

'### Report on Retrieval-Augmented Generation (RAG)\n\n#### Introduction\nRetrieval-Augmented Generation (RAG) is an innovative approach that combines the strengths of information retrieval and natural language generation. This method enhances the capabilities of language models by allowing them to access external knowledge bases or documents during the generation process, thereby improving the accuracy and relevance of the generated content.\n\n#### Key Concepts\n1. **Retrieval Mechanism**: RAG employs a retrieval system to fetch relevant documents or pieces of information from a large corpus. This is typically done using techniques such as vector embeddings or traditional keyword-based search.\n\n2. **Generation Mechanism**: After retrieving relevant information, a generative model (often based on transformer architectures like GPT or BERT) synthesizes this information into coherent and contextually appropriate responses.\n\n3. **Hybrid Model**: RAG can be viewed as a hybrid model th

The output format is slightly different, but the underlying functionality and content being output is the same. As before, we can view a formatted version of this output using the `Markdown` display:

In [14]:
display(Markdown(result))

### Report on Retrieval-Augmented Generation (RAG)

#### Introduction
Retrieval-Augmented Generation (RAG) is an innovative approach that combines the strengths of information retrieval and natural language generation. This method enhances the capabilities of language models by allowing them to access external knowledge bases or documents during the generation process, thereby improving the accuracy and relevance of the generated content.

#### Key Concepts
1. **Retrieval Mechanism**: RAG employs a retrieval system to fetch relevant documents or pieces of information from a large corpus. This is typically done using techniques such as vector embeddings or traditional keyword-based search.

2. **Generation Mechanism**: After retrieving relevant information, a generative model (often based on transformer architectures like GPT or BERT) synthesizes this information into coherent and contextually appropriate responses.

3. **Hybrid Model**: RAG can be viewed as a hybrid model that integrates both retrieval and generation, allowing it to leverage vast amounts of external knowledge while maintaining the fluency and contextual understanding of generative models.

#### Advantages
- **Enhanced Knowledge Access**: By retrieving information from external sources, RAG can provide more accurate and up-to-date responses compared to models that rely solely on pre-trained knowledge.
- **Contextual Relevance**: The integration of retrieval allows the model to generate responses that are more relevant to the specific query or context, improving user satisfaction.
- **Scalability**: RAG systems can scale to incorporate new information without the need for extensive retraining, making them adaptable to changing knowledge landscapes.

#### Applications
- **Question Answering**: RAG is particularly effective in applications where users seek specific information, such as customer support or educational tools.
- **Content Creation**: It can assist in generating articles, summaries, or reports by pulling in relevant data from various sources.
- **Conversational Agents**: RAG enhances chatbots and virtual assistants by enabling them to provide informed responses based on real-time data retrieval.

#### Challenges
- **Complexity**: The integration of retrieval and generation components can complicate the architecture and increase the computational resources required.
- **Quality of Retrieved Information**: The effectiveness of RAG heavily depends on the quality and relevance of the retrieved documents. Poor retrieval can lead to inaccurate or misleading outputs.
- **Latency**: The retrieval process can introduce latency, which may affect the responsiveness of applications, especially in real-time scenarios.

#### Conclusion
Retrieval-Augmented Generation represents a significant advancement in the field of natural language processing, combining the strengths of retrieval systems with generative models. Its ability to access and utilize external knowledge makes it a powerful tool for a variety of applications, from question answering to content creation. However, challenges related to complexity, information quality, and latency must be addressed to fully realize its potential. As research and development in this area continue, RAG is poised to play a crucial role in the evolution of intelligent systems.

### How Does the Pipe Operator Work?

Before moving onto other LCEL features, let's take a moment to understand what the pipe operator `|` is doing and _how_ it works.

Functionality wise, the pipe tells you that whatever the _left_ side outputs will be fed as input into the _right_ side. In the example of `prompt | llm | output_parser`, we see that `prompt` feeds into `llm` feeds into `output_parser`.

The pipe operator is a way of chaining together components, and is a way of saying that whatever the _left_ side outputs will be fed as input into the _right_ side.

Let's make a basic class named `Runnable` that will transform our a provided function into a _runnable_ class that we will then use with the pipe `|` operator.

In [15]:
class Runnable:
    def __init__(self, func):
        self.func = func
    def __or__(self, other):
        def chained_func(*args, **kwargs):
            return other.invoke(self.func(*args, **kwargs))
        return Runnable(chained_func)
    def invoke(self, *args, **kwargs):
        return self.func(*args, **kwargs)

With the `Runnable` class, we will be able wrap a function into the class, allowing us to then chain together multiple of these _runnable_ functions using the `__or__` method.

First, let's create a few functions that we'll chain together:

In [16]:
def add_five(x):
    return x+5

def sub_five(x):
    return x-5

def mul_five(x):
    return x*5

Now we wrap our functions with the `Runnable`:

In [17]:
add_five_runnable = Runnable(add_five)
sub_five_runnable = Runnable(sub_five)
mul_five_runnable = Runnable(mul_five)

Finally, we can chain these together using the `__or__` method from the `Runnable` class:

In [18]:
chain = (add_five_runnable).__or__(sub_five_runnable).__or__(mul_five_runnable)

chain.invoke(3)

15

So we can see that we're able to chain together our functions using `__or__`. The pipe `|` operator is simply a shortcut for the `__or__` method, so we can create the exact same chain like so:

In [19]:
chain = add_five_runnable | sub_five_runnable | mul_five_runnable

chain.invoke(3)

15

## LCEL `RunnableLambda`

The `RunnableLambda` class is LangChain's built-in method for constructing a _runnable_ object from a function. That is, it does the same thing as the custom `Runnable` class we created earlier. Let's try it out with the same functions as before.

In [20]:
from langchain_core.runnables import RunnableLambda

add_five_runnable = RunnableLambda(add_five)
sub_five_runnable = RunnableLambda(sub_five)
mul_five_runnable = RunnableLambda(mul_five)

We chain these together again with the pipe `|` operator:

In [21]:
chain = add_five_runnable | sub_five_runnable | mul_five_runnable

And call them using the `invoke` method:

In [22]:
chain.invoke(3)

15

Now we want to try something a little more testing, so this time we will generate a report, and we will try and edit that report using this functionallity.

In [23]:
prompt_str = "give me a small report about {topic}"
prompt = PromptTemplate(
    input_variables=["topic"],
    template=prompt_str
)

In [24]:
chain = prompt | llm | output_parser

In [25]:
result = chain.invoke("AI")
display(Markdown(result))

### Report on Artificial Intelligence (AI)

#### Introduction
Artificial Intelligence (AI) refers to the simulation of human intelligence processes by machines, particularly computer systems. These processes include learning (the acquisition of information and rules for using it), reasoning (using rules to reach approximate or definite conclusions), and self-correction. AI has become a transformative technology across various sectors, influencing how we work, communicate, and solve problems.

#### Types of AI
AI can be categorized into two main types:

1. **Narrow AI (Weak AI)**: This type of AI is designed and trained for a specific task. Examples include virtual assistants like Siri and Alexa, recommendation systems on platforms like Netflix and Amazon, and image recognition software.

2. **General AI (Strong AI)**: This is a theoretical form of AI that possesses the ability to understand, learn, and apply intelligence across a wide range of tasks, similar to a human being. As of now, General AI remains largely conceptual and has not yet been realized.

#### Applications of AI
AI technologies are being applied in numerous fields, including:

- **Healthcare**: AI is used for diagnostic purposes, personalized medicine, and predictive analytics to improve patient outcomes.
- **Finance**: Algorithms analyze market trends, assess risks, and automate trading processes.
- **Transportation**: AI powers autonomous vehicles, optimizing routes and improving safety.
- **Manufacturing**: Robotics and AI enhance production efficiency, predictive maintenance, and quality control.
- **Customer Service**: Chatbots and virtual assistants provide 24/7 support, improving customer engagement and satisfaction.

#### Challenges and Ethical Considerations
Despite its potential, AI poses several challenges:

- **Bias and Fairness**: AI systems can perpetuate existing biases present in training data, leading to unfair outcomes.
- **Privacy Concerns**: The use of AI in data collection raises significant privacy issues, as personal information can be misused.
- **Job Displacement**: Automation driven by AI may lead to job losses in certain sectors, necessitating workforce retraining and adaptation.
- **Security Risks**: AI can be exploited for malicious purposes, including cyberattacks and misinformation campaigns.

#### Future of AI
The future of AI is promising, with ongoing advancements in machine learning, natural language processing, and robotics. As AI continues to evolve, it is expected to play a crucial role in addressing global challenges, such as climate change, healthcare accessibility, and education.

#### Conclusion
Artificial Intelligence is a rapidly advancing field with the potential to revolutionize various aspects of society. While it offers numerous benefits, it also presents significant ethical and practical challenges that must be addressed to ensure its responsible and equitable deployment. As we move forward, collaboration among technologists, policymakers, and ethicists will be essential to harness the full potential of AI while mitigating its risks.

Here we are making two functions, `extract_fact` to pull out the main content of our text and `replace_word` that will replace AI with Skynet!

In [26]:
def extract_fact(x):
    if "\n\n" in x:
        return "\n".join(x.split("\n\n")[1:])
    else:
        return x

old_word = "AI"
new_word = "skynet"

def replace_word(x):
    return x.replace(old_word, new_word)

Lets wrap these functions and see what the output is!

In [27]:
extract_fact_runnable = RunnableLambda(extract_fact)
replace_word_runnable = RunnableLambda(replace_word)

In [28]:
chain = prompt | llm | output_parser | extract_fact_runnable | replace_word_runnable

In [29]:
result = chain.invoke("retrieval augmented generation")
display(Markdown(result))

#### Introduction
Retrieval-Augmented Generation (RAG) is an innovative approach that combines the strengths of information retrieval and natural language generation. This method enhances the capabilities of language models by allowing them to access external knowledge sources, thereby improving the accuracy and relevance of generated responses.
#### Concept Overview
RAG operates on the principle that while generative models like GPT-3 can produce coherent text, they may lack specific factual information or context that is not part of their training data. By integrating a retrieval mechanism, RAG can pull in relevant documents or data from a predefined corpus, which the generative model can then use to inform its output.
#### Mechanism
1. **Retrieval Component**: When a query is posed, the system first retrieves relevant documents from a large database or knowledge base. This is typically done using techniques such as vector similarity search or traditional keyword-based search.
2. **Generation Component**: The retrieved documents are then fed into a generative model, which synthesizes a response based on both the input query and the retrieved information. This allows the model to produce more informed and contextually relevant answers.
3. **End-to-End Training**: RAG models can be trained end-to-end, meaning that both the retrieval and generation components can be optimized together, leading to improved performance over time.
#### Applications
- **Question Answering**: RAG is particularly effective in applications where users seek specific information, such as customer support or educational tools.
- **Content Creation**: It can assist in generating articles, reports, or summaries by pulling in relevant data from various sources.
- **Conversational Agents**: RAG enhances chatbots and virtual assistants by providing them with up-to-date information and contextually relevant responses.
#### Advantages
- **Improved Accuracy**: By leveraging external knowledge, RAG can provide more accurate and factually correct information.
- **Contextual Relevance**: The integration of retrieval allows for responses that are more tailored to the specific needs of the user.
- **Dynamic Knowledge**: RAG systems can be updated with new information without retraining the entire model, making them adaptable to changing knowledge bases.
#### Challenges
- **Complexity**: The integration of retrieval and generation components adds complexity to the system architecture.
- **Latency**: The retrieval process can introduce delays, which may affect the responsiveness of applications.
- **Quality of Retrieved Data**: The effectiveness of RAG is heavily dependent on the quality and relevance of the retrieved documents.
#### Conclusion
Retrieval-Augmented Generation represents a significant advancement in the field of natural language processing. By combining retrieval and generation, RAG systems can produce more accurate, relevant, and context-aware responses. As research and development in this area continue, RAG is poised to play a crucial role in various applications, enhancing user experiences across multiple domains.

Those are our `RunnableLambda` functions. It's worth noting that all inputs to these functions are expected to be a SINGLE arguments. If you have a function that accepts multiple arguments, you can input a dictionary with keys, then unpack them inside the function.

## LCEL `RunnableParallel` and `RunnablePassthrough`

LCEL provides us with various `Runnable` classes that allow us to control the flow of data and execution order through our chains. Two of these are `RunnableParallel` and `RunnablePassthrough`.

* `RunnableParallel` — allows us to run multiple `Runnable` instances in parallel. Acting almost as a Y-fork in the chain.

* `RunnablePassthrough` — allows us to pass through a variable to the next `Runnable` without modification.

To see these runnables in action, we will create two data sources, each source provides specific information but to answer the question we will need both to fed to the LLM.

In [30]:
from langchain.embeddings import OpenAIEmbeddings
from langchain.vectorstores import DocArrayInMemorySearch

embedding = OpenAIEmbeddings()

vecstore_a = DocArrayInMemorySearch.from_texts(
    [
        "half the info is here",
        "DeepSeek-V3 was released in December 2024"
    ],
    embedding=embedding
)
vecstore_b = DocArrayInMemorySearch.from_texts(
    [
        "the other half of the info is here",
        "the DeepSeek-V3 LLM is a mixture of experts model with 671B parameters"
    ],
    embedding=embedding
)

  embedding = OpenAIEmbeddings()
  """


Here you can see the prompt does have three inputs, two for context and one for the question itself.

In [31]:
prompt_str = """Using the context provided, answer the user's question.
Context:
{context_a}
{context_b}
"""

In [32]:
from langchain.prompts import ChatPromptTemplate, SystemMessagePromptTemplate, HumanMessagePromptTemplate

prompt = ChatPromptTemplate.from_messages([
    SystemMessagePromptTemplate.from_template(prompt_str),
    HumanMessagePromptTemplate.from_template("{question}")
])

Here we are wrapping our vector stores as retrievers so they can be fitted into one big retrieval variable to be used by the prompt.

In [33]:
from langchain_core.runnables import RunnablePassthrough, RunnableParallel

retriever_a = vecstore_a.as_retriever()
retriever_b = vecstore_b.as_retriever()

retrieval = RunnableParallel(
    {
        "context_a": retriever_a, "context_b": retriever_b, "question": RunnablePassthrough()
    }
)

The chain we'll be constructing will look something like this:

![](https://github.com/aurelio-labs/langchain-course/blob/main/assets/lcel-flow.png?raw=1)

In [34]:
chain = retrieval | prompt | llm | output_parser

We `invoke` it as usual.

In [35]:
result = chain.invoke(
    "what architecture does the model DeepSeek released in december use?"
)
result

'The DeepSeek-V3 model, released in December 2024, is a mixture of experts model with 671 billion parameters.'

With that we've seen how we can use `RunnableParallel` and `RunnablePassthrough` to control the flow of data and execution order through our chains.

---