# LCEL Interface

- Author: [JeongGi Park](https://github.com/jeongkpa)
- Peer Review: [YooKyung Jeon](https://github.com/sirena1), [Wooseok Jeong](https://github.com/jeong-wooseok)
- Proofread : [Q0211](https://github.com/Q0211)
- This is a part of [LangChain Open Tutorial](https://github.com/LangChain-OpenTutorial/LangChain-OpenTutorial)

[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/LangChain-OpenTutorial/LangChain-OpenTutorial/blob/main/01-Basic/07-LCEL-Interface.ipynb)
[![Open in GitHub](https://img.shields.io/badge/Open%20in%20GitHub-181717?style=flat-square&logo=github&logoColor=white)](https://github.com/LangChain-OpenTutorial/LangChain-OpenTutorial/blob/main/01-Basic/07-LCEL-Interface.ipynb)

## Overview

The LangChain Expression Language (LCEL) is a powerful interface designed to simplify the creation and management of custom chains in LangChain. 
It implements the Runnable protocol, providing a standardized way to build and execute language model chains.


### Table of Contents

- [Overview](#overview)
- [Environment Setup](#environment-setup)
- [LCEL Runnable Protocol](#lcel-runnable-protocol)
- [stream: real-time output](#stream-real-time-output)
- [Invoke](#invoke)
- [batch: unit execution](#batch-unit-execution)
- [async stream](#async-stream)
- [async invoke](#async-invoke)
- [async batch](#async-batch)
- [Parallel](#parallel)
- [Parallelism in batches](#parallelism-in-batches)

### References

- [Langsmith DOC](https://docs.smith.langchain.com/)
---

## Environment Setup

Set up the environment. You may refer to [Environment Setup](https://wikidocs.net/257836) for more details.

**[Note]**
- ```langchain-opentutorial``` is a package that provides a set of easy-to-use environment setup, useful functions and utilities for tutorials. 
- You can checkout the [```langchain-opentutorial```](https://github.com/LangChain-OpenTutorial/langchain-opentutorial-pypi) for more details.

In [23]:
from dotenv import load_dotenv

load_dotenv(override=True)

True

In [25]:
%%capture --no-stderr
!pip install langchain-opentutorial

In [26]:
# Install required packages
from langchain_opentutorial import package

package.install(
    [
        "langsmith",
        "langchain-openai",
        "langchain",
        "python-dotenv",
        "langchain-core",
    ],
    verbose=False,
    upgrade=False,
)

In [27]:
# Set environment variables
from langchain_opentutorial import set_env
import os

set_env(
    {
        "OPENAI_API_KEY": os.environ.get('AZURE_OPENAI_API_KEY'),
        "LANGCHAIN_API_KEY":os.environ.get('LANGCHAIN_API_KEY'),
        "LANGCHAIN_TRACING_V2": "true",
        "LANGCHAIN_ENDPOINT": "https://api.smith.langchain.com",
        "LANGCHAIN_PROJECT": "pr-advanced-blossom-21",  # title 과 동일하게 설정해 주세요
    }
)

Environment variables have been set successfully.


## LCEL Runnable Protocol

---

To make it as easy as possible to create custom chains, we've implemented the ```Runnable``` protocol.

The ```Runnable``` protocol is implemented in most components.

It is a standard interface that makes it easy to define custom chains and call them in a standard way. The standard interface includes

- ```stream```: Streams a chunk of the response.
- ```invoke```: Invoke a chain on an input.
- ```batch```: Invoke a chain against a list of inputs.

There are also asynchronous methods

- ```astream```: Stream chunks of the response asynchronously.
- ```ainvoke```: Invoke a chain asynchronously on an input.
- ```abatch```: Asynchronously invoke a chain against a list of inputs.
- ```astream_log```: Streams the final response as well as intermediate steps as they occur.



### Log your trace

We provide multiple ways to log traces to LangSmith. Below, we'll highlight how to use traceable().

Use the code below to record a trace in LangSmith

In [30]:
from openai import AzureOpenAI
from langchain_openai import AzureChatOpenAI
from langsmith import wrappers, traceable
# import os
import httpx
# from langsmith import utils

# utils.tracing_is_enabled()

AZURE_OPENAI_ENDPOINT = os.environ.get('AZURE_OPENAI_ENDPOINT')
AZURE_OPENAI_API_VERSION = os.environ.get('AZURE_OPENAI_API_VERSION')
AZURE_OPENAI_DEPLOYMENT_NAME = os.environ.get('AZURE_OPENAI_DEPLOYMENT_NAME')
AZURE_OPENAI_API_KEY = os.environ.get('AZURE_OPENAI_API_KEY')



httpx_client = httpx.Client(http2=True, verify=False)

# client = AzureOpenAI(
#     azure_endpoint = AZURE_OPENAI_ENDPOINT,
#     api_version = AZURE_OPENAI_API_VERSION,
#     azure_deployment = AZURE_OPENAI_DEPLOYMENT_NAME,
#     api_key = AZURE_OPENAI_API_KEY,
#     http_client = httpx_client
# )

# Auto-trace LLM calls in-context
# client = wrappers.wrap_openai(openai.Client())
client = wrappers.wrap_openai(AzureOpenAI(
    azure_endpoint = AZURE_OPENAI_ENDPOINT,
    api_version = AZURE_OPENAI_API_VERSION,
    api_key = AZURE_OPENAI_API_KEY,
    azure_deployment=AZURE_OPENAI_DEPLOYMENT_NAME,
    http_client = httpx_client
))

# @traceable(http_client = httpx_client) # Auto-trace this function
def pipeline(user_input: str):
    result = client.chat.completions.create(
        messages=[{"role": "user", "content": user_input}],
        model="gpt-4o-mini"
    )
    return result.choices[0].message.content

pipeline("Hello, world!")
# Out:  Hello there! How can I assist you today?

'Hello! How can I assist you today?'

Create a chain using LCEL syntax.

In [39]:
from langchain_openai import AzureChatOpenAI
from langchain_core.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langsmith import utils

utils.tracing_is_enabled()
# Instantiate the ChatOpenAI model.
model = AzureChatOpenAI(
    temperature=0.5,  # Creativity (range: 0.0 ~ 2.0)
    model_name="gpt-4o-mini",  # Model name
    azure_endpoint = AZURE_OPENAI_ENDPOINT,
    openai_api_version = AZURE_OPENAI_API_VERSION,
    deployment_name = AZURE_OPENAI_DEPLOYMENT_NAME,
    openai_api_key = AZURE_OPENAI_API_KEY,
    http_client = httpx_client
)
# Create a prompt template that asks for jokes on a given topic.
prompt = PromptTemplate.from_template("Describe the {topic} in 3 sentences.")
# Connect the prompt and model to create a conversation chain.
chain = prompt | model | StrOutputParser()

## stream: real-time output

This function uses the ```chain.stream``` method to create a stream of data for a given topic, iterating over it and immediately outputting the ```content``` of each piece of data. 
The ```end=""``` argument disables newlines after output, and the ```flush=True``` argument causes the output buffer to be emptied immediately.

In [40]:
# Use the chain.stream method to create a stream of data for a given topic, iterating over it and immediately outputting the content of each piece of data. 
for token in chain.stream({"topic": "multimodal"}):
    # Output the content of each piece of data without newlines.
    print(token, end="", flush=True)

# example output 
# The multimodal approach involves using multiple modes of communication, such as visual, auditory, and kinesthetic, to enhance learning and understanding. By incorporating different sensory inputs, learners are able to engage with material in a more holistic and immersive way. This approach is especially effective in catering to diverse learning styles and preferences.

Multimodal refers to the integration and use of multiple modes or methods of communication and expression, such as text, images, audio, and video, to convey information or create meaning. This approach recognizes that different modes can complement each other, enhancing understanding and engagement by appealing to various senses and learning styles. In education and media, multimodal strategies can foster creativity and critical thinking by allowing individuals to explore and present ideas through diverse formats.

## Invoke

The ```invoke``` method of a ```chain``` object takes a topic as an argument and performs processing on that topic.






In [41]:
# Call the invoke method of the chain object, passing a dictionary with the topic 'ChatGPT'.
chain.invoke({"topic": "ChatGPT"})

'ChatGPT is an advanced language model developed by OpenAI that utilizes deep learning techniques to understand and generate human-like text. It can engage in conversations, answer questions, and provide information across a wide range of topics, making it a versatile tool for communication and assistance. Designed to learn from vast datasets, ChatGPT continuously improves its responses, aiming to provide relevant and coherent interactions.'

## batch: unit execution

The function ```chain.batch``` takes a list containing multiple dictionaries as arguments and performs batch processing using the values of the ```topic``` key in each dictionary.

In [42]:
# Call a function to batch process a given list of topics
chain.batch([{"topic": "ChatGPT"}, {"topic": "Instagram"}])

['ChatGPT is an advanced language model developed by OpenAI, designed to understand and generate human-like text based on the input it receives. It can assist with a wide range of tasks, including answering questions, providing explanations, and engaging in conversation across various topics. Leveraging deep learning techniques, ChatGPT continuously improves its ability to generate coherent and contextually relevant responses.',
 'Instagram is a popular social media platform that allows users to share photos and videos, apply filters, and engage with others through likes, comments, and direct messaging. Launched in 2010, it has evolved to include features like Stories, IGTV, and Reels, catering to diverse content formats and encouraging creativity. With a focus on visual storytelling, Instagram has become a key tool for influencers, brands, and individuals to connect and promote their identities or products.']

You can use the ```max_concurrency``` parameter to set the number of concurrent requests.

The ```config``` dictionary uses the ```max_concurrency``` key to set the maximum number of operations that can be processed concurrently. Here, it is set to process up to three jobs concurrently.

In [43]:
chain.batch(
    [
        {"topic": "ChatGPT"},
        {"topic": "Instagram"},
        {"topic": "multimodal"},
        {"topic": "programming"},
        {"topic": "machineLearning"},
    ],
    config={"max_concurrency": 1},
)

['ChatGPT is an advanced language model developed by OpenAI that uses deep learning techniques to understand and generate human-like text. It is designed to engage in conversations, answer questions, provide explanations, and assist with a wide range of topics. By leveraging a vast dataset, ChatGPT can produce coherent and contextually relevant responses, making it a valuable tool for communication and information retrieval.',
 'Instagram is a popular social media platform that allows users to share photos and videos, often enhanced with filters and editing tools. It emphasizes visual content and community engagement through features like stories, reels, and direct messaging. Users can follow others, like and comment on posts, and explore a wide range of content from friends, influencers, and brands.',
 'Multimodal refers to the integration and use of multiple modes or methods of communication, such as text, audio, images, and video, to convey information or express ideas. This approac

## async stream

The function ```chain.astream``` creates an asynchronous stream and processes messages for a given topic asynchronously.

It uses an asynchronous for loop (```async for```) to sequentially receive messages from the stream, and the print function to immediately print the contents of the messages (```s.content```). ```end=""``` disables line wrapping after printing, and ```flush=True``` forces the output buffer to be emptied to ensure immediate printing.


In [46]:
# Use an asynchronous stream to process messages in the 'YouTube' topic.
# result = chain.astream({"topic": "YouTube"})
# for token in result:
#     print()
async for token in chain.astream({"topic": "YouTube"}):
    # Print the message content. Outputs directly without newlines and empties the buffer.
    print(token, end="", flush=True)

APIConnectionError: Connection error.

## async invoke

The ```ainvoke``` method of a ```chain``` object performs an operation asynchronously with the given arguments. Here, we are passing a dictionary with a key named ```topic``` and a value named ```NVDA``` (NVIDIA's ticker) as arguments. This method can be used to asynchronously request processing for a specific topic.

In [47]:
# Handle the 'NVDA' topic by calling the 'ainvoke' method of the asynchronous chain object.
my_process = chain.ainvoke({"topic": "NVDA"})

In [48]:
# Wait for the asynchronous process to complete.
await my_process

APIConnectionError: Connection error.

## async batch

The function ```abatch``` batches a series of actions asynchronously.

In this example, we are using the ```abatch``` method of the ```chain``` object to asynchronously process actions on ```topic``` .

The ```await``` keyword is used to wait for those asynchronous tasks to complete.

In [49]:
# Performs asynchronous batch processing on a given topic.
my_abatch_process = chain.abatch(
    [{"topic": "YouTube"}, {"topic": "Instagram"}, {"topic": "Facebook"}]
)

In [20]:
# Wait for the asynchronous batch process to complete.
await my_abatch_process

APIConnectionError: Connection error.

## Parallel

Let's take a look at how the LangChain Expression Language supports parallel requests. For example, when you use ```RunnableParallel``` (often written in dictionary form), you execute each element in parallel.

Here's an example of running two tasks in parallel using the ```RunnableParallel``` class in the ```langchain_core.runnables``` module.

Create two chains (```chain1```, ```chain2```) that use the ```ChatPromptTemplate.from_template``` method to get the capital and area for a given ```country```.

These chains are connected via the ```model``` and pipe (```|```) operators, respectively. Finally, we use the ```RunnableParallel``` class to combine these two chains with the keys ```capital``` and ```area``` to create a ```combined``` object that can be run in parallel.

In [50]:
from langchain_core.runnables import RunnableParallel

# Create a chain that asks for the capital of {country}.
chain1 = (
    PromptTemplate.from_template("What is the capital of {country}?")
    | model
    | StrOutputParser()
)

# Create a chain that asks for the area of {country}.
chain2 = (
    PromptTemplate.from_template("What is the area of {country}?")
    | model
    | StrOutputParser()
)

# Create a parallel execution chain that generates the above two chains in parallel.
combined = RunnableParallel(capital=chain1, area=chain2)

The ```chain1.invoke()``` function calls the ```invoke``` method of the ```chain1``` object.

As an argument, it passes a dictionary with the value ```Canada``` in the key named ```country```.

In [51]:
# Run chain1 .
chain1.invoke({"country": "Canada"})

'The capital of Canada is Ottawa.'

Call ```chain2.invoke()```, this time passing a different country, the ```United States```, for the country key.

In [27]:
# Run chain2 .
chain2.invoke({"country": "USA"})

'The total land area of the United States is approximately 3.8 million square miles (9.8 million square kilometers).'

The ```invoke``` method of the ```combined``` object performs the processing for the given ```country```.

In this example, the topic ```USA``` is passed to the ```invoke``` method to run.

In [52]:
# Run a parallel execution chain.
combined.invoke({"country": "USA"})

{'capital': 'The capital of the United States is Washington, D.C.',
 'area': 'The total area of the United States is approximately 3.8 million square miles (about 9.8 million square kilometers), making it the third-largest country in the world by total area, after Russia and Canada. This figure includes all 50 states and the District of Columbia, but does not include territories.'}

## Parallelism in batches

Parallelism can be combined with other executable code. Let's try using parallelism with batch.

The ```chain1.batch``` function takes a list containing multiple dictionaries as an argument, and processes the values corresponding to the "topic" key in each dictionary. In this example, we're batch processing two topics, "Canada" and "United States".

In [53]:
# Perform batch processing.
chain1.batch([{"country": "Canada"}, {"country": "USA"}])


['The capital of Canada is Ottawa.',
 'The capital of the United States is Washington, D.C.']

The ```chain2.batch``` function takes in multiple dictionaries as a list and performs batch processing.

In this example, we request processing for two countries, ```Canada``` and the ```United States```.

In [30]:
# Perform batch processing.
chain2.batch([{"country": "Canada"}, {"country": "USA"}])


['The total land area of Canada is approximately 9.98 million square kilometers (3.85 million square miles), making it the second largest country in the world by land area.',
 'The total land area of the United States is approximately 3.8 million square miles.']

The combined.batch function is used to process the given data in batches. 

In this example, it takes a list containing two dictionary objects as arguments and batches data for two countries, Canada and the United States, respectively.

In [54]:
# Processes the given data in batches.
combined.batch([{"country": "Canada"}, {"country": "USA"}])


[{'capital': 'The capital of Canada is Ottawa.',
  'area': 'The total area of Canada is approximately 9.98 million square kilometers (about 3.85 million square miles), making it the second-largest country in the world by total area, after Russia.'},
 {'capital': 'The capital of the United States is Washington, D.C.',
  'area': 'The total area of the United States is approximately 3.8 million square miles (about 9.8 million square kilometers), making it the third-largest country in the world by total area, following Russia and Canada. This figure includes all 50 states and the District of Columbia, but does not include territories.'}]