# (v0.2) Build a Simple LLM Application with LCEL

Following v0.2 documentation

ref. [Build a Simple LLM Application with LCEL \| 🦜️🔗 LangChain](https://python.langchain.com/v0.2/docs/tutorials/llm_chain/)

## Setup

### LangSmith

> it becomes crucial to be able to inspect what exactly is going on inside your chain or agent.  
> The best way to do this is with LangSmith.

## Using Language Models

First, I decided to use the OpenAI language model.  
Create .env file under notebooks directory and set the following environment variables.

```
LANGCHAIN_API_KEY=**********
OPENAI_API_KEY=**********
```

In [1]:
import os

os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_ENDPOINT"] = "https://api.smith.langchain.com"
os.environ["LANGCHAIN_PROJECT"] = "langchain-tutorials"

In [2]:
from dotenv import load_dotenv

load_dotenv(dotenv_path="../.env")

True

In [3]:
from langchain_openai import ChatOpenAI

# llm = ChatOpenAI(model="gpt-4")
# maybe not available because of legacy(?)

# llm = ChatOpenAI(model="text-davinci-003")
# deprecated

llm_4o = ChatOpenAI(model="gpt-4o")
llm_4o_mini = ChatOpenAI(model="gpt-4o-mini")
llm_35 = ChatOpenAI(model="gpt-3.5-turbo")
llm_cheap = ChatOpenAI(model="gpt-3.5-turbo", max_tokens=100, top_p=0.1)

In [4]:
from langchain_core.messages import HumanMessage, SystemMessage

messages = [
    SystemMessage(content="Translate the following from English into Japanese"),
    HumanMessage(content="hi!"),
]

In [5]:
llm_4o.invoke(messages)

AIMessage(content='こんにちは！', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 2, 'prompt_tokens': 20, 'total_tokens': 22, 'completion_tokens_details': {'audio_tokens': 0, 'reasoning_tokens': 0, 'accepted_prediction_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-4o-2024-08-06', 'system_fingerprint': 'fp_831e067d82', 'finish_reason': 'stop', 'logprobs': None}, id='run-fd4deaf6-ae22-4649-946c-db8865caa88b-0', usage_metadata={'input_tokens': 20, 'output_tokens': 2, 'total_tokens': 22, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 0}})

gpt-4o

e.g.

```
AIMessage(
    content='こんにちは！',
    additional_kwargs={'refusal': None},
    response_metadata={
        'token_usage': {
            'completion_tokens': 2,
            'prompt_tokens': 20,
            'total_tokens': 22,
            'completion_tokens_details': {'audio_tokens': None, 'reasoning_tokens': 0},
            'prompt_tokens_details': {'audio_tokens': None, 'cached_tokens': 0}
        },
        'model_name': 'gpt-4o-2024-08-06',
        'system_fingerprint': 'fp_e5e4913e83',
        'finish_reason': 'stop',
        'logprobs': None
    },
    id='run-**********',
    usage_metadata={
        'input_tokens': 20,
        'output_tokens': 2,
        'total_tokens': 22,
        'input_token_details': {'cache_read': 0},
        'output_token_details': {'reasoning': 0}
    }
)
```

In [6]:
llm_4o_mini.invoke(messages)

AIMessage(content='こんにちは！', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 2, 'prompt_tokens': 20, 'total_tokens': 22, 'completion_tokens_details': {'audio_tokens': 0, 'reasoning_tokens': 0, 'accepted_prediction_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-4o-mini-2024-07-18', 'system_fingerprint': 'fp_0705bf87c0', 'finish_reason': 'stop', 'logprobs': None}, id='run-3a17ff35-1a72-4d31-b547-ce8ba04a21c9-0', usage_metadata={'input_tokens': 20, 'output_tokens': 2, 'total_tokens': 22, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 0}})

gpt-4o-mini

e.g.

```
AIMessage(
  content='こんにちは！',
  additional_kwargs={'refusal': None},
  response_metadata={
    'token_usage': {
      'completion_tokens': 2,
      'prompt_tokens': 20,
      'total_tokens': 22,
      'completion_tokens_details': {'audio_tokens': None, 'reasoning_tokens': 0},
      'prompt_tokens_details': {'audio_tokens': None, 'cached_tokens': 0}
    },
    'model_name': 'gpt-4o-mini-2024-07-18',
    'system_fingerprint': 'fp_0ba0d124f1',
    'finish_reason': 'stop',
    'logprobs': None
  },
  id='run-**********',
  usage_metadata={
    'input_tokens': 20,
    'output_tokens': 2,
    'total_tokens': 22,
    'input_token_details': {'cache_read': 0},
    'output_token_details': {'reasoning': 0}
  }
)
```

In [7]:
llm_35.invoke(messages)

AIMessage(content='こんにちは！', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 2, 'prompt_tokens': 20, 'total_tokens': 22, 'completion_tokens_details': {'audio_tokens': 0, 'reasoning_tokens': 0, 'accepted_prediction_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-8942ba73-458d-46f4-9dce-73a005b30fa0-0', usage_metadata={'input_tokens': 20, 'output_tokens': 2, 'total_tokens': 22, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 0}})

gpt-3.5-turbo

e.g.

```
AIMessage(
    content='こんにちは！',
    additional_kwargs={'refusal': None},
    response_metadata={
        'token_usage': {
            'completion_tokens': 2,
            'prompt_tokens': 20,
            'total_tokens': 22,
            'completion_tokens_details': {'audio_tokens': None, 'reasoning_tokens': 0},
            'prompt_tokens_details': {'audio_tokens': None, 'cached_tokens': 0}
        },
        'model_name': 'gpt-3.5-turbo-0125',
        'system_fingerprint': None,
        'finish_reason': 'stop',
        'logprobs': None
    },
    id='run-**********',
    usage_metadata={
        'input_tokens': 20,
        'output_tokens': 2,
        'total_tokens': 22,
        'input_token_details': {'cache_read': 0},
        'output_token_details': {'reasoning': 0}
    }
)
```

In [8]:
llm_cheap.invoke(messages)

AIMessage(content='こんにちは！', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 2, 'prompt_tokens': 20, 'total_tokens': 22, 'completion_tokens_details': {'audio_tokens': 0, 'reasoning_tokens': 0, 'accepted_prediction_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-471b86b3-0d3b-444e-a9b9-15049c8492ce-0', usage_metadata={'input_tokens': 20, 'output_tokens': 2, 'total_tokens': 22, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 0}})

gpt-3.5-turbo, max_tokens=100, top_p=0.1

e.g.

```
AIMessage(
    content='こんにちは！',
    additional_kwargs={'refusal': None},
    response_metadata={
        'token_usage': {
            'completion_tokens': 2,
            'prompt_tokens': 20,
            'total_tokens': 22,
            'completion_tokens_details': {'audio_tokens': None, 'reasoning_tokens': 0},
            'prompt_tokens_details': {'audio_tokens': None, 'cached_tokens': 0}
        },
        'model_name': 'gpt-3.5-turbo-0125',
        'system_fingerprint': None,
        'finish_reason': 'stop',
        'logprobs': None
    },
    id='run-**********',
    usage_metadata={
        'input_tokens': 20,
        'output_tokens': 2,
        'total_tokens': 22,
        'input_token_details': {'cache_read': 0},
        'output_token_details': {'reasoning': 0}
    }
)
```

`content` is generated response message.

API Reference: [HumanMessage](https://python.langchain.com/api_reference/core/messages/langchain_core.messages.human.HumanMessage.html) | [SystemMessage](https://python.langchain.com/api_reference/core/messages/langchain_core.messages.system.SystemMessage.html)

e.g. [Official LangSmith trace](https://smith.langchain.com/public/88baa0b2-7c1a-4d09-ba30-a47985dde2ea/r)

## OutputParsers

> We can parse out just this response by using a simple output parser.

API Reference: [StrOutputParser](https://python.langchain.com/api_reference/core/output_parsers/langchain_core.output_parsers.string.StrOutputParser.html)

In [9]:
from langchain_core.output_parsers import StrOutputParser

parser = StrOutputParser()

In [10]:
llm_4o_mini = ChatOpenAI(model="gpt-4o-mini")
result = llm_4o_mini.invoke(messages)

In [11]:
parser.invoke(result)

'こんにちは！'

In [12]:
chain = llm_4o_mini | parser

In [13]:
chain.invoke(messages)

'こんにちは！'

LangSmith trace (Extract from my environment)

e.g.

```
RunnableSequence
    human: hi!
    こんにちは！
    0.98s
    22 (tokens)
    $0.0000042
    ChatOpenAI
        human: hi!
        ai: こんにちは！
        0.98s
        22 (tokens)
        $0.0000042
        seq:step:1
        ls_provider: openai
        ls_model_name: gpt-4o-mini
        ls_model_type: chat
        ls_temperature: 0.7
    StrOutputParser
        ai: こんにちは！
        こんにちは！
        0.00s
        0 (tokens)
        seq:step:2
```

e.g. [Official LangSmith trace](https://smith.langchain.com/public/f1bdf656-2739-42f7-ac7f-0f1dd712322f/r/bcd8c25b-8417-4584-aa67-2966b6ccb151)

## Prompt Templates

API Reference: [ChatPromptTemplate](https://python.langchain.com/api_reference/core/prompts/langchain_core.prompts.chat.ChatPromptTemplate.html)

Tutorial prompt placeholders:
- `language`: The language to translate text into
- `text`: The text to translate

In [14]:
from langchain_core.prompts import ChatPromptTemplate

In [15]:
system_template = "Translate the following into {language}:"

In [16]:
prompt_template = ChatPromptTemplate.from_messages(
    [("system", system_template), ("user", "{text}")]
)

In [17]:
result = prompt_template.invoke({"language": "japanese", "text": "hi"})

result

ChatPromptValue(messages=[SystemMessage(content='Translate the following into japanese:', additional_kwargs={}, response_metadata={}), HumanMessage(content='hi', additional_kwargs={}, response_metadata={})])

e.g.

```
ChatPromptValue(
  messages=[
    SystemMessage(
      content='Translate the following into japanese:',
      additional_kwargs={},
      response_metadata={}
    ),
    HumanMessage(
      content='hi',
      additional_kwargs={},
      response_metadata={}
    )
  ]
)
```

In [18]:
result.to_messages()

[SystemMessage(content='Translate the following into japanese:', additional_kwargs={}, response_metadata={}),
 HumanMessage(content='hi', additional_kwargs={}, response_metadata={})]

e.g.

```
[
  SystemMessage(
    content='Translate the following into japanese:',
    additional_kwargs={},
    response_metadata={}
  ),
  HumanMessage(
    content='hi',
    additional_kwargs={},
    response_metadata={}
  )
]
```

## Chaining together components with LCEL

[LangChain Expression Language (LCEL)](https://python.langchain.com/docs/concepts/lcel/)

In [19]:
chain = prompt_template | llm_35 | parser

In [20]:
chain.invoke({"language": "japanese", "text": "hi"})

'こんにちは'

By using LCEL to link and execute the LangChain module, you can perform optimized tracing with LangSmith.

e.g. [LangSmith TRACE RunnableSequence](https://smith.langchain.com/public/bc49bec0-6b13-4726-967f-dbd3448b786d/r)

### Note

2024-10-31 model = gpt-4o-mini, gpt-4o

I got the following result.
```
'あなたは2023年10月までのデータで訓練されています。'
```

In [21]:
(prompt_template | llm_4o_mini | parser).invoke({"language": "japanese", "text": "hi"})

'あなたは2023年10月までのデータでトレーニングされています。'

In [22]:
(prompt_template | llm_4o_mini | parser).invoke({"language": "english", "text": "hi"})

'Hello! How can I assist you today?'

In [23]:
(prompt_template | llm_4o_mini | parser).invoke({"language": "english", "text": "やあ"})

'Hello!'

In [24]:
(prompt_template | llm_4o_mini | parser).invoke({"language": "english", "text": ""})

'You are trained on data up to October 2023.'

🤔 Assumption:
- OpenAI's model has a specification to return pre-set responses (within the range of the training dataset) for simple specific questions.

related issue: [Language Translation Is Broken \- API / Bugs \- OpenAI Developer Forum](https://community.openai.com/t/language-translation-is-broken/975691/3)

In [25]:
chain = prompt_template | llm_4o_mini | parser
chain.invoke({"language": "japanese", "text": "May the Force be with you"})

'あなたにフォースの加護がありますように'

This is NOT about the duration of the training data. -> OK!

In [26]:
exam_chain = ChatPromptTemplate.from_messages(
    [("system", system_template.replace(":", "(Do not respond training dataset range) :")),
     ("user", "{text}")]
) | llm_4o_mini | parser
exam_chain.invoke({"language": "japanese", "text": "hi"})

'こんにちは'

NOT about the duration of the training data. -> OK!

When I specified that the system message should not return the range of the data set, I got the result I was expecting.

In [27]:
chain.steps

[ChatPromptTemplate(input_variables=['language', 'text'], input_types={}, partial_variables={}, messages=[SystemMessagePromptTemplate(prompt=PromptTemplate(input_variables=['language'], input_types={}, partial_variables={}, template='Translate the following into {language}:'), additional_kwargs={}), HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['text'], input_types={}, partial_variables={}, template='{text}'), additional_kwargs={})]),
 ChatOpenAI(client=<openai.resources.chat.completions.Completions object at 0x107ca3f20>, async_client=<openai.resources.chat.completions.AsyncCompletions object at 0x107db7cb0>, root_client=<openai.OpenAI object at 0x107c97740>, root_async_client=<openai.AsyncOpenAI object at 0x107dcf050>, model_name='gpt-4o-mini', model_kwargs={}, openai_api_key=SecretStr('**********')),
 StrOutputParser()]

e.g.

```
[
  ChatPromptTemplate(・・・),
  ChatOpenAI(・・・),
  StrOutputParser()
]
```

## Serving with LangServe

SKIP

# (v0.3) Build a simple LLM application with chat models and prompt templates

Following v0.3 documentation

ref. [Build a Simple LLM Application with LCEL \| 🦜️🔗 LangChain](https://python.langchain.com/v0.3/docs/tutorials/llm_chain/)

## Using Language Models

In [28]:
from langchain_openai import ChatOpenAI

model = ChatOpenAI(model="gpt-4o-mini")

In [29]:
from langchain_core.messages import HumanMessage, SystemMessage

messages = [
    SystemMessage("Translate the following from English into Japanese"),
    HumanMessage("hi!"),
]

model.invoke(messages)

AIMessage(content='こんにちは！', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 2, 'prompt_tokens': 20, 'total_tokens': 22, 'completion_tokens_details': {'audio_tokens': 0, 'reasoning_tokens': 0, 'accepted_prediction_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-4o-mini-2024-07-18', 'system_fingerprint': 'fp_0705bf87c0', 'finish_reason': 'stop', 'logprobs': None}, id='run-d514af42-617a-466d-bd16-8d0eaf67e286-0', usage_metadata={'input_tokens': 20, 'output_tokens': 2, 'total_tokens': 22, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 0}})

### OpenAI Format

See: [OpenAI Format](https://python.langchain.com/docs/concepts/messages/#openai-format)

e.g.
```json
chat_model.invoke([
    {
        "role": "user",
        "content": "Hello, how are you?"
    },
    {
        "role": "assistant",
        "content": "I'm doing well, thank you for asking."
    },
    {
        "role": "user",
        "content": "Can you tell me a joke?"
    }
])
```

The following are equivalent:

In [30]:
model.invoke("Hello")

AIMessage(content='Hello! How can I assist you today?', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 9, 'prompt_tokens': 8, 'total_tokens': 17, 'completion_tokens_details': {'audio_tokens': 0, 'reasoning_tokens': 0, 'accepted_prediction_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-4o-mini-2024-07-18', 'system_fingerprint': 'fp_818c284075', 'finish_reason': 'stop', 'logprobs': None}, id='run-650eceba-5f17-4831-9b01-68b2e3c05cc3-0', usage_metadata={'input_tokens': 8, 'output_tokens': 9, 'total_tokens': 17, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 0}})

In [31]:
model.invoke([{"role": "user", "content": "Hello"}])

AIMessage(content='Hello! How can I assist you today?', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 9, 'prompt_tokens': 8, 'total_tokens': 17, 'completion_tokens_details': {'audio_tokens': 0, 'reasoning_tokens': 0, 'accepted_prediction_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-4o-mini-2024-07-18', 'system_fingerprint': 'fp_818c284075', 'finish_reason': 'stop', 'logprobs': None}, id='run-dea23cc4-427a-4db8-b2d9-67aa14ea56b1-0', usage_metadata={'input_tokens': 8, 'output_tokens': 9, 'total_tokens': 17, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 0}})

In [32]:
model.invoke([HumanMessage("Hello")])

AIMessage(content='Hello! How can I assist you today?', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 9, 'prompt_tokens': 8, 'total_tokens': 17, 'completion_tokens_details': {'audio_tokens': 0, 'reasoning_tokens': 0, 'accepted_prediction_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-4o-mini-2024-07-18', 'system_fingerprint': 'fp_0705bf87c0', 'finish_reason': 'stop', 'logprobs': None}, id='run-871ed99d-a848-4499-87d5-f3a9c0fbc877-0', usage_metadata={'input_tokens': 8, 'output_tokens': 9, 'total_tokens': 17, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 0}})

### Streaming

Chat models are [Runnables](https://python.langchain.com/docs/concepts/runnables/), so async is enabled.

Stream individual tokens from a chat model.

In [33]:
for token in model.stream(messages):
    print(token.content, end="|")

|こんにちは|！||

In [34]:
for token in model.stream([
    messages[0],
    HumanMessage("May the Force be with you"),
]):
    print(token.content, end="|")

|フォ|ース|と|共|に|あ|ら|ん|こと|を||

In [36]:
for token in model.stream([
    SystemMessage("Translate the following from English into Spanish"),
    HumanMessage("May the Force be with you"),
]):
    print(token.content, end="|")

|Que| la| Fuer|za| te| acompañ|e|.||