#### References

* https://learn.deeplearning.ai/courses/langchain/lesson/1/introduction


# LangChain for LLM Application Development


* Langchain은 **LLM(대형 언어 모델) 애플리케이션 개발**을 위한 오픈 소스 프레임워크입니다. GPT-4와 같은 **LLM을 외부 데이터와 결합**합니다. 
* Langchain은 Python 또는 JavaScript(TypeScript) 패키지로 제공됩니다. 
* Langchain은 **구성요소화(composition)** 와 **모듈성**에 중점을 둡니다. 여기에는 개별 구성요소를 서로 **결합하거나 단독으로 사용**할 수 있는 모듈식 구성요소가 있습니다. 
* Langchain은 **다양한 사례에 적용**될 수 있으며, 더 많은 엔드 투 엔드 애플리케이션을 위해 모듈식 구성 요소를 결합할 수 있습니다.

### LangChain의 주요 구성 요소

LangChain은 유연성과 모듈성을 강조합니다. 자연어 처리 파이프라인을 별도의 모듈식 구성 요소로 나누어 개발자가 필요에 따라 워크플로를 맞춤화할 수 있습니다. Langchain 프레임워크는 6개의 모듈로 나눌 수 있으며, 각 모듈은 LLM과의 상호 작용의 다양한 측면을 허용합니다.

* **Models**
  - LLMs: 20+ integrations
  - Chat Models
  - Text Embedding Models: 10+ integrations
 
* **Prompts**
  - Prompt templates
  - Output Parsers: 5+ implementations
    - Retry/fixing logic
  - Example Selectors: 5+ implementations
 
* **Indexes**
  - Document loaders: 50+ implementations
  - Text Splitters: 10+ implementations
  - Vector stores: 10+ integrations
  - Retriviers: 5+ integrations/implementations
 
* **Chains**
  - Prompt + LLM + Output parsing
  - Can be used as building blocks for longer chains
  - More application specific chains: 20+ types
 
* **Agents**
  - Agent Types: 5+ types
    - Algorithms for getting LLMs to use tools
  - Agent Toolkits: 10+ implementations
    - Agents armed with specific tools for a specific application

## 01. Models, Prompts and Parsers

여기에서는 모델, 프롬프트 및 파서(parser)를 다룰 것입니다. 

* **모델(model)** 은 많은 것을 뒷받침하는 언어 모델을 나타냅니다. 
* **프롬프트(prompt)** 는 모델에 전달할 입력을 만드는 스타일을 나타냅니다. 
* 그리고 **파서(parser)** 는 반대쪽에 위치하며, 모델의 출력을 가져오고 더 구조화된 형식으로 구문 분석하여 다운스트림 작업을 수행할 수 있습니다.

LLM을 사용하여 애플리케이션을 구축할 때 재사용 가능한 모델이 있는 경우가 많습니다. 우리는 반복적으로 모델을 프롬프트하고 출력을 파싱하므로 LangChain은 이러한 유형의 작업을 수행하기 위한 손쉬운 추상화 세트를 제공합니다.

### Outline

 * Direct API calls to OpenAI
 * API calls through LangChain:
   * Prompts
   * Models
   * Output parsers

## Get your [OpenAI API Key](https://platform.openai.com/account/api-keys)

In [1]:
#!pip install python-dotenv
#!pip install openai

In [2]:
import os
import openai

from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv()) # read local .env file
openai.api_key = os.environ['OPENAI_API_KEY']

## Chat API : OpenAI

OpenAI에 대한 직접 API 호출부터 시작하겠습니다.

In [5]:
def get_completion(prompt, model="gpt-3.5-turbo"):
    messages = [{"role": "user", "content": prompt}]
    response = openai.chat.completions.create(
        model=model,
        messages=messages,
        temperature=0, # 이것는 모델 출력의 무작위성 정도를 말합니다.[0~1]
    )
    return response.choices[0].message.content

In [6]:
get_completion("What is 1+1?")

'1+1 equals 2.'

이제 모델, 프롬프트 및 파서에 대한 LangChain 추상화에 동기를 부여하기 위해, 영어가 아닌 언어로 고객으로부터 이메일을 받았다고 가정해 보겠습니다.
이것이 가능한 지 확인하기 위해 제가 사용할 다른 언어는 "영국 해적 언어(English pirate language)"입니다.

In [7]:
customer_email = """
Arrr, I be fuming that me blender lid \
flew off and splattered me kitchen walls \
with smoothie! And to make matters worse,\
the warranty don't cover the cost of \
cleaning up me kitchen. I need yer help \
right now, matey!
"""

In [8]:
style = """American English in a calm and respectful tone"""

In [9]:
prompt = f"""Translate the text \
that is delimited by triple backticks 
into a style that is {style}.
text: ```{customer_email}```
"""

print(prompt)

Translate the text that is delimited by triple backticks 
into a style that is American English in a calm and respectful tone.
text: ```
Arrr, I be fuming that me blender lid flew off and splattered me kitchen walls with smoothie! And to make matters worse,the warranty don't cover the cost of cleaning up me kitchen. I need yer help right now, matey!
```



In [10]:
response = get_completion(prompt)

In [11]:
response

"I am really frustrated that my blender lid flew off and splattered my kitchen walls with smoothie! And to make matters worse, the warranty doesn't cover the cost of cleaning up my kitchen. I could really use your help right now, friend."

## Chat API : LangChain

**LangChain**을 사용하여 동일한 작업을 수행할 수 있는 방법을 시도해 봅시다.

### Model

In [12]:
#!pip install langchain
#!pip install langchain_community

In [13]:
from langchain.chat_models import ChatOpenAI

In [14]:
# This is langchain's abstraction for chatGPT API Endpoint
# To control the randomness and creativity of the generated text by an LLM, use temperature = 0.0
chat = ChatOpenAI(temperature=0.0, model="gpt-3.5-turbo")
chat

  warn_deprecated(


ChatOpenAI(client=<openai.resources.chat.completions.Completions object at 0x0000022B15EE9B10>, async_client=<openai.resources.chat.completions.AsyncCompletions object at 0x0000022B160E9090>, temperature=0.0, openai_api_key='sk-proj-WboK1MNhFmaIRtv0EQOjT3BlbkFJSpST32fblerT9sJ7WEAr', openai_proxy='')

### Prompt template

prompts 는 모델을 프로그래밍하는 새로운 방법입니다. prompts 는 모델에 전달할 입력을 생성하는 스타일을 나타냅니다. prompts는 종종 여러 구성 요소로 구성됩니다. prompt_template 과 Example selector 는 prompts를 쉽게 구성하고 사용할 수 있는 기본 클래스와 함수를 제공합니다.

템플릿 문자열(template_string)을 정의하고 이 템플릿 문자열과 langChain의 ChatPromptTemplate을 사용하여 PromptTemplate 을 생성하겠습니다.

In [15]:
template_string = """Translate the text \
that is delimited by triple backticks \
into a style that is {style}. \
text: ```{text}```
"""

In [16]:
from langchain.prompts import ChatPromptTemplate

prompt_template = ChatPromptTemplate.from_template(template_string)

In [17]:
prompt_template.messages[0].prompt

PromptTemplate(input_variables=['style', 'text'], template='Translate the text that is delimited by triple backticks into a style that is {style}. text: ```{text}```\n')

In [18]:
prompt_template.messages[0].prompt.input_variables

['style', 'text']

위의 Prompt_template에는 style과 text라는 2개의 필드가 있습니다. 이 Prompt_template 에서 원본 템플릿 문자열을 추출할 수도 있습니다. 이제 텍스트를 다른 스타일로 번역하려면 번역할 style과 text를 정의해야 합니다.

In [19]:
customer_style = """American English in a calm and respectful tone"""

In [20]:
customer_email = """
Arrr, I be fuming that me blender lid \
flew off and splattered me kitchen walls \
with smoothie! And to make matters worse, \
the warranty don't cover the cost of \
cleaning up me kitchen. I need yer help \
right now, matey!
"""

In [21]:
customer_messages = prompt_template.format_messages(
                    style=customer_style,
                    text=customer_email)

In [22]:
print(type(customer_messages))
print(type(customer_messages[0]))

<class 'list'>
<class 'langchain_core.messages.human.HumanMessage'>


In [23]:
print(customer_messages[0])

content="Translate the text that is delimited by triple backticks into a style that is American English in a calm and respectful tone. text: ```\nArrr, I be fuming that me blender lid flew off and splattered me kitchen walls with smoothie! And to make matters worse, the warranty don't cover the cost of cleaning up me kitchen. I need yer help right now, matey!\n```\n"


In [24]:
# Call the LLM to translate to the style of the customer message
customer_response = chat(customer_messages)

  warn_deprecated(


In [25]:
print(customer_response.content)

I am really frustrated that my blender lid flew off and splattered my kitchen walls with smoothie! And to make matters worse, the warranty doesn't cover the cost of cleaning up my kitchen. I need your help right now, friend!


이제 고객 서비스 담당자가 원래 언어로 고객에게 응답하기를 바랍니다. 그들은 공손한 메시지입니다. 우리는 서비스 메시지가 이 해적 스타일로 번역되도록 지정하겠습니다.
그래서 우리는 그것이 영국 해적에서 말하는 공손한 어조로 되기를 원합니다.
이전에 해당 프롬프트 템플릿을 만들었기 때문에 멋진 점은 이제 **해당 프롬프트 템플릿을 재사용**하고 원하는 **출력 스타일이 이 service style pirate** 이고 **텍스트가 이 서비스 응답**임을 지정할 수 있다는 것입니다.

In [26]:
service_reply = """Hey there customer, \
the warranty does not cover \
cleaning expenses for your kitchen \
because it's your fault that \
you misused your blender \
by forgetting to put the lid on before \
starting the blender. \
Tough luck! See ya!
"""

In [27]:
service_style_pirate = """a polite tone that speaks in English Pirate"""

In [28]:
service_messages = prompt_template.format_messages(
    style=service_style_pirate,
    text=service_reply)

print(service_messages[0].content)

Translate the text that is delimited by triple backticks into a style that is a polite tone that speaks in English Pirate. text: ```Hey there customer, the warranty does not cover cleaning expenses for your kitchen because it's your fault that you misused your blender by forgetting to put the lid on before starting the blender. Tough luck! See ya!
```



In [29]:
service_response = chat(service_messages)
print(service_response.content)

Ahoy there, valued customer! Regrettably, the warranty be not coverin' the cost o' cleanin' yer galley due to yer own negligence in misusin' yer blender. Ye forgot to secure the lid afore startin' the blender, savvy? 'Tis a tough break, matey! Fare thee well!


정교한 애플리케이션을 구축할 때 프롬프트가 상당히 길고 상세할 수 있으며, 프롬프트 템플릿은 가능한 한 좋은 프롬프트를 재사용하는 데 도움이 되는 유용한 추상화입니다.

<img src="why_prompt_template.PNG" width="400">

LangChain 프롬프트 라이브러리의 또 다른 측면은 파싱(parsing)도 지원한다는 것입니다. LLM을 사용하여 복잡한 애플리케이션을 구축할 때 특정 키워드 사용과 같은 특정 형식으로 출력을 생성하도록 LLM에 지시하는 경우가 많습니다. 아래의 예는 LLM을 사용하여 React 라는 프레임워크를 사용하여 **일련의 사고 추론(chain of thought reasoning)** 이라는 것을 수행하는 것을 보여줍니다. LLM은 **Thought, Action, Observation**과 같은 키워드를 사용하여 **ReAct라는 프레임워크**를 사용하여 일련의 사고 추론을 수행합니다. 
* **Thought**은 LLM이 생각하는 것이며, LLM에 생각할 수 있는 공간을 제공함으로써 LLM은 보다 정확한 결론을 얻을 수 있습니다. 
* **Action**은 특정 작업을 수행하기 위한 키워드이고, 
* **Observation**은 LLM이 특정 작업을 통해 무엇을 학습했는지 보여주는 키워드입니다.  

**Thought, Action, Observation**과 같은 특정 키워드를 사용하도록 LLM에 지시하는 프롬프트가 있는 경우 **이러한 키워드를 파서와 결합하여 해당 키워드로 태그가 지정된 텍스트를 추출**할 수 있습니다.

<img src="prompt_output.PNG" width="400">

## Output Parsers

LLM 출력이 다음과 같이 Dicrionary 또는 JSON 으로 표시되는 방법을 정의하는 것으로 시작하겠습니다.

In [30]:
{
  "gift": False,
  "delivery_days": 5,
  "price_value": "pretty affordable!"
}

{'gift': False, 'delivery_days': 5, 'price_value': 'pretty affordable!'}

In [31]:
customer_review = """\
This leaf blower is pretty amazing.  It has four settings:\
candle blower, gentle breeze, windy city, and tornado. \
It arrived in two days, just in time for my wife's \
anniversary present. \
I think my wife liked it so much she was speechless. \
So far I've been the only one using it, and I've been \
using it every other morning to clear the leaves on our lawn. \
It's slightly more expensive than the other leaf blowers \
out there, but I think it's worth it for the extra features.
"""

review_template = """\
For the following text, extract the following information:

gift: Was the item purchased as a gift for someone else? \
Answer True if yes, False if not or unknown.

delivery_days: How many days did it take for the product \
to arrive? If this information is not found, output -1.

price_value: Extract any sentences about the value or price,\
and output them as a comma separated Python list.

Format the output as JSON with the following keys:
gift
delivery_days
price_value

text: {text}
"""

In [32]:
from langchain.prompts import ChatPromptTemplate

prompt_template = ChatPromptTemplate.from_template(review_template)
print(prompt_template)

input_variables=['text'] messages=[HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['text'], template='For the following text, extract the following information:\n\ngift: Was the item purchased as a gift for someone else? Answer True if yes, False if not or unknown.\n\ndelivery_days: How many days did it take for the product to arrive? If this information is not found, output -1.\n\nprice_value: Extract any sentences about the value or price,and output them as a comma separated Python list.\n\nFormat the output as JSON with the following keys:\ngift\ndelivery_days\nprice_value\n\ntext: {text}\n'))]


In [33]:
messages = prompt_template.format_messages(text=customer_review)
chat = ChatOpenAI(temperature=0.0)
response = chat(messages)
print(response.content)

{
    "gift": true,
    "delivery_days": 2,
    "price_value": ["It's slightly more expensive than the other leaf blowers out there"]
}


In [34]:
type(response.content)

str

In [35]:
# You will get an error by running this line of code 
# because'gift' is not a dictionary
# 'gift' is a string
response.content.get('gift')

AttributeError: 'str' object has no attribute 'get'

### Parse the LLM output string into a Python dictionary

In [36]:
from langchain.output_parsers import ResponseSchema
from langchain.output_parsers import StructuredOutputParser

In [37]:
gift_schema = ResponseSchema(name="gift",
                             description="Was the item purchased\
                             as a gift for someone else? \
                             Answer True if yes,\
                             False if not or unknown.")
delivery_days_schema = ResponseSchema(name="delivery_days",
                                      description="How many days\
                                      did it take for the product\
                                      to arrive? If this \
                                      information is not found,\
                                      output -1.")
price_value_schema = ResponseSchema(name="price_value",
                                    description="Extract any\
                                    sentences about the value or \
                                    price, and output them as a \
                                    comma separated Python list.")

response_schemas = [gift_schema, 
                    delivery_days_schema,
                    price_value_schema]

In [38]:
output_parser = StructuredOutputParser.from_response_schemas(response_schemas)

In [39]:
format_instructions = output_parser.get_format_instructions()

In [40]:
print(format_instructions)

The output should be a markdown code snippet formatted in the following schema, including the leading and trailing "```json" and "```":

```json
{
	"gift": string  // Was the item purchased                             as a gift for someone else?                              Answer True if yes,                             False if not or unknown.
	"delivery_days": string  // How many days                                      did it take for the product                                      to arrive? If this                                       information is not found,                                      output -1.
	"price_value": string  // Extract any                                    sentences about the value or                                     price, and output them as a                                     comma separated Python list.
}
```


In [41]:
review_template_2 = """\
For the following text, extract the following information:

gift: Was the item purchased as a gift for someone else? \
Answer True if yes, False if not or unknown.

delivery_days: How many days did it take for the product\
to arrive? If this information is not found, output -1.

price_value: Extract any sentences about the value or price,\
and output them as a comma separated Python list.

text: {text}

{format_instructions}
"""

prompt = ChatPromptTemplate.from_template(template=review_template_2)

messages = prompt.format_messages(text=customer_review, 
                                format_instructions=format_instructions)

In [42]:
print(messages[0].content)

For the following text, extract the following information:

gift: Was the item purchased as a gift for someone else? Answer True if yes, False if not or unknown.

delivery_days: How many days did it take for the productto arrive? If this information is not found, output -1.

price_value: Extract any sentences about the value or price,and output them as a comma separated Python list.

text: This leaf blower is pretty amazing.  It has four settings:candle blower, gentle breeze, windy city, and tornado. It arrived in two days, just in time for my wife's anniversary present. I think my wife liked it so much she was speechless. So far I've been the only one using it, and I've been using it every other morning to clear the leaves on our lawn. It's slightly more expensive than the other leaf blowers out there, but I think it's worth it for the extra features.


The output should be a markdown code snippet formatted in the following schema, including the leading and trailing "```json" and "```

In [43]:
response = chat(messages)

In [44]:
print(response.content)

```json
{
	"gift": true,
	"delivery_days": 2,
	"price_value": ["It's slightly more expensive than the other leaf blowers out there, but I think it's worth it for the extra features."]
}
```


In [45]:
output_dict = output_parser.parse(response.content)

In [46]:
output_dict

{'gift': True,
 'delivery_days': 2,
 'price_value': ["It's slightly more expensive than the other leaf blowers out there, but I think it's worth it for the extra features."]}

In [47]:
type(output_dict)

dict

In [48]:
output_dict.get('delivery_days')

2