# LangChain: Models, Prompts and Output Parsers
## Objective
- To demonstrate an intelligent and scalable approach using Langchain for basic chatbot purposes.
- You will experience firsthand the superior capabilities of Langchain.

## Outline

 * Direct API calls to OpenAI
 * API calls through LangChain:
   * Prompts
   * Models
   * Output parsers
   
## Edge of LangChain over OpenAI
- Input Variable defined in library instead of text concatenation for prompt
- LangChain Output Parsers capability to output consistency JSON format and value

In [1]:
!pip install -q python-dotenv
!pip install -q openai==0.28.1
!pip install -q --upgrade langchain

## Get your [OpenAI API Key](https://platform.openai.com/account/api-keys) & Model

In [2]:
import os
import openai
from dotenv import load_dotenv, find_dotenv

# Get OpenAI API key
try:
    _ = load_dotenv(find_dotenv()) # read local .env file
    openai.api_key = os.environ['OPENAI_API_KEY']
except:
    openai.api_key = 'your openai key'

# Get OpenAI Model type
llm_model = "gpt-3.5-turbo"

def get_completion(prompt, model=llm_model):
    messages = [{"role": "user", "content": prompt}]
    response = openai.ChatCompletion.create(
        model=model,
        messages=messages,
        temperature=0, 
    )
    return response.choices[0].message["content"]

Note: LLM's do not always produce the same results. When executing the code in your notebook, you may get slightly different answers

# Use case 1 : Prompt is with variables defined in LangChain library, instead of text concatenation

Provided with 
- customer complaint email, and 
- the style to be rephrased, 

the LLM will rephrase the customer complaint email in given respectful style

In [3]:
customer_email = """
Arrr, I be fuming that me blender lid \
flew off and splattered me kitchen walls \
with smoothie! And to make matters worse,\
the warranty don't cover the cost of \
cleaning up me kitchen. I need yer help \
right now, matey!
"""

style = """American English \
in a calm and respectful tone
"""

## Chat API : OpenAI

Let's start with a direct API calls to OpenAI.

In [5]:
OpenAI_prompt = f"""Translate the text \
that is delimited by triple backticks 
into a style that is {style}.
text: ```{customer_email}```
"""

print(OpenAI_prompt)

Translate the text that is delimited by triple backticks 
into a style that is American English in a calm and respectful tone
.
text: ```
Arrr, I be fuming that me blender lid flew off and splattered me kitchen walls with smoothie! And to make matters worse,the warranty don't cover the cost of cleaning up me kitchen. I need yer help right now, matey!
```



In [6]:
response = get_completion(OpenAI_prompt)
response

"I am really frustrated that my blender lid flew off and splattered my kitchen walls with smoothie! And to make matters worse, the warranty doesn't cover the cost of cleaning up my kitchen. I need your help right now, friend!"

## Chat API : LangChain

Let's try how we can do the same using LangChain.

__Edge of LangChain over OpenAI__:
- Input Variable defined in library instead of text concatenation

### Model

In [7]:
from langchain.chat_models import ChatOpenAI

# To control the randomness and creativity of the generated
# text by an LLM, use temperature = 0.0
chat = ChatOpenAI(temperature=0.0, model=llm_model)

  warn_deprecated(


### Prompt template

In [8]:
LangChain_prompt = """Translate the text \
that is delimited by triple backticks \
into a style that is {style}. \
text: ```{text}```
"""

from langchain.prompts import ChatPromptTemplate
LangChain_template = ChatPromptTemplate.from_template(LangChain_prompt)
customer_messages = LangChain_template.format_messages(
                    style=style,
                    text=customer_email)

# Call the LLM to translate to the style of the customer message
customer_response = chat(customer_messages)
print(customer_response.content)

  warn_deprecated(


I'm really frustrated that my blender lid flew off and splattered my kitchen walls with smoothie! And to make matters worse, the warranty doesn't cover the cost of cleaning up my kitchen. I could really use your help right now, friend.


In [9]:
LangChain_template.messages[0].prompt.input_variables

['style', 'text']

- The LangChain template can directly be used for other email template and style which is the variable of the library

# Use case 2 :  Output Parsers in JSON format with specific key-value pairs, instead of json.loads(json_text)

Provided with 
- customer review, and 
- review methodology template

the LLM will output with JSON format instaed of Str format

In [10]:
customer_review = """\
This leaf blower is pretty amazing.  It has four settings:\
candle blower, gentle breeze, windy city, and tornado. \
It arrived in two days, just in time for my wife's \
anniversary present. \
I think my wife liked it so much she was speechless. \
So far I've been the only one using it, and I've been \
using it every other morning to clear the leaves on our lawn. \
It's slightly more expensive than the other leaf blowers \
out there, but I think it's worth it for the extra features.
"""

# with input variable "text" at the end of the review_template
review_template = """\
For the following text, extract the following information:

gift: Was the item purchased as a gift for someone else? \
Answer True if yes, False if not or unknown.

delivery_days: How many days did it take for the product \
to arrive? If this information is not found, output -1.

price_value: Extract any sentences about the value or price,\
and output them as a comma separated Python list.

Format the output as JSON with the following keys:
gift
delivery_days
price_value

text: {text}
"""

In [11]:
from langchain.prompts import ChatPromptTemplate

prompt_template = ChatPromptTemplate.from_template(review_template)
messages = prompt_template.format_messages(text=customer_review)
chat = ChatOpenAI(temperature=0.0, model=llm_model)
response = chat(messages)
print(response.content)

{
  "gift": true,
  "delivery_days": 2,
  "price_value": ["It's slightly more expensive than the other leaf blowers out there"]
}


In [12]:
type(response.content)

str

The LLM output the JSON in string format. We can use json.loads(json_text) to ingest as JSON.

However, the output format and value maybe deviated due to potential hallucination. 
e.g.
- output "true" instead of "True" for JSON value of certain key
- single quote, double quote problem

Instead, we can use __langchain.output_parsers__ which we can define the __ResponseSchema__ of JSON. To ensure that the output are really in the format we defined

### Parse the LLM output string into a Python dictionary

In [13]:
from langchain.output_parsers import ResponseSchema
from langchain.output_parsers import StructuredOutputParser

In [14]:
gift_schema = ResponseSchema(name="gift",
                             description="Was the item purchased\
                             as a gift for someone else? \
                             Answer True if yes,\
                             False if not or unknown.")
delivery_days_schema = ResponseSchema(name="delivery_days",
                                      description="How many days\
                                      did it take for the product\
                                      to arrive? If this \
                                      information is not found,\
                                      output -1.")
price_value_schema = ResponseSchema(name="price_value",
                                    description="Extract any\
                                    sentences about the value or \
                                    price, and output them as a \
                                    comma separated Python list.")

response_schemas = [gift_schema, 
                    delivery_days_schema,
                    price_value_schema]

output_parser = StructuredOutputParser.from_response_schemas(response_schemas)
format_instructions = output_parser.get_format_instructions()
print(format_instructions)

The output should be a markdown code snippet formatted in the following schema, including the leading and trailing "```json" and "```":

```json
{
	"gift": string  // Was the item purchased                             as a gift for someone else?                              Answer True if yes,                             False if not or unknown.
	"delivery_days": string  // How many days                                      did it take for the product                                      to arrive? If this                                       information is not found,                                      output -1.
	"price_value": string  // Extract any                                    sentences about the value or                                     price, and output them as a                                     comma separated Python list.
}
```


In [15]:
review_template_2 = """\
For the following text, extract the following information:

gift: Was the item purchased as a gift for someone else? \
Answer True if yes, False if not or unknown.

delivery_days: How many days did it take for the product\
to arrive? If this information is not found, output -1.

price_value: Extract any sentences about the value or price,\
and output them as a comma separated Python list.

text: {text}

{format_instructions}
"""

prompt = ChatPromptTemplate.from_template(template=review_template_2)
messages = prompt.format_messages(text=customer_review, 
                                format_instructions=format_instructions)

In [16]:
response = chat(messages)
print(response.content)

```json
{
	"gift": true,
	"delivery_days": 2,
	"price_value": ["It's slightly more expensive than the other leaf blowers out there, but I think it's worth it for the extra features."]
}
```


Text in Prompt:
```
gift: Was the item purchased as a gift for someone else? \
Answer True if yes, False if not or unknown.
```
However, the output of key "gift" is "true" instead of "True", which may cause up-coming boolean identificaiton failure.

In [17]:
output_dict = output_parser.parse(response.content)

In [18]:
output_dict

{'gift': True,
 'delivery_days': 2,
 'price_value': ["It's slightly more expensive than the other leaf blowers out there, but I think it's worth it for the extra features."]}

In [19]:
type(output_dict)

dict

After utilizing the __output_parser__ , the output format is what we want finally

# Summary

## 1. Code Scalability

In the context of text manipulation and integration with libraries, LangChain adopt a different approach which are better in terms of scalability.

### ChatGPT Approach
ChatGPT primarily relies on f-strings for text manipulation, which can become cumbersome, especially in complex scenarios involving multiple variables. For example:

```python
prompt_GPT = f"""xxx {style}.\
text: '''{customer_email}'''
"""
```

### LangChain Approach
LangChain simplifies scalability by directly interpreting variables stored within its library, eliminating the need for f-strings. This approach enhances code readability and maintainability, particularly in large-scale projects.

```python
prompt_LC = """xxx {style}.\
text: '''{customer_email}'''
"""
prompt_template = ChatPromptTemplate.from_template(prompt_LC)
print(prompt_template.messages[0].prompt.input_variables)
['style', 'customer_email']
```

## 2. Output Parsers Ability for JSON Format

### Prompt directly output approach

Despite defining the JSON key-value pairs with clear guidelines, the output string from LLM can still not be what you want. e.g. "true" in key "gift" in above example

Also, __with 1 year experience of maintaining LLM output pipeline__, other challenges arise due to the nature of language model hallucination, especially on inconsistent JSON formatting, key misalignment, or mixed quotation styles, or even not outputing JSON.

### LangChain Output Handling

LangChain offers __response_schema__ and __output_parser__ which offer robust output handling capabilities, including defining schemas for JSON key-value pairs. By leveraging attention mechanisms in NLP LLM models, this setup ensures a more consistent output formatting, thus facilitating seamless integration within pipelines.


Example:
```python
response_schemas = [gift_schema, 
                    delivery_days_schema,
                    price_value_schema]

output_parser = StructuredOutputParser.from_response_schemas(response_schemas)
```

## Conclusion
In conclusion, while both ChatGPT and LangChain serve text manipulation purposes, LangChain's innovative approach to scalability and output consistency, particularly its direct interpretation of variables and structured output handling, positions it as a preferred choice for large-scale projects and maintaining pipeline integrity. By addressing challenges such as inconsistent JSON formatting and key misalignment, LangChain offers a robust solution for seamless integration within text processing pipelines.