# Topic 2: LangChain & Prompt Engineering

**Learning Objectives:**

* Learn how to use LangChain pipeline.
* Struct LLM output for prompt chaining.
* Chain-of-Thought prompting.
* Use CoT prompting for various tasks.

**Outline:**

1. **LangChain**
2. **Prompt Chaining**
3. **Chain-of-Thought**
4. **Assignment 2**

## 1. LangChain
LangChain is a framework for developing applications powered by large language models (LLMs).

*   **Langchain module**: Including chains, agents, and retrieval strategies that make up an application's cognitive architecture.
*   **LangGraph**: Build multi-actor applications with LLMs by modeling steps as edges and nodes in a graph.
*   **LangServe**: Deploy LangChain chains as REST APIs.
*   **LangSmith**: A developer platform that lets developers debug, test, evaluate, and monitor LLM applications.

You can check the LangChain docs for more information. ([link](https://python.langchain.com/docs/introduction/))


### Environment Setup

In [None]:
# Install langchain
!pip install -qU langchain

In [None]:
# Install langchain-openai
!pip install -qU langchain-openai

**Set API key**

In [None]:
# Set API key
OPENAI_API_KEY="YOUR_API_KEY_HERE"

### Using Language Models

In [None]:
# Prepare model
from langchain_openai import ChatOpenAI

model = ChatOpenAI(model="gpt-4o-mini", temperature=0.0, api_key=OPENAI_API_KEY)

Can use the model directly by passing messages to the `.invoke` method.

In [None]:
from langchain_core.messages import HumanMessage, SystemMessage

messages = [
    SystemMessage(content="Translate the following from English into Korean"),
    HumanMessage(content="hi!"),
]

model.invoke(messages)

### OutputParsers
Response from the model contains string response and metadata about the response.
LangChain output parser can parse out the response.

In [None]:
# Import simple output parser
from langchain_core.output_parsers import StrOutputParser

parser = StrOutputParser()

In [None]:
# Pass the response to the parser
result = model.invoke(messages)
parser.invoke(result)

### Prompt Templates
PromptTemplates are a concept in LangChain designed to assist with the transformation
 from the raw user input to a list of messages ready to pass to the language model.

In [None]:
from langchain_core.prompts import ChatPromptTemplate

system_template = "Translate the following into {language}:"

prompt_template = ChatPromptTemplate.from_messages(
    [("system", system_template), ("user", "{text}")]
)

# Pass the dictionary input to the prompt template
result = prompt_template.invoke({"language": "English", "text": "좋은 하루 되세요!"})

result

### Chaining
LangChain can "chain" the model and the output parser. The `|` operator is used in LangChain to combine elements together.

In [None]:
# Create chain
chain = model | parser

# Use chain
messages = [
    SystemMessage(content="Translate the following into English:"),
    HumanMessage(content="좋은 하루 되세요!"),
]
chain.invoke(messages)

We can combine with the prompt template.

In [None]:
chain = prompt_template | model | parser

chain.invoke({"language": "English", "text": "좋은 하루 되세요!"})

LCEL is a declarative way to chain LangChain components.
![LCEL](https://raw.githubusercontent.com/GSDSAML/prompt-engineering-2024fall/main/lcel.png)

The input type and output type varies by component:

| Component     | Input Type                                      | Output Type                    |
|---------------|-------------------------------------------------|---------------------------------|
| Prompt        | Dictionary                                       | PromptValue                    |
| ChatModel     | Single string, list of chat messages or a PromptValue | ChatMessage                    |
| LLM           | Single string, list of chat messages or a PromptValue | String                         |
| OutputParser  | The output of an LLM or ChatModel                | Depends on the parser           |
| Retriever     | Single string                                    | List of Documents               |
| Tool          | Single string or dictionary, depending on the tool | Depends on the tool             |


## 2. Prompt Chaining
To improve the reliability and performance of LLMs, complex tasks should be split into its subtasks. The LLM is prompted with a subtask and then its response is used as input to another prompt.

With prompt chaining, you can debug complex LLM tasks with the generated outputs of subtasks.

This example uses two LLMs for generate a topic and blog post title related to the topic.

In [None]:
from langchain_core.prompts import PromptTemplate

# Prompt template for the first step: Generate a topic
topic_template = PromptTemplate(
    input_variables=["domain"],
    template="Generate a topic related to {domain}.",
)

# Prompt template for the second step: Generate a blog post title
title_template = PromptTemplate(
    input_variables=["topic"],
    template="Generate a blog post title about {topic}.",
)

# Chain the prompts together
chain = topic_template | model | parser | title_template | model | parser

# Run the chain
result = chain.invoke({"domain": "artificial intelligence"})

print(result)

In [None]:
# Compare with the single LLM
prompt_template = PromptTemplate(
    input_variables=["domain"],
    template="Generate a topic related to {domain}. Then, generate a blog post title about the topic you generated."
)


chain = prompt_template | model | parser


result = chain.invoke({"domain": "artificial intelligence"})

print(result)

Another example is to write a short synopsis of the story for the genre.

In [None]:
# Prompt template for the first step: Generate a creative story idea
story_idea_template = PromptTemplate(
    input_variables=["genre"],
    template="Generate a creative story idea within the genre of {genre}.",
)

# Prompt template for the second step: Develop the story's main characters
characters_template = PromptTemplate(
    input_variables=["story_idea"],
    template="Based on the story idea: {story_idea}, develop 3 main characters with distinct personalities and backstories.",
)

# Prompt template for the third step: Write a short synopsis of the story
synopsis_template = PromptTemplate(
    input_variables=["story_idea", "characters"],
    template="Given the story idea: {story_idea} and the characters: {characters}, write a short synopsis of the story, highlighting the plot and conflict.",
)

# Chain the prompts
chain1 = (
    story_idea_template
    | model
    | parser
)

chain2 = (
    characters_template
    | model
    | parser
)

chain3 = (
    synopsis_template
    | model
    | parser
)

story_idea = chain1.invoke({"genre": "science fiction"})
characters = chain2.invoke({"story_idea": story_idea})
result = chain3.invoke({"story_idea": story_idea, "characters": characters})

print(result)


In [None]:
# Single LLM version
prompt_template = PromptTemplate(
    input_variables=["genre"],
    template="""Generate a creative story idea within the genre of {genre}.
    Based on the story idea, develop 3 main characters with distinct personalities and backstories.
    Given the story idea and the characters, write a short synopsis of the story, highlighting the plot and conflict.
    """
)

chain = prompt_template | model | parser

result = chain.invoke({"genre": "science fiction"})

print(result)

We can use RunnableParallel to execute multiple Runnables in parallel, and to return the output of these Runnables as a map.

The example below uses RunnableParallel to make pros and cons for a topic and merge them into a single paragraph.

In [None]:
from langchain_core.runnables import RunnableParallel

# Prompt template for generating pros
pros_template = PromptTemplate(
    input_variables=["topic"],
    template="Generate 3 pros for the topic: {topic}.",
)

# Prompt template for generating cons
cons_template = PromptTemplate(
    input_variables=["topic"],
    template="Generate 3 cons for the topic: {topic}.",
)

# Create chains for pros and cons
pros_chain = pros_template | model | parser
cons_chain = cons_template | model | parser

# Combine the two chains in parallel
combined_chain = RunnableParallel(pros=pros_chain, cons=cons_chain)

# Create a chain to merge the results
merge_chain = (
    PromptTemplate(
        input_variables=["pros", "cons"],
        template="""
        Here are the pros: {pros}
        Here are the cons: {cons}
        Merge the pros and cons into a single paragraph summarizing the topic.
        """
    )
    | model
    | parser
)

# Combine the parallel chain with the merge chain
full_chain = combined_chain | merge_chain

# Run the chain
result = full_chain.invoke({"topic": "Artificial Intelligence"})

print(result)


## 3. Chain-of-Thought

Chain-of-Thought prompting let the LLM to construct an entire logical argument, including premises and a conclusion.

From GPT-4, LLM will generate Chain-of-Thought by default. So we will compare CoT with "do not explain".

In [None]:
cot_prompt_template = PromptTemplate(
    input_variables=["question"],
    template="""
    Question: {question}
    Answer:
    """
)

no_cot_prompt_template = PromptTemplate(
    input_variables=["question"],
    template="""
    Question: {question}
    Do not explain, just answer.
    Answer:
    """
)


cot_chain = cot_prompt_template | model | parser
no_cot_chain = no_cot_prompt_template | model | parser


question = "Darrell and Allen's ages are in the ratio of 7:11. If their total age now is 162, calculate Allen's age 10 years from now."

cot_result = cot_chain.invoke({"question": question})
no_cot_result = no_cot_chain.invoke({"question": question})

print(f"Chain-of-Thought Result:\n{cot_result}\n")
print(f"No Chain-of-Thought Result:\n{no_cot_result}\n")


 CoT can combine with few-shot prompting to get better results on more complex tasks that require reasoning before responding.

In [None]:
# Provide the cipher puzzle without examples
cipher_puzzle_template = PromptTemplate(
    input_variables=["cipher_text"],
    template="""
    Decode following text:
    {cipher_text}
    """,
)


cipher_chain = cipher_puzzle_template | model | parser

result = cipher_chain.invoke({"cipher_text": "oyekaijzdf aaptcg suaokybhai ouow aqht mynznvaatzacdfoulxxz"})

print(result)

In [None]:
# Provide the cipher puzzle with example, without reasoning
cipher_puzzle_template = PromptTemplate(
    input_variables=["cipher_text"],
    template="""
    oyfjdnisdr rtqwainr acxz mynzbhhx -> Think step by step

    Use the example above to decode:
    {cipher_text}
    """,
)


cipher_chain = cipher_puzzle_template | model | parser

result = cipher_chain.invoke({"cipher_text": "oyekaijzdf aaptcg suaokybhai ouow aqht mynznvaatzacdfoulxxz"})

print(result)

### Exercise: Try to make reasoning for few-shot example on your own  

In [None]:
# Provide the cipher puzzle with example and reasoning

cipher_puzzle_template = PromptTemplate(
    input_variables=["cipher_text"],
    ########### Modify here #############
    template="""
    Decode following text:

    Example 1:
    Cipher text: oyfjdnisdr rtqwainr acxz mynzbhhx
    Reasoning:

    Cipher text: {cipher_text}
    Reasoning:
    """,
    #####################################
)


cipher_chain = cipher_puzzle_template | model | parser

result = cipher_chain.invoke({"cipher_text": "oyekaijzdf aaptcg suaokybhai ouow aqht mynznvaatzacdfoulxxz"})

print(result)


<details>
<summary>Click here for the sample prompt!</summary>

```
Break down the ciphertext into pairs:
    1. First word: "oyfjdnisdr"
      a. Pairs: oy, fj, dn, is, dr
      b. Decoded letters:
        oy → (15+25)/2 = 20 → T
        fj → (6+10)/2 = 8 → H
        dn → (4+14)/2 = 9 → I
        is → (9+19)/2 = 14 → N
        dr → (4+18)/2 = 11 → K
      c. Decoded word: THINK
    2. Second word: "rtqwainr"
      a. Paris: rt, qw, ai, nr
      b. Decoded letters:
        rt → S
        qw → T
        ai → E
        nr → P
      c. Decoded word: STEP
    3. Third word: "acxz"
      a. Pairs: ac, xz
      b. Decoded letters:
        ac → B
        xz → Y
      c. Decoded word: BY
    4. Fourth word: "mynzbhhx"
      a. Pairs: my, nz, bh, hx
      b. Decoded letters:
        my → S
        nz → T
        bh → E
        hx → P
      c. Decoded word: STEP
```

</details>

# Assignment 2: Python Code Generation

For the given tasks, use Chain-of-thought prompting to guide the LLM in generating Python code.

The code **must include test code** with **at least three examples** to verify that functions works correctly.

You only need to complete **4 tasks out of the total 6 tasks**.

In [None]:
# Install langchain
!pip install -qU langchain

# Install langchain-openai
!pip install -qU langchain-openai

In [None]:
# Set API key
OPENAI_API_KEY="YOUR_API_KEY_HERE"

# Prepare model
from langchain_openai import ChatOpenAI
from langchain_core.prompts import PromptTemplate

model = ChatOpenAI(model="gpt-4o-mini", temperature=0.0, api_key=OPENAI_API_KEY)

# Import simple output parser
from langchain_core.output_parsers import StrOutputParser

parser = StrOutputParser()

## Example Task 1: Palindrome Checker

Write a Python function `is_palindrome(str)` that checks a given string `str` is a palindrome. Ignore spaces, punctuation, and case sensitivity.

In [None]:
prompt_template = PromptTemplate(
    input_variables=["problem_statement", "cot"],
    template="""
    Generate a python code for the following problem statement:

    {problem_statement}

    {cot}
    """
)

chain = prompt_template | model | parser


problem_statement = """
A Python function is_palindrome(s) that checks whether a given string str is a palindrome. Ignore spaces, punctuation, and case sensitivity.
"""

################ TODO: Modify here ################

cot = """
You should define the function first.
In the function, normalize the input string. and compare with the reverse string.
After defining the function, you should write a test code with at least three examples.
"""

###################################################

result = chain.invoke({"problem_statement": problem_statement, "cot": cot})

print(result)

And paste the code to the cell below and run it for validation.

In [None]:
################ TODO: Paste the code here ################
import string

def is_palindrome(s):
    # Normalize the string: remove punctuation, spaces, and convert to lowercase
    normalized_str = ''.join(char.lower() for char in s if char.isalnum())

    # Check if the normalized string is equal to its reverse
    return normalized_str == normalized_str[::-1]

# Test cases
if __name__ == "__main__":
    test_cases = [
        "A man, a plan, a canal, Panama!",
        "Was it a car or a cat I saw?",
        "No 'x' in Nixon",
        "Hello, World!"
    ]

    for test in test_cases:
        result = is_palindrome(test)
        print(f"'{test}' -> {result}")

## Example Task 2: Prime Numbers in a Range

Create a Python function `find_primes(a, b)` that returns a list of all prime numbers between two integers `a` and `b` (inclusive).

In [None]:
prompt_template = PromptTemplate(
    input_variables=["problem_statement", "cot"],
    template="""
    Generate a python code for the following problem statement:

    {problem_statement}

    {cot}
    """
)

chain = prompt_template | model | parser


problem_statement = """
A Python function find_primes(a, b) that returns a list of all prime numbers between two integers a and b (inclusive).
"""

################ TODO: Modify here ################

cot = """
First, you should define a function is_prime() to check whether the given number is a prime number or not.
And then, define a find_primes() function that finds all prime numbers between given two integers(inclusive).
After defining functions, you should write a test code with at least three examples.
"""

###################################################

result = chain.invoke({"problem_statement": problem_statement, "cot": cot})

print(result)

In [None]:
################ TODO: Paste the code here ################
def is_prime(n):
    """Check if a number is prime."""
    if n <= 1:
        return False
    for i in range(2, int(n**0.5) + 1):
        if n % i == 0:
            return False
    return True

def find_primes(a, b):
    """Return a list of all prime numbers between two integers a and b (inclusive)."""
    primes = []
    for num in range(a, b + 1):
        if is_prime(num):
            primes.append(num)
    return primes

# Test code
if __name__ == "__main__":
    # Test case 1: Primes between 10 and 30
    print("Primes between 10 and 30:", find_primes(10, 30))

    # Test case 2: Primes between 1 and 20
    print("Primes between 1 and 20:", find_primes(1, 20))

    # Test case 3: Primes between 50 and 60
    print("Primes between 50 and 60:", find_primes(50, 60))

## Task 1: Anagram Grouping

Write a Python function `group_anagrams(words)` that takes a list of strings and groups them into anagrams.

Return a list of lists, where each sublist contains `words` that are anagrams of each other.

Ex) For the input `["cat", "act", "not", "ate", "ton", "tea"]`, the output will be:
```
[['cat', 'act'], ['not', 'ton'], ['ate', 'tea]]
```

In [None]:
prompt_template = PromptTemplate(
    input_variables=["problem_statement", "cot"],
    template="""
    Generate a python code for the following problem statement:

    {problem_statement}

    {cot}
    """
)

chain = prompt_template | model | parser


problem_statement = """
A Python function group_anagrams(words) that takes a list of strings and groups them into anagrams.
Return a list of lists, where each sublist contains words that are anagrams of each other.
"""

################ TODO: Modify here ################

cot = """
"""

###################################################

result = chain.invoke({"problem_statement": problem_statement, "cot": cot})

print(result)

In [None]:
################ TODO: Paste the code here ################

## Task 2: Word Ladder Length

Given two words, `beginWord` and `endWord`, and a dictionary `wordList`, write a Python function `ladder_length(beginWord, endWord, wordList)` that returns the length of the shortest transformation sequence and the path from `beginWord` to `endWord`.

Each transformed word must exist in `wordList`, and only one letter can be changed at a time.

If no such sequence exists, return 0.

Ex) Input: wordList = `{ABCD, EBAD, EBCD, XYZA}`, beginWord = `ABCV`, endWord = `EBAD`

Output: Length is 4, the path is `ABCV – ABCD – EBCD – EBAD`

In [None]:
prompt_template = PromptTemplate(
    input_variables=["problem_statement", "cot"],
    template="""
    Generate a python code for the following problem statement:

    {problem_statement}

    {cot}
    """
)

chain = prompt_template | model | parser


problem_statement = """
Given two words and a dictionary, write a Python function ladder_length() that returns the length of the shortest transformation sequence and the path from begin word to end word.
Each transformed word must exist in the dictionary, and only one letter can be changed at a time.
If no such sequence exists, return 0.
"""

################ TODO: Modify here ################

cot = """
"""

###################################################

result = chain.invoke({"problem_statement": problem_statement, "cot": cot})

print(result)

In [None]:
################ TODO: Paste the code here ################

## Task 3: 24 Game solver

24 game is to make 24 with 4 numbers and elementary arithmetic operations (+ - × and /).

Please make a Python code for solving 24 game with given 4 numbers.

If there is no solution, the output should be "no solution found."

Ex) Input: `[4, 7, 8, 8]`

Output: `(7-(8/8))*4=24`

In [None]:
prompt_template = PromptTemplate(
    input_variables=["problem_statement", "cot"],
    template="""
    Generate a python code for the following problem statement:

    {problem_statement}

    {cot}

    """
)

chain = prompt_template | model | parser


problem_statement = """
24 game is to make 24 with exact 4 numbers and elementary arithmetic operations (+ - × and /).
Please make a Python code for solve 24 game with given 4 numbers.
If there is no solution, the output should be "no solution found."
"""

################ TODO: Modify here ################

cot = """
"""

###################################################

result = chain.invoke({"problem_statement": problem_statement, "cot": cot})

print(result)

In [None]:
################ TODO: Paste the code here ################

## Task 4: Word Chain

Generate a python function find_word_chain() that finds longest word chain for the given list of words.

In a word chain, next words begin with the letter that the previous word ended with.

Ex) Input: `[banana, apple, cat, math, dog, exam]`

Output: `banana -> apple -> exam -> math`: word chain with length 4.

In [None]:
prompt_template = PromptTemplate(
    input_variables=["problem_statement", "cot"],
    template="""
    Generate a python code for the following problem statement:

    {problem_statement}

    {cot}

    """
)

chain = prompt_template | model | parser


problem_statement = """
A python function find_word_chain() that finds longest word chain for the given list of words.
In a word chain, next words begin with the letter that the previous word ended with.
"""

################ TODO: Modify here ################

cot = """
"""

###################################################

result = chain.invoke({"problem_statement": problem_statement, "cot": cot})

print(result)

In [None]:
################ TODO: Paste the code here ################

## Task 5: Gram Matrix Calculation

Write a Python code that calculate a Gram matrix for the given matrix.

Gram matrix G for the given matrix A can be calculated by multiplying transpose of A with the matrix A. [ref](https://en.wikipedia.org/wiki/Gram_matrix)

G = AᵀA

Ex) Input: [[1,2],[3,4],[5,6]]

Output: [[35, 44], [44, 56]]

In [None]:
prompt_template = PromptTemplate(
    input_variables=["problem_statement", "cot"],
    template="""
    Generate a python code for the following problem statement:

    {problem_statement}

    {cot}

    """
)

chain = prompt_template | model | parser


problem_statement = """
Calculate a Gram matrix for the given matrix.
Gram matrix G for the given matrix A can be calculated by multiplying transpose of A with the matrix A.
"""

################ TODO: Modify here ################

cot = """
"""

###################################################

result = chain.invoke({"problem_statement": problem_statement, "cot": cot})

print(result)

In [None]:
################ TODO: Paste the code here ################

## Task 6: Hanoi Tower Variation (Hard)

Generate Python function hanoi_solver() that solves Hanoi tower problem, but discs can only move to the adjacent rods.

Ex) Input: 3 discs

Output:
```
1 A->B
1 B->C
2 A->B
1 C->B
1 B->A
2 B->C
1 A->B
1 B->C
3 A->B
1 C->B
1 B->A
2 C->B
1 A->B
1 B->C
2 B->A
1 C->B
1 B->A
3 B->C
1 A->B
1 B->C
2 A->B
1 C->B
1 B->A
2 B->C
1 A->B
1 B->C
```

In [None]:
prompt_template = PromptTemplate(
    input_variables=["problem_statement", "cot"],
    template="""
    Generate a python code for the following problem statement:

    {problem_statement}

    {cot}

    """
)

chain = prompt_template | model | parser


problem_statement = """
Generate Python function hanoi_solver() that solves Hanoi tower problem, but discs can only move to the adjacent rods.
"""

################ TODO: Modify here ################

cot = """
"""

###################################################

result = chain.invoke({"problem_statement": problem_statement, "cot": cot})

print(result)

In [None]:
################ TODO: Paste the code here ################