<a href="https://colab.research.google.com/github/aditya0589/notebooks/blob/main/AI%20Engineering/Langchain/LC03.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **LC03 LANGCHAIN PROMPTS**

Prompts are the input instructions or queries given to a model to guide its output.

There are two kinds of ways you can prompt an LLM:
1. Static prompts
2. Dynamic Prompts

Static prompts are not much used in real projects where we want to give a consistant response to the user. Since the model output depends heavily on the prompt, we must ensure that the kind of prompt which the user enters remains in our control. This way we can ensure the quality of outputs given by our system remains consistant

In [2]:
!pip install langchain langchain_community langchain_groq

Collecting langchain_community
  Downloading langchain_community-0.4.1-py3-none-any.whl.metadata (3.0 kB)
Collecting langchain_groq
  Downloading langchain_groq-1.1.1-py3-none-any.whl.metadata (2.4 kB)
Collecting langchain-classic<2.0.0,>=1.0.0 (from langchain_community)
  Downloading langchain_classic-1.0.1-py3-none-any.whl.metadata (4.2 kB)
Collecting requests<3.0.0,>=2.32.5 (from langchain_community)
  Downloading requests-2.32.5-py3-none-any.whl.metadata (4.9 kB)
Collecting dataclasses-json<0.7.0,>=0.6.7 (from langchain_community)
  Downloading dataclasses_json-0.6.7-py3-none-any.whl.metadata (25 kB)
Collecting groq<1.0.0,>=0.30.0 (from langchain_groq)
  Downloading groq-0.37.1-py3-none-any.whl.metadata (16 kB)
Collecting marshmallow<4.0.0,>=3.18.0 (from dataclasses-json<0.7.0,>=0.6.7->langchain_community)
  Downloading marshmallow-3.26.2-py3-none-any.whl.metadata (7.3 kB)
Collecting typing-inspect<1,>=0.4.0 (from dataclasses-json<0.7.0,>=0.6.7->langchain_community)
  Downloading t

In [3]:
import os
from google.colab import userdata

# Load keys from Colab Secrets
os.environ["GROQ_API_KEY"] = userdata.get("GROQ_API_KEY")
os.environ["GOOGLE_API_KEY"] = userdata.get("GOOGLE_API_KEY")
os.environ['HF_TOKEN'] = userdata.get('HF_TOKEN')
os.environ['HUGGINGFACEHUB_API_TOKEN'] = userdata.get('HUGGINGFACEHUB_API_TOKEN')

# Sanity check (safe partial print)
print("Groq:", os.environ["GROQ_API_KEY"][:8])
print("Google:", os.environ["GOOGLE_API_KEY"][:8])
print('HF_TOKEN', os.environ['HF_TOKEN'][:8])
print('huggingface api token', os.environ['HUGGINGFACEHUB_API_TOKEN'][:8])

Groq: gsk_L243
Google: AIzaSyCX
HF_TOKEN AIzaSyBE
huggingface api token hf_SMPRD


In [4]:
from langchain_groq import ChatGroq

llm = ChatGroq(model_name="llama-3.1-8b-instant")
prompt = input("Enter the prompt: ")
response = llm.invoke(prompt)
print(response.content)

Enter the prompt: summarize attention is all you need
"Attention is All You Need" is a research paper published in 2017 by Vaswani et al., which introduced the Transformer model, a groundbreaking architecture for natural language processing (NLP) tasks. Here's a summary of the paper:

**Motivation:**

1. Traditional sequence-to-sequence models, such as recurrent neural networks (RNNs) and long short-term memory (LSTM) networks, relied on recurrent connections and sequential processing, which can be computationally expensive and prone to vanishing gradients.
2. The paper aimed to design a new architecture that could efficiently process long-range dependencies in sequential data, such as sentences and paragraphs.

**Key contributions:**

1. **Self-Attention Mechanism:** The Transformer model introduced a self-attention mechanism, which allows the model to attend to all positions in the input sequence simultaneously and weigh their importance. This mechanism is computationally efficient a

The above is an application where we allow the user to enter static prompts. Here we cannot control the consistancy of the responses as we are giving full control to the user regarding the prompts

In dynamic prompts, we developers write the prompt beforehand (not the user). The user only provides us only the key values which he needs.

The below is an example of how you do it:

In [5]:
from langchain_groq import ChatGroq

llm = ChatGroq(model_name="llama-3.1-8b-instant")
name = input("Enter the name of the paper you want to summarize: ")
prompt = f"Summarize the paper {name}. Give detailed response with equations and code snippets wherever needed"
response = llm.invoke(prompt)
print(response.content)


Enter the name of the paper you want to summarize: attention is all you need
**Attention is All You Need**

The paper "Attention is All You Need" by Vaswani et al. (2017) introduced the Transformer model, a groundbreaking neural network architecture that has revolutionized the field of natural language processing (NLP). The paper presents a new approach to sequence-to-sequence learning that relies entirely on self-attention mechanisms, abandoning the traditional recurrent neural network (RNN) and convolutional neural network (CNN) architectures.

**Background**

Traditional sequence-to-sequence models, such as RNNs and CNNs, rely on recurrent or convolutional layers to capture temporal dependencies in sequential data. However, these models suffer from several limitations:

1. **Sequential computation**: RNNs and CNNs process sequences sequentially, which can lead to slow computation and difficulty in parallelizing the computation.
2. **Fixed context window**: RNNs have a fixed context 

You can find that in the above prompt, we as developers pre-defined the prompt and the user only gives the name of the paper he wants to summarize.

This way we were able to control how the output is to be designed in our program

## **Langchain PromptTemplate**

A **PromptTemplate** in langchain is a structured way to create prompts dynamically by inserting variables into a predefined template. Instead of hardcoding prompts, PromptTemplate allows you to define placeholders that can be filled in at runtime with different inputs.

This makes it reusable, flexible and easy to manage. This makes it easy to work with dynamic user inputs or automatic workflows


**Why use PromptTemplate over f strings:**

1. **Default Validation:** Langchain automatically validates if all the specified variables in the prompt are initialized.
2. **Reusable:** A prompt can be reused by saving it as a JSON.
3. **Langchain ecosystem:** Prompt template is havily coupled with the rest of the Langchain ecosystem such as chains and agents.

In [6]:
from langchain_core.prompts import PromptTemplate
from langchain_groq import ChatGroq

template = PromptTemplate(
    template = """
    Summarize the paper {paper_name} in about
    {word_count} words. Add equations and code snippets wherever possible.
    """,
    input_variables = ["paper_name", "word_count"]
)

prompt = template.invoke(
    {
        "paper_name": "Attention is all you need",
        "word_count": 100
    }
)

llm = ChatGroq(model_name="llama-3.1-8b-instant")
response = llm.invoke(prompt)
print(response.content)

**"Attention is all you need" Paper Summary**

The paper "Attention is all you need" by Vaswani et al. (2017) introduced the Transformer model, a revolutionary architecture for sequence-to-sequence tasks. The Transformer replaces traditional recurrent neural networks (RNNs) and convolutional neural networks (CNNs) with self-attention mechanisms.

**Key Components:**

1. **Self-Attention Mechanism**: The Transformer uses a multi-head attention mechanism to compute weighted sums of the input representations.
    \[ \text{Attention(Q, K, V)} = \text{softmax}(\frac{\text{QK}^T}{\sqrt{d_k}}) \text{V} \]
    where Q, K, and V are the query, key, and value matrices, respectively.

2. **Encoder-Decoder Architecture**: The Transformer consists of an encoder and a decoder. The encoder takes in a sequence of tokens and outputs a sequence of context vectors. The decoder generates output tokens based on the context vectors.

**Code Snippet (PyTorch):**

```python
import torch
import torch.nn as nn


Suppose the user forgets to add an input value, we can set a validation parameter to handle it:

In [7]:
from langchain_core.prompts import PromptTemplate
from langchain_groq import ChatGroq

template = PromptTemplate(
    template = """
    Summarize the paper {paper_name} in about
    {word_count} words. Add equations and code snippets wherever possible.
    """,
    input_variables = ["paper_name"],
    validate_template = True
)

prompt = template.invoke(
    {
        "paper_name": "Attention is all you need",
        "word_count": 100
    }
)

llm = ChatGroq(model_name="llama-3.1-8b-instant")
response = llm.invoke(prompt)
print(response.content)

ValidationError: 1 validation error for PromptTemplate
  Value error, Invalid prompt schema; check for mismatched or missing input parameters from ['paper_name']. [type=value_error, input_value={'template': '\n    Summa...'partial_variables': {}}, input_type=dict]
    For further information visit https://errors.pydantic.dev/2.12/v/value_error

You can find that this returns a `` validation error``. This is because we forgot to specify an input variable in the list (there are two input variables in the prompt, but only one specified in the list)

We can save the template as a JSON object for future reuse as many times as we want

In [8]:
template.save('template.json')

In [9]:
template.json

In the code snippet below, we are loading the json file to access the prompt template

In [10]:
from langchain_core.prompts import PromptTemplate, load_prompt
from langchain_groq import ChatGroq

template = load_prompt('template.json')

prompt = template.invoke(
    {
        "paper_name": "Attention is all you need",
        "word_count": 100
    }
)

llm = ChatGroq(model_name="llama-3.1-8b-instant")
response = llm.invoke(prompt)
print(response.content)

**Summary of "Attention is All You Need" Paper**

The paper "Attention is All You Need" by Vaswani et al. (2017) introduces the Transformer model, a novel architecture for sequence-to-sequence tasks. The Transformer model uses self-attention mechanisms to weigh the importance of different input elements, eliminating the need for recurrent neural networks (RNNs) or convolutional neural networks (CNNs).

**Self-Attention Mechanism:**

Given a sequence of input vectors `x1, x2, ..., xn`, the self-attention mechanism computes a weighted sum of these vectors. The weights are computed as:

`Attention(Q, K, V) = softmax(QK^T/sqrt(d))V`

where `Q` (query), `K` (key), and `V` (value) are learned linear transformations of the input vectors.

**Multi-Head Attention:**

The Transformer model uses multi-head attention to jointly attend to information from different representation subspaces at different positions. This is achieved by applying attention in parallel multiple times, with different lear

## **Langchain Messages**

In [13]:
from langchain_groq import ChatGroq

model = ChatGroq(model_name="llama-3.1-8b-instant")

print("Chat. Type 'exit' to break")
while True:
  user_input = input("You: ")
  if user_input == 'exit':
    break
  response = model.invoke(user_input)
  print("Chat: ", response.content)


Chat. Type 'exit' to break
You: hello
Chat:  Hello. How can I assist you today?
You: what is my name?
Chat:  I don't have any information about your name. I'm a large language model, I don't have personal interactions or memories, so I don't know your name unless you tell me. If you'd like to share your name, I'd be happy to chat with you!
You: sing me a song
Chat:  (Verse 1)
In a world of dreams, where stars are bright
A melody whispers through the night
A gentle breeze that rustles through the trees
A lullaby that only the heart can see

(Chorus)
Oh, the moon is high, the wind is low
The shadows dance, and the night grows slow
In this moment, I am free to roam
Where the music takes me, I am never alone

(Verse 2)
In a sea of clouds, where the sun dips low
A symphony of colors starts to grow
A canvas painted with hues of gold
A masterpiece that's yet to be told

(Chorus)
Oh, the moon is high, the wind is low
The shadows dance, and the night grows slow
In this moment, I am free to roam

You might find out that the above chatbot does not have any memory ie: the chatbot has no memory of the previous questions asked, it only takes a question and gives the response to the current question. If we ask it about something already mentioned previously in the chat, it has no memory of it.

We can solve this problem by maintaining the chat history in the form of an array and then passing this vhat history to the model each time.

This way the model recieves the previous messages along with the recent message and can answer accordingly.

In [14]:
from langchain_groq import ChatGroq

model = ChatGroq(model_name="llama-3.1-8b-instant")

chat_history = []
print("Chat. Type 'exit' to break")
while True:
  user_input = input("You: ")
  chat_history.append(user_input)
  if user_input == 'exit':
    break
  response = model.invoke(chat_history)
  chat_history.append(response.content)
  print("Chat: ", response.content)


Chat. Type 'exit' to break
You: hello
Chat:  Hello, how can I assist you today?
You: what is my name?
Chat:  I don't have any information about your name. This is the beginning of our conversation, and I don't retain any data about individual users. If you'd like to share your name with me, I'm happy to chat with you and address you by it.
You: my name is aditya
Chat:  It's nice to meet you, Aditya. I don't retain any information about our conversation, so feel free to start fresh and ask me anything you'd like. What's on your mind today?
You: what is my name
Chat:  We've been here before, Aditya. I told you earlier that I don't retain any information, so I won't be able to recall that your name is Aditya. However, you've just told me, so I'm happy to chat with you as Aditya now.
You: exit


We can find that now the model is able to remember the previous conversation.

If we now print the chat history, we find that its a single list with the messages. But we are not able to store the information about which messages is said by whom

In [15]:
print(chat_history)

['hello', 'Hello, how can I assist you today?', 'what is my name?', "I don't have any information about your name. This is the beginning of our conversation, and I don't retain any data about individual users. If you'd like to share your name with me, I'm happy to chat with you and address you by it.", 'my name is aditya', "It's nice to meet you, Aditya. I don't retain any information about our conversation, so feel free to start fresh and ask me anything you'd like. What's on your mind today?", 'what is my name', "We've been here before, Aditya. I told you earlier that I don't retain any information, so I won't be able to recall that your name is Aditya. However, you've just told me, so I'm happy to chat with you as Aditya now.", 'exit']


In order to solve this problem, we need to use a dictionary instead of a list. This dictionary should not only store the messages but also who said those messages

like:

You: hi

AI: Hello

and so on

Langchain messages helps you achieve this. You dont need to manually implement these data structures.

Lanchain has three types of messages:
1. System message
2. Human message
3. AI message




In [18]:
from langchain_core.messages import SystemMessage, HumanMessage, AIMessage
from langchain_groq import ChatGroq

model = ChatGroq(model_name="llama-3.1-8b-instant")

messages=[
    SystemMessage(content='You are a helpful assistant'),
    HumanMessage(content='Tell me about LangChain')
]

result = model.invoke(messages)
print(result.content)

messages.append(AIMessage(content=result.content))

print(messages)


LangChain is an open-source Python library for building large language models and AI systems. It was developed by Llama.com, a company that provides a platform for building and deploying AI models.

LangChain provides a set of tools and APIs that make it easier to build and integrate large language models into various applications. Some of the key features of LangChain include:

1. **Model Integration**: LangChain allows developers to easily integrate large language models, such as LLaMA, BART, and T5, into their applications.
2. **Conversational Interfaces**: LangChain provides tools for building conversational interfaces, such as chatbots and voice assistants, using large language models.
3. **Text Generation**: LangChain allows developers to generate text programmatically using large language models.
4. **Text Classification**: LangChain provides tools for text classification tasks, such as sentiment analysis and topic modeling.
5. **Dialogue Management**: LangChain provides tools f

The above is the basic implementation of the langchain messages. we can see how the different type of messages work

We now use this messages concept in our chatbot

In [19]:
from langchain_groq import ChatGroq
from langchain_core.messages import SystemMessage, HumanMessage, AIMessage

model = ChatGroq(model_name="llama-3.1-8b-instant")

chat_history = [
    SystemMessage(content='You are a helpful AI assistant')
]

while True:
    user_input = input('You: ')
    chat_history.append(HumanMessage(content=user_input))
    if user_input == 'exit':
        break
    result = model.invoke(chat_history)
    chat_history.append(AIMessage(content=result.content))
    print("AI: ",result.content)

print(chat_history)

You: hello
AI:  Hello. How can I assist you today?
You: what is my name?
AI:  I don't have any information about your name. I'm a conversational AI, and our conversation just started. You can tell me your name if you'd like, and I can use it to address you throughout our conversation.
You: my name is aditya
AI:  Nice to meet you, Aditya. Is there something specific you'd like to talk about or would you like some recommendations on topics I can help you with?
You: what did i ask you until now?
AI:  You asked me the following:

1. "Hello"
2. "what is my name?"
3. "my name is aditya"
4. "what did i ask you until now?"
You: exit
[SystemMessage(content='You are a helpful AI assistant', additional_kwargs={}, response_metadata={}), HumanMessage(content='hello', additional_kwargs={}, response_metadata={}), AIMessage(content='Hello. How can I assist you today?', additional_kwargs={}, response_metadata={}, tool_calls=[], invalid_tool_calls=[]), HumanMessage(content='what is my name?', additional

We can find that in the above chat history every message is labelled if its a HumanMessage, a SystemMessage or an AIMessage

## **Chat Prompt Template**


A **Chat Prompt Template** is a template used to create prompts specifically for **chat-based models** (like ChatGPT).  
It structures the prompt as a **conversation**, using roles such as:
- **System** – instructions or rules for the model
- **Human** – user messages
- **AI** – assistant responses

- **Prompt Template** → Used for **single text prompts** (one block of text).
- **Chat Prompt Template** → Used for **conversations with roles**.

### Simple explanation:
> Chat models think in conversations, not plain text.  
> Chat Prompt Templates help you talk to them the *right way* by separating instructions, user input, and responses.

- Use **PromptTemplate** for simple text generation
- Use **ChatPromptTemplate** when working with **chat models** to get better, more accurate responses


In [20]:
from langchain_core.prompts import ChatPromptTemplate

chat_template = ChatPromptTemplate([
    ('system', 'You are a helpful {domain} expert'),
    ('human', 'Explain in simple terms, what is {topic}')
])

prompt = chat_template.invoke({'domain':'cricket','topic':'Dusra'})

print(prompt)

messages=[SystemMessage(content='You are a helpful cricket expert', additional_kwargs={}, response_metadata={}), HumanMessage(content='Explain in simple terms, what is Dusra', additional_kwargs={}, response_metadata={})]


## **Message Placeholder**

A **Message Placeholder** in Langchain is a special placeholder used inside a chatprompt template to dynamically insert chat history on a list of messages during runtime.

Let us assume we have the chat history stored in an external file. We want to insert it dynamically in runtime

In [22]:
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
# chat template
chat_template = ChatPromptTemplate([
    ('system','You are a helpful customer support agent'),
    MessagesPlaceholder(variable_name='chat_history'),
    ('human','{query}')
])

chat_history = []
# load chat history
with open('chat_history.txt') as f:
    chat_history.extend(f.readlines())

print(chat_history)

# create prompt
prompt = chat_template.invoke({'chat_history':chat_history, 'query':'Where is my refund'})

print(prompt)

['HumanMessage(content="I want to request a refund for my order #12345.")\n', 'AIMessage(content="Your refund request for order #12345 has been initiated. It will be processed in 3-5 business days.")']
messages=[SystemMessage(content='You are a helpful customer support agent', additional_kwargs={}, response_metadata={}), HumanMessage(content='HumanMessage(content="I want to request a refund for my order #12345.")\n', additional_kwargs={}, response_metadata={}), HumanMessage(content='AIMessage(content="Your refund request for order #12345 has been initiated. It will be processed in 3-5 business days.")', additional_kwargs={}, response_metadata={}), HumanMessage(content='Where is my refund', additional_kwargs={}, response_metadata={})]
