Prompt Templates help to run raw user information into a format that LLM can work with. In this case, the raw user input is just a message, which we are passing to the LLM. 

In [2]:
import os
from dotenv import load_dotenv
load_dotenv()

True

Creating a Prompt Template:

`prompt = ChatPromptTemplate.from_messages(...):` This creates a ChatPromptTemplate object. Prompt templates help you structure your prompts consistently. This specific template defines two parts:
* `('system', 'Hey you are a helpful assistant. Answer all the questions to the best of your ability')`: This sets the "system" role in the chat, providing an initial instruction to the language model.
* `MessagesPlaceholder(variable_name="messages")`: This is where the magic of MessagePlaceholder comes in. It creates a placeholder within your prompt template where you can dynamically insert a list of messages. This is crucial for multi-turn conversations where you want to include the chat history.

In [3]:
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage

openai_api_key = os.getenv('OPENAI_API_KEY')

llm = ChatOpenAI(model = 'gpt-4o')


prompt = ChatPromptTemplate.from_messages(
    [
        ('system','Hey you are a helpful assitant. Answer all the questions to the nest of your ability'),
        MessagesPlaceholder(variable_name="messages")
    ]
)

chain = prompt|llm


In [4]:
chain.invoke({'messages': [HumanMessage(content='Hi my name is poorna')]})

AIMessage(content='Hello Poorna! How can I assist you today?', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 11, 'prompt_tokens': 35, 'total_tokens': 46, 'completion_tokens_details': {'audio_tokens': None, 'reasoning_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': None, 'cached_tokens': 0}}, 'model_name': 'gpt-4o-2024-08-06', 'system_fingerprint': 'fp_6b68a8204b', 'finish_reason': 'stop', 'logprobs': None}, id='run-98c4c65a-dcb2-49e2-b3fe-76c8935dd7f3-0', usage_metadata={'input_tokens': 35, 'output_tokens': 11, 'total_tokens': 46, 'input_token_details': {'cache_read': 0}, 'output_token_details': {'reasoning': 0}})

In [5]:
from langchain_community.chat_message_histories import ChatMessageHistory
from langchain_core.chat_history import BaseChatMessageHistory
from langchain_core.runnables import RunnableWithMessageHistory


store={}
def get_session_history(session_id:str)-> BaseChatMessageHistory:
    if session_id not in store:
        store[session_id] = ChatMessageHistory()
    return store[session_id]


with_message_history = RunnableWithMessageHistory(chain, get_session_history=get_session_history)

In [6]:
config = {'configurable':{'session_id':'chat3'}}

response = with_message_history.invoke(
    [HumanMessage(content = 'Hi my name is poorna')],
    config = config
)

response

AIMessage(content='Hello, Poorna! How can I assist you today?', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 12, 'prompt_tokens': 35, 'total_tokens': 47, 'completion_tokens_details': {'audio_tokens': None, 'reasoning_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': None, 'cached_tokens': 0}}, 'model_name': 'gpt-4o-2024-08-06', 'system_fingerprint': 'fp_6b68a8204b', 'finish_reason': 'stop', 'logprobs': None}, id='run-fafc9e6c-df2d-4636-8912-4e15fd62d6e1-0', usage_metadata={'input_tokens': 35, 'output_tokens': 12, 'total_tokens': 47, 'input_token_details': {'cache_read': 0}, 'output_token_details': {'reasoning': 0}})

In [7]:
# adding more complexity

prompt = ChatPromptTemplate.from_messages(
    [('system','you are a helpful assistant. Answer all questions to the best of your ability in {language}'),
     MessagesPlaceholder(variable_name='messages')]
)

chain = prompt|llm


response  = chain.invoke(
    {
        'messages' : [HumanMessage(content ='Hi my name is poorna')],
        'language': 'Telugu'
        }
)

response.content

'హలో పూర్ణ, మీరు ఎలా ఉన్నారు? నాకు సహాయం అవసరమైతే చెప్పండి!'

`RunnableWithMessageHistory` is a class in LangChain that helps manage conversation history when you're interacting with a language model. It acts as a wrapper around another `Runnable` object (like an LLM or a Chain) and takes care of storing and retrieving messages from the conversation.

Here's a breakdown of what it does:

**1. Storing Messages:**

- Every time you run the `RunnableWithMessageHistory`, it automatically stores the input and output messages in a `ChatMessageHistory` object. This history can be stored in memory or in a persistent store like a database.

**2. Retrieving Messages:**

- When you invoke the wrapped `Runnable` again, `RunnableWithMessageHistory` retrieves the relevant message history and includes it in the prompt. This ensures that the language model has context from the previous turns of the conversation.

**Why is this useful?**

- **Maintaining Context:**  Language models have no memory of past interactions. `RunnableWithMessageHistory` provides this memory by feeding the conversation history back into the model, enabling more coherent and contextually relevant responses.
- **Simplifying Conversation Handling:** You don't have to manually manage message storage and retrieval. `RunnableWithMessageHistory` handles this for you, making it easier to build conversational applications.

**How to Use It:**

```python
from langchain.schema import HumanMessage, SystemMessage
from langchain.chat_models import ChatOpenAI
from langchain.prompts import ChatPromptTemplate
from langchain.chains import LLMChain
from langchain.runnables import RunnableWithMessageHistory
from langchain.memory import ConversationBufferMemory

# Initialize your LLM and prompt
llm = ChatOpenAI(temperature=0)
prompt = ChatPromptTemplate.from_messages(
    [
        SystemMessage(content="You are a helpful AI assistant."),
        HumanMessage(content="{input}"),
    ]
)
chain = LLMChain(llm=llm, prompt=prompt)

# Create a memory object to store the history
memory = ConversationBufferMemory()

# Wrap the chain with RunnableWithMessageHistory
runnable_with_history = RunnableWithMessageHistory(
    runnable=chain, 
    memory=memory
)

# Start the conversation
runnable_with_history.invoke({"input": "Hi, my name is Bob"})  
runnable_with_history.invoke({"input": "What's my name?"}) # The LLM will remember!
```

In this example, `RunnableWithMessageHistory` manages the conversation history between the user ("Bob") and the LLM. The second time you invoke it, the LLM will "remember" the previous interaction and correctly identify Bob's name.


In [8]:
prompt = ChatPromptTemplate.from_messages(
    [
        ('system','you are a helpful assistant.Answer all the questions in {language}'),
        MessagesPlaceholder(variable_name='messages')
    ]
)

chain = prompt|llm

with_message_history = RunnableWithMessageHistory(
    chain,
    get_session_history,
    input_messages_key= 'messages'
)

config = {'configurable':{'session_id':'chat4'}}

response =  with_message_history.invoke(
    {'messages': [HumanMessage(content ='Hi my name is poorna')],
     "language":'telugu'},
     config =  config
)

response.content


'హలో పూర్ణ! మీతో పరిచయం కావడం ఆనందంగా ఉంది. మీకు ఎలా సహాయపడగలను?'

# Managing the conversation history

One important concept to understand when building chatbots is how to manage conversation history. If left unmanaged, the list of messages will grow unbounded and potentially overflow the context window of the LLM. Therefore, it is important to add a step that limits the size of the messages you are passing in.

`trim_messages` is a very useful function in LangChain, especially when building chatbots that involve longer conversations. It helps you manage the context window of language models by trimming the conversation history to a manageable size.

Here's why it's important and how it works:

**The Context Window Problem:**

* Language models (LLMs) have a limited context window – they can only "remember" a certain amount of text from the current conversation.
* As a conversation gets longer, the message history can exceed this limit.
* If you send too much history to the LLM, it might lose track of the earlier parts of the conversation or even exceed the maximum token limit, resulting in errors.

**How `trim_messages` Helps:**

* `trim_messages` allows you to truncate the conversation history while preserving the most important information.
* You can specify how many tokens or messages to keep, and which strategy to use for trimming (e.g., keep the most recent messages, keep the first few messages, etc.).
* This ensures that the LLM receives a concise and relevant history, preventing context window overflow and improving performance.

**Key Benefits:**

* **Improved Performance:**  Avoids exceeding the LLM's token limits, leading to faster responses and fewer errors.
* **Reduced Costs:**  Shorter prompts mean fewer tokens are processed, potentially lowering the cost of using the LLM.
* **Better Context:**  By keeping the most relevant parts of the conversation, you can ensure the LLM has the necessary context to provide meaningful responses.

**Example:**

```python
from langchain.schema import HumanMessage, AIMessage, SystemMessage
from langchain.messages import trim_messages

messages = [
    SystemMessage(content="You are a helpful AI assistant."),
    HumanMessage(content="Hi, what's the capital of France?"),
    AIMessage(content="The capital of France is Paris."),
    HumanMessage(content="And what's the population of Paris?"),
    AIMessage(content="The population of Paris is about 2.1 million."),
    # ... more messages ...
]

# Trim the messages to the last 50 tokens
trimmed_messages = trim_messages(messages, max_tokens=50, token_counter=llm) 
```

In this example, `trim_messages` will shorten the conversation history to fit within 50 tokens, likely keeping the most recent messages to provide context for the ongoing conversation.

**Important Considerations:**

* **Trimming Strategy:** Choose the right strategy (`"first"`, `"last"`) based on your chatbot's needs.
* **Token Counting:**  Ensure you have an accurate way to count tokens (e.g., using the LLM's tokenizer).
* **System Messages:** Decide whether to include system messages in the trimming process.

In [9]:
from langchain_core.messages import SystemMessage, trim_messages
from langchain_core.messages import HumanMessage, AIMessage
trimmer = trim_messages(
    max_tokens = 100,
    strategy = "last",
    token_counter = llm,
    include_system = True,
    allow_partial = False,
    start_on = 'human'
    )

messages = [
    SystemMessage(content ="you're a good assistant"),
    HumanMessage(content="Hi, how's the weather today?"),
    AIMessage(content="The weather is sunny with a temperature of 25°C."),
    HumanMessage(content="What's the current news?"),
    AIMessage(content="The latest news is about the coronavirus outbreak."),
    HumanMessage(content="who is YS Jagan Mohan Reddy?"),
    AIMessage(content="YS Jagan Mohan Reddy is a renowned Indian politician."),
    HumanMessage(content="Tell me more about the latest match"),
    AIMessage(content="The latest match is between India and Australia."),
    HumanMessage(content="Can you help me find a restaurant near me?"),
    AIMessage(content="I recommend the Chennai Tandoor."),
    
]

trimmer.invoke(messages)

[SystemMessage(content="you're a good assistant", additional_kwargs={}, response_metadata={}),
 HumanMessage(content='who is YS Jagan Mohan Reddy?', additional_kwargs={}, response_metadata={}),
 AIMessage(content='YS Jagan Mohan Reddy is a renowned Indian politician.', additional_kwargs={}, response_metadata={}),
 HumanMessage(content='Tell me more about the latest match', additional_kwargs={}, response_metadata={}),
 AIMessage(content='The latest match is between India and Australia.', additional_kwargs={}, response_metadata={}),
 HumanMessage(content='Can you help me find a restaurant near me?', additional_kwargs={}, response_metadata={}),
 AIMessage(content='I recommend the Chennai Tandoor.', additional_kwargs={}, response_metadata={})]

In [10]:
from operator import itemgetter
from langchain_core.runnables import RunnablePassthrough

prompt = ChatPromptTemplate.from_messages(
    [
        ('system','you are a helpful assistant.Answer all the questions in {language}'),
        MessagesPlaceholder(variable_name='messages')
    ]
)


chain = (
    RunnablePassthrough.assign(messages = itemgetter('messages')|trimmer)
    |prompt
    |llm
)

response = chain.invoke({
    'messages': messages +[HumanMessage(content="Who is YS Jagan Mohan Reddy?")],
    'language': 'telugu'}
    )

response.content

'వైఎస్ జగన్ మోహన్ రెడ్డి ఆంధ్రప్రదేశ్ రాష్ట్ర ముఖ్యమంత్రిగా ఉన్నారు. ఆయన 2019లో జరిగిన ఎన్నికల్లో వైఎస్సార్ కాంగ్రెస్ పార్టీ తరఫున ముఖ్యమంత్రి పదవిని సాధించారు. ఆయన అంతకు ముందు దివంగత ముఖ్యమంత్రి వైఎస్ రాజశేఖర రెడ్డి కుమారుడు. వైఎస్ జగన్ మోహన్ రెడ్డి ఆంధ్రప్రదేశ్ రాష్ట్రంలో అనేక సంక్షేమ కార్యక్రమాలను ప్రారంభించారు మరియు రాష్ట్ర అభివృద్ధి కోసం కృషి చేస్తున్నారు.'

In [11]:
with_message_history = RunnableWithMessageHistory(
    chain,
    get_session_history,
    input_messages_key= 'messages')

config = {'configurable':{'session_id':'chat5'}}
response = with_message_history.invoke(
    {
        'messages': messages +[HumanMessage(content="Who is YS Jagan Mohan Reddy?")],
    'language': 'telugu'},
config = config
)

response.content

'వైఎస్ జగన్ మోహన్ రెడ్డి ఆంధ్రప్రదేశ్ రాష్ట్రానికి ముఖ్యమంత్రిగా ఉన్నారు. ఆయన 2019 మే 30న ముఖ్యమంత్రిగా బాధ్యతలు స్వీకరించారు. జగన్ మోహన్ రెడ్డి వై ఎస్ ఆర్ కాంగ్రెస్ పార్టీకి అధినేతగా ఉన్నారు. ఆయన ఆంధ్రప్రదేశ్ మాజీ ముఖ్యమంత్రి డాక్టర్ వై.ఎస్. రాజశేఖర రెడ్డి గారి కుమారుడు. రాజకీయాల్లోకి రాక ముందు ఆయన వ్యాపార రంగంలో కూడా ఉన్నారు.'

## Vectors and Retrivers

## Documents

You're absolutely right! It seems I got carried away with the broader concept of "documents" in LangChain. 

You're specifically interested in the `Document` class within `langchain_core.documents`. 

Here's a breakdown of the `langchain_core.documents.Document` class:

**Purpose:**

The `Document` class is a fundamental building block for representing pieces of text within LangChain. It provides a standardized way to store and work with text data along with its associated metadata.

**Key Components:**

* **page_content:** This attribute stores the actual text content of the document. It can be a string of any length, representing an article, code snippet, email, etc.
* **metadata:** This is a dictionary that holds additional information about the document. This metadata can be crucial for:
    * **Source Tracking:**  Store the origin of the document (URL, file path, database ID).
    * **Contextualization:**  Include information like author, date, or category.
    * **Document Relationships:**  Define links or connections to other documents.
    * **Custom Properties:**  Add any other relevant information you need.

**Example:**

```python
from langchain_core.documents import Document

doc = Document(
    page_content="This is an example document.",
    metadata={"source": "example.txt", "author": "John Doe"}
)
```

**Why is this important?**

* **Organization:**  Provides a structured way to manage text data.
* **Contextualization:**  Metadata adds valuable context to the text, which can be used for retrieval or to guide the language model.
* **Integration:**  The `Document` class is used throughout LangChain for various tasks like:
    * **Loading data:**  Document loaders return lists of `Document` objects.
    * **Splitting text:**  Text splitters operate on `Document` objects.
    * **Storing data:**  Document stores often use `Document` objects as their basic unit.
    * **Creating prompts:** You can include `Document` content and metadata in prompts to provide context to the LLM.

**Key Takeaway:**

The `langchain_core.documents.Document` class is a fundamental element in LangChain for representing and working with text data and its associated metadata. It plays a crucial role in organizing, contextualizing, and processing text information within your LLM applications.


In [12]:
from langchain_core.documents import Document

documents = [
    Document(
        page_content= "Dogs are great companions, known for their loyality and friendliness.",
        metadata ={'source':"mammal-pets-doc"}
    ),
    Document(
        page_content= "Cats are intelligent and playful creatures, often seen as a symbol of companionship and happiness.",
        metadata = {'source':"mammal-pets-doc"}
    ),
    Document(
        page_content= "Birds are fascinating creatures with a wide range of behaviors and species.",
        metadata = {'source':"bird-behaviour-doc"} ),
    Document(
        page_content= "Fish are aquatic animals that are omnivores, meaning they eat a variety of foods.",
        metadata = {'source':"aquatic-animals-doc"}
    ),
]

In [13]:
documents

[Document(metadata={'source': 'mammal-pets-doc'}, page_content='Dogs are great companions, known for their loyality and friendliness.'),
 Document(metadata={'source': 'mammal-pets-doc'}, page_content='Cats are intelligent and playful creatures, often seen as a symbol of companionship and happiness.'),
 Document(metadata={'source': 'bird-behaviour-doc'}, page_content='Birds are fascinating creatures with a wide range of behaviors and species.'),
 Document(metadata={'source': 'aquatic-animals-doc'}, page_content='Fish are aquatic animals that are omnivores, meaning they eat a variety of foods.')]

In [14]:
import os
from dotenv import load_dotenv
from langchain_groq import ChatGroq

load_dotenv()

groq_api_key = os.getenv('GROQ_API_KEY')

os.environ['HF_TOKEN'] = os.getenv('HF_TOKEN')

llm = ChatGroq(model ="Llama3-8b-8192", groq_api_key = groq_api_key) 


In [15]:
from langchain_huggingface import HuggingFaceEmbeddings

embeddings = HuggingFaceEmbeddings(
    model_name = 'sentence-transformers/all-MiniLM-L6-v2'
)

  from tqdm.autonotebook import tqdm, trange


In [16]:
## vetor stores are used to store the embeddings og the input documents

from langchain_chroma import Chroma

vectorstore = Chroma.from_documents(documents,embedding=embeddings)
vectorstore 

<langchain_chroma.vectorstores.Chroma at 0x1ecb6c9f9d0>

In [17]:
vectorstore.similarity_search('cat')

[Document(metadata={'source': 'mammal-pets-doc'}, page_content='Cats are intelligent and playful creatures, often seen as a symbol of companionship and happiness.'),
 Document(metadata={'source': 'bird-behaviour-doc'}, page_content='Birds are fascinating creatures with a wide range of behaviors and species.'),
 Document(metadata={'source': 'mammal-pets-doc'}, page_content='Dogs are great companions, known for their loyality and friendliness.'),
 Document(metadata={'source': 'aquatic-animals-doc'}, page_content='Fish are aquatic animals that are omnivores, meaning they eat a variety of foods.')]

This code performs an **asynchronous similarity search** on a `vectorstore` using the query "fish". Let's break down what this means:

**1. Vectorstore:**

* A vectorstore is a specialized database that stores embeddings (vector representations) of your data (e.g., documents, images).
* These embeddings capture the semantic meaning of your data, allowing for similarity search.

**2. `asimilarity_search()`:**

* This method is used to find the most semantically similar items in the vectorstore to a given query.
* In this case, the query is "fish". The `asimilarity_search()` method will calculate the similarity between the embedding of "fish" and the embeddings of all the items stored in the vectorstore.
* It will then return a list of the most similar items (e.g., documents, images) based on a similarity score.

**3. `await`:**

* The `await` keyword indicates that this is an asynchronous operation.
* Asynchronous operations allow your program to continue executing other tasks while waiting for the similarity search to complete. This is particularly useful when dealing with potentially long-running operations like database queries.

**In simpler terms:**

Imagine you have a library of documents about various aquatic creatures. You want to find the documents that are most relevant to "fish".  

This code does the following:

1.  **Converts "fish" into an embedding:** This captures the meaning of "fish" in a numerical vector format.
2.  **Compares the "fish" embedding to all other embeddings in the vectorstore:** This identifies documents with similar meanings.
3.  **Returns the most similar documents:** You get a list of documents that are most likely related to "fish".

**Example:**

```python
# Assuming you have a vectorstore called 'my_vectorstore'
results = await my_vectorstore.asimilarity_search('fish')

# Print the content of the most similar document
print(results[0].page_content)
```

This might output something like:

```
"Fish are aquatic vertebrates that have gills for breathing and fins for swimming. They come in a wide variety of shapes and sizes..." 
```

**Key takeaway:**

`await vectorstore.asimilarity_search('fish')` efficiently retrieves the most semantically relevant items related to "fish" from a vectorstore using an asynchronous operation. This is a powerful technique for building applications that require semantic search capabilities, such as information retrieval, question answering, and recommendation systems.


In [18]:
## async query

await vectorstore.asimilarity_search('fish')

[Document(metadata={'source': 'aquatic-animals-doc'}, page_content='Fish are aquatic animals that are omnivores, meaning they eat a variety of foods.'),
 Document(metadata={'source': 'bird-behaviour-doc'}, page_content='Birds are fascinating creatures with a wide range of behaviors and species.'),
 Document(metadata={'source': 'mammal-pets-doc'}, page_content='Cats are intelligent and playful creatures, often seen as a symbol of companionship and happiness.'),
 Document(metadata={'source': 'mammal-pets-doc'}, page_content='Dogs are great companions, known for their loyality and friendliness.')]

In [19]:
vectorstore.similarity_search_with_score('cat')

[(Document(metadata={'source': 'mammal-pets-doc'}, page_content='Cats are intelligent and playful creatures, often seen as a symbol of companionship and happiness.'),
  1.0024406909942627),
 (Document(metadata={'source': 'bird-behaviour-doc'}, page_content='Birds are fascinating creatures with a wide range of behaviors and species.'),
  1.578450322151184),
 (Document(metadata={'source': 'mammal-pets-doc'}, page_content='Dogs are great companions, known for their loyality and friendliness.'),
  1.5882408618927002),
 (Document(metadata={'source': 'aquatic-animals-doc'}, page_content='Fish are aquatic animals that are omnivores, meaning they eat a variety of foods.'),
  1.6116491556167603)]

In [20]:
## Retrivers
from typing import List

from langchain_core.documents import Document
from langchain_core.runnables import RunnableLambda


retriver = RunnableLambda(vectorstore.similarity_search).bind(k=1)

retriver.batch(['cat','dog'])


[[Document(metadata={'source': 'mammal-pets-doc'}, page_content='Cats are intelligent and playful creatures, often seen as a symbol of companionship and happiness.')],
 [Document(metadata={'source': 'mammal-pets-doc'}, page_content='Dogs are great companions, known for their loyality and friendliness.')]]

# vecot store implementation with as_retriever



In [21]:
retriever = vectorstore.as_retriever(
    search_type = 'similarity',
    search_kwargs = {'k':1}
)

retriver.batch(['cat','dog'])

[[Document(metadata={'source': 'mammal-pets-doc'}, page_content='Cats are intelligent and playful creatures, often seen as a symbol of companionship and happiness.')],
 [Document(metadata={'source': 'mammal-pets-doc'}, page_content='Dogs are great companions, known for their loyality and friendliness.')]]

In [23]:
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough

message = """
Answer this question using the provided context only.

{question}

Context: {context}

"""

prompt = ChatPromptTemplate.from_messages([('human', message)])

rag_chain = {'context':retriever, "question":RunnablePassthrough()}|prompt|llm

response = rag_chain.invoke('tell me about dogs')
print(response.content)

According to the provided context, dogs are great companions, known for their loyalty and friendliness.
