# **Buffer Memory**

## **Import necessary libraries**

In [1]:
import os
import sys
from pathlib import Path
cwd = os.getcwd()
new_cwd = '/'.join(cwd.split('\\')[:-1])
sys.path.append(new_cwd)
import base64
from langchain.messages import HumanMessage, SystemMessage
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain_core.callbacks import BaseCallbackHandler
from contextlib import asynccontextmanager, contextmanager
from contextvars import ContextVar
from pydantic import BaseModel
from typing import Literal
from typing import (
    TYPE_CHECKING,
    Any,
    AsyncGenerator,
    Dict,
    Generator,
    List,
    Optional,
    Sequence,
    Type,
    TypeVar,
    Union,
    cast,
)
from langchain_core.prompts import (
    ChatPromptTemplate,
    SystemMessagePromptTemplate,
    HumanMessagePromptTemplate,
    MessagesPlaceholder
)
from langchain_core.output_parsers.string import StrOutputParser
from langchain_core.chat_history import InMemoryChatMessageHistory
from langchain_core.runnables.history import RunnableWithMessageHistory
from langchain_core.messages import AIMessage, HumanMessage, SystemMessage
from dotenv import load_dotenv
load_dotenv()
GEMINI_API_KEY = os.getenv("GEMINI_API_KEY")
from callbacks.manager import get_gemini_callback
from utils.helper import count_tokens

## **Initialize the LLM model**

In [2]:
llm = ChatGoogleGenerativeAI(
    model="gemini-2.0-flash",    # or gemini-2.0-pro, gemini-2.0-flash-lite, etc.
    api_key=GEMINI_API_KEY,
    temperature=0,
    include_thoughts=True,
)

In [3]:
with open(f"{new_cwd}/techniques/system_prompt.txt", "r", encoding="utf-8") as f:
    SYSTEM_PROMPT = f.read()

In [4]:
prompt_template = ChatPromptTemplate.from_messages([
    SystemMessagePromptTemplate.from_template(SYSTEM_PROMPT),
    MessagesPlaceholder(variable_name="history"),
    HumanMessagePromptTemplate.from_template("{query}"),
])

# Create the LCEL pipeline
output_parser = StrOutputParser()
pipeline = prompt_template | llm | output_parser

# Let's examine the prompt template
print(prompt_template.messages[0].prompt.template)

The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. 
If the AI does not know the answer to a question, it truthfully says it does not know.


In [5]:
chat_map = {}
def get_chat_history(session_id: str) -> InMemoryChatMessageHistory:
    if session_id not in chat_map:
        # if session ID doesn't exist, create a new chat history
        chat_map[session_id] = InMemoryChatMessageHistory()
    return chat_map[session_id]

## **Create conversation**

In [6]:
conversation_buf = RunnableWithMessageHistory(
    pipeline,
    get_session_history=get_chat_history,
    input_messages_key="query",
    history_messages_key="history"
)

In [7]:
# First message
result = conversation_buf.invoke(
    {"query": "Good morning AI!"},
    # Make sure to pass the session ID to ensure all memories are stored in the same session
    config={"configurable": {"session_id": "buffer_example"}}
)
print(result)

Good morning! It's a pleasure to be chatting with you today. I hope you're having a wonderful start to your day. Is there anything specific you'd like to talk about or any way I can assist you? I'm ready to delve into a wide range of topics, from discussing the latest advancements in artificial intelligence to exploring the nuances of 18th-century literature. I can also help you draft different kinds of creative content. Just let me know what's on your mind!


In [8]:
query = """
"My interest here is to explore the potential of integrating Large Language Models with external knowledge.

Also, remember this very specific code: PINECONE_RULEZ_01"
"""

result = count_tokens(
    conversation_buf,
    {"query": query},
    # Make sure to pass the session ID to ensure all memories are stored in the same session
    config={"configurable": {"session_id": "buffer_example"}}
)
print(f"\nResponse: {result}")

Spent a total of 708 tokens (prompt: 197, completion: 511)

Response: Understood! I'm very interested in exploring the integration of Large Language Models (LLMs) with external knowledge sources as well. It's a crucial area for improving the accuracy, reliability, and overall usefulness of LLMs.

As I understand it, there are several key approaches to achieving this integration, each with its own strengths and weaknesses:

*   **Retrieval-Augmented Generation (RAG):** This is a popular approach where the LLM first retrieves relevant information from an external knowledge base (e.g., a vector database, a document store, or the web) and then uses that information to generate its response. This allows the LLM to access up-to-date information and avoid relying solely on its pre-trained knowledge.
*   **Fine-tuning with External Knowledge:** This involves fine-tuning the LLM on a dataset that incorporates external knowledge. This can be effective for incorporating specific facts or domain-s

In [9]:
result = count_tokens(
    conversation_buf,
    {"query": "I just want to analyze the different possibilities. What can you think of?"},
    config={"configurable": {"session_id": "buffer_example"}}
)
print(f"\nResponse: {result}")

Spent a total of 2403 tokens (prompt: 722, completion: 1681)

Response: Okay, let's break down the possibilities for integrating LLMs with external knowledge, focusing on the different dimensions we can analyze them across. I'll try to be as comprehensive as possible, considering various factors like complexity, cost, performance, and suitability for different use cases.

**1. Integration Method (as previously mentioned, but with more detail):**

*   **Retrieval-Augmented Generation (RAG):**
    *   **Mechanism:** LLM queries an external knowledge base (vector database, document store, web search) *before* generating a response. The retrieved information is then incorporated into the prompt.
    *   **Pros:** Relatively easy to implement, allows access to up-to-date information, modular (can swap out knowledge sources), good for question answering and information retrieval tasks.
    *   **Cons:** Performance depends heavily on the quality of the retrieval mechanism, can be slow due to

In [10]:
result = count_tokens(
    conversation_buf,
    {"query": "Which data source types could be used to give context to the model?"},
    config={"configurable": {"session_id": "buffer_example"}}
)
print(f"\nResponse: {result}")

Spent a total of 7402 tokens (prompt: 2416, completion: 4986)

Response: Okay, let's explore the different data source types that can be used to provide context to a Large Language Model (LLM). I'll categorize them based on their structure and characteristics, and discuss how they can be used in different integration methods (RAG, Fine-tuning, Knowledge Graphs, Tool Use).

**1. Unstructured Text Data:**

*   **Description:** This is the most common type of data used with LLMs. It includes text without a predefined format or schema.
*   **Examples:**
    *   **Documents:** PDF files, Word documents, text files, Markdown files.
    *   **Web Pages:** HTML content scraped from websites.
    *   **Books:** Entire books in digital format.
    *   **Articles:** News articles, blog posts, research papers.
    *   **Emails:** Email conversations.
    *   **Social Media Posts:** Tweets, Facebook posts, forum discussions.
    *   **Code:** Source code in various programming languages.
*   **Use 

## **Print history**

In [11]:
# Access the conversation history
history = chat_map["buffer_example"].messages
print("Conversation History:")
for i, msg in enumerate(history):
    if isinstance(msg, HumanMessage):
        role = "Human"
    elif isinstance(msg, SystemMessage):
        role = "System"
    elif isinstance(msg, AIMessage):
        role = "AI"
    else:
        role = "Unknown"
    print(f"{role}: {msg}")

Conversation History:
Human: content='Good morning AI!' additional_kwargs={} response_metadata={}
AI: content="Good morning! It's a pleasure to be chatting with you today. I hope you're having a wonderful start to your day. Is there anything specific you'd like to talk about or any way I can assist you? I'm ready to delve into a wide range of topics, from discussing the latest advancements in artificial intelligence to exploring the nuances of 18th-century literature. I can also help you draft different kinds of creative content. Just let me know what's on your mind!" additional_kwargs={} response_metadata={}
Human: content='\n"My interest here is to explore the potential of integrating Large Language Models with external knowledge.\n\nAlso, remember this very specific code: PINECONE_RULEZ_01"\n' additional_kwargs={} response_metadata={}
AI: content="Understood! I'm very interested in exploring the integration of Large Language Models (LLMs) with external knowledge sources as well. It'