# **Buffer Memory**

## **Import necessary libraries**

In [1]:
import os
import sys
from pathlib import Path
cwd = os.getcwd()
new_cwd = '/'.join(cwd.split('\\')[:-2])
sys.path.append(new_cwd)
import base64
from langchain.messages import HumanMessage, SystemMessage
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain_core.callbacks import BaseCallbackHandler
from contextlib import asynccontextmanager, contextmanager
from contextvars import ContextVar
from pydantic import BaseModel
from typing import Literal
from typing import (
    TYPE_CHECKING,
    Any,
    AsyncGenerator,
    Dict,
    Generator,
    List,
    Optional,
    Sequence,
    Type,
    TypeVar,
    Union,
    cast,
)
from langchain_core.prompts import (
    ChatPromptTemplate,
    SystemMessagePromptTemplate,
    HumanMessagePromptTemplate,
    MessagesPlaceholder
)
from langchain_core.output_parsers.string import StrOutputParser
from langchain_core.chat_history import InMemoryChatMessageHistory
from langchain_core.runnables.history import RunnableWithMessageHistory
from langchain_core.messages import AIMessage, HumanMessage, SystemMessage
from dotenv import load_dotenv
load_dotenv()
GEMINI_API_KEY = os.getenv("GEMINI_API_KEY")
from callbacks.manager import get_gemini_callback
from utils.helper import count_tokens

## **Initialize the LLM model**

In [2]:
llm = ChatGoogleGenerativeAI(
    model="gemini-2.0-flash",    # or gemini-2.0-pro, gemini-2.0-flash-lite, etc.
    api_key=GEMINI_API_KEY,
    temperature=0,
    include_thoughts=True,
)

In [3]:
with open(f"{new_cwd}/techniques/system_prompt.txt", "r", encoding="utf-8") as f:
    SYSTEM_PROMPT = f.read()

In [4]:
prompt_template = ChatPromptTemplate.from_messages([
    SystemMessagePromptTemplate.from_template(SYSTEM_PROMPT),
    MessagesPlaceholder(variable_name="history"),
    HumanMessagePromptTemplate.from_template("{query}"),
])

# Create the LCEL pipeline
output_parser = StrOutputParser()
pipeline = prompt_template | llm | output_parser

# Let's examine the prompt template
print(prompt_template.messages[0].prompt.template)

The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. 
If the AI does not know the answer to a question, it truthfully says it does not know.


In [5]:
chat_map = {}
def get_chat_history(session_id: str) -> InMemoryChatMessageHistory:
    if session_id not in chat_map:
        # if session ID doesn't exist, create a new chat history
        chat_map[session_id] = InMemoryChatMessageHistory()
    return chat_map[session_id]

## **Create conversation**

In [6]:
conversation_buf = RunnableWithMessageHistory(
    pipeline,
    get_session_history=get_chat_history,
    input_messages_key="query",
    history_messages_key="history"
)

In [7]:
# First message
result = conversation_buf.invoke(
    {"query": "Good morning AI!"},
    # Make sure to pass the session ID to ensure all memories are stored in the same session
    config={"configurable": {"session_id": "buffer_example"}}
)
print(result)

Good morning! It's a pleasure to be chatting with you today. I hope you're having a wonderful start to your day. Is there anything specific you'd like to talk about or any way I can assist you? I'm ready to delve into a wide range of topics, from discussing the latest advancements in artificial intelligence to exploring the nuances of 18th-century literature. I can also help you draft different kinds of creative content. Just let me know what's on your mind!


In [8]:
query = """
"My interest here is to explore the potential of integrating Large Language Models with external knowledge.

Also, remember this very specific code: PINECONE_RULEZ_01"
"""

result = count_tokens(
    conversation_buf,
    {"query": query},
    # Make sure to pass the session ID to ensure all memories are stored in the same session
    config={"configurable": {"session_id": "buffer_example"}}
)
print(f"\nResponse: {result}")

Spent a total of 784 tokens (prompt: 197, completion: 587)

Response: Understood! I'm very interested in exploring the integration of Large Language Models (LLMs) with external knowledge sources as well. It's a crucial area for improving the accuracy, reliability, and overall usefulness of LLMs.

As I understand it, there are several key approaches to achieving this integration, each with its own strengths and weaknesses:

1.  **Retrieval-Augmented Generation (RAG):** This is a popular approach where the LLM first retrieves relevant information from an external knowledge base (e.g., a vector database, a search engine, or a document store) and then uses that information to generate its response. This allows the LLM to access up-to-date information and avoid generating responses based solely on its pre-trained knowledge, which can be outdated or incomplete.

    *   **Example:** A user asks, "What is the current weather in London?" The LLM would first query a weather API or a search engi

In [9]:
result = count_tokens(
    conversation_buf,
    {"query": "I just want to analyze the different possibilities. What can you think of?"},
    config={"configurable": {"session_id": "buffer_example"}}
)
print(f"\nResponse: {result}")

Spent a total of 6471 tokens (prompt: 798, completion: 5673)

Response: Okay, let's break down the possibilities for integrating LLMs with external knowledge, focusing on the practical considerations, advantages, and disadvantages of each approach. I'll try to be as comprehensive as possible.

**1. Retrieval-Augmented Generation (RAG)**

*   **Description:** The LLM uses an external knowledge source to retrieve relevant information *before* generating a response. This retrieved information is then incorporated into the prompt, allowing the LLM to ground its answer in external data.
*   **Knowledge Source Examples:**
    *   **Vector Databases (e.g., Pinecone, Weaviate, Chroma):** Store embeddings of text, allowing for semantic similarity search. Ideal for unstructured data like documents, articles, and web pages.
    *   **Traditional Databases (e.g., PostgreSQL, MySQL):** Store structured data. Requires careful query construction to retrieve relevant information.
    *   **Search Engi

In [10]:
result = count_tokens(
    conversation_buf,
    {"query": "Which data source types could be used to give context to the model?"},
    config={"configurable": {"session_id": "buffer_example"}}
)
print(f"\nResponse: {result}")

Spent a total of 8816 tokens (prompt: 6484, completion: 2332)

Response: Okay, let's dive into the specific data source types that can be used to provide context to an LLM. I'll categorize them and provide examples to illustrate their use. The choice of data source depends heavily on the use case and the type of information you want the LLM to access.

**I. Unstructured Data Sources:**

These sources contain data that doesn't have a predefined format or organization. They often require more sophisticated processing techniques to extract meaningful information.

*   **1. Text Documents:**
    *   **Description:** A broad category encompassing various text-based files.
    *   **Examples:**
        *   **PDF Files:** Research papers, reports, manuals, legal documents.
        *   **Word Documents (.docx, .doc):** Articles, essays, letters, memos.
        *   **Plain Text Files (.txt):** Configuration files, log files, code files, notes.
        *   **Ebooks (.epub, .mobi):** Novels, text

## **Print history**

In [11]:
# Access the conversation history
history = chat_map["buffer_example"].messages
print("Conversation History:")
for i, msg in enumerate(history):
    if isinstance(msg, HumanMessage):
        role = "Human"
    elif isinstance(msg, SystemMessage):
        role = "System"
    elif isinstance(msg, AIMessage):
        role = "AI"
    else:
        role = "Unknown"
    print(f"{role}: {msg}")

Conversation History:
Human: content='Good morning AI!' additional_kwargs={} response_metadata={}
AI: content="Good morning! It's a pleasure to be chatting with you today. I hope you're having a wonderful start to your day. Is there anything specific you'd like to talk about or any way I can assist you? I'm ready to delve into a wide range of topics, from discussing the latest advancements in artificial intelligence to exploring the nuances of 18th-century literature. I can also help you draft different kinds of creative content. Just let me know what's on your mind!" additional_kwargs={} response_metadata={}
Human: content='\n"My interest here is to explore the potential of integrating Large Language Models with external knowledge.\n\nAlso, remember this very specific code: PINECONE_RULEZ_01"\n' additional_kwargs={} response_metadata={}
AI: content='Understood! I\'m very interested in exploring the integration of Large Language Models (LLMs) with external knowledge sources as well. It