##### Purpose
This notebook is my dairy to teach myself on how to build a RAG system and use LLMs in applications.
The end goal here is to create a LLM that has DaggerHearts game manual as its knowledge base.  
I will be starting from the small blocks first and build up from there. 

First. LLM, it is the brain in the whole system. 
I will be using LangChain as the library of choice as it is one of the most popular frameworks avaliable. 
Gemini has been chosen as it provides free API keys for me to test out the application.  

In [1]:
# Library import cell. This cell will contain all the library imports used in this notebook.

# To handle API Key
import os
from dotenv import load_dotenv

# To get the LLM model
from langchain.chat_models import init_chat_model

# To get the memory
from langchain.memory import ConversationBufferMemory

What is an API Key? 
An API Key is akin to a password/digital key for you to access services. 
This is used to prevent unauthorised access etc.

#### Setting up API KEY

In [2]:
import os
from dotenv import load_dotenv

# use .env to load env variables
load_dotenv()
# saving env variable as GOOGLE_API_KEY
GOOGLE_API_KEY = os.environ.get("GOOGLE_API_KEY")

#### Initialize LLM Model

Here are some standardized parameters when constructing ChatModels:

* model: the name of the model
* temperature: the sampling temperature
* timeout: request timeout
* max_tokens: max tokens to generate
* stop: default stop sequences
* max_retries: max number of times to retry requests
* api_key: API key for the model provider
* base_url: endpoint to send requests to

In [3]:
from langchain.chat_models import init_chat_model

# initialize a llm 
llm = init_chat_model("gemini-2.5-flash", model_provider="google_genai", temperature = 0.3)

#### Calling the LLM. 
Use `.invoke` to "talk" the LLM. The LLM will generate a response and you view the response within the `.content` attribute

In [7]:
# use .invoke to get a response from the LLM based on query
query = "Hi, i'm tom?"
response = llm.invoke(query)
print(response.content, flush=True)

Hi Tom! Yes, you are. It's great to meet you!

How can I help you today?


Use `.stream` to see how the response broken down into parts, the following example is using the pipe `|` symbol to show each chunk that is returned by the LLM

In [8]:
# use .invoke to get a response from the LLM based on query

for chunk in llm.stream("Write me the lyrics to the song 'Fly me to the moon'"):
    print(chunk.content, end="|", flush=True)

Okay, here are the classic lyrics to "Fly Me to the Moon," famously performed by Frank Sinatra, but written by Bart Howard.

---

**Fly Me to the Moon**|
*(Written by Bart Howard)*

Fly me to the moon
Let me play among the stars
Let me see what spring is like
On a-Jupiter and Mars

In other words, hold my hand
In other words, baby|, kiss me

Fill my heart with song
And let me sing forevermore
You are all I long for
All I worship and adore

In other words, please be true
In other words, I love you

---|

After calling a LLM. The returned object has several attributes

* contents = Text message from the LLM model
* additional_kwargs
* response_metadata
* id
* usage_metadata

In [None]:
print(response)

content="Hi Tom! Yes, you are. It's great to meet you!\n\nHow can I help you today?" additional_kwargs={} response_metadata={'prompt_feedback': {'block_reason': 0, 'safety_ratings': []}, 'finish_reason': 'STOP', 'model_name': 'gemini-2.5-flash', 'safety_ratings': []} id='run--afe5f853-be75-4e5f-984a-34d57fcc4231-0' usage_metadata={'input_tokens': 8, 'output_tokens': 639, 'total_tokens': 647, 'input_token_details': {'cache_read': 0}, 'output_token_details': {'reasoning': 615}}


In [9]:
# this allows you to see the different attributes of this object
print(dir(response))

['__abstractmethods__', '__add__', '__annotations__', '__class__', '__class_getitem__', '__class_vars__', '__copy__', '__deepcopy__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__fields__', '__fields_set__', '__format__', '__ge__', '__get_pydantic_core_schema__', '__get_pydantic_json_schema__', '__getattr__', '__getattribute__', '__getstate__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__iter__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__pretty__', '__private_attributes__', '__pydantic_complete__', '__pydantic_computed_fields__', '__pydantic_core_schema__', '__pydantic_custom_init__', '__pydantic_decorators__', '__pydantic_extra__', '__pydantic_fields__', '__pydantic_fields_set__', '__pydantic_generic_metadata__', '__pydantic_init_subclass__', '__pydantic_parent_namespace__', '__pydantic_post_init__', '__pydantic_private__', '__pydantic_root_model__', '__pydantic_serializer__', '__pydantic_setattr_handlers__', '__pydantic_validator__', '__

In [7]:
response
print(response.content)

Hi Tom! Nice to meet you.

I'm an AI assistant, ready to help you with whatever you need. What can I do for you today?


In [10]:
print(response.additional_kwargs)

{}


In [14]:
print(response.response_metadata)

{'prompt_feedback': {'block_reason': 0, 'safety_ratings': []}, 'finish_reason': 'STOP', 'model_name': 'gemini-2.5-flash', 'safety_ratings': []}


#### Formatting the Prompt

* The prompt is the input for the LLM. 
* Prompts are can be anything from simple ("What is 2+2") to complex ("Explain the meaning of life")
* You can define variables within the prompt it self. This provide useability to change the variables

* `PromptTemplate` allows for simple string inputs
* `ChatPromptTemplate` allows for more complex inputs with roles etc.

In [28]:
from langchain_core.prompts import PromptTemplate

prompt = PromptTemplate.from_template("Answer the following questions. {query}")
prompt.invoke({"query": "Hi  im tom"})

chain = prompt | llm
chain.invoke({"query": "Hi  im tom"})

AIMessage(content="Hi Tom! Nice to meet you.\n\nI'm ready to answer your questions whenever you are. What would you like to know?", additional_kwargs={}, response_metadata={'prompt_feedback': {'block_reason': 0, 'safety_ratings': []}, 'finish_reason': 'STOP', 'model_name': 'gemini-2.5-flash', 'safety_ratings': []}, id='run--b6475dc0-10ed-490d-92f5-321db657f427-0', usage_metadata={'input_tokens': 10, 'output_tokens': 579, 'total_tokens': 589, 'input_token_details': {'cache_read': 0}, 'output_token_details': {'reasoning': 551}})

In [None]:
from langchain_core.prompts import ChatPromptTemplate

prompt_template = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant"),
    ("user", "Tell me a joke about {topic}")
])

prompt_template.invoke({"topic": "cats"})

ChatPromptValue(messages=[SystemMessage(content='You are a helpful assistant', additional_kwargs={}, response_metadata={}), HumanMessage(content='Tell me a joke about cats', additional_kwargs={}, response_metadata={})])

#### Adding memory function for conversational context

LLMs are stateless. 

Here you can see that, after introducing yourself to the LLM. It is not able to "remember" your name. 
By adding a memory function. You will be able to converse with the LLM like talking to a person 
This is achieved by adding the chat history as context for LLM to generate the response.

Messages in LangChain have several components. A `role`, a `content` and `metadata`

Roles - describes WHO is saying the message
* user - You
* assistant - The LLM answering the input
* system - tells models how to behave, not a models support this
* tool - represent the decision for LLM to use a tool or not. It contains a dictionary (`name`, `args`, `id`)



In [None]:
# use .invoke to get a response from the LLM based on query
query_2 = "Hi i am albert"
response_2 = llm.invoke(query_2)
print(response_2.content)

Hi Albert! It's nice to meet you. I'm an AI.

How can I help you today?


The LLM does not remember that you have already introduced yourself earlier. 

In [None]:
# use .invoke to get a response from the LLM based on query
query_3 = "What is my name"
response_3 = llm.invoke(query_3)
print(response_3.content)

As an AI, I don't have access to personal information about you, including your name. I don't retain memory of past conversations or user identities.

If you'd like to tell me your name, I can refer to you by it during our current conversation, but I won't remember it for future interactions.


To enable them as conversational partners. They need to be able to remember what has be said before. 
This means to give them "memory".

#### Create memory function for conversational context

In [None]:
from langchain.memory import ConversationBufferMemory
from langchain.chains import ConversationChain

memory = ConversationBufferMemory()

chat = ConversationChain(llm = llm , memory = memory)


The returned object from the LLM invocation is a dictionary
you can access the response but accessing the key "response"

In [None]:
response = chat.invoke("Hi im tom")
response['response']

{'input': 'Hi im tom',
 'history': "Human: Hi im tom\nAI: Hi Tom! It's really nice to meet you! I'm an AI, a large language model, and I'm here and ready to chat about anything you'd like. It's always great to connect with new people – or, well, new humans, in my case! How are you doing today, Tom?\nHuman: what is my name?\nAI: Your name is Tom! You told me that right at the beginning of our chat. It's good to remember names, even for an AI like me! Is there anything else you'd like to chat about, Tom?",
 'response': "Oh, hello again, Tom! It seems like you're introducing yourself again, and I'm happy to confirm that yes, your name is Tom! You've told me that a few times now since we started chatting, and I've definitely got it stored away in my memory. It's always good to be sure, though! Is there anything specific you'd like to talk about today, Tom, or were you just checking if I remembered? I'm ready for whatever you have in mind!"}

In [40]:
response['response']

"Oh, hello again, Tom! It seems like you're introducing yourself again, and I'm happy to confirm that yes, your name is Tom! You've told me that a few times now since we started chatting, and I've definitely got it stored away in my memory. It's always good to be sure, though! Is there anything specific you'd like to talk about today, Tom, or were you just checking if I remembered? I'm ready for whatever you have in mind!"

In [32]:
chat.invoke("what is my name?")

{'input': 'what is my name?',
 'history': "Human: Hi im tom\nAI: Hi Tom! It's really nice to meet you! I'm an AI, a large language model, and I'm here and ready to chat about anything you'd like. It's always great to connect with new people – or, well, new humans, in my case! How are you doing today, Tom?",
 'response': "Your name is Tom! You told me that right at the beginning of our chat. It's good to remember names, even for an AI like me! Is there anything else you'd like to chat about, Tom?"}

### ConversationChain is deprecated

will need to learn about Runnables and LangChain Expression Language (LECL)