In [1]:
import langchain
print(langchain.__version__)

0.3.23


In [53]:
from dotenv import load_dotenv
load_dotenv("../.env")
# You should use:
# load_dotenv("../.env")

True

In [3]:
from langchain_groq import ChatGroq
from langchain_core.messages import HumanMessage

llm = ChatGroq(model="llama3-8b-8192")

response = llm.invoke([HumanMessage(content="Tell me a joke")])

print(response.content)

Here's one:

Why couldn't the bicycle stand up by itself?

(Wait for it...)

Because it was two-tired!

Hope that made you laugh!


In [4]:
from langchain_core.output_parsers import StrOutputParser

output_parser=StrOutputParser()
print(output_parser.invoke(response))

Here's one:

Why couldn't the bicycle stand up by itself?

(Wait for it...)

Because it was two-tired!

Hope that made you laugh!


In [5]:
chain = llm | output_parser
llm_response=chain.invoke("Tell me a joke!")

In [6]:
print(llm_response)

Here's one:

Why couldn't the bicycle stand up by itself?

Because it was two-tired!

Hope that made you laugh!


In [7]:
from typing import List
from pydantic import BaseModel, Field

class MobileReview(BaseModel):
    phone_model: str = Field(description="Name and model of the phone")
    rating: float = Field(description="Overall rating out of 5")
    pros: List[str] = Field(description="List of positive aspects")
    cons: List[str] = Field(description="List of negative aspects")
    summary: str = Field(description="Brief summary of the review")

review_text = """
Just got my hands on the new Galaxy S21 and wow, this thing is slick! The screen is gorgeous,
colors pop like crazy. Camera's insane too, especially at night - my Insta game's never been
stronger. Battery life's solid, lasts me all day no problem.
Not gonna lie though, it's pretty pricey. And what's with ditching the charger? C'mon Samsung.
Also, still getting used to the new button layout, keep hitting Bixby by mistake.
Overall, I'd say it's a solid 4 out of 5. Great phone, but a few annoying quirks keep it from
being perfect. If you're due for an upgrade, definitely worth checking out!
"""

structured_llm = llm.with_structured_output(MobileReview)
output = structured_llm.invoke(review_text)
print(output)

phone_model='Galaxy S21' rating=4.0 pros=['gorgeous screen', 'colors pop', 'insane camera', 'solid battery life'] cons=['pricey', 'no charger included', 'annoying button layout'] summary='A solid 4 out of 5, great phone but with a few quirks'


In [8]:
print(output.pros)

['gorgeous screen', 'colors pop', 'insane camera', 'solid battery life']


In [9]:
from langchain_core.prompts import ChatPromptTemplate
prompt=ChatPromptTemplate.from_template("Tell me a short joke about {topic}")
prompt.invoke({"topic":"programming"})

ChatPromptValue(messages=[HumanMessage(content='Tell me a short joke about programming', additional_kwargs={}, response_metadata={})])

In [10]:
chain = prompt | llm | output_parser
response=chain.invoke({"topic":"programmer"})

In [11]:
print(response)

Why do programmers prefer dark mode?

Because light attracts bugs.


In [12]:
from langchain_core.messages import HumanMessage, SystemMessage

system_message=SystemMessage(content="You are a helpful assistant that tells jokes.")
human_message=HumanMessage(content="Tell me about programming")

response=llm.invoke([system_message, human_message])

In [13]:
print(response.content)

Programming! It's like trying to figure out how to get a cat to do tricks for treats... but instead of treats, the cat is a computer and the tricks are code!

But seriously, programming is like solving a puzzle. You have to understand the rules of the game (the programming language), and then use that knowledge to create a solution that works.

Here's a joke to help you understand the concept of loops in programming:

Why did the programmer quit his job?

Because he didn't get arrays! (get a raise)

And here's one to help you understand the concept of conditional statements:

Why do programmers prefer dark mode?

Because light attracts bugs!

And finally, here's one to help you understand the concept of debugging:

Why do programmers prefer debugging to therapy?

Because debugging is like therapy, but with fewer personal questions!

I hope these jokes helped make programming more enjoyable for you!


In [14]:
template=ChatPromptTemplate(
    [
        ("system", "You are a helpful assistant that tells jokes."),
        ("human","Tell me about {topic}")
    ]
)

prompt_value=template.invoke(
    {
        "topic":"programming"
    }
)

prompt_value

ChatPromptValue(messages=[SystemMessage(content='You are a helpful assistant that tells jokes.', additional_kwargs={}, response_metadata={}), HumanMessage(content='Tell me about programming', additional_kwargs={}, response_metadata={})])

In [15]:
response=llm.invoke(prompt_value)

In [16]:
print(response.content)

Programming! It's like trying to explain a joke to a computer... it's a bit of a "buggy" process!

But seriously, programming is like writing a recipe for your computer. You give it a set of instructions (or "code") and it follows them to achieve a specific task or outcome. It's like baking a cake, except instead of flour and sugar, you're using 1s and 0s!

Here's a joke to "debug" your understanding:

Why did the programmer quit his job?

Because he didn't get arrays! (get a raise, haha)

Okay, okay, let me try again:

Why do programmers prefer dark mode?

Because light attracts bugs! (ahem, I mean, attention)

Programming can be a bit of a "loop" (get it?), but with practice and patience, you can become a master "algorithm-ist"!

Keep in mind, these jokes are just a "debugged" attempt to make programming more "byte-sized" and fun. But seriously, programming is an amazing field that can help create incredible innovations and solve real-world problems.


In [17]:
from langchain_community.document_loaders import PyPDFLoader, Docx2txtLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
from typing import List
from langchain_core.documents import Document
import os

def load_documents(folder_path: str) -> List[Document]:
    documents = []
    for filename in os.listdir(folder_path):
        file_path = os.path.join(folder_path, filename)
        if filename.endswith('.pdf'):
            loader = PyPDFLoader(file_path)
        elif filename.endswith('.docx'):
            loader = Docx2txtLoader(file_path)
        else:
            print(f"Unsupported file type: {filename}")
            continue
        documents.extend(loader.load())
    return documents

folder_path = "../data/"
documents = load_documents(folder_path)
print(f"Loaded {len(documents)} documents from the folder.")

Unsupported file type: company_profile.txt
Unsupported file type: employee_handbook.txt
Unsupported file type: it_support_policy.txt
Unsupported file type: meeting_guidelines.txt
Unsupported file type: product_faq.txt
Loaded 5 documents from the folder.


In [18]:
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=1000,
    chunk_overlap=200,
    length_function=len
)

splits = text_splitter.split_documents(documents)
print(f"Split the documents into {len(splits)} chunks.")

Split the documents into 5 chunks.


In [19]:
print(documents[0])

page_content='Company Profile
Company Name: FutureTech Corp
Founded: 2012
Headquarters: San Francisco, CA
Employees: 1,200+
Mission:
To innovate intelligent solutions that simplify lives and empower industries through AI
and automation.
Flagship Products:
- Vision360 AI Camera
- AutoInsights Business Dashboard
- RoboHR Assistant
Awards:
- Best AI Startup, 2020 (TechCrush Awards)
- Top Workplace in Tech, 2022' metadata={'producer': 'PyFPDF 1.7.2 http://pyfpdf.googlecode.com/', 'creator': 'PyPDF', 'creationdate': 'D:20250418155508', 'source': '../data/company_profile.pdf', 'total_pages': 1, 'page': 0, 'page_label': '1'}


In [20]:
print(splits[0])

page_content='Company Profile
Company Name: FutureTech Corp
Founded: 2012
Headquarters: San Francisco, CA
Employees: 1,200+
Mission:
To innovate intelligent solutions that simplify lives and empower industries through AI
and automation.
Flagship Products:
- Vision360 AI Camera
- AutoInsights Business Dashboard
- RoboHR Assistant
Awards:
- Best AI Startup, 2020 (TechCrush Awards)
- Top Workplace in Tech, 2022' metadata={'producer': 'PyFPDF 1.7.2 http://pyfpdf.googlecode.com/', 'creator': 'PyPDF', 'creationdate': 'D:20250418155508', 'source': '../data/company_profile.pdf', 'total_pages': 1, 'page': 0, 'page_label': '1'}


In [55]:
from langchain_google_genai import GoogleGenerativeAIEmbeddings

embeddings = GoogleGenerativeAIEmbeddings(model="models/text-embedding-004")

document_embeddings=embeddings.embed_documents([split.page_content for split in splits])

print(f"Created embeddings for {len(document_embeddings)} document chunks.")

Created embeddings for 5 document chunks.


In [24]:
document_embeddings[0]

[-0.007752885576337576,
 0.012852292507886887,
 -0.006625112146139145,
 -0.008146611973643303,
 0.022754602134227753,
 -0.003702770220115781,
 0.0162301417440176,
 -0.0550101064145565,
 0.011062280274927616,
 0.030934615060687065,
 0.001954884035512805,
 0.02702544815838337,
 0.04583318904042244,
 0.016693878918886185,
 -0.027962416410446167,
 -0.017020881175994873,
 -0.010214210487902164,
 0.09666426479816437,
 -0.0599079467356205,
 -0.02552757039666176,
 -0.01877083256840706,
 -0.02183852531015873,
 0.015846911817789078,
 -0.028401892632246017,
 -0.04972561448812485,
 -0.03311712294816971,
 0.03311529383063316,
 0.01895793341100216,
 -0.033627744764089584,
 -0.05015123635530472,
 0.013188720680773258,
 0.05714269354939461,
 -0.03079942800104618,
 -0.05481507629156113,
 0.00046006529009900987,
 0.01226368360221386,
 0.018244436010718346,
 0.028454449027776718,
 0.05126721039414406,
 -0.0017024175031110644,
 0.002053704811260104,
 -0.01958993636071682,
 -0.04500453174114227,
 0.0573999

In [25]:
from langchain_chroma import Chroma

collection_name = "my_collection"
vectorstore = Chroma.from_documents(
    collection_name=collection_name,
    documents=splits,
    embedding=embeddings,
    persist_directory="./chroma_db"
)
print("Vector store created and persisted to './chroma_db'")

Vector store created and persisted to './chroma_db'


In [26]:
query = "When was FutureTech Corp founded?"
search_results = vectorstore.similarity_search(query, k=2)
print(f"\nTop 2 most relevant chunks for the query: '{query}'\n")
for i, result in enumerate(search_results, 1):
    print(f"Result {i}:")
    print(f"Source: {result.metadata.get('source', 'Unknown')}")
    print(f"Content: {result.page_content}")
    print()



Top 2 most relevant chunks for the query: 'When was FutureTech Corp founded?'

Result 1:
Source: ../data/company_profile.pdf
Content: Company Profile
Company Name: FutureTech Corp
Founded: 2012
Headquarters: San Francisco, CA
Employees: 1,200+
Mission:
To innovate intelligent solutions that simplify lives and empower industries through AI
and automation.
Flagship Products:
- Vision360 AI Camera
- AutoInsights Business Dashboard
- RoboHR Assistant
Awards:
- Best AI Startup, 2020 (TechCrush Awards)
- Top Workplace in Tech, 2022

Result 2:
Source: ../data/employee_handbook.pdf
Content: Employee Handbook
Welcome to FutureTech Corp!
Working Hours:
- Standard hours are 9:00 AM to 6:00 PM, Monday to Friday.
- Employees may request flexible hours subject to manager approval.
Leave Policy:
- 18 days of paid leave annually.
- Sick leave up to 10 days per year.
- Maternity leave: 6 months paid + 1 month unpaid (optional).
- Paternity leave: 10 days.
Remote Work:
- Employees may work remotely up 

In [27]:
retriever = vectorstore.as_retriever(search_kwargs={"k": 2})
retriever_results = retriever.invoke("When was FutureTech Corp founded?")
print(retriever_results)

[Document(id='a48de656-f35d-463b-8b05-011796d23aa0', metadata={'creationdate': 'D:20250418155508', 'creator': 'PyPDF', 'page': 0, 'page_label': '1', 'producer': 'PyFPDF 1.7.2 http://pyfpdf.googlecode.com/', 'source': '../data/company_profile.pdf', 'total_pages': 1}, page_content='Company Profile\nCompany Name: FutureTech Corp\nFounded: 2012\nHeadquarters: San Francisco, CA\nEmployees: 1,200+\nMission:\nTo innovate intelligent solutions that simplify lives and empower industries through AI\nand automation.\nFlagship Products:\n- Vision360 AI Camera\n- AutoInsights Business Dashboard\n- RoboHR Assistant\nAwards:\n- Best AI Startup, 2020 (TechCrush Awards)\n- Top Workplace in Tech, 2022'), Document(id='c7f0be87-6afa-4a1d-9fee-81124675d923', metadata={'creationdate': 'D:20250418155508', 'creator': 'PyPDF', 'page': 0, 'page_label': '1', 'producer': 'PyFPDF 1.7.2 http://pyfpdf.googlecode.com/', 'source': '../data/employee_handbook.pdf', 'total_pages': 1}, page_content='Employee Handbook\nWel

In [28]:
from langchain_core.prompts import ChatPromptTemplate
from langchain.schema.runnable import RunnablePassthrough
from langchain_core.output_parsers import StrOutputParser

template = """Answer the question based only on the following context:
{context}
Question: {question}
Answer: """

prompt = ChatPromptTemplate.from_template(template)

def docs2str(docs):
    return "\n\n".join(doc.page_content for doc in docs)

rag_chain = (
    {"context": retriever | docs2str, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)


In [29]:
question = "When was FutureTech Corp founded?"
response = rag_chain.invoke(question)
print(f"Question: {question}")
print(f"Answer: {response}")

Question: When was FutureTech Corp founded?
Answer: According to the Company Profile, FutureTech Corp was founded in 2012.


In [43]:
from langchain_core.messages import HumanMessage, AIMessage

chat_history=[]
chat_history.extend([
    HumanMessage(content=question),
    AIMessage(content=response)
])

In [44]:
chat_history

[HumanMessage(content='When was FutureTech Corp founded?', additional_kwargs={}, response_metadata={}),
 AIMessage(content='According to the Company Profile, FutureTech Corp was founded in 2012.', additional_kwargs={}, response_metadata={})]

In [45]:
from langchain_core.prompts import MessagesPlaceholder
from langchain.chains import create_history_aware_retriever
from langchain.chains.combine_documents import create_stuff_documents_chain

contextualize_q_system_prompt = """
Given a chat history and the latest user question
which might reference context in the chat history,
formulate a standalone question which can be understood
without the chat history. Do NOT answer the question,
just reformulate it if needed and otherwise return it as is.
"""

contextualize_q_prompt = ChatPromptTemplate.from_messages(
    [
        ("system", contextualize_q_system_prompt),
        MessagesPlaceholder("chat_history"),
        ("human", "{input}"),
    ]
)

contextualize_chain = contextualize_q_prompt | llm | StrOutputParser()
print(contextualize_chain.invoke({"input": "Where is it headquartered?", "chat_history": chat_history}))

Where is FutureTech Corp headquartered?


In [46]:
from langchain.chains import create_history_aware_retriever
history_aware_retriever=create_history_aware_retriever(llm, retriever, contextualize_q_prompt)
history_aware_retriever.invoke({"input": "Where is it headquartered?", "chat_history": chat_history})

[Document(id='a48de656-f35d-463b-8b05-011796d23aa0', metadata={'creationdate': 'D:20250418155508', 'creator': 'PyPDF', 'page': 0, 'page_label': '1', 'producer': 'PyFPDF 1.7.2 http://pyfpdf.googlecode.com/', 'source': '../data/company_profile.pdf', 'total_pages': 1}, page_content='Company Profile\nCompany Name: FutureTech Corp\nFounded: 2012\nHeadquarters: San Francisco, CA\nEmployees: 1,200+\nMission:\nTo innovate intelligent solutions that simplify lives and empower industries through AI\nand automation.\nFlagship Products:\n- Vision360 AI Camera\n- AutoInsights Business Dashboard\n- RoboHR Assistant\nAwards:\n- Best AI Startup, 2020 (TechCrush Awards)\n- Top Workplace in Tech, 2022'),
 Document(id='c7f0be87-6afa-4a1d-9fee-81124675d923', metadata={'creationdate': 'D:20250418155508', 'creator': 'PyPDF', 'page': 0, 'page_label': '1', 'producer': 'PyFPDF 1.7.2 http://pyfpdf.googlecode.com/', 'source': '../data/employee_handbook.pdf', 'total_pages': 1}, page_content='Employee Handbook\nWe

In [47]:
from langchain.chains import create_retrieval_chain
from langchain.chains.combine_documents import create_stuff_documents_chain

qa_prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful AI assistant. Use the following context to answer the user's question."),
    ("system", "Context: {context}"),
    MessagesPlaceholder(variable_name="chat_history"),
    ("human", "{input}")
])

question_answer_chain = create_stuff_documents_chain(llm, qa_prompt)
rag_chain = create_retrieval_chain(history_aware_retriever, question_answer_chain)

In [48]:
question2 = "Where is it headquartered?"
answer2 = rag_chain.invoke({"input": question2, "chat_history": chat_history})['answer']
chat_history.extend([
    HumanMessage(content=question2),
    AIMessage(content=answer2)
])

print(f"Human: {question2}")
print(f"AI: {answer2}")

Human: Where is it headquartered?
AI: According to the Company Profile, FutureTech Corp is headquartered in San Francisco, CA.


In [49]:
import sqlite3
from datetime import datetime
import uuid

DB_NAME = "rag_app.db"

def get_db_connection():
    conn = sqlite3.connect(DB_NAME)
    conn.row_factory = sqlite3.Row
    return conn

def create_application_logs():
    conn = get_db_connection()
    conn.execute('''CREATE TABLE IF NOT EXISTS application_logs
    (id INTEGER PRIMARY KEY AUTOINCREMENT,
    session_id TEXT,
    user_query TEXT,
    gpt_response TEXT,
    model TEXT,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP)''')
    conn.close()

def insert_application_logs(session_id, user_query, gpt_response, model):
    conn = get_db_connection()
    conn.execute('INSERT INTO application_logs (session_id, user_query, gpt_response, model) VALUES (?, ?, ?, ?)',
                 (session_id, user_query, gpt_response, model))
    conn.commit()
    conn.close()

def get_chat_history(session_id):
    conn = get_db_connection()
    cursor = conn.cursor()
    cursor.execute('SELECT user_query, gpt_response FROM application_logs WHERE session_id = ? ORDER BY created_at', (session_id,))
    messages = []
    for row in cursor.fetchall():
        messages.extend([
            {"role": "human", "content": row['user_query']},
            {"role": "ai", "content": row['gpt_response']}
        ])
    conn.close()
    return messages

# Initialize the database
create_application_logs()

In [50]:
# Example usage for a new user
session_id = str(uuid.uuid4())
question = "What is FutureTech Corp?"
chat_history = get_chat_history(session_id)
answer = rag_chain.invoke({"input": question, "chat_history": chat_history})['answer']
insert_application_logs(session_id, question, answer, "gpt-3.5-turbo")
print(f"Human: {question}")
print(f"AI: {answer}\n")

# Example of a follow-up question
question2 = "What are their flagship product?"
chat_history = get_chat_history(session_id)
answer2 = rag_chain.invoke({"input": question2, "chat_history": chat_history})['answer']
insert_application_logs(session_id, question2, answer2, "gpt-3.5-turbo")
print(f"Human: {question2}")
print(f"AI: {answer2}")

Human: What is FutureTech Corp?
AI: FutureTech Corp is a technology company that specializes in developing intelligent solutions that simplify lives and empower industries through artificial intelligence (AI) and automation. It was founded in 2012 and is headquartered in San Francisco, California. The company has a mission to innovate and create products that make a positive impact.

FutureTech Corp's flagship products include the Vision360 AI Camera, AutoInsights Business Dashboard, and RoboHR Assistant, among others. The company has received several awards, including the Best AI Startup in 2020 and Top Workplace in Tech in 2022.

FutureTech Corp values innovation, integrity, inclusion, and impact, and strives to create a positive work environment for its employees. The company offers a range of benefits, including flexible working hours, paid leave, sick leave, and remote work options. With over 1,200 employees, FutureTech Corp is a growing and dynamic organization that is shaping th

In [52]:
chat_history = get_chat_history(session_id)
print(chat_history)

[{'role': 'human', 'content': 'What is FutureTech Corp?'}, {'role': 'ai', 'content': "FutureTech Corp is a technology company that specializes in developing intelligent solutions that simplify lives and empower industries through artificial intelligence (AI) and automation. It was founded in 2012 and is headquartered in San Francisco, California. The company has a mission to innovate and create products that make a positive impact.\n\nFutureTech Corp's flagship products include the Vision360 AI Camera, AutoInsights Business Dashboard, and RoboHR Assistant, among others. The company has received several awards, including the Best AI Startup in 2020 and Top Workplace in Tech in 2022.\n\nFutureTech Corp values innovation, integrity, inclusion, and impact, and strives to create a positive work environment for its employees. The company offers a range of benefits, including flexible working hours, paid leave, sick leave, and remote work options. With over 1,200 employees, FutureTech Corp is