# Project Information

Project Background: Building a Desktop and Laptop Support Agent Assistant with Generative AI

## Context

In today's oganizations, providing efficient and effective customer support for technical issues related to desktops and laptops is crucial. Customer support agents often deal with a wide range of queries, from basic troubleshooting steps to more complex hardware and software problems. Access to well-structured and easily searchable support documentation is essential for agents to quickly diagnose and resolve customer issues.


## Problem Outline

Traditional methods of accessing support documentation can be time-consuming. Agents may need to navigate through lengthy PDF files or knowledge base articles to find the relevant information. This can lead to longer resolution times, increased frustration for both agents and customers, and potentially lower customer satisfaction.

##  Proposed Solution


This project aims to leverage the power of Generative AI to create an intelligent assistant that can enhance the capabilities of customer support agents dealing with desktop and laptop issues. The core idea is to process existing support documentation (in this case, a synthetic PDF file created for this purpose) and enable agents to quickly retrieve relevant information and potentially generate helpful responses or troubleshooting steps based on customer queries.

# Key Components:

## Support Documentation

This is the primary dataset. A synthetic PDF file containing common desktop and laptop troubleshooting steps, structured into logical sections (e.g., power issues, display problems, network connectivity, battery issues). This document serves as the knowledge base for the AI assistant.

## Generative AI Model

A Large Language Model (LLM), _command-r-plus_ from _COHERE_ This model will be used to understand user queries and extract relevant information from the support documentation.

## Embedding Model

The embedding model used is _embed-english-light-v3.0_ also from _COHERE_. 

## Vector Store

I have used qdrant in this case. 

## LangChain and Langgraph Framework

The LangChain library was used as the framework to connect the LLM with the support documentation. This involved techniques like:

* Document Loading: Loading and processing the PDF file.
* Text Splitting: Dividing the document into smaller chunks for efficient retrieval.
* Vector Embeddings: Creating vector representations of the text chunks to enable semantic search.
* Retrieval-Based Question Answering: Using a retrieval mechanism (e.g., a vector store and similarity search) to find relevant sections in the documentation based on the agent's query.
* Response Generation: Utilizing the LLM to generate concise and helpful answers or troubleshooting steps based on the retrieved information.

For the agent creation and operation, LangGraph was used.

# Demonstration of AI Capabilities

In this Capstone Project, the following GenAI capabilities have been demonstrated

1. Structured output/JSON mode/controlled generation
2. Few-shot prompting
3. Document understanding
4. Function Calling
5. Agents
6. Embeddings
7. Retrieval augmented generation (RAG)
8. Vector search/vector store/vector database
9. Grounding

In [1]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

# Input data files are available in the read-only "../input/" directory
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os
for dirname, _, filenames in os.walk('../input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# You can write up to 20GB to the current directory (/kaggle/working/) that gets preserved as output when you create a version using "Save & Run All" 
# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session

../input/pm-80499167-at-04-12-2025-20-44-04/__script__.py
../input/pm-80499167-at-04-12-2025-20-44-04/__results__.html
../input/pm-80499167-at-04-12-2025-20-44-04/input_requirements.txt
../input/pm-80499167-at-04-12-2025-20-44-04/__script__.ipynb
../input/pm-80499167-at-04-12-2025-20-44-04/__output__.json
../input/pm-80499167-at-04-12-2025-20-44-04/custom.css
../input/support-documentation/support_documentation.pdf


In [2]:
!pip uninstall -qqy pylibraft-cu12
!pip uninstall -qqy jupyterlab  # Remove unused packages from Kaggle's base image that conflict
!pip install -qqU langchain-core langchain-text-splitters langchain 
!pip install -qqU transformers sentence-transformers pypdf langchain
!pip install -qqU qdrant-client
!pip install -qqU "langchain[cohere]"
!pip install -qqU langchain-community
!pip install -qqU "langchain-qdrant"

[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m423.3/423.3 kB[0m [31m10.3 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.0/1.0 MB[0m [31m39.5 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m10.4/10.4 MB[0m [31m96.3 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m340.6/340.6 kB[0m [31m22.9 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m363.4/363.4 MB[0m [31m4.7 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m664.8/664.8 MB[0m [31m2.5 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m211.5/211.5 MB[0m [31m6.2 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m56.3/56.3 MB[0m [31m31.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━

In [3]:
from kaggle_secrets import UserSecretsClient

CO_API_KEY = UserSecretsClient().get_secret("COHERE_API_KEY")
QDRANT_API_KEY = UserSecretsClient().get_secret("QDRANT_CLOUD_API_KEY")
QDRANT_CLOUD_HOST = UserSecretsClient().get_secret("QDRANT_CLOUD_HOST")

In [4]:
from langchain.chat_models import init_chat_model
model = init_chat_model("command-r-plus", 
                        model_provider="cohere",
                        cohere_api_key=CO_API_KEY)

# Define loader, text splitters 
The next step is to get the loader and the text spliters so that we can load
and split the text document to get into a array of Documents

In [5]:
from langchain.document_loaders import PyPDFLoader
from langchain_community.document_loaders.directory import DirectoryLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_core.documents import Document

The code below loads and splits PDF files into manageable chunks using the
DirectoryLoader and RecursiveCharacterTextSplitter classes from the PyMuPDF library.

First, a DirectoryLoader object named `loader` is created with the following
parameters:
- `path=".."`: The path to the directory containing the PDF files.
- `glob="**/*.pdf"`: A glob pattern specifying that only PDF files should be loaded.
- `recursive=True`: A flag indicating that the search for PDF files should be
performed recursively within subdirectories.
- `show_progress=True`: A flag indicating whether a progress bar should be displayed
during file loading.

Next, a RecursiveCharacterTextSplitter object named `text_splitter` is created with
the following parameters:
- `chunk_size=50`: The maximum number of characters to include in each text chunk.
- `chunk_overlap=10`: The number of characters that overlap between consecutive
chunks.

Finally, the code assigns the result of calling the `load_and_split()` method on the
`loader` object to a variable named `docs`. This method loads all PDF files from the
specified directory and its subdirectories, splits each loaded file into text chunks
using the `text_splitter`, and returns a list of Document objects representing the
split PDF files.

In [6]:
from langchain_community.document_loaders import PyPDFDirectoryLoader
loader = PyPDFDirectoryLoader(path="..", glob="**/*.pdf", 
                         recursive=True, )
text_splitter = RecursiveCharacterTextSplitter(chunk_size=50, chunk_overlap=10)
docs: Document = loader.load_and_split()

# Vector Store

At this point, we have to install qdrant for the vector store and the langchain libraries which facilitate the interaction with it.

In [7]:
from langchain_cohere.embeddings import CohereEmbeddings
from langchain_cohere.chat_models import ChatCohere

embeddings = CohereEmbeddings(model="embed-english-light-v3.0", cohere_api_key=CO_API_KEY)


In [8]:
from langchain_qdrant import Qdrant, QdrantVectorStore
from qdrant_client import QdrantClient
from qdrant_client.http.models import Distance, VectorParams

assert(QDRANT_CLOUD_HOST != None)


vector_store = QdrantVectorStore.from_documents(
    docs,
    embeddings,
    url=QDRANT_CLOUD_HOST,
    api_key=QDRANT_API_KEY,
    collection_name="support_documents",
)

retriever = vector_store.as_retriever()

## Storing Documents into the Vector Store

At this point we have the vector store defined and have the retriever ready and 
the next step is to generate the ids for the documents and load the documents into 
the vector store. The next step accomplishes that. 

In [9]:
from uuid import uuid4
uuids = [str(uuid4()) for _ in range(len(docs))]
doc_ids = vector_store.add_documents(ids = uuids, documents = docs)

In [10]:
# The following will give us the doc_ids which are loaded
doc_ids

['c89815f5-b576-468d-8c26-d2999704a1ed',
 '072e2507-19ac-4122-94ec-cd47832c749e',
 'ab986bad-a6e6-4d38-9895-2d7063a9b260']

In [11]:
# Test the similarity search 
documents = vector_store.similarity_search("Why is my mouse not working", k=2, )

# LLM

In [12]:
llm = ChatCohere(model="command-r-plus", cohere_api_key=CO_API_KEY)

## Few Shot Prompting

This demonstrates _few-shot prompting_ by using a retrieval prompt that takes a context and question as input, then retrieves relevant documents to generate an answer.

The AI agent iterates through a list of questions, each time retrieving the necessary documents based on the context and the user's question. This allows the AI to learn from only a few examples (the context and related questions), which is characteristic of few-shot prompting.

In [13]:
from langchain_core.prompts import ChatPromptTemplate
retrieval_prompt = ChatPromptTemplate.from_template(
    
    """Answer the user's question based on the context provided below:
    
    Context:
    {context}

    If you are unable to answer based on the document then say "I do not have that
    information with me." else provide step by step, numbered action items to solve 
    the problem.
    
    Question: {question}
    
    Answer:"""
)

# Constructing the Agent Graph

In [14]:
# Installing LangGraph Libraries

!pip  install -Uqq langgraph

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m43.5/43.5 kB[0m [31m2.1 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m144.7/144.7 kB[0m [31m7.5 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m42.0/42.0 kB[0m [31m2.5 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m47.2/47.2 kB[0m [31m2.9 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m223.6/223.6 kB[0m [31m11.7 MB/s[0m eta [36m0:00:00[0m
[?25h

## Testing the Retriever

### Creating the State Dictionary

In [15]:
# Let's do the agent now
from typing import TypedDict, Dict
class AgentState(TypedDict):
    keys: Dict[str, any]

# Here we import the Objects which are necessary to construct the graph

from langgraph.graph import StateGraph, END

In [16]:
state:AgentState = {"keys":{}}
user_input = "My mouse is not working, what might be the problem?"
state = {"keys": {"question": user_input}}
retriever = vector_store.as_retriever(embedding=embeddings, k = 2, )
question = state["keys"]["question"]
retrieved_docs = retriever.get_relevant_documents(question)
len(retrieved_docs)

  retrieved_docs = retriever.get_relevant_documents(question)


4

1. The `retrieve_documents` function takes a state as input, retrieves relevant
documents based on the user's question, and returns them in a new state.
2. The `generate_response` function takes a state as input, generates a final response
using the Cohere model and retrieved context, and returns it in a new state.
3. The workflow is created with nodes for retrieving documents and generating a
response, as well as edges connecting these nodes.
4. The entry point of the workflow is set to the retrieve node.
5. Finally, the workflow is compiled into an agent.

In [17]:
context = ",".join([doc.page_content for doc in retrieved_docs])


Let's define a state graph-based workflow for an AI agent, which
includes three main components: `retrieve_documents`, `generate_response`, and the
overall workflow structure.

1. The `retrieve_documents` function takes in the current `state` of the agent as its
parameter. It retrieves relevant documents based on the user's question using the
`embeddings` object and returns an updated state with the retrieved documents' context
and the original question.
2. The `generate_response` function also accepts the current `state`. It generates a
final response using the Cohere model based on the previous 'context' and 'question'
stored in the state. The response is then added to the state under the key 'response'.
3. The overall workflow, named as per the code block, consists of adding nodes for
each function (`"retrieve"` corresponding to `retrieve_documents` and `"generate"`
corresponding to `generate_response`) and connecting them using edges.
4. Lastly, an instance of the AI agent is created by compiling the workflow.

The workflow structure ensures that the retrieval process occurs first followed by a
response generation to provide context-specific answers based on the user's question.

In [18]:
workflow = StateGraph(AgentState)

def retrieve_documents(state) -> AgentState:
    """Retrieves relevant documents based on the user's question."""
    question = state["keys"]["question"]
    if question is None:
        raise KeyError(f"No question in state {question}")
    retriever = vector_store.as_retriever(embedding=embeddings, k = 2, )
    retrieved_docs = retriever.get_relevant_documents(question)
    context =  {"context": "\n\n".join([doc.page_content for doc in retrieved_docs])}
    return {"keys":{"question":question, "context":context,}}

def generate_response(state) -> AgentState:
    """Generates the final response using the Cohere model and retrieved context."""
    context = state["keys"]["context"]
    question = state["keys"]["question"]
    # Use the retrieval_prompt here
    response = llm.invoke(retrieval_prompt.format_messages(context=context, question=question))
    state["keys"]["response"] = response.content
    return state

# Adding Nodes
workflow.add_node("retrieve", retrieve_documents)
workflow.add_node("generate", generate_response)

# Adding Edges
workflow.add_edge("retrieve", "generate")
workflow.add_edge("generate", END)

# Setting the entry point
workflow.set_entry_point("retrieve")

# Compiling the workflow
agent = workflow.compile()


## The Agent State

The code below uses a dictionary called `AgentState` to store and manage
information about an AI agent's state. In this case, the state includes two keys:
'question' and 'context', which are initially empty.

Next, a user provides input by assigning a string value `"My mouse is not working,
what might be the problem?"` to a variable named `user_input`.

Then, the code updates the `state` dictionary with the provided user input under the
'question' key.

The code then calls an `invoke()` function on the AI agent and passes the updated `state` as its argument.

Finally, it prints out the response generated by the AI agent based on the given
context and question using the key `'response'`.

In [19]:
state:AgentState = {"keys":{
    "question":"",
    "context":"",
}}

user_input = "My mouse is not working, what might be the problem?"
state = {"keys": {"question": user_input}}
state = agent.invoke(state)


## Response from the Agent

The code snippet below gives you the response from the AI Agent. 

In [20]:

print(state["keys"]["response"],"\n\n")

I do not have information about mouse issues in the provided context. However, based on the troubleshooting tips, I can suggest the following general steps: 

1. Restart your computer: This often resolves temporary software glitches that could be causing the mouse issue. 

2. Check error messages: Note down any error messages displayed related to your mouse or USB devices for further investigation. 

3. Gather information: Think about when the problem started and any recent changes you made to your system, such as updating the operating system or installing new software. 

4. Use the operating system's built-in troubleshooters: Both Windows and macOS have built-in tools for diagnosing common issues, including hardware problems. 

5. Try a different USB port: Connect your mouse to a different USB port on your computer to ensure the current port is not the issue. 

6. Try a different mouse: If possible, test with a known working mouse to determine if the issue is specific to your mouse o

## Final Thoughts

This implementation can be expanded to support other use cases by simply adding more
nodes and edges in the workflow graph or extending `retrieve_documents` and/or
`generate_response`.

For instance, additional nodes could be created for pre-processing user queries (`
preprocess_query`), interpreting retrieved documents (` interpret_documents`), or
generating follow-up responses (` generate_followup`). An extended
`retrieve_documents` might query multiple document sources simultaneously. Meanwhile,
an expanded `generate_response` could generate multiple possible responses and select
one at random based on certain criteria.

In a nutshell, the workflow structure provides a flexible way to add more complex
behavior in steps without changing much of the underlying logic or architecture. This
is the strength and promise of state graph-based AI agents: they can be incrementally
extended with new features as business needs change.