# **Agentic RAG for Analyzing Customer Issues: A Comprehensive Tutorial**

Retrieval-Augmented Generation (RAG) is a groundbreaking AI technique that boosts the performance of Large Language Models (LLMs) by integrating real-time document retrieval during text generation. Traditional RAG methods, such as Naive RAG, can falter when faced with complex, multi-step reasoning tasks. Enter **Agentic RAG**, which combines RAG's retrieval capabilities with autonomous decision-making agents, offering unparalleled performance in handling intricate queries.

This tutorial provides a hands-on guide to building an Agentic RAG system for analyzing customer issues, using Python and key libraries like **CrewAI**, **LangChain**, **OpenAI**, **Gradio**, and **ChromaDB**.

---

## **Learning Objectives**
- Understand the key differences between Agentic RAG and Naive RAG.
- Explore the limitations of Naive RAG in handling complex queries.
- Discover real-world use cases where Agentic RAG excels.
- Implement Agentic RAG in Python using CrewAI.
- Strengthen Naive RAG capabilities by adding decision-making agents.

---

## **Agentic RAG: Strengthening Naive RAG**
Agentic RAG enhances Naive RAG by adding autonomous decision-making agents capable of iterative reasoning. In Naive RAG, the retriever is passive, fetching information only when prompted. In contrast, Agentic RAG integrates **agents** that actively decide:
- **When** to retrieve information.
- **What** information to retrieve.
- **How** to synthesize the retrieved data for actionable insights.

### **Key Limitations of Naive RAG**
Naive RAG struggles with:
- **Summarization Questions**: E.g., "Summarize this document."
- **Comparative Analysis**: E.g., "Compare PepsiCo's and Coca-Cola's Q4 2023 strategies."
- **Complex Multi-Part Queries**: E.g., "Analyze retail inflation arguments in articles from *The Mint* and *Economic Times*, compare findings, and provide conclusions."

Agentic RAG addresses these challenges by incorporating reasoning agents capable of handling such queries effectively.

---

## **Use Cases of Agentic RAG**
1. **Legal Research**: Compare legal documents to extract key clauses.
2. **Market Analysis**: Analyze and compare top brands in a product segment.
3. **Medical Diagnosis**: Combine patient data with research for diagnoses.
4. **Financial Analysis**: Summarize and compare financial reports.
5. **Regulatory Compliance**: Ensure adherence by comparing policies with laws.

---

## **Building Agentic RAG with Python and CrewAI**

### **Step 1: Install Necessary Python Libraries**
Install the required Python libraries to set up the environment.

In [None]:
#!pip install llama-index-core
#!pip install llama-index-readers-file
#!pip install llama-index-embeddings-openai
#!pip install llama-index-llms-llama-api
#!pip install crewai

### **Step 2: Import Required Libraries**
Load essential libraries to set up agents, tools, and data processing.

In [1]:
import os
from crewai import Agent, Task, Crew, Process
from crewai_tools import LlamaIndexTool
from llama_index.core import SimpleDirectoryReader, VectorStoreIndex
from llama_index.llms.openai import OpenAI

In [2]:
#!pip show crewai

##### About CrewAI<br>
CrewAI is a framework designed to facilitate collaborative intelligence among autonomous AI agents, empowering them to work together seamlessly on complex tasks<br>

CrewAI is a multi-agent orchestration framework developed to manage and streamline workflows involving multiple AI agents. It enables AI agents to collaborate on complex tasks, leveraging their individual specialties to achieve collective goals. Unlike traditional AI systems focused on isolated problem-solving, CrewAI takes a team-based approach, allowing for dynamic coordination among agents in real time.

#### What is LlamaIndexTool?
Although the exact details depend on the specific implementation in crewai_tools, here's an educated guess:

Tool for Indexing and Querying:

Likely integrates with the LlamaIndex framework to allow AI agents or workflows in CrewAI to query indexed data efficiently.
Could serve as a bridge between structured data (like documents or embeddings) and natural language queries.
Agent Integration:

If part of CrewAI, it might enable autonomous AI agents to:
Query existing knowledge bases or datasets.
Build and manage indices dynamically during their operation.
Contextual Data Retrieval:

Useful for fetching relevant information to provide better responses, generate insights, or make decisions.
LlamaIndex Context:
LlamaIndex (previously GPT Index) is a popular framework for connecting external data sources with language models. It allows for creating indices from:

Document collections (PDFs, text files, etc.).
Databases or structured data.
Embeddings stored in vector databases (like Pinecone or ChromaDB).
These indices are then used to process user queries and retrieve the most relevant data before engaging a language model.

Use Case Examples:
Document Search for AI Agents:
Agents in CrewAI could use LlamaIndexTool to retrieve information from documents or knowledge bases during task execution.

Interactive Knowledge Retrieval:
LlamaIndexTool might enable CrewAI agents to interact with data sources, combining autonomous reasoning with contextual data.

Enhanced QA Systems:
When deployed in chatbots or similar systems, this tool could help agents respond with highly relevant information.

### **Step 3: Read Customer Issues Data**
Load customer issues data from a CSV file.

In [27]:
import kagglehub

# Download latest version
path = kagglehub.dataset_download("suraj520/customer-support-ticket-dataset")
path = f'{path}\customer_support_tickets.csv'
path = path.replace("\\", "/")
print("Path to dataset files:", path)


Path to dataset files: C:/Users/Sai Krishna/.cache/kagglehub/datasets/suraj520/customer-support-ticket-dataset/versions/1/customer_support_tickets.csv


In [28]:
reader = SimpleDirectoryReader(input_files=[path])
docs = reader.load_data()

### **Step 4: Define OpenAI API Key**
Set up the OpenAI API key to enable access to GPT-based models.

In [4]:
os.environ["OPENAI_API_KEY"] = <<OPENAI_API_KEY>>

### **Step 5: LLM Initialization**
Initialize the GPT-4 model for query processing.

In [29]:
llm = OpenAI(model="gpt-4o")

### **Step 6: Create Vector Store Index and Query Engine**
Create a searchable index and query engine for retrieving similar documents.

In [30]:
index = VectorStoreIndex.from_documents(docs)
query_engine = index.as_query_engine(similarity_top_k=2, llm=llm)

### **Step 7: Create a Query Tool**
Build a tool using the query engine for retrieving customer issue data.

In [31]:
query_tool = LlamaIndexTool.from_query_engine(
    query_engine,
    name="Customer Support Query Tool",
    description="Use this tool to lookup the customer ticket data"
)

### **Step 8: Define Agents**
Define agents with distinct roles and goals.

##### The given code creates an Agent object using a library, likely crewai, and assigns it a role to analyze customer ticket data. 

Here's a detailed breakdown of the code and its components:

1. Object Creation:
researcher = Agent(...)

The Agent is being instantiated as researcher.
This agent is designed to simulate an autonomous AI entity with specific attributes and responsibilities.<br><br>

2. Agent Attributes:

role="Customer Ticket Analyst"
Defines the agent's role as a customer ticket analyst.
The agent’s behavior and operations are tailored to this function.<br><br>

goal="Uncover insights about customer issue trends."
The goal specifies the primary objective of the agent:
Analyze data: Examine customer tickets to identify patterns.<br><br>
    
backstory="Analyze customer issues for brands like GoPro, LG, Dell, etc."
    Provides additional context about the agent’s domain expertise and scope:
    Domain expertise: Experience with handling tickets for major brands like GoPro, LG, Dell.
    Relevance: The backstory might influence how the agent prioritizes or contextualizes tasks.<br><br>

verbose=True
Enables detailed logging or feedback:
The agent provides verbose output during execution, such as detailed explanations of actions, decisions, or intermediate results.
Useful for debugging or monitoring agent behavior.<br><br>

3. Tool Integration:
tools=[query_tool]
A tool is a resource or utility the agent can use to achieve its goal.

query_tool:
Likely a querying or data-fetching utility (e.g., connected to a database, knowledge base, or customer ticket system).
Enables the agent to fetch relevant information or process data.
Example tools in such a framework might include:


In [32]:
researcher = Agent(
    role="Customer Ticket Analyst",
    goal="Uncover insights about customer issue trends.",
    backstory="Analyze customer issues for brands like GoPro, LG, Dell, etc.",
    verbose=True,
    tools=[query_tool]
)

writer = Agent(
    role="Product Content Specialist",
    goal="Craft compelling content on customer issue trends.",
    backstory="Transform customer data into engaging narratives.",
    verbose=True
)

### **Step 9: Create Tasks**
Assign tasks to the agents.

In [33]:
task1 = Task(
    description="Analyze customer issues for brands like GoPro, LG, Dell, etc.",
    expected_output="Detailed report with trends and insights.",
    agent=researcher
)

task2 = Task(
    description="Develop an engaging blog post summarizing customer issues for each brand.",
    expected_output="Bullet-point blog post highlighting key issues for each brand.",
    agent=writer
)

### **Step 10: Instantiate the Crew**
Combine agents and tasks into a crew and execute the process.

In [34]:
crew = Crew(
    agents=[researcher, writer],
    tasks=[task1, task2],
    verbose=True
)

result = crew.kickoff()

[1m[95m# Agent:[00m [1m[92mCustomer Ticket Analyst[00m
[95m## Task:[00m [92mAnalyze customer issues for brands like GoPro, LG, Dell, etc.[00m


[1m[95m# Agent:[00m [1m[92mCustomer Ticket Analyst[00m
[95m## Thought:[00m [92mI need to gather data on customer issues for brands like GoPro, LG, and Dell to identify trends and insights. I will start by searching for customer ticket data related to these brands.[00m
[95m## Using tool:[00m [92mCustomer Support Query Tool[00m
[95m## Tool Input:[00m [92m
"{\"query\": \"GoPro customer issues\"}"[00m
[95m## Tool Output:[00m [92m
Several customers have reported issues with GoPro products. These include trouble connecting to home Wi-Fi networks, peculiar error messages, data loss, and problems that began after recent software updates. Some customers have also inquired about product recommendations and payment issues related to GoPro products.[00m


[1m[95m# Agent:[00m [1m[92mCustomer Ticket Analyst[00m
[95m## T

---

### **Output**
The system generates a detailed analysis of customer issues and an engaging blog post summarizing key findings for each brand. For instance:

- **GoPro**: Common complaints include poor battery life and app connectivity.
- **Fitbit**: Issues with syncing and inaccurate step tracking.

---

## **Conclusion**
Agentic RAG marks a significant evolution in Retrieval-Augmented Generation by integrating decision-making agents. It enhances RAG’s ability to handle complex queries, providing robust insights across various industries. Using Python and CrewAI, developers can create intelligent systems for smarter, data-driven decisions.

---

## **Key Takeaways**
- Agentic RAG integrates autonomous agents for dynamic decision-making.
- It outperforms Naive RAG in handling multi-step reasoning and complex queries.
- Python and CrewAI simplify the implementation of Agentic RAG systems.

---

### **Frequently Asked Questions**
1. **What is Agentic RAG?**
   A hybrid RAG model that combines document retrieval with autonomous agents for enhanced decision-making.

2. **How does Agentic RAG improve Naive RAG?**
   By adding agents capable of multi-step reasoning and iterative data analysis.

3. **What libraries are used?**
   Key libraries include CrewAI, LangChain, OpenAI, and ChromaDB.

4. **What are its use cases?**
   Agentic RAG is applicable in legal research, market analysis, medical diagnosis, and more.