# Agents and RAG, A Technical Deep Dive 

In this notebook, i'll be using the Lang and Llama family for building and exploring RAG from scratch and the techniques we can do with Agents

### Brief History

The concept of intelligent agents has evolved dramatically over the past seven decades, transforming from simple rule-based systems to today's sophisticated AI companions that can reason, plan, and act autonomously. Understanding this progression is essential because it helps us appreciate why modern agentic systems represent such a significant breakthrough and why they're becoming central to how we build AI applications. The journey began in the 1950s when researchers like Allen Newell and Herbert Simon created the Logic Theorist, a program that could prove mathematical theorems by exploring different logical paths. These early agents were like skilled craftsmen—they could perform specific tasks very well, but only within narrow, pre-defined domains. The 1970s and 1980s brought expert systems like MYCIN for medical diagnosis and DENDRAL for chemical analysis. While impressive, these systems required months of manual knowledge engineering, where human experts had to explicitly encode their domain knowledge into rigid rule sets.

The 1990s marked a shift toward more flexible software agents that could operate in networked environments and coordinate with other agents. This period introduced the concept of multi-agent systems, where multiple specialized agents could collaborate to solve complex problems. However, these systems still required extensive manual programming and could only handle situations their creators had anticipated. The real transformation began in the 2000s with machine learning advances. Agents could now learn from data rather than relying solely on hand-coded rules. Virtual assistants like Siri and Alexa brought agent technology to mainstream consumers, though they remained relatively narrow in scope—essentially sophisticated voice interfaces for search and simple task execution.

<img src="https://miro.medium.com/1*Ygen57Qiyrc8DXAFsjZLNA.gif" width=700>

The breakthrough moment arrived with large language models starting around 2020. Systems like GPT-3 and GPT-4 combined vast knowledge with sophisticated reasoning abilities, creating agents that could understand natural language, maintain context across conversations, and tackle a wide variety of tasks without task-specific programming. Unlike their predecessors, these modern agents can break down complex problems into steps, use external tools when needed, and adapt to new situations they've never encountered before. This evolution represents a fundamental shift from automation to augmentation. Where early agents automated specific, predefined tasks, today's agents can understand our goals and work as collaborative partners in problem-solving. They can handle ambiguous instructions, incomplete information, and constantly changing contexts—capabilities that make them invaluable for building sophisticated applications like retrieval-augmented generation systems.

## Agents

When we talk about agents in 2025, we're entering a landscape where the term has become both ubiquitous and somewhat ambiguous. Different organizations and researchers use "agent" to describe everything from simple chatbots to fully autonomous systems that can operate independently for weeks. This diversity in definition isn't just academic—it reflects fundamentally different architectural approaches that will determine how we build the next generation of AI applications.

<img src="https://substackcdn.com/image/fetch/$s_!A_Oy!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3177e12-432e-4e41-814f-6febf7a35f68_1360x972.png" width=700>

At its core, an agent is a system that can perceive its environment, make decisions, and take actions to achieve specific goals. However, the way these capabilities are implemented varies dramatically. Some define agents as fully autonomous systems that operate independently over extended periods, using various tools and adapting their strategies based on feedback. Think of these like a personal assistant who can manage your entire schedule, book flights, handle emails, and make decisions on your behalf without constant supervision.

Others use the term more broadly to describe any system that follows predefined workflows to accomplish tasks. These implementations are more like following a detailed recipe—each step is predetermined, and while the system can handle some variations, it operates within clearly defined boundaries. The distinction between these approaches is crucial because it affects everything from system reliability to development complexity.

The most useful way to think about this spectrum is through the lens of control and decision-making. Workflows are systems where large language models and tools are orchestrated through predefined code paths. Every decision point is anticipated by the developer, and the system follows predetermined logic to handle different scenarios. Agents, in contrast, are systems where the LLM dynamically directs its own processes and tool usage, maintaining control over how it accomplishes tasks. The model itself decides what to do next, which tools to use, and how to adapt when things don't go as planned.

#### Simplicity defines perfectionism not complexity


When building applications with LLMs, the fundamental principle should be finding the simplest solution that meets your requirements. This might mean not building agentic systems at all. Agentic systems inherently trade latency and cost for better task performance, and you need to carefully consider when this tradeoff makes sense for your specific use case.

When more complexity is warranted, workflows offer predictability and consistency for well-defined tasks where you can anticipate most scenarios and edge cases. They're excellent for standardized processes like data processing pipelines, content moderation, or structured analysis tasks. Agents become the better choice when you need flexibility and model-driven decision-making at scale—situations where the variety of inputs and required responses is too broad to predefine, or where the system needs to adapt to entirely new scenarios.

The reality is that for many applications, the most effective approach involves optimizing single LLM calls with retrieval and in-context examples rather than building complex agentic systems. However, as we'll explore throughout this tutorial, there are compelling scenarios where the additional complexity of agents becomes not just beneficial, but necessary for achieving your goals. Understanding when and how to make this transition is what separates effective AI system builders from those who over-engineer solutions to problems that could be solved more simply.




#### Prompts


##### LangChain PromptTemplate


##### LangChain ChatPromptTemplate


##### LangGraph StateGraph Chains

### Tools

#### Context Management

##### LangChain ConversationBufferMemory

##### LangChain ConversationSummaryMemory


##### LangChain ConversationBufferWindowMemory


##### LangChain ConversationTokenBufferMemory


##### LangGraph MemorySaver


##### LangGraph MessagesState


##### LlamaIndex ChatMemoryBuffer








##### LlamaIndex VectorMemory

#### Routing Workflow



##### Parallelization Workflow



##### LangGraph Parallel Execution


##### LangChain RunnableParallel

#### Orchestrator Workers Workflow



##### LangGraph Multi-Agent Patterns


##### LangChain Agent Executors


##### LlamaIndex AgentRunner

## RAG

### But Why RAG?

Talk about LLM system in general, while introducing agents, where those workflows lack the limit of llms in context and actions

### Finding the Data

#### Webscraping


##### LangChain WebBaseLoader


##### LangChain AsyncHtmlLoader


##### LangChain SitemapLoader


##### LangChain PlaywrightURLLoader


##### LlamaIndex SimpleWebPageReader



##### LlamaIndex BeautifulSoupWebReader

#### Document Loading


##### LangChain PyPDFLoader


##### LangChain UnstructuredFileLoader


##### LangChain CSVLoader


##### LangChain JSONLoader


##### LlamaIndex SimpleDirectoryReader


##### LlamaIndex PDFReader

### Preprocessing the documents

#### Splitting


##### LangChain RecursiveCharacterTextSplitter


##### LangChain TokenTextSplitter


##### LangChain MarkdownHeaderTextSplitter


##### LangChain PythonCodeTextSplitter


##### LlamaIndex SentenceSplitter


##### LlamaIndex SemanticSplitterNodeParser


##### LlamaIndex HierarchicalNodeParser

#### Chunking



##### LangChain SemanticChunker


##### LangChain ParentDocumentRetriever


##### LlamaIndex SimpleNodeParser


##### LlamaIndex SentenceWindowNodeParser

#### Embedding


##### LangChain OpenAIEmbeddings


##### LangChain HuggingFaceEmbeddings


##### LlamaIndex OpenAIEmbedding



##### LlamaIndex HuggingFaceEmbedding

### Storing Documents

#### Vector Databases


##### LangChain Chroma Integration


##### LangChain Pinecone Integration


##### LangChain FAISS Integration


##### LlamaIndex ChromaVectorStore



##### LlamaIndex PineconeVectorStore

#### Knowledge Graphs


##### LangGraph StateGraph


##### LangChain Neo4jGraph



##### LlamaIndex KnowledgeGraphIndex

#### SQL


##### LangChain SQLDatabase


##### LangChain SQLDatabaseChain



##### LlamaIndex SQLStructStoreIndex

### Retrieval Mechanisms

#### Vector search


##### LangChain VectorStoreRetriever


##### LangChain MultiVectorRetriever


##### LlamaIndex VectorIndexRetriever



##### LlamaIndex VectorIndexAutoRetriever

#### Tree Search

#### Node Search

#### Hybrid Search

##### LangChain EnsembleRetriever
##### LangChain BM25Retriever
##### LlamaIndex QueryFusionRetriever

##### LangChain ConditionalEdge


##### LangGraph Router Patterns


##### LlamaIndex RouterQueryEngine

### Evaluation

#### Faithfulness & Accuracy

#### RAGAS (RAG Assessment)



##### LangSmith + RAGAS Integration


##### LangChain Evaluation Chains

#### TruLens RAG Triad


#### Multi-Agent Metrics


#### Advanced Agentic Patterns

#### Human Evaluation


#### LLM-as-Judge



##### LangSmith Tracing


##### LangSmith Evaluation Datasets


##### LangSmith Custom Evaluators

## A Complete Agentic System



##### LangGraph Agent Architecture


##### LangChain Agent Types (ReAct, Plan-and-Execute)


##### LangSmith Agent Monitoring


##### LlamaIndex Multi-Agent Orchestrator

## Limitations & Variations

#### RAPTOR

#### Self-RAG

#### CRAG

#### Adaptive RAG

## Summary

## Citations