## From RAG to Agents: Building Smart AI Assistants
An attempt to recover my lost notes after Codespaces crashed on me 😫

### What is RAG?
RAG (Retrieval-Augmented Generation) is a technique that combines retrieval of information with generation by a Large Language Model (LLM). It is particularly effective when you have a specific knowledge base and want the LLM to answer questions only using that context.

RAG consists of 3 parts:
- Search → Finds relevant docs (e.g., FAQ entries about course enrollment).
- Prompt → Combines docs + query into a structured template.
- LLM → Generates a grounded answer (e.g., "Yes, you can join late (see FAQ).").

### AI Agents
These are autonomous systems that interact with environments, make decisions, and perform actions (e.g., search, answer, modify data). 
Agents are AI systems that can:

- Make decisions about what actions to take
- Use tools to accomplish tasks
- Maintain state and context
- Learn from previous interactions
- Work towards specific goals

Agentic flow is not necessarily a completely independent agent, but it can still make some decisions during the flow execution

A typical agentic flow consists of:
- Receiving a user request
- Analyzing the request and available tools
- Deciding on the next action
- Executing the action using appropriate tools
- Evaluating the results
- Either completing the task or continuing with more actions

The key difference from basic RAG is that agents can:
- Make multiple search queries
- Combine information from different sources
- Decide when to stop searching
- Use their own knowledge when appropriate
- Chain multiple actions together

So in agentic RAG, the system
- has access to the history of previous actions
- makes decisions independently based on the current information and the previous actions

### Agentic RAG (Decision-Making)
Agentic RAG enhances basic RAG by allowing the AI assistant to decide whether to answer a question directly using its own knowledge or to perform a search in the FAQ database. This is achieved by modifying the prompt to include instructions and output templates for different actions:

- SEARCH: If the context is empty and the LLM decides it needs more information from the FAQ database
- ANSWER (source: CONTEXT): If the LLM can answer the question using the provided context from a search
- ANSWER (source: OWN_KNOWLEDGE): If the context does not contain the answer, or if the question can be answered without needing a search, the LLM uses its internal knowledge

### Difference bbetween Basic RAG and Agentic RAG

| Feature            | Basic RAG                          | Agentic RAG                          |
|--------------------|------------------------------------|--------------------------------------|
| **Decision-Making** | Always retrieves before answering  | Decides whether to retrieve (e.g., skips search for simple queries) |
| **Flexibility**    | Linear flow (search → prompt → LLM) | Dynamic loops (iterative queries, multi-source combining) |
| **Memory**         | No history of past actions         | Tracks previous searches/queries to avoid redundancy |
| **Tool Use**       | Single retrieval tool              | Chains multiple tools (search + edit + APIs) |
| **Autonomy**       | Follows fixed instructions         | Makes independent decisions (e.g., stops after max iterations) |



### Agentic search
Agentic search extends RAG with multi-query exploration and iterative refinement.

**How it works:**
- Reformulates queries (e.g., "How to excel in Module 1?" → "Docker best practices").
- Combines results from multiple searches.
- Stops when sufficient context is gathered or max iterations reached.

Advantage: Deeper topic coverage than single-query RAG.

