# Introduction to LlamaIndex

Welcome to the first notebook in our LlamaIndex learning series! This notebook will introduce you to the fundamentals of LlamaIndex and help you understand its core concepts.

## Learning Objectives

By the end of this notebook, you will:
1. Understand what LlamaIndex is and why it's useful
2. Set up your development environment
3. Load your first document
4. Create a simple index
5. Query your data using natural language

---

## 1. What is LlamaIndex?

**LlamaIndex** (formerly GPT Index) is a data framework designed to connect Large Language Models (LLMs) with external data sources. It provides:

### Key Capabilities

| Feature | Description |
|---------|-------------|
| **Data Connectors** | Load data from 100+ sources (PDFs, databases, APIs, etc.) |
| **Data Indexes** | Structure your data for efficient LLM consumption |
| **Query Engines** | Natural language interface to your data |
| **Agents** | Autonomous AI that can use tools and make decisions |
| **Workflows** | Complex multi-step orchestration |

### Why LlamaIndex?

LLMs like GPT-4, Claude, and Llama have a knowledge cutoff date and don't know about:
- Your private documents
- Recent events after their training
- Domain-specific information

**LlamaIndex bridges this gap** by enabling RAG (Retrieval-Augmented Generation):

```
User Query → Retrieve Relevant Context → Augment LLM Prompt → Generate Response
```

## 2. Environment Setup

Let's set up our environment with the necessary imports and API keys.

In [None]:
# Handle async in Jupyter notebooks
import nest_asyncio
nest_asyncio.apply()

# Load environment variables from .env file
from dotenv import load_dotenv
import os

load_dotenv()

# Verify API key is set (don't print the actual key!)
if os.getenv("OPENAI_API_KEY"):
    print("✓ OpenAI API key is configured")
else:
    print("✗ OpenAI API key not found. Please set it in your .env file")

In [None]:
# Core LlamaIndex imports
from llama_index.core import (
    VectorStoreIndex,
    SimpleDirectoryReader,
    Settings,
    Document,
)

# LLM and Embedding imports
from llama_index.llms.openai import OpenAI
from llama_index.embeddings.openai import OpenAIEmbedding

print("✓ All imports successful!")

## 3. Configure Global Settings

LlamaIndex uses a `Settings` object for global configuration. This includes:
- Which LLM to use for generation
- Which embedding model to use for vector representations
- Chunk sizes and other parameters

### Understanding LLMs vs Embeddings

| Component | Purpose | Example |
|-----------|---------|--------|
| **LLM** | Generate human-like text responses | GPT-4, Claude, Llama |
| **Embedding Model** | Convert text to numerical vectors for similarity search | text-embedding-3-small |

In [None]:
# Configure the LLM (Language Model)
# gpt-4o-mini is cost-effective for learning; use gpt-4o for production
Settings.llm = OpenAI(
    model="gpt-4o-mini",
    temperature=0.1,  # Lower = more deterministic, Higher = more creative
)

# Configure the Embedding Model
# This converts text into numerical vectors for semantic search
Settings.embed_model = OpenAIEmbedding(
    model="text-embedding-3-small",
)

print("✓ Settings configured!")
print(f"  LLM: {Settings.llm.model}")
print(f"  Embedding: {Settings.embed_model.model_name}")

## 4. Loading Documents

LlamaIndex provides multiple ways to load data:

### Method 1: Direct Document Creation
Create documents directly from strings - useful for testing.

In [None]:
# Method 1: Create documents directly from text
doc1 = Document(
    text="LlamaIndex is a data framework for LLM applications. "
         "It helps connect custom data sources to large language models.",
    metadata={"source": "manual", "topic": "introduction"}
)

doc2 = Document(
    text="RAG stands for Retrieval-Augmented Generation. "
         "It combines retrieval of relevant documents with LLM generation "
         "to produce more accurate and contextual responses.",
    metadata={"source": "manual", "topic": "rag"}
)

documents_manual = [doc1, doc2]
print(f"Created {len(documents_manual)} documents manually")

# Inspect document structure
print(f"\nDocument 1 preview:")
print(f"  Text: {doc1.text[:50]}...")
print(f"  Metadata: {doc1.metadata}")

### Method 2: SimpleDirectoryReader
Load documents from files in a directory - the most common approach.

In [None]:
# Method 2: Load from files using SimpleDirectoryReader
# This automatically handles different file types (txt, pdf, docx, etc.)

reader = SimpleDirectoryReader(
    input_dir="../data/sample_docs",
    recursive=True,  # Include subdirectories
)

documents_from_files = reader.load_data()

print(f"Loaded {len(documents_from_files)} documents from files")
for i, doc in enumerate(documents_from_files):
    print(f"  Document {i+1}: {doc.metadata.get('file_name', 'Unknown')}")

## 5. Creating an Index

An **Index** is the core data structure in LlamaIndex. It:
1. Chunks documents into smaller pieces
2. Converts chunks to embeddings (vectors)
3. Stores vectors for efficient similarity search

The most common index type is `VectorStoreIndex`:

```
Documents → Chunks (Nodes) → Embeddings → Vector Store
```

In [None]:
# Create a VectorStoreIndex from our documents
# This process:
# 1. Splits documents into chunks (nodes)
# 2. Generates embeddings for each chunk
# 3. Stores in an in-memory vector store

print("Creating index... (this may take a moment)")
index = VectorStoreIndex.from_documents(
    documents_from_files,
    show_progress=True  # Show progress bar
)

print("\n✓ Index created successfully!")

### Understanding Nodes (Chunks)

Documents are split into **Nodes** for more granular retrieval. Let's examine them:

In [None]:
# Access the underlying nodes (chunks)
from llama_index.core.schema import TextNode

# Get nodes from the index's document store
docstore = index.docstore
nodes = list(docstore.docs.values())

print(f"Total nodes created: {len(nodes)}")
print("\n--- Sample Node ---")
if nodes:
    sample_node = nodes[0]
    print(f"Node ID: {sample_node.node_id[:20]}...")
    print(f"Text preview: {sample_node.text[:200]}...")
    print(f"Metadata: {sample_node.metadata}")

## 6. Querying Your Data

Now for the exciting part - querying your data using natural language!

A **Query Engine** provides a simple interface:
1. Takes your question
2. Finds relevant chunks using similarity search
3. Sends chunks + question to the LLM
4. Returns a synthesized response

In [None]:
# Create a query engine from the index
query_engine = index.as_query_engine(
    similarity_top_k=3,  # Return top 3 most relevant chunks
)

print("✓ Query engine ready!")

In [None]:
# Let's ask our first question!
query = "What is artificial intelligence?"

print(f"Question: {query}")
print("-" * 50)

response = query_engine.query(query)

print(f"Answer:\n{response}")

In [None]:
# Ask more questions!
questions = [
    "What are the main types of machine learning?",
    "How is Python used in AI development?",
    "What are the ethical considerations of AI?",
]

for q in questions:
    print(f"\n{'='*60}")
    print(f"Q: {q}")
    print("-" * 60)
    response = query_engine.query(q)
    print(f"A: {response}")

## 7. Understanding the Response

The response object contains more than just the text. Let's explore it:

In [None]:
# Make a query and examine the full response
query = "What are the applications of deep learning?"
response = query_engine.query(query)

print("=== Response Analysis ===")
print(f"\n1. Response Text:\n{response}")

print(f"\n2. Source Nodes (chunks used to generate response):")
for i, node in enumerate(response.source_nodes):
    print(f"\n   Source {i+1}:")
    print(f"   - Score: {node.score:.4f}")
    print(f"   - File: {node.metadata.get('file_name', 'N/A')}")
    print(f"   - Text preview: {node.text[:100]}...")

## 8. Saving and Loading Indexes

Creating indexes can be time-consuming for large datasets. You can save them to disk:

In [None]:
# Save the index to disk
PERSIST_DIR = "./storage/introduction_index"

index.storage_context.persist(persist_dir=PERSIST_DIR)
print(f"✓ Index saved to {PERSIST_DIR}")

In [None]:
# Load the index from disk (useful when restarting your notebook)
from llama_index.core import StorageContext, load_index_from_storage

# Rebuild storage context from persisted data
storage_context = StorageContext.from_defaults(persist_dir=PERSIST_DIR)

# Load the index
loaded_index = load_index_from_storage(storage_context)
print("✓ Index loaded from disk!")

# Verify it works
loaded_query_engine = loaded_index.as_query_engine()
test_response = loaded_query_engine.query("What is Python?")
print(f"\nTest query response: {str(test_response)[:200]}...")

## 9. Summary

Congratulations! You've completed the introduction to LlamaIndex. Let's recap what you learned:

### Key Concepts

| Concept | Description |
|---------|-------------|
| **Document** | Container for your raw data with metadata |
| **Node** | A chunk of a document, the basic retrieval unit |
| **Index** | Data structure for efficient similarity search |
| **Query Engine** | Natural language interface to query your data |
| **Embedding** | Numerical vector representation of text |

### The Basic RAG Flow

```
1. LOAD:    Documents from files/APIs/databases
2. INDEX:   Create vector embeddings, store in index
3. QUERY:   Find relevant chunks, generate response with LLM
```

### Next Steps

In the next notebook (`02_simple_rag.ipynb`), we'll:
- Build a complete RAG pipeline
- Customize chunking strategies
- Work with different document types
- Understand retrieval in more depth

---

## Exercises

Try these exercises to reinforce your learning:

1. **Create your own documents**: Add more text files to the `data/sample_docs` folder and re-run the indexing

2. **Experiment with questions**: Try asking different types of questions (factual, analytical, comparative)

3. **Change the LLM**: Modify the temperature setting and observe how responses change

4. **Examine source nodes**: For each query, look at which chunks were retrieved and their scores

In [None]:
# Exercise space - try your own code here!

# Example: Try a different question
my_question = "Your question here"
# my_response = query_engine.query(my_question)
# print(my_response)