# Lesson 22: RAG Frameworks - Introduction and use of LlamaIndex and LangChain

## Introduction (5 minutes)

Welcome to our lesson on RAG Frameworks, focusing on LlamaIndex and LangChain. In this 60-minute session, we'll explore these powerful tools for implementing Retrieval-Augmented Generation systems and learn how to use them in practice.

## Lesson Objectives

By the end of this lesson, you will be able to:
1. Understand the key features and use cases of LlamaIndex and LangChain
2. Set up and use LlamaIndex for document indexing and querying
3. Implement a basic RAG system using LangChain
4. Compare the strengths and use cases of both frameworks

## 1. Introduction to LlamaIndex (5 minutes)

LlamaIndex (formerly GPT Index) is a data framework designed to help developers build applications using large language models (LLMs) with external data sources.

Key features:
- Flexible data ingestion from various sources
- Advanced indexing and querying capabilities
- Integration with popular LLMs
- Support for different types of indices (e.g., list, tree, keyword table)

## 2. Setting up and Using LlamaIndex (20 minutes)

Let's implement a basic RAG system using LlamaIndex:

In [None]:
import os
os.environ["OPENAI_API_KEY"] = "your-api-key-here"

from llama_index import GPTSimpleVectorIndex, Document, SimpleDirectoryReader
from llama_index.query_engine.retriever_query_engine import RetrieverQueryEngine
from llama_index.indices.postprocessor import SimilarityPostprocessor

# Load documents
documents = SimpleDirectoryReader('data').load_data()

# Create index
index = GPTSimpleVectorIndex.from_documents(documents)

# Create query engine
query_engine = RetrieverQueryEngine.from_args(
    index,
    similarity_top_k=2,
    node_postprocessors=[SimilarityPostprocessor(similarity_cutoff=0.7)]
)

# Query the system
response = query_engine.query("What are the main features of LlamaIndex?")
print(response)

# Save and load index
index.save_to_disk('index.json')
loaded_index = GPTSimpleVectorIndex.load_from_disk('index.json')

This example demonstrates how to load documents, create an index, query the system, and save/load the index for future use.

## 3. Introduction to LangChain (5 minutes)

LangChain is a framework for developing applications powered by language models, focusing on composability and end-to-end solutions.

Key features:
- Prompt management and optimization
- Chain of thought reasoning
- Integration with external data sources and APIs
- Memory management for conversational applications

## 4. Implementing RAG with LangChain (20 minutes)

Now, let's implement a RAG system using LangChain:

In [None]:
import os
os.environ["OPENAI_API_KEY"] = "your-api-key-here"

from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.vectorstores import FAISS
from langchain.text_splitter import CharacterTextSplitter
from langchain.llms import OpenAI
from langchain.chains import RetrievalQA
from langchain.document_loaders import TextLoader

# Load document
loader = TextLoader("data/example.txt")
documents = loader.load()

# Split text into chunks
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
texts = text_splitter.split_documents(documents)

# Create embeddings and vector store
embeddings = OpenAIEmbeddings()
vectorstore = FAISS.from_documents(texts, embeddings)

# Create retrieval chain
qa = RetrievalQA.from_chain_type(
    llm=OpenAI(),
    chain_type="stuff",
    retriever=vectorstore.as_retriever()
)

# Query the system
query = "What are the main features of LangChain?"
response = qa.run(query)
print(response)

This example shows how to load a document, split it into chunks, create embeddings, set up a vector store, and use a retrieval chain for question-answering.

## 5. Comparing LlamaIndex and LangChain (5 minutes)

Let's compare these frameworks:

1. Focus:
   - LlamaIndex: Specialized in data indexing and retrieval
   - LangChain: Broader framework for LLM-powered applications

2. Ease of Use:
   - LlamaIndex: Simpler API for basic RAG tasks
   - LangChain: More flexible but potentially more complex

3. Features:
   - LlamaIndex: Advanced indexing strategies
   - LangChain: Comprehensive toolkit for various LLM tasks

4. Integration:
   - Both integrate well with popular LLMs and vector stores

5. Use Cases:
   - LlamaIndex: Ideal for document-heavy applications
   - LangChain: Suitable for a wider range of LLM applications

## Hands-on Exercise (15 minutes)

Let's create a simple comparison by implementing the same task in both frameworks:

Task: Create a QA system over a set of documents about AI ethics.

1. LlamaIndex Implementation:

In [None]:
from llama_index import GPTSimpleVectorIndex, SimpleDirectoryReader

# Load documents
documents = SimpleDirectoryReader('ai_ethics_docs').load_data()

# Create index
index = GPTSimpleVectorIndex.from_documents(documents)

# Query
response = index.query("What are the main ethical concerns in AI development?")
print("LlamaIndex Response:", response)

2. LangChain Implementation:

In [None]:
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.vectorstores import FAISS
from langchain.text_splitter import CharacterTextSplitter
from langchain.llms import OpenAI
from langchain.chains import RetrievalQA
from langchain.document_loaders import DirectoryLoader

# Load documents
loader = DirectoryLoader('ai_ethics_docs')
documents = loader.load()

# Split text
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
texts = text_splitter.split_documents(documents)

# Create embeddings and vector store
embeddings = OpenAIEmbeddings()
vectorstore = FAISS.from_documents(texts, embeddings)

# Create QA chain
qa = RetrievalQA.from_chain_type(llm=OpenAI(), chain_type="stuff", retriever=vectorstore.as_retriever())

# Query
response = qa.run("What are the main ethical concerns in AI development?")
print("LangChain Response:", response)

Compare the responses and the implementation process for both frameworks.

## Conclusion and Q&A (5 minutes)

In this lesson, we've explored two powerful RAG frameworks: LlamaIndex and LangChain. We've seen how to implement basic RAG systems using both frameworks and compared their features and use cases.

Are there any questions about LlamaIndex, LangChain, or their applications in RAG systems?

## Additional Resources

1. LlamaIndex Documentation: https://gpt-index.readthedocs.io/
2. LangChain Documentation: https://langchain.readthedocs.io/
3. "Building RAG Applications with LlamaIndex" tutorial: https://medium.com/@jerryjliu98/building-rag-applications-with-llamaindex-54f6c953291a
4. "Getting Started with LangChain" guide: https://python.langchain.com/docs/get_started/introduction.html

In our next lesson, we'll dive deeper into embedding models, a crucial component of RAG systems.