<a href="https://colab.research.google.com/github/arulbenjaminchandru/Python-and-Gen-AI/blob/main/Building_real_AI_Projects.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#**Langchain and LlamaIndex**

## **Langchain**
LangChain is a framework designed to facilitate the creation of applications that interact with large language models (LLMs) in a more structured and meaningful way. It focuses on enabling applications that go beyond simple question-answering by adding various capabilities such as retrieval of external data, interaction with different sources, and chaining together different components.



### **Key Features of LangChain:**
1. **Chain of Actions**: LangChain enables you to build sequences of operations, where the output of one component feeds into another. This can include chaining LLM calls, interacting with APIs, or doing computations between different steps.
  
2. **Document Retrieval**: It provides the ability to retrieve information from external documents (such as PDFs, websites, or databases) to inform responses. This feature is essential for applications that need to integrate LLMs with large knowledge bases.

3. **Memory**: LangChain can allow LLMs to "remember" previous interactions, enabling context persistence across different prompts.

4. **Integration with Tools and APIs**: LangChain makes it easy to connect your language model to external tools or APIs. For example, you can call a weather service or perform calculations as part of your chain of operations.

5. **Agents**: These are advanced chains that use LLMs to decide which action to take, such as querying a database, reading a document, or calling an external API. Agents are useful in applications where the next step is not predetermined.



### **Typical LangChain Use Cases**:
- **Question Answering over Documents**: Combine LLMs with document retrieval systems to enable users to ask questions about specific documents or datasets.
  
- **Personal Assistants**: Create systems where the model can carry out tasks based on user input, from web scraping to scheduling meetings.
  
- **Conversational Agents with Memory**: Build chatbots that can remember previous conversations and improve interactions over time.

- **Data Augmentation**: Use LLMs alongside external data sources, such as databases or web APIs, to enhance the capabilities of the LLM.



### **Components of LangChain**:
1. **Prompt Templates**: Standardized templates for prompts, ensuring that you send the right kind of request to the model.
   
2. **Chains**: A sequence of steps that process user input and return a response. These steps can include calling an LLM, doing calculations, or interacting with external data sources.

3. **Agents and Tools**: Agents make decisions about which actions to take in a sequence, based on the current context.

4. **Document Loaders**: Connectors that allow LangChain to access and retrieve information from various sources such as PDFs, websites, databases, or APIs.

5. **Embeddings**: Used to represent text in a form that is easy to compare and search, essential for tasks like retrieval-augmented generation (RAG).



### **When to Use LangChain**
- When you need more control over how an LLM interacts with external data.
- If you want to build multi-step processes that go beyond simple question-answering.
- For creating complex applications where LLM responses need to be conditioned on previous actions or user inputs.

LangChain is especially powerful when used in retrieval-augmented generation (RAG) systems, personal assistants, and custom NLP pipelines.

## **LlamaIndex**

LlamaIndex, formerly known as GPT Index, is a library designed to help integrate large language models (LLMs) with various data sources and tools, enabling more advanced capabilities like document retrieval, indexing, and complex queries.



### Key Features of LlamaIndex:

1. **Document Indexing**: Allows you to create and manage indices of documents or datasets. This enables efficient search and retrieval of information.

2. **Retrieval-Augmented Generation (RAG)**: Combines retrieval of relevant documents with LLM generation. This helps in providing more accurate and contextually relevant responses by accessing external data.

3. **Integration with LLMs**: Works seamlessly with various LLMs to enable enhanced capabilities like document querying, summarization, and question-answering.

4. **Flexible Querying**: Supports complex querying mechanisms to interact with indexed documents, allowing for precise and detailed information retrieval.

5. **Ease of Use**: Provides a straightforward API for integrating document-based querying and retrieval into your applications.



### Key Components

- **Document**: Represents individual pieces of text that can be indexed and queried.
  
- **SimpleDocumentIndex**: An example implementation of a document index that supports basic operations like adding documents and querying them.

- **LLMChain**: Integrates the LLM with the document retrieval system to generate responses based on the indexed documents.



### When to Use LlamaIndex

- **Document-Based Applications**: When you need to build applications that require searching, indexing, or querying large collections of documents.
  
- **Enhanced LLM Capabilities**: To combine document retrieval with LLM responses for more contextually relevant outputs.

- **Complex Queries**: For scenarios where you need to perform complex queries over large datasets or documents.

LlamaIndex is a powerful tool for enhancing the capabilities of LLMs, especially in applications where interacting with large amounts of text or documents is crucial.

# **LlamaIndex Demo with Google Gemini, LlamaIndex and Chroma (Question Answering)**

## Overview

[Gemini](https://ai.google.dev/models/gemini) is a family of generative AI models that lets developers generate content and solve problems. These models are designed and trained to handle both text and images as input.

[LlamaIndex](https://www.llamaindex.ai/) is a simple, flexible data framework that can be used by Large Language Model(LLM) applications to connect custom data sources to LLMs.

[Chroma](https://docs.trychroma.com/) is an open-source embedding database focused on simplicity and developer productivity. Chroma allows users to store embeddings and their metadata, embed documents and queries, and search the embeddings quickly.

In this example, you'll learn how to create an application that answers questions using data from a website with the help of Gemini, LlamaIndex, and Chroma.

## Setup

First, you must install the packages and set the necessary environment variables.

### Installation

Install LlamaIndex's Python library, `llama-index`. Install LlamaIndex's integration package for Gemini, `llama-index-llms-gemini` and the integration package for Gemini embedding model, `llama-index-embeddings-gemini`. Next, install LlamaIndex's web page reader, `llama-index-readers-web`. Finally, install ChromaDB's Python client SDK, `chromadb` and

In [1]:
# This guide was tested with 0.10.17, but feel free to try newer versions.
!pip install -q llama-index==0.10.17
!pip install -q llama-index-llms-gemini
!pip install -q llama-index-embeddings-gemini
!pip install -q llama-index-readers-web
!pip install -q llama-index-vector-stores-chroma
!pip install -q chromadb

[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/1.6 MB[0m [31m?[0m eta [36m-:--:--[0m[2K   [91m━━━━━━━━━━━━[0m[90m╺[0m[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.5/1.6 MB[0m [31m15.2 MB/s[0m eta [36m0:00:01[0m[2K   [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[91m╸[0m [32m1.6/1.6 MB[0m [31m28.9 MB/s[0m eta [36m0:00:01[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.6/1.6 MB[0m [31m20.3 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.2/1.2 MB[0m [31m39.3 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m141.9/141.9 kB[0m [31m6.9 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m76.4/76.4 kB[0m [31m3.6 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m77.9/77.9 kB[0m [31m3.6 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━

## Configure your API key

To run the following cell, your API key must be stored in a Colab Secret named `GOOGLE_API_KEY`. If you don't already have an API key, or you're not sure how to create a Colab Secret, see [Authentication](https://github.com/google-gemini/cookbook/blob/main/quickstarts/Authentication.ipynb) for an example.


In [2]:
import os
from google.colab import userdata
GOOGLE_API_KEY=userdata.get('GOOGLE_API_KEY')

os.environ["GOOGLE_API_KEY"] = GOOGLE_API_KEY

## Basic steps
LLMs are trained offline on a large corpus of public data. Hence they cannot answer questions based on custom or private data accurately without additional context.

If you want to make use of LLMs to answer questions based on private data, you have to provide the relevant documents as context alongside your prompt. This approach is called Retrieval Augmented Generation (RAG).

You will use this approach to create a question-answering assistant using the Gemini text model integrated through LlamaIndex. The assistant is expected to answer questions about Google's Gemini model. To make this possible you will add more context to the assistant using data from a website.

In this tutorial, you'll implement the two main components in a RAG-based architecture:

1. Retriever

    Based on the user's query, the retriever retrieves relevant snippets that add context from the document. In this tutorial, the document is the website data.
    The relevant snippets are passed as context to the next stage - "Generator".

2. Generator

    The relevant snippets from the website data are passed to the LLM along with the user's query to generate accurate answers.

You'll learn more about these stages in the upcoming sections while implementing the application.

## Import the required libraries

In [3]:
from bs4 import BeautifulSoup
from IPython.display import Markdown, display
from llama_index.core import Document
from llama_index.core import Settings
from llama_index.core import SimpleDirectoryReader
from llama_index.core import StorageContext
from llama_index.core import VectorStoreIndex
from llama_index.readers.web import SimpleWebPageReader

from llama_index.vector_stores.chroma import ChromaVectorStore

import chromadb
import re

## 1. Retriever

In this stage, you will perform the following steps:

1. Read and parse the website data using LlamaIndex.

2. Create embeddings of the website data.

    Embeddings are numerical representations (vectors) of text. Hence, text with similar meaning will have similar embedding vectors. You'll make use of Gemini's embedding model to create the embedding vectors of the website data.

3. Store the embeddings in Chroma's vector store.
    
    Chroma is a vector database. The Chroma vector store helps in the efficient retrieval of similar vectors. Thus, for adding context to the prompt for the LLM, relevant embeddings of the text matching the user's question can be retrieved easily using Chroma.

4. Create a Retriever from the Chroma vector store.

    The retriever will be used to pass relevant website embeddings to the LLM along with user queries.

### Read and parse the website data

LlamaIndex provides a wide variety of data loaders. To read the website data as a document, you will use the `SimpleWebPageReader` from LlamaIndex.

To know more about how to read and parse input data from different sources using the data loaders of LlamaIndex, read LlamaIndex's [loading data guide](https://docs.llamaindex.ai/en/stable/understanding/loading/loading.html).

In [27]:
web_documents = SimpleWebPageReader().load_data(
    ["https://deepmind.google/technologies/veo/"]
)

# Extract the content from the website data document
html_content = web_documents[0].text
html_content

'<!doctype html>\n<html lang="en">\n  <head>\n    <meta charset="utf-8">\n    <meta content="initial-scale=1, minimum-scale=1, width=device-width" name="viewport">\n    <title>Veo - Google DeepMind</title>\n    \n  <meta name="description" content="Veo is our most capable video generation model to date. It generates high-quality, 1080p resolution videos that can go beyond a minute, in a wide range of cinematic and visual styles.">\n  <link rel="canonical" href="https://deepmind.google/technologies/veo/">\n\n  <meta property="og:title" content="Veo">\n  <meta property="og:description" content="Veo is our most capable video generation model to date. It generates high-quality, 1080p resolution videos that can go beyond a minute, in a wide range of cinematic and visual styles.">\n  <meta property="og:image" content="https://lh3.googleusercontent.com/4sPA7BgwdC6KEY9o2KQrQYEldPzpitmmZNM0pfe5bx4WZ7hghc8EBlebilkOn60DDxU20OlU-6xE3dxg8DM9ZkxYDCSdRgYIIHz9eHfcKE08zVQxLg=w1200-h630-n-nu">\n  <meta 

You can use variety of HTML parsers to extract the required text from the html content.

In this example, you'll use Python's `BeautifulSoup` library to parse the website data. After processing, the extracted text should be converted back to LlamaIndex's `Document` format.

In [28]:
# Parse the data.
soup = BeautifulSoup(html_content, 'html.parser')
p_tags = soup.findAll('p')
text_content = ""
for each in p_tags:
    text_content += each.text + "\n"

# Convert back to Document format
documents = [Document(text=text_content)]

### Initialize Gemini's embedding model

To create the embeddings from the website data, you'll use Gemini's embedding model, **embedding-001** which supports creating text embeddings.

To use this embedding model, you have to import `GeminiEmbedding` from LlamaIndex. To know more about the embedding model, read Google AI's [language documentation](https://ai.google.dev/models/gemini).

In [29]:
from llama_index.embeddings.gemini import GeminiEmbedding

gemini_embedding_model = GeminiEmbedding(model_name="models/embedding-001")

### Initialize Gemini

You must import `Gemini` from LlamaIndex to initialize your model.
 In this example, you will use **gemini-1.5-flash-latest**, as it supports text summarization. To know more about the text model, read Google AI's [model documentation](https://ai.google.dev/models/gemini).

You can configure the model parameters such as ***temperature*** or ***top_p***,  using the  ***generation_config*** parameter when initializing the `Gemini` LLM.  To learn more about the model parameters and their uses, read Google AI's [concepts guide](https://ai.google.dev/docs/concepts#model_parameters).

In [30]:
from llama_index.llms.gemini import Gemini

# To configure model parameters use the `generation_config` parameter.
# eg. generation_config = {"temperature": 0.7, "topP": 0.8, "topK": 40}
# If you only want to set a custom temperature for the model use the
# "temperature" parameter directly.

llm = Gemini(model_name="models/gemini-1.5-flash-latest")

### Store the data using Chroma

 Next, you'll store the embeddings of the website data in Chroma's vector store using LlamaIndex.

 First, you have to initiate a Python client in `chromadb`. Since the plan is to save the data to the disk, you will use the `PersistentClient`. You can read more about the different clients in Chroma in the [client reference guide](https://docs.trychroma.com/reference/Client).

After initializing the client, you have to create a Chroma collection. You'll then initialize the `ChromaVectorStore` class in LlamaIndex using the collection created in the previous step.

Next, you have to set `Settings` and create storage contexts for the vector store.

`Settings` is a collection of commonly used resources that are utilized during the indexing and querying phase in a LlamaIndex pipeline. You can specify the LLM, Embedding model, etc that will be used to create the application in the `Settings`. To know more about `Settings`, read the [module guide for Settings](https://docs.llamaindex.ai/en/stable/module_guides/supporting_modules/settings.html).

`StorageContext` is an abstraction offered by LlamaIndex around different types of storage. To know more about storage context, read the [storage context API guide](https://docs.llamaindex.ai/en/stable/api_reference/storage.html).

The final step is to load the documents and build an index over them. LlamaIndex offers several indices that help in retrieving relevant context for a user query. Here you'll use the `VectorStoreIndex` since the website embeddings have to be stored in a vector store.

To create the index you have to pass the storage context along with the documents to the `from_documents` function of `VectorStoreIndex`.
The `VectorStoreIndex` uses the embedding model specified in the `Settings` to create embedding vectors from the documents and stores these vectors in the vector store specified in the storage context. To know more about the
`VectorStoreIndex` you can read the [Using VectorStoreIndex guide](https://docs.llamaindex.ai/en/stable/module_guides/indexing/vector_store_index.html).

In [31]:
# Create a client and a new collection
client = chromadb.PersistentClient(path="./chroma_db")
chroma_collection = client.get_or_create_collection("quickstart")

# Create a vector store
vector_store = ChromaVectorStore(chroma_collection=chroma_collection)

# Create a storage context
storage_context = StorageContext.from_defaults(vector_store=vector_store)

# Set Global settings
Settings.llm = llm
Settings.embed_model = gemini_embedding_model
# Create an index from the documents and save it to the disk.
index = VectorStoreIndex.from_documents(
    documents, storage_context=storage_context
)

### Create a retriever using Chroma

You'll now create a retriever that can retrieve data embeddings from the newly created Chroma vector store.

First, initialize the `PersistentClient` with the same path you specified while creating the Chroma vector store. You'll then retrieve the collection `"quickstart"` you created previously from Chroma. You can use this collection to initialize the `ChromaVectorStore` in which you store the embeddings of the website data. You can then use the `from_vector_store` function of `VectorStoreIndex` to load the index.

In [32]:
# Load from disk
load_client = chromadb.PersistentClient(path="./chroma_db")

# Fetch the collection
chroma_collection = load_client.get_collection("quickstart")

# Fetch the vector store
vector_store = ChromaVectorStore(chroma_collection=chroma_collection)

# Get the index from the vector store
index = VectorStoreIndex.from_vector_store(
    vector_store
)

# Check if the retriever is working by trying to fetch the relevant docs related
# to the phrase 'Summarize the content'.
# If the length is greater than zero, it means that the retriever is
# functioning well.
# You can ask questions about your data using a generic interface called
# a query engine. You have to use the `as_query_engine` function of the
# index to create a query engine and use the `query` function of query engine
# to inquire the index.
test_query_engine = index.as_query_engine()
response = test_query_engine.query("Summarize the content")
print(response)

This document acknowledges the contributions of numerous individuals in developing and refining key components of a project. It highlights the release of two new technologies: Veo, a model for generating high-definition video, and Imagen 3, a text-to-image model. It also mentions SynthID, a tool for watermarking and identifying AI-generated images. 



## 2. Generator

The Generator prompts the LLM for an answer when the user asks a question. The retriever you created in the previous stage from the Chroma vector store will be used to pass relevant embeddings from the website data to the LLM to provide more context to the user's query.

You'll perform the following steps in this stage:

1. Create a prompt for answering any question using LlamaIndex.
    
2. Use a query engine to ask a question and prompt the model for an answer.

### Create prompt templates

You'll use LlamaIndex's [PromptTemplate](https://docs.llamaindex.ai/en/stable/module_guides/models/prompts.html) to generate prompts to the LLM for answering questions.

In the `llm_prompt`, the variable `query_str` will be replaced later by the input question, and the variable `context_str` will be replaced by the relevant text from the website retrieved from the Chroma vector store.

In [33]:
from llama_index.core import PromptTemplate

template = (
    """ You are an assistant for question-answering tasks.
Use the following context to answer the question.
If you don't know the answer, just say that you don't know.
Use five sentences maximum and keep the answer concise.\n
Question: {query_str} \nContext: {context_str} \nAnswer:"""
)
llm_prompt = PromptTemplate(template)

### Prompt the model using Query Engine

You will use the `as_query_engine` function of the `VectorStoreIndex` to create a query engine from the index using the `llm_prompt` passed as the value for the `text_qa_template` argument. You can then use the `query` function of the query engine to prompt the LLM. To know more about custom prompting in LlamaIndex, read LlamaIndex's [prompts usage pattern documentation](https://docs.llamaindex.ai/en/stable/module_guides/models/prompts/usage_pattern.html#defining-a-custom-prompt).

In [35]:
# Query data from the persisted index
query_engine = index.as_query_engine(text_qa_template=llm_prompt)
prompt = input("Enter your question : ")
response = query_engine.query(prompt)
print(response)

Enter your question : what is veo?
Veo is Google's most advanced video generation model. It can create high-quality, minute-long videos in various cinematic styles, accurately capturing the nuances of text prompts. Veo allows for creative control, understanding prompts for cinematic effects like time lapses and aerial shots. It aims to make video production accessible to everyone, from filmmakers to educators.  Veo is currently being tested in VideoFX, a new experimental tool at labs.google, and will be integrated into YouTube Shorts and other products in the future. 



# **Langchain vs LlamaIndex**

LangChain and LlamaIndex (formerly GPT Index) are two libraries designed to extend the capabilities of large language models (LLMs) like GPT-4 by integrating them with external data sources and creating advanced workflows. However, they have different focuses and feature sets, making each more suitable for specific use cases.



### 1. **Primary Focus**
   - **LangChain**: Primarily focuses on chaining different components, such as language models, APIs, document retrieval systems, and tools, to build end-to-end workflows. It's designed to handle complex multi-step operations and workflows involving LLMs.
   - **LlamaIndex (GPT Index)**: Specializes in connecting LLMs to external data sources by indexing documents, making it easier to search, retrieve, and query information from large document sets. It focuses on enabling LLMs to interact with large-scale document retrieval systems.



### 2. **Core Functionality**
   - **LangChain**:
     - Chain multiple operations together (e.g., prompt generation, API calls, document retrieval).
     - Strong integration with external tools (APIs, databases, etc.).
     - Advanced agents that decide which tools or APIs to use based on user input.
     - Memory capabilities to maintain the context of conversations.
   - **LlamaIndex**:
     - Builds, stores, and manages indices for documents.
     - Optimized for searching and retrieving relevant documents to feed into LLMs for tasks like question-answering and summarization.
     - Provides more specialized tools for document chunking, vector embeddings, and efficient retrieval over large datasets.



### 3. **Use Cases**
   - **LangChain**:
     - **Conversational agents**: Building agents that can interact with users, retrieve information, and take actions based on inputs.
     - **API integration**: Connecting LLMs to external APIs for dynamic task execution.
     - **Multi-step workflows**: Creating processes that involve multiple steps, like retrieving data, performing calculations, and making decisions based on output.
     - **Tool use**: Agents that decide when to use specific tools like databases, web scrapers, or calculators.
   - **LlamaIndex**:
     - **Retrieval-Augmented Generation (RAG)**: Enhancing LLM outputs by pulling relevant information from large external datasets.
     - **Question-answering over documents**: When a user wants to ask a model questions about specific documents or knowledge bases.
     - **Text-based applications**: Indexing and querying large sets of unstructured text data, like articles, research papers, or PDFs.
     - **Efficient document retrieval**: Optimizing the way data is chunked, embedded, and queried to scale document processing.



### 4. **Data Sources and Retrieval**
   - **LangChain**:
     - Integrates with external APIs, databases, and file systems.
     - Allows flexible retrieval systems but not specifically optimized for large document retrieval out of the box.
   - **LlamaIndex**:
     - Designed specifically for document retrieval and indexing.
     - Offers vector-based search, chunking, and embedding capabilities to efficiently store and retrieve large datasets of text.



### 5. **Complexity**
   - **LangChain**: More general-purpose and flexible, allowing for complex, multi-step workflows. The complexity comes from orchestrating various components (LLMs, APIs, memory, and tools).
   - **LlamaIndex**: Simpler in concept but more specialized. It's easier to use for document-based tasks like retrieval and question-answering but doesn’t have the same level of general workflow capabilities as LangChain.



### 6. **Memory Management**
   - **LangChain**: Offers built-in memory capabilities that allow models to retain context across conversations, useful for interactive applications where continuity is required.
   - **LlamaIndex**: Focuses on document indexing and retrieval; memory in the sense of keeping track of past interactions isn't a core focus.



### 7. **Integration**
   - **LangChain**: Can integrate with LlamaIndex for document retrieval. LangChain handles the orchestration and agents, while LlamaIndex can be used to manage document indexing and search within the LangChain pipeline.
   - **LlamaIndex**: Can be used as a backend for LangChain’s document retrieval tasks. It’s more focused on data and document processing rather than full multi-component pipelines.



### Comparison Summary

| Feature               | **LangChain**                                                                                          | **LlamaIndex**                                                                                   |
|-----------------------|--------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------|
| **Focus**              | Multi-step workflows and integration of LLMs with external tools and APIs.                             | Optimized for indexing, retrieving, and querying large datasets.                                 |
| **Key Components**     | Chains, agents, prompt templates, memory, tool integration.                                            | Document indexing, chunking, embedding, efficient retrieval.                                     |
| **Use Cases**          | Conversational agents, tool usage, multi-step workflows, API integrations, interactive applications.    | Retrieval-augmented generation (RAG), question-answering over documents, text data indexing.      |
| **Retrieval**          | Supports retrieval but more general; connects to external sources via APIs or databases.               | Specializes in document retrieval, with built-in support for embeddings, chunking, and vector search. |
| **Complexity**         | More flexible and powerful but requires managing multiple components and workflows.                    | Easier to use for document-based tasks but not as flexible for complex workflows.                 |
| **Memory**             | Built-in memory for context retention across interactions.                                             | Focused on document retrieval, not conversational memory.                                        |
| **Integration**        | Can use LlamaIndex for document retrieval tasks as part of a broader workflow.                         | Can be used as a backend for LangChain’s document retrieval.                                      |



### Which One to Choose?
- **Choose LangChain** if:
  - You need to build complex, multi-step workflows involving LLMs and external tools.
  - You want to build conversational agents that can interact with users and perform tasks dynamically.
  - Your application requires memory to retain context across interactions.
  
- **Choose LlamaIndex** if:
  - You’re focused on indexing and querying large document sets.
  - You need efficient retrieval of data from external sources for tasks like question-answering or summarization.
  - You are building a system that requires RAG (retrieval-augmented generation) with large knowledge bases.

In many cases, the two can complement each other, where LangChain handles the logic and LlamaIndex manages document retrieval.