Below is a **complete, structured list** of all major **LlamaIndex concepts**, grouped by category, covering *every core abstraction*, *modules*, and *problem statements* LlamaIndex is designed to solve.
This is comprehensive and suitable for learning, interviews, architecture design, and production LLM projects.

---

# 1. **Core Abstractions**

## 1.1 Document

* `Document`, `ImageDocument`, `AudioDocument`, `VideoDocument`
* Holds raw text or multimodal content.
  **Problem solved:** unify raw data formats before indexing.

## 1.2 Node

* Smallest chunk unit: text + metadata + embeddings.
  **Problem solved:** chunk granularity for search and RAG.

## 1.3 NodeParser

* Text splitter that converts a Document → Nodes.
  **Types:** SentenceWindow, SemanticSplitter, AdaptiveSplitter.
  **Problem solved:** efficient chunking that preserves context.

---

# 2. **Indexes**

## 2.1 Index Types

### a. **Vector Index**

* Embedding-based similarity search.
  **Problem solved:** semantic retrieval.

### b. **Summary Index**

* Stores LLM-generated summaries of documents.
  **Problem solved:** long document summarization.

### c. **List Index**

* Sequential content list.
  **Problem solved:** map-reduce operations; global summarization.

### d. **Keyword Table Index**

* Keyword → node mapping.
  **Problem solved:** keyword-based search when embeddings fail.

### e. **Composable Graph Index**

* Hierarchical index-of-indexes.
  **Problem solved:** large multi-collection retrieval.

---

# 3. **Retrievers**

## 3.1 Retriever Types

* `VectorIndexRetriever`
* `ListIndexRetriever`
* `SummaryIndexRetriever`
* `BM25Retriever`
* `HybridRetriever` (BM25 + Embedding search)
* `AutoMergingRetriever` (tree-structured merging)
* `RecursiveRetriever`
* `TransformRetriever`

**Problem solved:** choosing the right search strategy based on query type.

---

# 4. **Query Engines**

## 4.1 QueryEngine

* Primary engine to run queries on indexes.

### Types:

* `RetrieverQueryEngine`
* `SubQuestionQueryEngine`
* `SQLQueryEngine`
* `GraphQueryEngine`

**Problem solved:** translate user queries into retrieval + synthesis steps.

---

# 5. **Response Synthesizers**

## 5.1 Synthesis Modes

* `CompactAndRefine`
* `MapReduce`
* `TreeSummarize`
* `Accumulate`
* `Refine`
* `SimpleResponseBuilder`

**Problem solved:** how to combine multiple retrieved chunks into the final LLM answer.

---

# 6. **Prompts & Query Construction**

## Components:

* PromptTemplates
* QueryTransform
* Decomposition (breaking queries into sub-questions)

**Problem solved:** dynamic prompting and query transformation for complex tasks.

---

# 7. **Storage Layer**

## 7.1 StorageContext

* Manages access to:

  * `VectorStore`
  * `DocumentStore`
  * `IndexStore`

**Problem solved:** decouple data storage from retrieval logic.

## 7.2 Persistence

* Save indexes locally or remotely.
  **Problem solved:** load indexes without reprocessing data.

---

# 8. **Vector Stores**

## 8.1 Supported Vector DBs

* FAISS
* Chroma
* Pinecone
* Weaviate
* MongoDB Atlas Search
* Elasticsearch
* Milvus
* Qdrant
* LanceDB

**Problem solved:** persistent and scalable vector search.

---

# 9. **Data Connectors**

## 9.1 Loaders

* PDFReader
* HTMLReader
* Notion
* Slack
* Gmail
* GitHub Repo Loader
* SQLDatabaseLoader
* WebPageReader
* YouTubeTranscriptLoader

**Problem solved:** get raw data from real systems.

---

# 10. **LLM Integration**

## 10.1 Models

* OpenAI
* HuggingFace
* Anthropic
* Google Gemini
* Local LLMs via llama.cpp, Ollama, vLLM

**Problem solved:** pluggable LLM execution layer.

## 10.2 Embedding Models

* OpenAI embeddings
* Instructor
* Sentence Transformers
* Local embedding models

---

# 11. **Agent Framework (LlamaIndex Agents)**

## 11.1 Core Concepts

* Tools
* Agent
* OpenAI Functions
* ReAct Agent
* QueryEngine tools

**Problem solved:** tool use + decision-making to answer queries.

---

# 12. **Advanced RAG Techniques**

## 12.1 Context Retrieval Improvements

* Auto-merging retrieval
* Sentence windowing
* Multi-step retrieval
* Query rewriting
* Contextual compression

## 12.2 Knowledge Graph RAG

* Build graphs from text
* Query with semantic graph queries

## 12.3 SQL + RAG Hybrid

* SQL QueryEngine + Semantic index

---

# 13. **Workflow / Pipelines**

## Components:

* Node postprocessors
* Transformers
* Routers
* Query pipeline builder

**Problem solved:** building multi-step LLM workflows.

---

# 14. **Evaluation & Monitoring**

## 14.1 LlamaIndex Eval

* Query correctness
* Context relevance
* Faithfulness checks
* Hallucination detection

**Problem solved:** quality control for RAG systems.

---

# 15. **Enterprise Features**

## Includes:

* Observability
* Cost tracking
* Callbacks
* Caching
* Remote indexing services

**Problem solved:** production scaling and diagnostics.

---

# 16. Problem Statements LlamaIndex Addresses

### 1. RAG (Retrieval-Augmented Generation)

* Search documents
* Extract answers from large files
* Multi-modal knowledge search

### 2. Enterprise Search

* Organization-wide indexed search
* Email + Confluence + SharePoint integration

### 3. NL → SQL

* Query structured databases using natural language

### 4. Multi-document Question Answering

* Summaries
* Comparison across sources

### 5. Agents With Tools

* Agents that can retrieve docs, run SQL, call APIs

### 6. Knowledge Graph RAG

* Semantic graph-based reasoning

### 7. Chatbots

* Conversational RAG
* Memory-aware chat

### 8. Multi-modal Retrieval

* Images
* PDFs
* Audio transcripts

### 9. Ingestion Pipelines

* Index large corpora
* Smart chunking
* Metadata extraction

### 10. Advanced Summarization

* Map-reduce + hierarchical summarization for long documents

---

# If you want:

I can also generate:

* A **LlamaIndex vs LangChain comparison table**
* A **mind map** of all LlamaIndex concepts
* A **full RAG architecture using LlamaIndex**
* Example **end-to-end code for SQL RAG, Multi-modal RAG, or Graph RAG**


Below is a **clean, complete, structured list** of **all major concepts in LlamaIndex**, grouped logically.
This is the **full catalogue** of what LlamaIndex provides.

---

# **1. Core Data Concepts**

1. **Document**
2. **Node**
3. **Metadata**
4. **NodeParser / TextSplitter**
5. **Multi-modal Nodes** (image, audio, HTML, etc.)

---

# **2. Indexes (Indexing Layer)**

1. **VectorStoreIndex**
2. **ListIndex**
3. **SummaryIndex**
4. **KeywordTableIndex**
5. **ComposableGraphIndex**
6. **SQLIndex / SQLStruct Types**
7. **EmptyIndex**

---

# **3. Storage Abstractions**

1. **StorageContext**
2. **DocumentStore**
3. **IndexStore**
4. **VectorStore**
5. **MetadataStore**
6. **Persisted Storage** (save/load index)

---

# **4. Embeddings & LLM Layer**

1. **Embedding Models** (OpenAI, HF, Instructor, local models)
2. **LLM Predictors / Chat Models**
3. **Rerank Models**
4. **Local LLM execution** (llama.cpp, Ollama, vLLM)

---

# **5. Retrievers**

1. **VectorIndexRetriever**
2. **BM25Retriever**
3. **SummaryIndexRetriever**
4. **ListIndexRetriever**
5. **HybridRetriever** (BM25 + Vector)
6. **AutoMergingRetriever**
7. **RecursiveRetriever**
8. **TransformRetriever**
9. **Multi-modal Retriever**

---

# **6. Query Engines**

1. **Base QueryEngine**
2. **RetrieverQueryEngine**
3. **SubQuestionQueryEngine**
4. **Multi-step QueryEngine**
5. **SQLQueryEngine**
6. **PandasQueryEngine**
7. **GraphQueryEngine**
8. **RouterQueryEngine**
9. **KnowledgeGraphQueryEngine**

---

# **7. Response Synthesis (Answer Construction)**

1. **ResponseSynthesizer**
2. **Refine**
3. **MapReduce**
4. **TreeSummarize**
5. **CompactAndRefine**
6. **Accumulate**
7. **SimpleResponseBuilder**

---

# **8. Data Ingestion Layer**

1. **Document Loaders**

   * PDFReader
   * HTML/WebReader
   * Notion / Slack / GitHubRepoReader
   * YouTubeTranscriptLoader
   * CSVReader
   * JSON/Dict reader
   * SQLDatabaseReader
2. **Node Parsers** (SentenceSplitter, SemanticSplitter)
3. **Transform pipelines**
4. **Node Postprocessors** (rerankers, filters, compression)

---

# **9. Advanced RAG Capabilities**

1. **Contextual Compression**
2. **Query Rewriting**
3. **Reranking**
4. **Auto-merging chunks**
5. **Multi-step retrieval**
6. **Graph RAG**
7. **Hybrid RAG** (semantic + keyword)
8. **Agent-driven RAG**

---

# **10. Agents & Tools**

1. **Agent**
2. **ReAct Agent**
3. **ToolExecutor**
4. **OpenAI Function Tooling**
5. **QueryEngine Tools**
6. **MultiToolAgent**
7. **Agent Workflow Graphs**

---

# **11. Multi-modal Support**

1. **Image Documents**
2. **Vision embeddings**
3. **Audio transcription**
4. **Image + text hybrid retrieval**

---

# **12. Workflow & Pipeline Concepts**

1. **Indexing Pipeline**
2. **Query Pipeline**
3. **Composable Graph Pipeline**
4. **Agent pipeline**
5. **Multi-index routing**
6. **Async Pipelines**

---

# **13. Evaluation & Observability**

1. **LlamaIndex Eval module**
2. **Faithfulness checking**
3. **Relevance scoring**
4. **Hallucination detection**
5. **Callback events**
6. **Tracing**

---

# **14. Integration Ecosystem**

1. **LLMs** (OpenAI, Anthropic, Gemini, HF, Mistral, Llama)
2. **Vector Stores** (FAISS, Chroma, Pinecone, Weaviate, Milvus, Qdrant, LanceDB)
3. **Databases** (Postgres, MySQL, SQLite)
4. **Search engines** (Elasticsearch, Vespa)
5. **Cloud integrations** (AWS, Azure, GCP)
6. **Data Lakes & Filesystems**

---

# **15. Utilities**

1. **Caching**
2. **Streaming responses**
3. **Async support**
4. **Settings Management**
5. **ServiceContext**
6. **JSON and Structured Output support**

---

# ✔ **This is the full conceptual list for LlamaIndex**

If you want, I can also generate:

* A **one-page cheat sheet**
* A **mind map** visualizing all concepts
* A **practical tutorial** covering each concept
* A **full enterprise architecture** using LlamaIndex + RAG + Agents

Tell me what format you want.
