<h1>Indices and Composable Graphs</h1>

think of an index as a single database table, while a composable graph is a relational database that links multiple tables (indices), allowing for more complex queries. 

an index is a single storage format for organising and retrieving data

vector index gives all data types within it an embedding and searches based on closest embedding
keyword index maps documents to keywords found in text. it returns when a prompt matches the keywords related to a document. 

a composable graph is a metastructure. it is composed of several indices. when querying a composable graph, it directs you to the relevant index for the information, then queries the index like normal. composable graphs can consist of different types of indices. 

example:
a composable graph is like a university, with different departments - computer science, physics, math, engineering, biology (each department represents an index). if you ask a question, like what is quantum computing, the composable graph would direct you to the cs and physics indices. then it would query the cs and physics indices and return relevant documents. 

<h1>Storing</h1>

we don't want to spend time re-indexing data every time, so we store data

by default, indexed data is only stored in memory

<h2>Persisting to disk</h2>

.persist() method writes data to disk at the location specified. works for any type of Index

In [None]:
#example of storing an index
index.storage_context.persist(persist_dir="<persist_dir>")

#example of storing a composable graph
graph.root_index.storage_context.persist(persist_dir="<persist_dir>")

AttributeError: module 'llama_index' has no attribute 'LLamaIndex'

In [18]:
from llama_index.core import StorageContext, load_index_from_storage

# rebuild storage context
storage_context = StorageContext.from_defaults(persist_dir=persist_dir)

# load index
index = load_index_from_storage(storage_context)

FileNotFoundError: [Errno 2] No such file or directory: 'c:/Users/adria/Documents/VSCode/github/llm-learning/test_persist_dir/docstore.json'

As discussed in indexing, one of the most common types of Index is the VectorStoreIndex. The API calls to create the embeddings in a VectorStoreIndex can be expensive in terms of time and money, so you will want to store them to avoid having to constantly re-index things.

LlamaIndex supports a huge number of vector stores which vary in architecture, complexity and cost. In this example we'll be using Chroma, an open-source vector store.

To use Chroma to store the embeddings from a VectorStoreIndex, you need to:

- initialize the Chroma client
- create a Collection to store your data in Chroma
- assign Chroma as the vector_store in a StorageContext
- initialize your VectorStoreIndex using that StorageContext

In [10]:
import chromadb
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_index.vector_stores.chroma import ChromaVectorStore
from llama_index.core import StorageContext
from llama_index.embeddings.huggingface import HuggingFaceEmbedding

# Load documents
documents = SimpleDirectoryReader("./data/starter").load_data()

# Initialize Chroma client with persistence
db = chromadb.PersistentClient(path="./chroma_db")

# Create or get existing collection
chroma_collection = db.get_or_create_collection("quickstart")

# Initialize Hugging Face embedding model
embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-en-v1.5")

# Assign Chroma as the vector_store to the storage context
vector_store = ChromaVectorStore(chroma_collection=chroma_collection)
storage_context = StorageContext.from_defaults(vector_store=vector_store)

# Create the index (pass embed_model here)
index = VectorStoreIndex.from_documents(
    documents, storage_context=storage_context, embed_model=embed_model
)

# Create a query engine and perform a query
from llama_index.llms.ollama import Ollama

llm = Ollama(model="deepseek-r1:8b")

# Create query engine with the local model
query_engine = index.as_query_engine(llm=llm)
response = query_engine.query("What is the meaning of life?")
print(response)


<think>
Okay, so I need to figure out what the user is asking for. They provided a query asking, "What is the meaning of life?" and they've given me some context information from a file path and several notes.

First, I'll read through the context to see if there's any mention of the meaning of life or related themes. The context includes memories about family, moving to England, writing essays, and various other topics like programming, art, and architecture. There are also some technical notes on software development, computer history, languages like Lisp, and Italian vocabulary.

Looking through each note:

1. Talks about computers skipping batch processing to microcomputers.
2. Discusses Italian language concepts.
3. Describes a walk in Florence.
4. Comments on portrait painting as still life.
5. Mentions a company's downfall due to technology advancements.
6. Refers to art school culture and price of coolness.
7. Talks about a cheap apartment in New York.
8. Discusses launching an

If you've already created and stored your embeddings, you'll want to load them directly without loading your documents or creating a new VectorStoreIndex:

In [13]:
import chromadb
from llama_index.core import VectorStoreIndex
from llama_index.vector_stores.chroma import ChromaVectorStore
from llama_index.core import StorageContext
from llama_index.embeddings.huggingface import HuggingFaceEmbedding

# Use Hugging Face embeddings instead of OpenAI
embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-en-v1.5")

# Initialize Chroma client with persistence
db = chromadb.PersistentClient(path="./chroma_db")

# Get or create collection
chroma_collection = db.get_or_create_collection("quickstart")

# Assign Chroma as the vector store
vector_store = ChromaVectorStore(chroma_collection=chroma_collection)
storage_context = StorageContext.from_defaults(vector_store=vector_store)

# Load your index from stored vectors using Hugging Face embeddings
index = VectorStoreIndex.from_vector_store(
    vector_store, storage_context=storage_context, embed_model=embed_model
)

from llama_index.llms.ollama import Ollama

llm = Ollama(model="deepseek-r1:8b")

query_engine = index.as_query_engine(llm=llm)
response = query_engine.query("What is llama2?")
print(response)


<think>
Okay, so I'm trying to figure out what "llama2" refers to based on the provided context. Let me start by going through each relevant part of the context.

First, in [17], there's a mention of HN, which I assume stands for something like "HN" or perhaps "Hacker News," but it's not clear from this snippet. The problem described relates to a combination of writing essays and running a forum, where interactions can lead to tedious and even disastrous consequences because of the way conversations are assumed to involve the author.

Next, in [18], the context talks about leaving YC, which I take to mean "Y Combinator," a startup accelerator. It mentions that Jessica Livingston was part of the team, and leaving was like pulling up a deeply rooted tree, implying it was a significant decision with deep roots in their work and personal lives.

Looking at [19], the discussion is about distinguishing between invented vs discovered knowledge, using space aliens as an example. It references 

If you've already created an index, you can add new documents to your index using the insert method.

In [None]:
from llama_index.core import VectorStoreIndex

index = VectorStoreIndex([])
for doc in documents:
    index.insert(doc)

Retrying llama_index.embeddings.openai.base.OpenAIEmbedding._get_text_embeddings.<locals>._retryable_get_embeddings in 0.6257933798479913 seconds as it raised RateLimitError: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}.


RateLimitError: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}