In [2]:
import os
from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv())
openai_api_key = os.environ["OPENAI_API_KEY"]

## Load private document

In [3]:
from llama_index.core import (
    VectorStoreIndex,
    SimpleDirectoryReader,
    StorageContext,
    load_index_from_storage,
)

documents = SimpleDirectoryReader("data").load_data()



## Create vector database

In [4]:
index = VectorStoreIndex.from_documents(documents)

2025-10-09 18:42:45,399 - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


## Ask questions to private document

In [5]:
query_engine = index.as_query_engine()
response = query_engine.query("summarize the article in 100 words")
print(response)

2025-10-09 18:42:46,118 - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2025-10-09 18:42:50,175 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


The article discusses the importance of creating something people want and not focusing solely on making money in the early stages of a startup. It explores the idea that behaving like a charity can lead to success, using examples like Craigslist and Google. Benevolence is highlighted as a key factor in startup success, improving morale, attracting help from others, and fostering decisiveness. The article emphasizes the value of helping people, building a loyal user base, and the resilience that comes from a sense of mission. Ultimately, being good can lead to long-term success and sustainability in the startup world.


## See under the hood

In [6]:
# import logging
# import sys

# logging.basicConfig(
#     stream=sys.stdout, 
#     level=logging.DEBUG
# )

# logging.getLogger().addHandler(
#     logging.StreamHandler(stream=sys.stdout)
# )

## Save the vector database

In [None]:
import os.path

# check if storage already exists
if not os.path.exists("./storage"):
    # load the documents and create the index
    documents = SimpleDirectoryReader("data").load_data()
    index = VectorStoreIndex.from_documents(documents)
    # store it for later
    index.storage_context.persist()
else:
    # load the existing index
    storage_context = StorageContext.from_defaults(persist_dir="./storage")
    index = load_index_from_storage(storage_context)

# either way we can now query the index
query_engine = index.as_query_engine()

response = query_engine.query("According to the author, what is good?")

print(response)


ValueError: Directory 000-data does not exist.

## Customization options
* parse into smaller chunks
* use a different vector store
* retrieve more context when I query
* use a different LLM
* use a different response mode
* stream the response back

## Use cases
* QA
* Chatbot
* Agent
* Structured Data Extraction
* Multimodal

## Optimizing
* Advanced Retrieval Strategies
* Evaluation
* Building performant RAG applications for production

## Other
* LlamaPacks and Create-llama = LangChain templates
* Very recent, still in beta
* Very interesting: create-llama allows you to create a Vercel app!
* Very interesting: open-source end-to-end project (SEC Insights)
    * llamaindex + react/nextjs (vercel) + fastAPI + render + AWS
    * environment setup: localStack + docker
    * monitoring: sentry
    * load testing: loader.io
    * [web](https://www.secinsights.ai/)
    * [code](https://github.com/run-llama/sec-insights)