### Objective
LLama Index is a framework for that seamlessly handles steps in a RAG pipeline: chunking, embedding, indexing, creating vector store, retrieval from vector store.
This document demonstrates use cases of llama index using sample text documents from insurance domain. Following examples are included:


*   Q&A over a single document
*   Multiple Q&A over multiple documents
*   Storing (persisting) llama index locally on a disk so that everytime program runs, we may not have to re-index the documents
*   Custom settings for chunk size and model used






In [15]:
#!pip install llama-index
!pip install llama-index-embeddings-huggingface

Collecting llama-index-embeddings-huggingface
  Downloading llama_index_embeddings_huggingface-0.6.1-py3-none-any.whl.metadata (458 bytes)
Downloading llama_index_embeddings_huggingface-0.6.1-py3-none-any.whl (8.9 kB)
Installing collected packages: llama-index-embeddings-huggingface
Successfully installed llama-index-embeddings-huggingface-0.6.1


In [41]:
import os
from llama_index.core import Document, VectorStoreIndex, StorageContext, load_index_from_storage

In [42]:
from google.colab import userdata
openai_api_key = userdata.get("OPENAI_API_KEY")
os.environ["OPENAI_API_KEY"] = openai_api_key

#### Question-Answering Over a Single Document

In [43]:
def basic_use_case():
  """ This function demonstrates basic use case for llama index. It automates:
  Step 1: Convert raw text to Document type for llama-index
  Step 2: Create a VectorStoreIndex that handles chunking, embeddings using OpenAIEmbeddings, stores embeddings in an in-memory vector store
  Step 3: Create a query engine. Query engine handles the following: embded user query -> finds similar chunks -> pass to LLM -> generate response
  Step 4: User asks questions
  """

  # sample text for document (could be a .pdf or any other text file)
  raw_text = """
  Policy Number: CYB-2026-1.
  This cyber liability policy covers data breaches, ransomware attacks,
  and business interruption losses up to $5 million per occurrence.
  The retroactive date is February 29, 2026.
  Exclusions include: war, nuclear events, and intentional acts by the insured.
  Annual premium: $80,000. Renewal date: December 31, 2028.
  """

  document = Document(text=raw_text, metadata={"source_document":"CYB-2026-1"})

  # build the index
  # VectorStoreIndex implicitly uses OpenAI's OpenAI Embedding model to generate responses
  # once we've set the openai API key above.

  index = VectorStoreIndex.from_documents([document])

  # questions are asked to the query engine
  query_engine = index.as_query_engine()

  user_query = "What are the inclusions and exclusions in policy document"

  #.query() implicitly uses OpenAI's LLM model to generate responses once we've set the openai API key above.
  response = query_engine.query(user_query)
  print(response)

  # print source of the response; a node represents a chunk of the document
  for i, node in enumerate(response.source_nodes):
    print(f"Chunk used: {i+1}")
    print(f"Score: {node.score}")
    print(f"Source Document: {node.text[:150]}")


#### Question-Answering over Multiple Documents

In [44]:
def multiple_doc_use_case():
  # list of three separate documents
  document = [
      Document(
          text="""Policy Number: CYB-2024-10234.
            Cyber liability policy covering data breaches and ransomware.
            Coverage limit: $5 million per occurrence.
            Exclusions: war, nuclear events, intentional acts.""",
          metadata={"doc_type": "cyber_policy", "client": "ABC Corp"}
          ),
      Document(
          text="""Policy Number: GL-2024-88821.
            General Liability policy for bodily injury and property damage.
            Coverage limit: $1 million per occurrence, $2 million aggregate.
            Covers operations in Singapore, Malaysia, and Indonesia.""",
          metadata={"doc_type": "general_liability", "client": "ABC Corp"}
          ),
      Document(
          text="""Claims History for ABC Corp:
            2022: One cyber claim for $120,000. Phishing attack, resolved.
            2023: One GL claim for $45,000. Slip and fall, settled.
            2024: No claims filed. 3-year loss ratio: 0.68.""",
          metadata={"doc_type": "claims_history", "client": "ABC Corp"}
          )
  ]

  # build a common index for all documents
  index = VectorStoreIndex.from_documents(document)

  # retrive top 2 most relevant chunks
  query_engine = index.as_query_engine(similarity_top_k=2)

  questions = ["What is the cyber coverage limit?",
              "Which countries does the GL policy cover?",
              "Has ABC Corp filed any cyber claims before?"
               ]
  for question in questions:
    response = query_engine.query(question)
    print("Question:\n", question)
    print("Response:\n", response)

#### Saving the Index to Disk
By default the Llama index is stored in memory, hence, everytime we run the program, it indexes the document(s) again.
To avoid re-indexing the files again and again, the index can be stored persistently on disk.

In [64]:
# directory to save index
PERSIST_DIRECTORY = "saved_index"

In [65]:
def create_and_save_index():
  raw_text = """
   Policy Number: CYB-2024-10234.
   Cyber liability policy. Coverage: $5M. Exclusions: war, nuclear.
  """

  document = Document(text=raw_text)
  index = VectorStoreIndex.from_documents([document])
  # save the index
  # https://developers.llamaindex.ai/python/framework/module_guides/storing/save_load/
  index.storage_context.persist(persist_dir=PERSIST_DIRECTORY)
  print("Index saved to: ", PERSIST_DIRECTORY)
  return index

In [66]:
def load_index():
  # check if the stored index already exists
  if not os.path.exists(PERSIST_DIRECTORY):
    print("Building index from scratch")
    index=create_and_save_index()
  else:
    # load the index
    print("Loading already saved index from: ", PERSIST_DIRECTORY)
    storage_context = StorageContext.from_defaults(persist_dir=PERSIST_DIRECTORY)
    index = load_index_from_storage(storage_context)
    print("Index loaded")
  return index

In [67]:
def persistance_use_case():
  index = load_index()
  query_engine = index.as_query_engine()
  user_question = "What is the coverage limit?"
  response = query_engine.query(user_question)
  print("User question: ", user_question)
  print("Response: ", response)

In [68]:
if __name__ == "__main__":
  #basic_use_case()
  #multiple_doc_use_case()
  persistance_use_case()

Building index from scratch
Index saved to:  saved_index
User question:  What is the coverage limit?
Response:  The coverage limit is $5 million.
