In [1]:
# Import necessary libraries

from dotenv import load_dotenv
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_index.llms.openai import OpenAI

In [2]:
# Load the environment variable containing the OpenAI API Key so we don't have to hardcode that here
load_dotenv()

True

In [None]:
# Create an OpenAI object with LLM
# We are using 'gpt-4o-mini' because it's cost effective and capable enough to query on text
# Refer https://platform.openai.com/docs/models/gpt-4o-mini for more info

OpenAI(model="gpt-4o-mini")

In [4]:
# We will use the SimpleDirectoryReader from LlamaIndex to read the pdf file and store into Document object of LlamaIndex
# SimpleDirectoryReader can read one or more files and supports various file types such as .csv, .docx, .pdf, and more.
# Refer to https://github.com/run-llama/llama_index/blob/main/docs/docs/module_guides/loading/simpledirectoryreader.md

docs = SimpleDirectoryReader("pdf/").load_data()

In [6]:
# The VectorStoreIndex in LlamaIndex helps to store, query, and manage vector embeddings efficiently in RAG workflows. 
# Refer https://github.com/run-llama/llama_index/blob/main/docs/docs/module_guides/indexing/vector_store_index.md for more.
# Here we will create the in-memory vector store from the document we created above from the pdf file
# Make sure to have credit available in OpenAI account, otherwise it will give error saying "You exceeded your current quota"

vector = VectorStoreIndex.from_documents(docs)

2025-09-02 11:24:06,423 - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


In [7]:
# The as_query_engine method prepares the index for querying in natural language, for quick and intuitive data retrieval.
# Refer https://docs.llamaindex.ai/en/stable/module_guides/indexing/lpg_index_guide/ for more
# Create a query engine from the vector store

query_engine = vector.as_query_engine()

In [9]:
# Perform a query using the query engine created above

response = query_engine.query("Give me a comparison between loading documents using langchain and llamaindex")
print(response)

2025-09-02 12:08:00,967 - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2025-09-02 12:08:02,472 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


LangChain and LlamaIndex differ in their approach to loading documents. LangChain may have a distinct method or process compared to LlamaIndex when it comes to document loading. The two systems likely have unique features or functionalities that set them apart in terms of how they handle document loading tasks.
