##### Load documents

In [17]:
from langchain_community.document_loaders import TextLoader

docs=TextLoader("../data.txt").load()

#### Split doc into chunks

In [18]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

splitter=RecursiveCharacterTextSplitter(
    chunk_size=500,
    chunk_overlap=50
)

chunks=splitter.split_documents(docs)

#### Create Embeddings

In [19]:
from langchain.embeddings import HuggingFaceEmbeddings

embeddings = HuggingFaceEmbeddings(
    model_name="sentence-transformers/all-MiniLM-L6-v2"
)


#### Store Embeddings in a Vector Store

In [20]:
from langchain_community.vectorstores import FAISS

vectorstore = FAISS.from_documents(chunks, embeddings)


In [21]:
retriever=vectorstore.as_retriever(
    search_type="similarity",
    search_kwargs={"k":4}
)

In [22]:
print("Number of chunks stored:", len(vectorstore.index_to_docstore_id))


Number of chunks stored: 22


#### Testing the retriever directl

In [23]:
query = "What is this document about?"

results = retriever.get_relevant_documents(query)

print("Number of results:", len(results))
print("-" * 50)

for i, doc in enumerate(results, 1):
    print(f"Result {i}:")
    print(doc.page_content)
    print("-" * 50)


Number of results: 4
--------------------------------------------------
Result 1:
technology hubs and space research centers, Karnataka represents a blend of continuity and change.
--------------------------------------------------
Result 2:
Today, Karnataka stands at the intersection of tradition and modernity. It continues to preserve its linguistic and cultural heritage while embracing rapid technological and social change. Challenges such as urban congestion, water management, environmental conservation, and regional inequality remain important policy concerns. At the same time, the stateâ€™s human capital, innovation capacity, and cultural depth position it as a key contributor to Indiaâ€™s future development.
--------------------------------------------------
Result 3:
One of the most celebrated historical sites in India, Hampi, served as the capital of the Vijayanagara Empire and is now a UNESCO World Heritage Site. Its temples, markets, and monuments reflect a period when Karna

In [32]:
query = "which all states share border with Karnataka"

results = retriever.get_relevant_documents(query)

for doc in results:
    print(doc.page_content)


Geography and Natural Features
Karnataka covers an area of about 191,791 square kilometers, making it one of the larger Indian states by size. It shares borders with Maharashtra, Goa, Kerala, Tamil Nadu, Andhra Pradesh, and Telangana. To the west lies a coastline of about 320 kilometers along the Arabian Sea, which supports ports, fishing communities, and coastal trade.
Karnataka is a major state in southern India known for its deep historical roots, linguistic richness, cultural diversity, and strong contribution to Indiaâ€™s modern economy. Located on the Deccan Plateau, Karnataka connects the interior of peninsular India with the Arabian Sea coast and has played an important role in shaping political, cultural, and economic developments over many centuries. From ancient dynasties and classical art forms to modern technology hubs and space research centers,
In summary, Karnataka is not defined by a single identity. It is a land of ancient kingdoms and modern startups, classical arts 

In [None]:
#printing only result gves whole document
print(results)

[Document(page_content='Karnataka is a major state in southern India known for its deep historical roots, linguistic richness, cultural diversity, and strong contribution to Indiaâ€™s modern economy. Located on the Deccan Plateau, Karnataka connects the interior of peninsular India with the Arabian Sea coast and has played an important role in shaping political, cultural, and economic developments over many centuries. From ancient dynasties and classical art forms to modern technology hubs and space research centers,', metadata={'source': '../data.txt'}),
 Document(page_content='Geography and Natural Features\nKarnataka covers an area of about 191,791 square kilometers, making it one of the larger Indian states by size. It shares borders with Maharashtra, Goa, Kerala, Tamil Nadu, Andhra Pradesh, and Telangana. To the west lies a coastline of about 320 kilometers along the Arabian Sea, which supports ports, fishing communities, and coastal trade.', metadata={'source': '../data.txt'}),
 

In [31]:
query="what is capital of Karnataka"

res=retriever.get_relevant_documents(query)

for doc in res:
    print(f"\n{doc.page_content}")


Karnataka is a major state in southern India known for its deep historical roots, linguistic richness, cultural diversity, and strong contribution to Indiaâ€™s modern economy. Located on the Deccan Plateau, Karnataka connects the interior of peninsular India with the Arabian Sea coast and has played an important role in shaping political, cultural, and economic developments over many centuries. From ancient dynasties and classical art forms to modern technology hubs and space research centers,

In summary, Karnataka is not defined by a single identity. It is a land of ancient kingdoms and modern startups, classical arts and digital innovation, rural traditions and global connections. Its historical legacy, cultural richness, and economic dynamism together make Karnataka one of the most significant and influential states in India.

Today, Karnataka stands at the intersection of tradition and modernity. It continues to preserve its linguistic and cultural heritage while embracing rapid 

#### same as above but all in one

In [None]:
from langchain_community.document_loaders import TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_google_genai import GoogleGenerativeAIEmbeddings
from langchain_core.vectorstores import FAISS

doc=TextLoader("../data.txt").load()

splitter=RecursiveCharacterTextSplitter(
    chunk_size=500,
    chunk_overlap=50
)

chunks=splitter.split_documents(docs)

embeddings=GoogleGenerativeAIEmbeddings(model="models/embedding-001")

vectorstore=FAISS.from_documents(chunks,embeddings)

retriever=vectorstore.as_retriever(
    search_type="similarity",
    search_kwargs={"k":4}
)
