In [None]:
### Base Code For RAG Agent ###

<!-- Document Stage — Loading Data from Text Files
Reading the file and just printing 1000 character -->

In [1]:
from langchain_community.document_loaders import TextLoader
file = "./data.txt"
text = TextLoader(file).load()
print(text[0].page_content[:1000])

Overview of DevOps:  
Briefly describe the DevOps culture and its goalsâ€”emphasizing automation, 
collaboration between development and operations teams, and continuous 
integration/delivery. 
Python in DevOps 
Highlight why Python has become a go-to scripting language for DevOps 
engineers. Its simplicity, flexibility, and vast ecosystem of libraries make it ideal for 
DevOps automation and integrations.
Key Advantages of Python in DevOps - Ease of Use:  
o Pythonâ€™s readable syntax and ease of learning make it accessible for both 
beginner and experienced DevOps engineers - Cross-Platform Compatibility: 
o  Python scripts run on multiple operating systems, including Linux, 
Windows, and macOS, ensuring consistency across environments. - Rich Library Ecosystem: 
o Python has extensive libraries for interacting with APIs, automating 
infrastructure, and working with data, making it suitable for various 
DevOps use cases. - Community and Support:  
o Pythonâ€™s strong community suppor

In [None]:
# Chunking stage - Text splitting 
# Converting the big document into many small overlapping chunks
# chunk size = split it into 800 chunk 
# chunk overlap = 100
# Here first 700 will be unique and next 100 will overlap with next characters
# printing the total chunk lenght 

In [2]:
from langchain_text_splitters import RecursiveCharacterTextSplitter
# Initialize splitter
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=800,
    chunk_overlap=100,
    add_start_index=True)
# Split documents into chunks
chunks = text_splitter.split_documents(text)
print(len(chunks))
# Inspect one chunk
print(chunks[0])

4
page_content='Overview of DevOps:  
Briefly describe the DevOps culture and its goalsâ€”emphasizing automation, 
collaboration between development and operations teams, and continuous 
integration/delivery. 
Python in DevOps 
Highlight why Python has become a go-to scripting language for DevOps 
engineers. Its simplicity, flexibility, and vast ecosystem of libraries make it ideal for 
DevOps automation and integrations.
Key Advantages of Python in DevOps - Ease of Use:  
o Pythonâ€™s readable syntax and ease of learning make it accessible for both 
beginner and experienced DevOps engineers - Cross-Platform Compatibility: 
o  Python scripts run on multiple operating systems, including Linux, 
Windows, and macOS, ensuring consistency across environments. - Rich Library Ecosystem:' metadata={'source': './data.txt', 'start_index': 0}


In [None]:
# Step 4: Embeddings + Chroma
# Text is converted into vectors, Stored in a local persistent vector database
# Use Ollama for generating embeddings 
# model="nomic-embed-text" = Designed for semantic search

In [3]:
from langchain_community.embeddings import OllamaEmbeddings
from langchain_chroma import Chroma

# Initialize embedding model (local Ollama)
embeddings = OllamaEmbeddings(
    model="nomic-embed-text"
)

# Create Chroma vector store
vectorstore = Chroma.from_documents(
    documents=chunks,
    embedding=embeddings,
    persist_directory="chroma_db"
)

print("Vector store created and persisted successfully!")


  embeddings = OllamaEmbeddings(


Vector store created and persisted successfully!


In [5]:
# User query
query = "what is dynamodb"

# Perform semantic search
results = vectorstore.similarity_search(
    query=query,
    k=3
)

# Inspect results
print(f"\nTop {len(results)} results:\n")

for i, doc in enumerate(results, 1):
    print(f"--- Result {i} ---")
    print(doc.page_content)
    print("Metadata:", doc.metadata)
    print()


Top 3 results:

--- Result 1 ---
Windows, and macOS, ensuring consistency across environments. - Rich Library Ecosystem: 
o Python has extensive libraries for interacting with APIs, automating 
infrastructure, and working with data, making it suitable for various 
DevOps use cases. - Community and Support:  
o Pythonâ€™s strong community support ensures fast problem-solving and 
access to many resources and tools.
Metadata: {'source': './data.txt', 'start_index': 687}

--- Result 2 ---
Windows, and macOS, ensuring consistency across environments. - Rich Library Ecosystem: 
o Python has extensive libraries for interacting with APIs, automating 
infrastructure, and working with data, making it suitable for various 
DevOps use cases. - Community and Support:  
o Pythonâ€™s strong community support ensures fast problem-solving and 
access to many resources and tools.
Metadata: {'start_index': 687, 'source': './data.txt'}

--- Result 3 ---
Windows, and macOS, ensuring consistency across en