## First stage 
### Downloaded the sample stackoverflow data from hugging face with 13224 records.
#### The URL for dataset : hf://datasets/mlfoundations-dev/stackexchange_devops/data/train-00000-of-00001.parquet
> **Following steps are taken:**
> - Create Virtual environment , in this case I used python 3.12
> - From top right select kernel 3.12

In [1]:
pip install datasets

Collecting datasets
  Downloading datasets-3.3.2-py3-none-any.whl.metadata (19 kB)
Collecting pyarrow>=15.0.0 (from datasets)
  Downloading pyarrow-19.0.1-cp312-cp312-macosx_12_0_x86_64.whl.metadata (3.3 kB)
Collecting dill<0.3.9,>=0.3.0 (from datasets)
  Using cached dill-0.3.8-py3-none-any.whl.metadata (10 kB)
Collecting pandas (from datasets)
  Downloading pandas-2.2.3-cp312-cp312-macosx_10_9_x86_64.whl.metadata (89 kB)
Collecting xxhash (from datasets)
  Downloading xxhash-3.5.0-cp312-cp312-macosx_10_9_x86_64.whl.metadata (12 kB)
Collecting multiprocess<0.70.17 (from datasets)
  Downloading multiprocess-0.70.16-py312-none-any.whl.metadata (7.2 kB)
Collecting fsspec<=2024.12.0,>=2023.1.0 (from fsspec[http]<=2024.12.0,>=2023.1.0->datasets)
  Downloading fsspec-2024.12.0-py3-none-any.whl.metadata (11 kB)
Collecting pytz>=2020.1 (from pandas->datasets)
  Downloading pytz-2025.1-py2.py3-none-any.whl.metadata (22 kB)
Collecting tzdata>=2022.7 (from pandas->datasets)
  Downloading tzdata-

In [3]:
from datasets import load_dataset
import pandas as pd

In [4]:
df = pd.read_parquet("hf://datasets/mlfoundations-dev/stackexchange_devops/data/train-00000-of-00001.parquet")
df.head(5)
len(df)

13224

In [5]:
# Add a new column combining both questions and answers
df["myspec"]="Question: "+df["instruction"]+" Answer:  " + df["completion"]
# check first 4 rows for 3 colums in dataframe df
df.head(4)

Unnamed: 0,instruction,completion,conversations,myspec
0,My build process packages my application in a ...,"Yes, your deployment revision can use a `.nupk...","[{'from': 'human', 'value': 'My build process ...",Question: My build process packages my applica...
1,So here is your job role:\n\nYou help in desig...,"Based on the job role you've described, a good...","[{'from': 'human', 'value': 'So here is your j...",Question: So here is your job role:\n\nYou hel...
2,Amazon S3 has an option of cross-region replic...,Amazon S3's cross-region replication (CRR) is ...,"[{'from': 'human', 'value': 'Amazon S3 has an ...",Question: Amazon S3 has an option of cross-reg...
3,I've had some very interesting conversations t...,"In a DevOps environment, where cross-functiona...","[{'from': 'human', 'value': 'I've had some ver...",Question: I've had some very interesting conve...


In [6]:
## For testing my script, I am minimizing the token cost to embedding model and just using 50 records from dataframe.
## These fifty records will be written into a json file
df50=df[:50]

In [7]:
## Write this dataframe df new created column "myspec" into a json file.
## File name : devops_data.json

json_data=df50["myspec"].to_json(orient='records')
with open ("devops_data50.json","w") as devops_data:
    devops_data.write(json_data)

In [8]:
import json
with open("devops_data50.json","r") as data_f50:
    data_50=json.load(data_f50)
    print(len(data_50))

50


## 2nd stage:
#### Load JSON document with 50 records only, to not exceed quota of embedding model
#### Use the json file created with only 50 records of "myspec" column from dataframe.


#### Use my devops json data to answer related questions.
#### https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/switching-endpoints
#### import modules/libraries from rquirements.txt

In [10]:
pip install -r requirements.txt


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m25.0[0m[39;49m -> [0m[32;49m25.0.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49m/Users/badalsingh/Workspace/LLMs/LLMOps/myenv3.12/bin/python -m pip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.


In [13]:
import os
from openai import AzureOpenAI
import sys

from langchain.document_loaders import JSONLoader

In [14]:
json_loader = JSONLoader(file_path="devops_data50.json", jq_schema=".",text_content=False)
json_loaded_data=json_loader.load()
## Read metdata from JSonLoader of langchain
d=json_loaded_data[0]
print(d.metadata)
## print content of this page loaded
print(json_loaded_data[0].page_content[0:5000])

{'source': '/Users/badalsingh/Workspace/LLMs/LLMOps/devops_data50.json', 'seq_num': 1}
["Question: My build process packages my application in a .nupkg instead of a .zip. \nAssuming my .nupkg contains a correctly-implemented appspec.yml and is otherwise appropriately bundled, can my deployment revision use it?\n Answer:  Yes, your deployment revision can use a `.nupkg` file, provided that it contains a correctly-implemented `appspec.yml` file and is otherwise appropriately packaged according to the deployment service's requirements. \n\nFor example, if you are using AWS CodeDeploy, the `.nupkg` file needs to include the `appspec.yml` file, which defines how the application should be deployed, along with any other files needed for the deployment. The key is that the deployment service must support the type of package you are using, and you have to ensure that the contents of the `.nupkg` are correctly structured for deployment.\n\nMake sure to test the deployment with a small revision t

## 3rd stage
#### Split document into smaller chunks

In [None]:
## For testing I am using recursive text splitter, there are other options as well in langchain :
### https://python.langchain.com/v0.1/docs/modules/data_connection/document_transformers/

In [15]:
from langchain.text_splitter import RecursiveCharacterTextSplitter, CharacterTextSplitter

In [16]:
## Test recursive text splitter with chunk size and chunk overlap
chunk_size=2500
chunk_overlap=250
sample_text = "My name is somthing you won't remember, until you know this."
r_splitter = RecursiveCharacterTextSplitter(chunk_size=chunk_size, chunk_overlap=chunk_overlap)
r_splitter.split_text(sample_text)

["My name is somthing you won't remember, until you know this."]

In [17]:
r_splitter_new = RecursiveCharacterTextSplitter(chunk_size=chunk_size, chunk_overlap=chunk_overlap,separators=["Question:","\n\n","\n"])

In [18]:
## Split the data from json file created in stage 2 above.
my_data = r_splitter_new.split_documents(json_loaded_data)
len(my_data)

50

## 4th stage
### Vectorization and embedding
### All chunks of data we obtained from splitting should be indexed , so that we can use the data to answer questions.
### to perform this we will use embedding and vector stores

In [19]:
#from langchain.embeddings.azure_openai import AzureOpenAIEmbeddings
from langchain_openai import AzureOpenAIEmbeddings

In [21]:
%env AZURE_OPENAI_ENDPOINT=https://ai-myraghub....embeddings?api-version=2023-05-15
%env AZURE_OPENAI_API_KEY=B33...ILfZ
az_openai_embedding = AzureOpenAIEmbeddings(model="text-embedding-ada-002")

env: AZURE_OPENAI_ENDPOINT=https://ai-myraghub246415217390.openai.azure.com/openai/deployments/text-embedding-ada-002/embeddings?api-version=2023-05-15
env: AZURE_OPENAI_API_KEY=B33zpvwBj9Y4Lk4PF8RDZpNnsBA6vsKmHpyH376vcYPfJH84x4VFJQQJ99BBACHYHv6XJ3w3AAAAACOGILfZ


In [22]:
#openai_embedding = OpenAIEmbeddings()
az_openai_embedding = AzureOpenAIEmbeddings()

In [23]:
### Sample embeddings
# sample texts for embedding and embedding comparisions
s1="I like fruits."
s2="I like apples."
s3="Sun rises from the east."
emb1 = az_openai_embedding.embed_query(s1)
emb2 = az_openai_embedding.embed_query(s2)
emb3 = az_openai_embedding.embed_query(s3)
import numpy as np
### compare likeliness of embeddings
print(np.dot(emb1,emb2))
print(np.dot(emb1,emb3))
print(np.dot(emb2,emb3))

0.9424148941143409
0.7456183639607952
0.7402937622251249


In [24]:
from langchain.vectorstores import Chroma


In [25]:
db_dir = "./docs/chroma/"
!rm -rf ./docs/chroma


In [26]:
vectordb = Chroma.from_documents(documents=my_data, embedding=az_openai_embedding, persist_directory=db_dir)

In [44]:
print(vectordb._collection.count())

50


In [28]:
docs=vectordb.similarity_search("Amazon S3 has an option of cross-region replication",k=2)
len(docs)
print(docs[0].page_content)

Question: Amazon S3 has an option of cross-region replication which should be pretty fault-tolerant against region/zone outages.\nDoes that mean those who are ranting about the outage did not make use of this aspect?\nOr is that cross-region replication is not completely fool-proof and would not have helped?\n Answer:  Amazon S3's cross-region replication (CRR) is indeed a powerful feature that allows for data to be automatically replicated across different AWS regions. This can enhance fault tolerance and data availability, especially in the case of regional outages. However, there are several considerations to keep in mind regarding its effectiveness during outages:\n\n1. **Configuration**: Users need to enable and correctly configure cross-region replication. If an organization does not have CRR set up, they would not benefit from this feature during an outage.\n\n2. **Replication Lag**: CRR is not instantaneous. There can be replication lag, meaning that any changes made to data ma

### Now time to use Chat GPT LLM model to use for question anwering
### First setup envrionment vraibles for AZURE_OPENAI_API_KEY, AZURE_OPENAI_ENDPOINT,OPENAI_API_VERSION


In [107]:
%env AZURE_OPENAI_ENDPOINT=https://ai-myrag....api-version=2025-01-01-preview
%env AZURE_OPENAI_API_KEY=B33...LfZ
%env OPENAI_API_VERSION=2025-01-01-preview
%env AZURE_OPENAI_DEPLOYMENT_NAME=gpt-35-turbo

env: AZURE_OPENAI_ENDPOINT=https://ai-myraghub246415217390.openai.azure.com/openai/deployments/gpt-35-turbo/chat/completions?api-version=2025-01-01-preview
env: AZURE_OPENAI_API_KEY=B33zpvwBj9Y4Lk4PF8RDZpNnsBA6vsKmHpyH376vcYPfJH84x4VFJQQJ99BBACHYHv6XJ3w3AAAAACOGILfZ
env: OPENAI_API_VERSION=2025-01-01-preview
env: AZURE_OPENAI_DEPLOYMENT_NAME=gpt-35-turbo


In [103]:
#from langchain.llms import AzureOpenAI
from langchain_openai import AzureChatOpenAI
#from langchain.retrievers.self_query.base import SelfQueryRetriever
#from langchain.chains.query_constructor.base import AttributeInfo

In [111]:
llm = AzureChatOpenAI(temperature=0,name="gpt-35-turbo")

In [139]:
#from langchain.chains import RetrievalQA
from langchain.chains import (create_retrieval_chain,create_history_aware_retriever)
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain.chains.mapreduce import MapReduceDocumentsChain
from langchain_core.prompts import (ChatPromptTemplate, MessagesPlaceholder)

In [137]:
llm.invoke("Tell me a joke")

AIMessage(content="Why couldn't the bicycle stand up by itself?\n\nBecause it was two tired!", additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 16, 'prompt_tokens': 11, 'total_tokens': 27, 'completion_tokens_details': None, 'prompt_tokens_details': None}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': 'fp_0165350fbb', 'prompt_filter_results': [{'prompt_index': 0, 'content_filter_results': {'hate': {'filtered': False, 'severity': 'safe'}, 'jailbreak': {'filtered': False, 'detected': False}, 'self_harm': {'filtered': False, 'severity': 'safe'}, 'sexual': {'filtered': False, 'severity': 'safe'}, 'violence': {'filtered': False, 'severity': 'safe'}}}], 'finish_reason': 'stop', 'logprobs': None, 'content_filter_results': {'hate': {'filtered': False, 'severity': 'safe'}, 'self_harm': {'filtered': False, 'severity': 'safe'}, 'sexual': {'filtered': False, 'severity': 'safe'}, 'violence': {'filtered': False, 'severity': 'safe'}}}, id='run-4b

In [122]:
#llm.invoke("Tell me a joke")
#Stuff technique
#need to check other 3 techniques like Map_reduce, Refine and map_rerank


system_prompt = (
    "Use the given context to answer the question. "
    "If you don't know the answer, say you don't know. "
    "Use three sentence maximum and keep the answer concise. "
    "Context: {context}"
)
prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system_prompt),
        ("human", "{input}"),
    ]
)
print(vectordb._collection.count())
question_answer_chain = create_stuff_documents_chain(llm, prompt)

retriever=vectordb.as_retriever()
chain = create_retrieval_chain(retriever, question_answer_chain)
query="what info does the doc contain regarding aws s3."
chain.invoke({"input": query})["answer"]


50


'The document contains information about managing data in Amazon S3 (Simple Storage Service) within the context of various use cases and best practices. It covers topics such as HIPAA compliance considerations, strategies for backing up data from S3, ensuring site availability during S3 outages, and securely managing secrets in serverless applications using AWS Lambda. The document provides detailed guidance on encryption, access controls, redundancy, diversification, and fallback mechanisms to enhance the security and reliability of applications utilizing S3.'

In [138]:
pip install -qU langgraph


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m25.0[0m[39;49m -> [0m[32;49m25.0.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49m/Users/badalsingh/Workspace/LLMs/LLMOps/myenv3.12/bin/python -m pip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.


In [151]:
from langchain.memory.buffer import ConversationBufferMemory
memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)
#question_answer_chain_chat = create_stuff_documents_chain(llm, prompt,memory)


### https://python.langchain.com/api_reference/langchain/chains/langchain.chains.conversational_retrieval.base.ConversationalRetrievalChain.html

### Contex
# Contextualize question
contextualize_q_system_prompt = (
    "Given a chat history and the latest user question "
    "which might reference context in the chat history, "
    "formulate a standalone question which can be understood "
    "without the chat history. Do NOT answer the question, just "
    "reformulate it if needed and otherwise return it as is."
)

contextualize_q_prompt = ChatPromptTemplate.from_messages(
    [
        ("system", contextualize_q_system_prompt),
        MessagesPlaceholder("chat_history"),
        ("human", "{input}"),
    ]
)

history_aware_retriever = create_history_aware_retriever(
    llm, retriever, contextualize_q_prompt
)

# Answer question
qa_system_prompt = (
    "You are an assistant for question-answering tasks."
    "Use the following pieces of retrieved context to answer the question."
    "If you don't know the answer, just say that you don't know." 
    "Use three sentences maximum and keep the answer concise."
    "Context: {context}"
) 

qa_prompt = ChatPromptTemplate.from_messages(
    [
        ("system",qa_system_prompt),
        MessagesPlaceholder("chat_history"),
        ("human","{input}"),
    ]
)
question_answer_chain = create_stuff_documents_chain(llm, qa_prompt) 

rag_chain = create_retrieval_chain(
    history_aware_retriever, question_answer_chain
)
#chat_history = []
#query="what info does the doc contain regarding aws s3."
#rag_chain.invoke({"input":query,"chat_history":chat_history})



In [153]:
from langchain_core.messages import HumanMessage

chat_history = []

query="what info does the doc contain regarding aws s3."
resp1 = rag_chain.invoke({"input":query,"chat_history":chat_history})
print(resp1["answer"])

chat_history.extend([HumanMessage(content=question), resp1["answer"]])

second_question = "what strategies it mentions to backup data?"
ai_msg_2 = rag_chain.invoke({"input": second_question, "chat_history": chat_history})

print(ai_msg_2)

The document contains information about managing data in Amazon S3, including strategies for handling HIPAA-compliant data, ordering physical backups using AWS Snowball, ensuring site availability during S3 outages, and securely managing secrets in AWS Lambda functions without committing them to source control. It also discusses best practices for encryption, access controls, monitoring, and redundancy when working with S3 in AWS architectures.


In [150]:
query="what strategies it mentions to backup data."
rag_chain.invoke({"input":query,"chat_history":chat_history})

{'input': 'what strategies it mentions to backup data.',
 'chat_history': [],
 'context': [Document(metadata={'seq_num': 1, 'source': '/Users/badalsingh/Workspace/LLMs/LLMOps/devops_data50.json'}, page_content='Question: What is a good strategy to keep my site online when S3 goes offline?\\nIf S3 US East 1 goes offline, how should I have my app configured/structured to prevent that taking my entire site offline?\\nWhat are the best strategies to diversify in this sort of situation?\\n Answer:  To ensure your site remains online even if Amazon S3 (Simple Storage Service) goes offline\\u2014particularly in the US East (N. Virginia) region\\u2014it\'s important to implement a multi-faceted strategy that includes redundancy, diversification, and fallback mechanisms. Here are some best practices and strategies to consider:\\n\\n### 1. Use Cross-Region Replication\\n- **Set Up Cross-Region Replication**: Configure S3 bucket replication to replicate your data to a different AWS region (e.g., 

In [148]:
#question="Amazon S3 has an option of cross-region replication"
#result = qa_chain({"query": question})
#result["result"]
query="what strategies it mentions to backup data."
chain.invoke({"input": query})["answer"]

'The strategies mentioned to backup data include using Cross-Region Replication in Amazon S3, implementing Local Caching on web servers or in-memory caches, utilizing Backup and Restore Solutions for regular backups, and considering a Multi-Cloud Strategy with multiple cloud storage providers. These strategies aim to ensure data redundancy, availability, and quick recovery in case of outages or failures.'

In [120]:
pip install -U langsmith

Collecting langsmith
  Downloading langsmith-0.3.15-py3-none-any.whl.metadata (14 kB)
Downloading langsmith-0.3.15-py3-none-any.whl (343 kB)
Installing collected packages: langsmith
  Attempting uninstall: langsmith
    Found existing installation: langsmith 0.3.1
    Uninstalling langsmith-0.3.1:
      Successfully uninstalled langsmith-0.3.1
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
langchain-community 0.3.16 requires langchain<0.4.0,>=0.3.16, but you have langchain 0.3.15 which is incompatible.[0m[31m
[0mSuccessfully installed langsmith-0.3.15

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m25.0[0m[39;49m -> [0m[32;49m25.0.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49m/Users/badalsingh/Workspace/LLMs/LLMOps/myenv3.12/bin/python -m pip install --upgrade

In [132]:
#Using Langsmith
import os
os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_ENDPOINT"] = "https://api.smith.langchain.com"
os.environ["LANGCHAIN_API_KEY"] = "ls...0" # replace dots with your api key
os.environ["LANGCHAIN_PROJECT"] = "myllmproject1"
os.environ["OPENAI_API_KEY"] = "B33...LfZ"

In [135]:
from langsmith import utils
utils.tracing_is_enabled()

False

In [136]:
print(os.environ.get("OPENAI_API_KEY"))

B33zpvwBj9Y4Lk4PF8RDZpNnsBA6vsKmHpyH376vcYPfJH84x4VFJQQJ99BBACHYHv6XJ3w3AAAAACOGILfZ
