## RAG and QA Chain
* Generate embeddings for text data
* Store or query embeddings in a vector database - FAISS
* Set up the retriever for document search
* Set up Retrieval-Augmented Generation (RAG) pipeline or QA chain
* Test or validate the pipeline outputs

## Step1: Import Libraries

In [3]:
!pip install faiss-cpu sentence-transformers langchain langchain-openai --quiet
!pip install -U langchain-community --quiet

[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.5/2.5 MB[0m [31m26.6 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m45.2/45.2 kB[0m [31m3.5 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m50.9/50.9 kB[0m [31m4.3 MB/s[0m eta [36m0:00:00[0m
[?25h

In [4]:
from langchain.embeddings import HuggingFaceEmbeddings
from langchain.vectorstores import FAISS
from langchain.chains import RetrievalQA
from langchain_openai import OpenAI
from langchain.docstore.document import Document
from langchain.text_splitter import RecursiveCharacterTextSplitter
import os
from google.colab import userdata
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [5]:
transcript_folder = "/content/drive/MyDrive/ServiceNow_Audio_Transcripts/"
text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50)

documents = []

for filename in os.listdir(transcript_folder):
    if filename.endswith(".txt"):
        path = os.path.join(transcript_folder, filename)
        with open(path, "r", encoding="utf-8") as f:
            text = f.read()
            chunks = text_splitter.split_text(text)
            for chunk in chunks:
                documents.append(Document(page_content=chunk, metadata={"source": filename}))

## Step2: Embedder & Vectorizer

In [6]:
embedding_model = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")

vectorstore = FAISS.from_documents(documents, embedding=embedding_model)

  embedding_model = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")


modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md: 0.00B [00:00, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/612 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/350 [00:00<?, ?B/s]

vocab.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

## Step3: Retriever for Document Search

In [7]:
llm = OpenAI(
    temperature=0,
    openai_api_key=userdata.get("OPENAI_API_KEY")
)

In [8]:
qa_chain = RetrievalQA.from_chain_type(
    llm=llm,
    retriever=vectorstore.as_retriever(),
    return_source_documents=True
)

## Step4: Validate the pipeline outputs

In [9]:
question = "How does ServiceNow use AI in incident management?"
result = qa_chain(question)

print("🧠 Answer:\n", result["result"])
print("\n📚 Sources:")
for doc in result["source_documents"]:
    print("-", doc.metadata["source"])

  result = qa_chain(question)


🧠 Answer:
  ServiceNow uses AI in incident management through their AI Ops solution, which utilizes machine learning and pattern recognition to assist and automate the work of IT operations. This helps move away from an incident creation culture to an alert-based, actionable mindset. The AI is built into the ServiceNow platform and is not acquired from other companies, making it a native and additive part of the platform's innovations and investments.

📚 Sources:
- kQV6g8Vbbfc.txt
- KSWNDuKn9t0.txt
- WyQTP0AA1VU.txt
- fqB-NcZmqXo.txt


In [10]:
question = "What is CMDB in ServiceNow?"
result = qa_chain(question)

print("🧠 Answer:\n", result["result"])
print("\n📚 Sources:")
for doc in result["source_documents"]:
    print("-", doc.metadata["source"])

🧠 Answer:
  CMDB stands for Configuration Management Database and it is a feature in ServiceNow that allows users to track and manage all of the assets and configuration items in their IT environment. It is a central repository for all IT assets and their relationships, providing a comprehensive view of the entire IT infrastructure.

📚 Sources:
- K6z4c256gzI.txt
- kQV6g8Vbbfc.txt
- K6z4c256gzI.txt
- kQV6g8Vbbfc.txt


In [11]:
question = "How is AI used in ServiceNow Project Management?"
result = qa_chain(question)

print("🧠 Answer:\n", result["result"])
print("\n📚 Sources:")
for doc in result["source_documents"]:
    print("-", doc.metadata["source"])

🧠 Answer:
  AI is used in ServiceNow Project Management through the AI Ops solution, which assists and automates the work of IT operations using AI fundamentals like machine learning and pattern recognition. It also helps optimize the value of AI across the organization, including within Strategic Portfolio Management. This allows for assisted and autonomous workflows and combines data from different sources to create actions through AI agents.

📚 Sources:
- KSWNDuKn9t0.txt
- eFMeZto6yMg.txt
- kQV6g8Vbbfc.txt
- mSYdZW_D67o.txt


In [12]:
question = "How does ServiceNow use AI in incident management?"
result = qa_chain(question)

print("🧠 Answer:\n")
print(result["result"])

print("\n📚 Sources (with snippet preview):\n")
for i, doc in enumerate(result["source_documents"]):
    print(f"{i+1}. 📄 {doc.metadata['source']}")
    snippet = doc.page_content.strip()
    print("   🔎 Snippet:", snippet[:300] + "..." if len(snippet) > 200 else snippet)
    print()

🧠 Answer:

 ServiceNow uses AI in incident management through their AI Ops solution, which utilizes machine learning and pattern recognition to assist and automate the work of IT operations. This helps move away from an incident creation culture to an alert-based, actionable mindset. The AI is built into the ServiceNow platform and is not acquired from other companies, making it a native and additive part of the platform's innovations and investments.

📚 Sources (with snippet preview):

1. 📄 kQV6g8Vbbfc.txt
   🔎 Snippet: Hello everyone, welcome to the KBDI podcast. We are bringing you a new set of topics and you will see a lot of focus on agentic AI starting this year. So as you probably know, ServiceNow is very deep into the three pillars that comes to workflow data fabric that combines all data from different disp...

2. 📄 KSWNDuKn9t0.txt
   🔎 Snippet: Hi, this is Evans from the Technical Product and Solutions Marketing Team at ServiceNow. Today we're going to talk about ServiceNow I