## RAG

R --> Retrival
A --> Augented
G --> Generation

* First the document is stored in vector database.
* We take prompt from the user.
* Relevent information is **Retrived** from the Vector DB on the basis of the prompt.
* This prompt and the context from the retrieved document is **Augmented**.
* This augmented prompt is used to **Generate** the response from LLM.

In [1]:
# Requirements
# pip install faiss-cpu  
# pip install langchain-community 
# pip install sentence-transformers

In [2]:
# pip install langchain-community

In [3]:
# pip install sentence-transformers

In [4]:
# Import the libraries
import os
import google.generativeai as genai
from langchain_community.embeddings import HuggingFaceBgeEmbeddings
from pypdf import PdfReader
from langchain_text_splitters import RecursiveCharacterTextSplitter

import faiss
from langchain_community.vectorstores import FAISS


All support for the `google.generativeai` package has ended. It will no longer be receiving 
updates or bug fixes. Please switch to the `google.genai` package as soon as possible.
See README for more details:

https://github.com/google-gemini/deprecated-generative-ai-python/blob/main/README.md

  import google.generativeai as genai


In [5]:
# First lets configure the model
gemini_key = os.getenv('GOOGLE_API_KEY2')
genai.configure(api_key=gemini_key)
model = genai.GenerativeModel('gemini-2.5-flash-lite')

# Configure Embedding Model

embedding_model = HuggingFaceBgeEmbeddings(model_name ='sentence-transformers/all-MiniLM-L6-v2')

  embedding_model = HuggingFaceBgeEmbeddings(model_name ='sentence-transformers/all-MiniLM-L6-v2')


In [6]:
# Step2: Get the document and extract the text
# First lets configure the Model

pdf_file = PdfReader(r'D:\python.workspace\Minutes of meeting\MoMGenerator\Minutes of Meeting.pdf')

raw_text = ''
for page in pdf_file.pages:
    text = page.extract_text()
    if text:
        raw_text = raw_text + text + '\n'
        

In [7]:
print(raw_text)

Case Study: AI-Powered Modular Minutes of Meeting 
(MoM) Generator 
GitHub Repository: github.com/mukul-mschauhan/Minutes-of-Meeting 
Live App: generate-mom.streamlit.app 
 
Executive Summary 
In the dynamic landscape of construction, civil, and project management domains, 
recording, interpreting, and structuring meeting minutes remains a labor-intensive, error-
prone, and often unstandardized process. With the proliferation of hybrid documentation 
formats such as handwritten notes, scanned PDFs, or mobile-clicked images, teams 
struggle to translate raw information into actionable, uniform, and digitally processable 
records. This case study introduces an innovative solution: an AI-powered Modular MoM 
Generator that leverages advanced vision and generative AI models to streamline, 
standardize, and automate the generation of high-quality Minutes of Meeting across 
multiple formats. 
 
Problem Statement 
Despite the rising adoption of ERP and project collaboration platforms, large-s

In [8]:
# Step 3: Chunking
# First we need to split the text

splitter = RecursiveCharacterTextSplitter(chunk_size =1000, chunk_overlap=200)
chunks = splitter.split_text(raw_text)


In [9]:
len(chunks)

7

In [10]:
# Step 4: Create the vector database
vector_store = FAISS.from_texts(chunks,embedding_model)


In [11]:
# Step 5: Get the prompt from the user

prompt = 'Give the brief introduction of the authors of this report.'


In [12]:
# Step 6: Retrival (R)
retriever = vector_store.as_retriever(search_kwargs ={'k':3})
retrived_docs = retriever.invoke(prompt)

In [13]:
retrived_docs

[Document(id='ccf2e26c-ec70-4d58-b1d9-b9d5c6eba71b', metadata={}, page_content='standardize, and automate the generation of high-quality Minutes of Meeting across \nmultiple formats. \n \nProblem Statement \nDespite the rising adoption of ERP and project collaboration platforms, large-scale \nindustries such as construction, real estate, civil engineering, and infrastructure still rely \nheavily on manual note-taking and non-standard formats to record meeting discussions. \nKey challenges include: \n\uf0b7 Fragmented Documentation: Teams record updates via notebooks, WhatsApp \nimages, printed PDFs, or Excel sheets—making it hard to extract a coherent \nsummary. \n\uf0b7 Loss of Accountability: Without a structured format, assigning responsibilities, \ndeadlines, or measuring progress becomes cumbersome. \n\uf0b7 Delayed Decision Making: Project delays often occur due to missed \ncommunication or unrecorded discussions in project meetings. \n\uf0b7 Data Silos: Unstructured and siloed M

In [14]:
context = '\n'.join([d.page_content for d in retrived_docs])

In [15]:
context

'standardize, and automate the generation of high-quality Minutes of Meeting across \nmultiple formats. \n \nProblem Statement \nDespite the rising adoption of ERP and project collaboration platforms, large-scale \nindustries such as construction, real estate, civil engineering, and infrastructure still rely \nheavily on manual note-taking and non-standard formats to record meeting discussions. \nKey challenges include: \n\uf0b7 Fragmented Documentation: Teams record updates via notebooks, WhatsApp \nimages, printed PDFs, or Excel sheets—making it hard to extract a coherent \nsummary. \n\uf0b7 Loss of Accountability: Without a structured format, assigning responsibilities, \ndeadlines, or measuring progress becomes cumbersome. \n\uf0b7 Delayed Decision Making: Project delays often occur due to missed \ncommunication or unrecorded discussions in project meetings. \n\uf0b7 Data Silos: Unstructured and siloed MoMs prevent integration with digital \ndashboards or knowledge systems.\n\uf0b7

In [None]:
# Step 7: Augmenting (A)
augmented_prompt = f'''
<Role> You are a helpful assistant using RAG.
<Goal> Answer the question asked by the user.here the question: {prompt}.
<Context> Here are the documents retrived from the vector database to support the answer which you have to generate: {context}.'''

In [17]:
# step 8: Generation (G)
response = model.generate_content(augmented_prompt)
print(response.text)

The 1918 Spanish Flu pandemic killed an estimated 50 million people worldwide.
