# üîç Retrieval Layer (RAG) ‚Äî Extracting the Correct Proposal Sections

## üìå Objective
The Retrieval layer is responsible for **pulling the most relevant text chunks** from proposals so the LLM can generate clean, accurate sections such as:

- Executive Summary  
- Background / Context  
- Problem / Complication  
- Proposed Solution  
- Approach / Methodology  
- Costing & Pricing

For this POC we are generating only executive summary.

Since every proposal has its own structure and formatting, retrieval must be **robust, hybrid, and section-aware**.

From vectostore we are pully most relevant executive summary which will help LLM to understand how real proposal's executive summary is written and then we will also pass client's relevant executive summary or any other section to copy the firm's tone and maintain that.

We are trying to achieve 2 things here:

1) Learn from actual proposals how each section is written and
2) Maintain the tone of the firm so that it should not look like genarte by some genral pusrpose LLMs.

In [None]:
import os
from dotenv import load_dotenv
from langchain_community.embeddings import HuggingFaceBgeEmbeddings
from langchain_community.vectorstores import Chroma
from langchain_core.runnables import RunnablePassthrough, RunnableParallel
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import PromptTemplate
from langchain_groq import ChatGroq

In [None]:
load_dotenv()

In [None]:
bge_embedding = HuggingFaceBgeEmbeddings(
    model_name='BAAI/bge-large-en-v1.5',
    model_kwargs={"device": "cpu"},
)

In [None]:
vector_store = Chroma(
    persist_directory="./bge_chroma_store",
    embedding_function=bge_embedding
)

In [None]:
def format_docs(docs):
    return "/n/n".join([doc.page_content for doc in docs])

In [None]:
retriever = vector_store.as_retriever()

In [None]:
llm = ChatGroq(
    model="llama-3.1-8b-instant",
    api_key=os.getenv('GROQ_API_KEY'),
    temperature=0,
    model_kwargs={
        "top_p": 1    
    }
)

In [None]:
prompt = PromptTemplate(input_varibale=["firm_style", "context"], template="""
You are an expert proposal writer. Your task is to generate the ‚ÄúExecutive Summary‚Äù section for a Digital Marketing proposal.

Use the following two inputs:
1. CONTEXT: Executive summary fetched from various real proposals example related to digital marketing.
2. FIRM_STYLE: Real proposal examples from the consulting firm. These may be unrelated to Digital Marketing, but must be used to infer tone and writing patterns.

--- STRICT INSTRUCTIONS ---

‚Ä¢ First, analyze FIRM_STYLE to understand:
  - Tone (formal, concise, business-oriented, narrative, etc.)
  - Writing structure (bullets vs. paragraphs, storytelling patterns)
  - Vocabulary and phrasing patterns

‚Ä¢ Then generate the Executive Summary for Digital Marketing by:
  - Learning section length (short/long, number of paragraphs) from given CONTEXT
  - Learn how different firms are actually writing executive summary
  - Tone + structure learned from FIRM_STYLE

‚Ä¢ Do NOT copy text from FIRM_STYLE. Only learn style and format.

‚Ä¢ Executive Summary must:
  - Clearly articulate client‚Äôs situation, challenges, and desired outcomes
  - Present a high-level view of the digital marketing solution
  - Reflect the same level of polish as a real firm proposal
  - Match the expected length based on CONTEXT and FIRM_STYLE

‚Ä¢ If CONTEXT lacks some information, keep the narrative general without inventing specifics.

--- OUTPUT FORMAT ---
Write only the ‚ÄúExecutive Summary‚Äù section.
Do not include headings like CONTEXT ANALYSIS or STYLE ANALYSIS in your output.

CONTEXT:
{context}

FIRM_STYLE:
{firm_style}
""")

In [None]:
chain = (RunnableParallel({
    'firm_style': lambda x: x['firm_style'],
    'context': lambda x: format_docs(retriever.invoke(x['about']))
})
| prompt
| llm
| StrOutputParser)

In [None]:
about = "We are a performance-focused digital marketing firm specializing in end-to-end growth strategies across SEO, paid media, content, and analytics. Our team blends creative storytelling with data intelligence to drive measurable ROI and long-term brand visibility. We work closely with clients to understand their audience, optimize digital activation, and accelerate engagement across channels. Our solutions are tailored, scalable, and aligned with each client‚Äôs evolving market objectives."
firm_style = "Our firm partners with clients to navigate complex business challenges through data-driven insights and structured problem-solving. We emphasize clarity, measurable outcomes, and a collaborative approach that aligns strategic priorities with operational execution. Our proposals maintain a concise narrative supported by evidence-based recommendations. Each engagement is designed to deliver sustainable value with a clear roadmap for implementation."

In [None]:
ans = chain.invoke({'firm_style': firm_style, 'about': about})