# Retrieval Augmented Generation

Table of Contents
1. Idea
2. Naive Implementation
3. Graph Implementation

In [1]:
import pandas as pd
from language_models.proxy_client import BTPProxyClient
from language_models.agents.react import ReActAgent
from language_models.agents.chain import AgentChain
from language_models.tools.tool import Tool
from language_models.models.llm import OpenAILanguageModel
from language_models.models.embedding import SentenceTransformerEmbeddingModel
from language_models.retrievers import BasicRetriever, ContextualCompressionRetriever
from language_models.retrievers.utils import split_documents
from language_models.vector_stores import FAISSVectorStore, DistanceMetric
from language_models.settings import settings
from langchain_core.documents import Document
from pydantic import BaseModel, Field
from numpy import dot
from numpy.linalg import norm
from pathlib import Path
from langchain_core.documents import Document
from utils import load_docs_from_json, save_docs_to_json
from pprint import pprint

02/06/24 20:38:25 INFO Load pretrained SentenceTransformer: sentence-transformers/all-MiniLM-L6-v2
02/06/24 20:38:31 INFO Use pytorch device_name: mps
02/06/24 20:38:31 INFO Loading faiss.
02/06/24 20:38:31 INFO Successfully loaded faiss.


In [2]:
proxy_client = BTPProxyClient(
    client_id=settings.CLIENT_ID,
    client_secret=settings.CLIENT_SECRET,
    auth_url=settings.AUTH_URL,
    api_base=settings.API_BASE,
)

In [3]:
path = Path("./data/jobs")
filenames = [file.name for file in path.iterdir() if file.is_file()]

documents = []
for filename in filenames:
    file_path = path / filename
    with open(file_path, "r", encoding="utf-8", errors="replace") as file:
        content = file.read()
        documents.append(Document(page_content=content, metadata={"source": file_path}))

In [4]:
system_prompt = """Take the following job and extract data about the job.

Respond with the following extracted data:
- job_title: The job title."""

llm = OpenAILanguageModel(
    proxy_client=proxy_client,
    model='gpt-4',
    max_tokens=128,
    temperature=0.7,
)

class Job(BaseModel):
    job_title: str = Field(description="The job title.")

job_data_agent = ReActAgent.create(
    llm=llm,
    system_prompt=system_prompt,
    task_prompt="Job description:\n{job}",
    task_prompt_variables=["job"],
    tools=None,
    output_format=Job,
    iterations=5,
)

In [5]:
def extract_job_titles(documents: list[Document]) -> pd.DataFrame:
    for document in documents:
        response = job_data_agent.invoke({"job": document.page_content})
        job_title = response.final_answer["job_title"]
        document.metadata["job_title"] = job_title
        job_data_agent.reset()
    return documents

In [6]:
try:
    documents = load_docs_from_json('./data/jobs.json')
except:
    documents = extract_job_titles(documents[:30])
    save_docs_to_json(documents, './data/jobs.json')

In [7]:
documents = split_documents(documents, separators=["\n\n", "\n", " ", ""], chunk_size=1000, chunk_overlap=100)

## Idea

Integrating RAG into AI systems can significantly enhance LLM responses. RAG allows the LLM to reference a knowledge base outside of its training data sources before generating a response. This is particularly useful when dealing with documents or content specific to your business that LLMs may not be familiar with. By using external information to ground the LLM, it can effectively answer questions related to those topics. However, careful implementation is essential.

![rag](../img/rag.png)

In [8]:
llm = OpenAILanguageModel(
    proxy_client=proxy_client,
    model="gpt-4",
    max_tokens=256,
    temperature=0.7,
)

As a company, part of our operations involves hiring individuals for a variety of positions. To facilitate the hiring process and improve efficiency, we've incorporated the use of an LLM to assist us. Our hiring process entails the collaboration of various entities, including recruiters, hiring managers, engineers, applicants, departments, and the specific job roles themselves. For this demonstration, we'll use the LLM to respond to inquiries related to our recruitment procedures and overall business operations.

In [9]:
system_prompt = "You are an expert in job postings. Respond with the most accurate information about the job."

class Output(BaseModel):
    content: str = Field(description="The final answer.")

agent = ReActAgent.create(
    llm=llm,
    system_prompt=system_prompt,
    task_prompt="{question}",
    task_prompt_variables=["question"],
    output_format=Output,
    iterations=5,
)

In this example, we ask the LLM about the salary range for one of our job positions, an airport engineer. Since the LLM doesn't have specific information about our company, it uses its internal knowledge to generate an answer based on information it has encountered on the internet.

In [10]:
response = agent.invoke({"question": "What is the salary range of an airport engineer."})

02/06/24 20:38:34 INFO Prompt:
What is the salary range of an airport engineer.
02/06/24 20:38:40 INFO Raw response:
{
  "thought": "I need to provide information on the salary range of an airport engineer.",
  "tool": "Final Answer",
  "tool_input": {
    "content": "The salary range of an airport engineer can vary widely depending on the location, years of experience, specific duties, and other factors. On average, the salary can range from $65,000 to over $100,000 per year in the United States. However, this can be higher in areas with a higher cost of living or for more specialized roles."
  }
}
02/06/24 20:38:40 INFO Thought:
I need to provide information on the salary range of an airport engineer.
02/06/24 20:38:40 INFO Final answer:
{'content': 'The salary range of an airport engineer can vary widely depending on the location, years of experience, specific duties, and other factors. On average, the salary can range from $65,000 to over $100,000 per year in the United States. How

In [11]:
pprint(response.final_answer["content"])

('The salary range of an airport engineer can vary widely depending on the '
 'location, years of experience, specific duties, and other factors. On '
 'average, the salary can range from $65,000 to over $100,000 per year in the '
 'United States. However, this can be higher in areas with a higher cost of '
 'living or for more specialized roles.')


Here, we provide the necessary information about the job from our document, which grounds the LLM so it can give the user an accurate response.

In [12]:
question = """What is the salary range of an airport engineer.

Use this context to answer the question:
AIRPORT ENGINEER
Class Code:       7256
Open Date:  07-06-18
(Exam Open to All, including Current City Employees)

ANNUAL SALARY

$105,005 to $153,509 and $111,854 to $163,532."""

response = agent.invoke({"question": question})

02/06/24 20:38:40 INFO Prompt:
What is the salary range of an airport engineer.

Use this context to answer the question:
AIRPORT ENGINEER
Class Code:       7256
Open Date:  07-06-18
(Exam Open to All, including Current City Employees)

ANNUAL SALARY

$105,005 to $153,509 and $111,854 to $163,532.
02/06/24 20:38:43 INFO Raw response:
{
  "thought": "The salary range for the airport engineer position is provided in the job posting.",
  "tool": "Final Answer",
  "tool_input": {
    "content": "The annual salary for the Airport Engineer position ranges from $105,005 to $163,532."
  }
}
02/06/24 20:38:43 INFO Thought:
The salary range for the airport engineer position is provided in the job posting.
02/06/24 20:38:43 INFO Final answer:
{'content': 'The annual salary for the Airport Engineer position ranges from $105,005 to $163,532.'}


In [13]:
pprint(response.final_answer["content"])

('The annual salary for the Airport Engineer position ranges from $105,005 to '
 '$163,532.')


To effectively compare documents, we need to create embeddings that capture the semantic meaning of the questions posed to the LLM. Simply put, this involves converting text into numerical representations, aka vectors, using an embedding model.

![embedding](../img/embedding.png)

In [14]:
embedding_model = SentenceTransformerEmbeddingModel(model="all-MiniLM-L6-v2")

In [15]:
query = "What is the salary range of an airport engineer."
embedding1 = embedding_model.embed_query(query)

In [16]:
print(embedding1)

[0.061040811240673065, 0.006013249978423119, 0.012034785002470016, 0.09349054843187332, -0.03897635266184807, -0.041639544069767, 0.052157122641801834, 0.04001327604055405, -0.040530331432819366, -0.0388326458632946, -0.057182084769010544, -0.07547193765640259, -0.04045605659484863, 0.005403607618063688, -0.053804319351911545, 0.0300196073949337, -0.019651353359222412, -0.0824994146823883, 0.030851002782583237, -0.10073095560073853, 0.07586315274238586, 0.032726261764764786, 0.07222713530063629, -0.0612051896750927, 0.1382749229669571, -0.014632114209234715, 0.10441509634256363, 0.034444037824869156, 0.07069538533687592, 0.02793906442821026, 0.0020828156266361475, 0.039086028933525085, -0.02007865719497204, 0.019540050998330116, 0.043947625905275345, 0.01898784562945366, -0.025768116116523743, -0.002511907834559679, 0.05089978873729706, 0.005661378148943186, -0.03936994448304176, 0.018286826089024544, 0.010302698239684105, -0.017223015427589417, -0.061629876494407654, -0.02500708401203

Next, we also convert our documents into vectors so we can compare them to the user's question. By embedding both the question and the documents, we project the user's question into the vector space of the documents to identify the closest neighbors, such as the 5 most similar documents.

![vector-space](../img/vector-space.png)

In [17]:
document = """AIRPORT ENGINEER
Class Code:       7256
Open Date:  07-06-18
(Exam Open to All, including Current City Employees)

ANNUAL SALARY

$105,005 to $153,509 and $111,854 to $163,532."""

embedding2 = embedding_model.embed_query(document)

To compute the similarities between vectors, we can use mathematical formulas such as cosine similarity, euclidean distance, or inner product. The output will indicate how similar the vectors are.

Differences between value ranges:

- **Euclidean distance:** Measures the straight-line distance between two points. Smaller values indicate greater similarity.
- **Cosine similarity:** Measures the cosine of the angle between two vectors, ranging from -1 to 1. Values closer to 1 indicate greater similarity.
- **Inner product:** Measures the dot product of two vectors. Higher values typically indicate greater similarity.

In [18]:
cosine_similarity = dot(embedding1, embedding2) / (norm(embedding1) * norm(embedding2))
print(f"Cosine similarity: {cosine_similarity}")

Cosine similarity: 0.7345214541574135


## Naive Implementation

For the simplest implementation, we embed our available documents, chunk them into smaller pieces if necessary, and store them in a vector database. For this demonstration, we'll use FAISS.

In [19]:
try:
    vector_store = FAISSVectorStore.load_local("./data", "job_embeddings")
except:
    vector_store = FAISSVectorStore.from_documents(
        documents=documents,
        embedding_model=embedding_model,
        distance_metric=DistanceMetric.COSINE_SIMILARITY,
    )
    vector_store.save_local("./data", "job_embeddings")

To reduce costs, instead of adding additional context from documents to every user question — which can worsen the answer quality if the LLM doesn't need external knowledge — we'll convert the RAG functionality into a tool. This way, the LLM can autonomously decide when to search the documents.

In [20]:
tool_name = "Search"
tool_description = "Use this tool to search job postings."

class Search(BaseModel):
    user_text: str = Field(description="The user question/prompt/text.")
    fetch_k: int = Field(5, description="The number of documents to return.")

In our basic retriever, we've implemented everything discussed so far. When a user asks a question, the LLM uses the search tool to find relevant documents and then provides an answer based on the information from those documents.

In [21]:
basic_retriever = BasicRetriever(
    vector_store=vector_store,
    score_threshold=0.0
)

basic_retriever_tool = Tool(
    func=basic_retriever.get_relevant_documents,
    name=tool_name,
    description=tool_description,
    args_schema=Search,
)

In [22]:
system_prompt = """You are an expert in job postings. Respond with the most accurate information about the job.

Use the search tool to answer the user's question."""

class Output(BaseModel):
    content: str = Field(description="The final answer.")

basic_retriever_agent = ReActAgent.create(
    llm=llm,
    system_prompt=system_prompt,
    task_prompt="{question}",
    task_prompt_variables=["question"],
    output_format=Output,
    tools=[basic_retriever_tool],
    iterations=5,
)

In [23]:
response = basic_retriever_agent.invoke({"question": "Give me the job description of an airport engineer."})

02/06/24 20:38:46 INFO Prompt:
Give me the job description of an airport engineer.
02/06/24 20:38:49 INFO Raw response:
{
  "thought": "I need to find a job description for an airport engineer.",
  "tool": "Search",
  "tool_input": {"user_text": "airport engineer job description", "fetch_k": 5}
}
02/06/24 20:38:49 INFO Thought:
I need to find a job description for an airport engineer.
02/06/24 20:38:49 INFO Tool:
Search
02/06/24 20:38:49 INFO Tool input:
{'user_text': 'airport engineer job description', 'fetch_k': 5}
02/06/24 20:38:49 INFO Tool response:
The examination will consist entirely of an evaluation of experience and personal qualifications by interview.  In the interview, the following competencies may be evaluated: Judgment and Decision Making, Initiative, Conscientiousness, Innovation, Emotional Maturity, Credibility, Leadership, and Job Knowledge, including knowledge of: Los Angeles World Airport organizational structure; local, state, and federal laws and regulations gove

In [24]:
pprint(response.final_answer["content"])

('An Airport Engineer performs professional engineering work in the planning, '
 'design, construction, maintenance, and operation of landside facilities, '
 'structures, pavement and support systems at an airport. They also apply '
 'sound supervisory principles and techniques in building and maintaining an '
 'effective work force and fulfill equal employment opportunity '
 'responsibilities. Their job knowledge includes engineering principles and '
 'practices related to civil, structural, electrical, mechanical, and/or '
 'communications; technical requirements and regulations related to the '
 'Federal Aviation Administration (FAA) and Transportation Security '
 'Administration (TSA) standards; layout, functions and components of airfield '
 'operational areas (AOA); and layout, functions, and components of airport '
 'terminal facility operations.')


Another popular implementation that draws attention is contextual compression, where two LLMs are used. One LLM is tasked with evaluating documents retrieved from the vector database, filtering out irrelevant ones. Each document is sequentially presented to this LLM to determine its relevance to the user's question. After filtering, the remaining documents are then provided to the 2nd LLM, which interacts directly with the user and provides the final answer.

In [25]:
contextual_compression_retriever = ContextualCompressionRetriever(
    llm=llm,
    vector_store=vector_store,
    score_threshold=0.0,
)

contextual_compression_retriever_tool = Tool(
    func=contextual_compression_retriever.get_relevant_documents,
    name=tool_name,
    description=tool_description,
    args_schema=Search,
)

In [26]:
system_prompt = """You are an expert in job postings. Respond with the most accurate information about the job.

Use the search tool to answer the user's question."""

class Output(BaseModel):
    content: str = Field(description="The final answer.")

contextual_compression_retriever_agent = ReActAgent.create(
    llm=llm,
    system_prompt=system_prompt,
    task_prompt="{question}",
    task_prompt_variables=["question"],
    output_format=Output,
    tools=[contextual_compression_retriever_tool],
    iterations=5,
)

In [27]:
response = contextual_compression_retriever_agent.invoke({"question": "Give me the job description of an airport engineer."})

02/06/24 20:38:55 INFO Prompt:
Give me the job description of an airport engineer.
02/06/24 20:38:58 INFO Raw response:
{
  "thought": "I need to find the job description for an airport engineer.",
  "tool": "Search",
  "tool_input": {"user_text": "airport engineer job description", "fetch_k": 5}
}
02/06/24 20:38:58 INFO Thought:
I need to find the job description for an airport engineer.
02/06/24 20:38:58 INFO Tool:
Search
02/06/24 20:38:58 INFO Tool input:
{'user_text': 'airport engineer job description', 'fetch_k': 5}
02/06/24 20:39:05 INFO Tool response:
AIRPORT ENGINEER
Class Code:       7256
Open Date:  07-06-18
(Exam Open to All, including Current City Employees)

ANNUAL SALARY 
 
$105,005 to $153,509 and $111,854 to $163,532     

NOTES:

1. For information regarding reciprocity between the City of Los Angeles departments and LADWP, go to http://per.lacity.org/Reciprocity_CityDepts_and_DWP.pdf.
2. The current salary range is subject to change. You may confirm the starting salar

In [28]:
pprint(response.final_answer["content"])

('An Airport Engineer performs professional engineering work in the planning, '
 'design, construction, maintenance, and operation of landside facilities, '
 'structures, pavement and support systems at an airport. The duties also '
 'include applying sound supervisory principles and techniques in building and '
 'maintaining an effective work force, and fulfilling equal employment '
 'opportunity responsibilities. The job requires knowledge of engineering '
 'principles and practices related to civil, structural, electrical, '
 'mechanical, and/or communications; technical requirements and regulations '
 'related to the Federal Aviation Administration (FAA) and Transportation '
 'Security Administration (TSA) standards; layout, functions and components of '
 'airfield operational areas (AOA); layout, functions, and components of '
 'airport terminal facility operations; and other necessary knowledge, skills, '
 'and abilities.')


Every naive implementation of RAG, including those demonstrated previously and others, encounters a common challenge.

Naive RAG is adequate for handling highly specific objects. For instance, if the vector space contains only documents related to earthquakes or a particular car model, naive RAG might suffice. However, such simplistic RAG solutions present difficulties in ensuring that the retrieved documents contain the necessary context to address the query effectively. For example, when projecting an embedding into the vector space and examining the closest 5 neighbors or the 5 most similar documents, there's no guarantee that we'll only obtain documents related to the specific object of interest.

![naive-rag-vector-space](../img/naive-rag-vector-space.png)

In our basic demonstration, we've confined our vector space to contain solely documents about the job listings; nonetheless, the LLM's responses remain subpar. After reviewing the logs of the LLM's Chain-of-Thought process, it is evident that the search tool retrieves irrelevant documents, such as those discussing unrelated roles like an airport police captain. While the information about the airport police captain could potentially be used to formulate an answer, it ultimately depends on the LLM. However, we can enhance the process by excluding such irrelevant documents from the beginning.

Now, envision a scenario where our document repository extends beyond job-related materials to encompass various entities integral to our recruitment process. Applicants provide documents such as resumes and cover letters; recruiters, engineers, and managers maintain candidate-related notes from interviews; departments house documents outlining team compositions and project details for prospective hires. If all these documents were incorporated into the vector space, it would worsen the issue, leading to further deterioration in the LLM's responses.

This challenge arises from the ambiguity in document usage and accessibility. For instance, when querying about an airport engineer position, applicant documents may mention prior experience in the role, recruiters/engineers may record interactions related to airport engineering roles, and departments may outline their need for such positions. This lack of control over document utilization and access complicates the task and contributes to the degradation of the LLM's responses. For improved outcomes, we can use knowledge graphs.

## Graph Implementation

Implementing knowledge graph-based RAG presents a pragmatic solution. When considering knowledge graphs, we often envision structures similar to the figure below.

![knowledge-graph](../img/knowledge-graph.png)

Nevertheless, we need to recalibrate our perception of knowledge graphs within the framework of LLMs. Ideally, we aim to create a digital twin of our business processes, wherein nodes symbolize objects and edges denote their links. Since the use case was introduced earlier, we're already familiar with the entities engaged in the hiring process. This includes applicants, who provide documents such as cover letters and resumes, along with additional metadata like their name, birthday, degrees, and responses to application questions. Additionally, individuals involved in the process — such as recruiters, engineers, and hiring managers — may maintain notes about applicants from interviews, alongside metadata such as name, birthday, and department affiliation. Furthermore, departments play a role, providing both metadata and documents, and the job postings themselves contain metadata such as job title, salary range, department, and required technical skills. The figure below provides a basic outline of the process. While it may not depict every detail accurately, it conveys the overall concept effectively.

![knowledge-graph-vector-space](../img/knowledge-graph-vector-space.png)

Now that we've established the graph, we can utilize traversal methods like depth-first search and breadth-first search. This enables us to initially locate the relevant object or entity related to the question. Moreover, we can leverage the available metadata to refine our search process, effectively reducing the pool of documents to be searched. Essentially, this allows us to exclude entirely irrelevant documents and focus solely on those that are highly relevant. This structure also provides enhanced control and security. We can decide how the graph is traversed and specify which data, including metadata, the LLM is permitted to access.

In [29]:
def create_dataset(documents: list[Document]) -> pd.DataFrame:
    data = []
    for document in documents:
        embedding = embedding_model.embed_query(document.page_content)
        data.append({
            "job_title": document.metadata.get("job_title") or "",
            "text": document.page_content,
            "embedding": embedding,
            "source": document.metadata.get("source") or "",
        })
    return pd.DataFrame(data)

In [30]:
df = create_dataset(documents)
data = {"jobs": df}

In our simple scenario focusing solely on job-related data, where we solely retain metadata regarding job titles, we'll implement a chain of LLMs. The initial LLM is tasked with identifying the precise job title based on the inquiry, while the subsequent LLM utilizes this information to search relevant documents. It applies the identified job title as metadata to refine the document pool, thereby narrowing down the selection before conducting a similarity search to pinpoint the most relevant segments for the specified role of an airport engineer.

| Input |
|------|
| question |

<br/>

| LLM that finds the job title |
|------|
| `Input` question |
| `Output` job title |

<br/>

| LLM that answer the user question |
|------|
| `Input` job title |
| `Output` content |

<br/>

| Output |
|------|
| content |

In [31]:
def get_available_job_titles() -> list[str]:
    return data["jobs"].job_title.unique().tolist()

get_jobs_tool = Tool(
    func=get_available_job_titles,
    name="Get Available Job Titles",
    description="Use this tool to get the job titles.",
    args_schema=None,
)

In [32]:
class Job(BaseModel):
    job_title: str = Field(description="The job title.")

job_agent = ReActAgent.create(
    llm=llm,
    system_prompt="",
    task_prompt="{question} \n\nRespond with the job title.",
    task_prompt_variables=["question"],
    output_format=Job,
    tools=[get_jobs_tool],
    iterations=10,
)

In [33]:
class Search(BaseModel):
    user_text: str = Field(description="The user question/prompt/text.")
    fetch_k: int = Field(5, description="The number of documents to return.")
    job_title: str = Field(description="The job title to filter for. Must be all caps.")

def search(user_text: str, fetch_k: int, job_title: str) -> str:

    def calculate_cosine_similarity(user_text_embedding, embedding):
        cosine_similarity = dot(user_text_embedding, embedding) / (norm(user_text_embedding) * norm(embedding))
        return cosine_similarity

    user_text_embedding = embedding_model.embed_query(user_text)
    df = data["jobs"]
    df = df.loc[df["job_title"] == job_title.upper()].copy()
    df["cosine_similarity"] = df.embedding.apply(lambda embedding: calculate_cosine_similarity(user_text_embedding, embedding))
    df = df.sort_values(by="cosine_similarity", ascending=False)
    df = df.iloc[:fetch_k]
    documents = "\n\n".join(df.text.tolist())
    return f"Context:\n\n{documents}"


search_tool = Tool(
    func=search,
    name=tool_name,
    description=tool_description,
    args_schema=Search,
)

In [34]:
system_prompt = """You are an expert in job postings. Respond with the most accurate information about the job.

Use the Search tool to find the job description."""

class Output(BaseModel):
    content: str = Field(description="The final answer.")

agent = ReActAgent.create(
    llm=llm,
    system_prompt=system_prompt,
    task_prompt="{job_title}",
    task_prompt_variables=["job_title"],
    output_format=Output,
    tools=[search_tool],
    iterations=10,
)

In [35]:
chain = AgentChain(
    chain=[job_agent, agent],
    chain_variables=["question"],
)

In [36]:
response = chain.invoke({"question": "Give me the job description of an airport engineer."})

02/06/24 20:39:50 INFO Prompt:
Give me the job description of an airport engineer. 

Respond with the job title.
02/06/24 20:39:52 INFO Raw response:
{
  "thought": "I need to find the job title for an 'airport engineer'. I should use the Get Available Job Titles tool to find the correct job title.",
  "tool": "Get Available Job Titles",
  "tool_input": {}
}
02/06/24 20:39:52 INFO Thought:
I need to find the job title for an 'airport engineer'. I should use the Get Available Job Titles tool to find the correct job title.
02/06/24 20:39:52 INFO Tool:
Get Available Job Titles
02/06/24 20:39:52 INFO Tool input:
{}
02/06/24 20:39:52 INFO Tool response:
['SENIOR HOUSING INSPECTOR', 'LEGISLATIVE ASSISTANT', 'DISTRICT SUPERVISOR ANIMAL SERVICES', 'LEGISLATIVE REPRESENTATIVE', 'Gallery Attendant', 'TRUCK AND EQUIPMENT DISPATCHER', 'COMMUNICATIONS CABLE SUPERVISOR', 'SHIP CARPENTER', 'PRINCIPAL SECURITY OFFICER', 'PRINCIPAL CIVIL ENGINEERING DRAFTING TECHNICIAN', 'AIRPORT ENGINEER', 'BENEFITS S

In [37]:
pprint(response.final_answer["content"])

('An Airport Engineer is responsible for the planning, design, construction, '
 'maintenance, and operation of landside facilities, structures, pavement and '
 'support systems at an airport. This role also includes supervisory '
 'responsibilities. The salary range is from $105,005 to $163,532. The minimum '
 'qualifications include two to five years of full-time paid professional '
 'experience in civil, structural, mechanical, electrical, or communication '
 'engineering in the design, construction, management or engineering of '
 'airport/aviation projects or programs; and a valid license as a Professional '
 'Engineer with the California State Board of Registration for Professional '
 'Engineers.')


As evident from the improved response of the LLM, our approach ensures that the model exclusively accesses documents about an airport engineer, thereby enhancing its performance. Scaling the solution to encompass multiple entities and numerous documents allows us to effectively filter out irrelevant content, concentrating solely on significant documents. Consequently, this significantly boosts the likelihood of the LLM receiving high-quality documents to address the task. If RAG doesn't improve the LLM's answer quality, you might consider taking an additional step: fine-tuning a model specifically for the domain and optionally integrating RAG for further enhancement.

Passion leads to results. Happy hacking :)