#### Section 1: Defining your Problem and Audience

1.1 Write a succinct 1-sentence description of the problem.

<div style="background-color:rgb(245, 230, 250); padding: 10px; border-radius: 5px;">
<span style="color: black;">

<b> ANSWER:  </b> 

Travelers face challenges in identifying optimal destinations aligned with their personal preferences, constructing efficient itineraries, and obtaining authentic pre-trip insights due to information overload, fragmented travel information, uncertainty regarding on-site conditions (including climate, atmosphere, and footfall / crowd volume), and unclear cost burden.

</span>
</div>

1.2 Write 1-2 paragraphs on why this is a problem for your specific user.

<div style="background-color:rgb(245, 230, 250); padding: 10px; border-radius: 5px;">
<span style="color: black;">

<b> ANSWER:  </b>   

<b> 1. Lack of unbiased, genuine source of information</b> : For digital-savvy, upwardly mobile but budget-conscious traveler segment, planning a trip can be frustrating and time-consuming. While there are popular online itinerary planners from industry leaders such as Expedia, these tools primarily serve as vendor-driven marketplaces rather than objective travel advisory. The search results and recommendations are often skewed by sponsored listings, pushing specific hotels, flights, and experiences that may or may not optimize for travelers' interests. Thus, users are left with the option of browsing through an overwhelming volume of content themselves.

<b> 2. Decision fatigue and suboptimal travel experience</b> : Travelers, despite hyper connectivity, face FOMO (fear of missing out) due to the sheer influx of content across social media and travel websites, making it difficult to determine what is truly worth experiencing. Existing tools provide generic information but fail to give real-time, contextual insights into destinations—such as actual crowd levels, local weather conditions, or a location's current vibe. Thus, there is a lot of uncertainty about whether their chosen spots will match their expectations. Price sensitive groups also find it hard to get a realistic estimate of their total trip costs. 

</span>
</div>

#### Section 2: Propose a Solution

2.1 Write 1-2 paragraphs on your proposed solution.  How will it look and feel to the user?

<div style="background-color:rgb(245, 230, 250); padding: 10px; border-radius: 5px;">
<span style="color: black;">

<b> ANSWER:  </b> 

The solution is a next-generation AI travel assistant powered by a RAG (Retrieval-Augmented Generation) + multi-agent system architecture that aggregates, analyzes, and personalizes travel information from diverse sources. Unlike traditional itinerary planners that primarily serve vendor interests, this AI assistant will act as an unbiased, intelligent advisor, helping users discover destinations, build optimized itineraries, and make data-driven travel decisions based on their unique preferences, constraints, and real-time conditions.

The system will leverage multi-agent coordination to collect and process data from multiple sources, including real-time weather reports, footfall density analysis, cost estimations, and user-generated reviews. Using vector embeddings and a scalable knowledge base, the assistant will provide hyper-personalized recommendations, avoiding the noise of paid promotions and instead focusing on authentic, data-backed insights. It will predict potential trip expenses based on dynamic pricing trends and offer realistic previews of destinations—including expected crowd levels, atmosphere, and local conditions—ensuring users make informed choices. By eliminating information overload and uncertainty, this AI assistant will help budget-conscious travelers maximize their experience while staying within their financial limits.

<b> The look and feel of the final product for the user will be of a state-of-art digital tool that provides dynamic and interactive visualization of the tool's outputs optimized for the users' preferences such as timing, budget, group travelling, and fun preferences. </b> 

</span>
</div>

2.2 Describe the tools you plan to use in each part of your stack.  Write one sentence on why you made each tooling choice. 

<div style="background-color:rgb(245, 230, 250); padding: 10px; border-radius: 5px;">
<span style="color: black;">

<b> ANSWER:  </b> 

<b>1. LLM - OpenAI GPT-3.5</b>- Chosen for its cost-effective yet powerful natural language understanding and generation capabilities, balancing efficiency and affordability for handling user queries.

<b>2. Embedding Model- OpenAIEmbeddings</b>: Selected for its optimized balance of performance and speed in generating vector embeddings, ensuring efficient retrieval of relevant travel information.

<b>3. Orchestration - LangGraph</b>: Used for managing multi-agent workflows, allowing structured interactions between different agents in a RAG-based system to enhance contextual understanding and response generation.

<b>4. Vector Database - FAISS</b>: Chosen for its high-speed and scalable similarity search capabilities, enabling quick retrieval of relevant travel data from large-scale embeddings.

<b>5. Monitoring - LangSmith</b>: Selected for its ability to track, debug, and analyze interactions in LLM-based applications, ensuring transparency, reliability, and continuous improvement in system performance.

<b>6. Evaluation - RAGAS</b>: Used for assessing the quality of retrieval-augmented generation outputs, helping to fine-tune responses by measuring factual accuracy, relevance, and coherence.

<b>7. User Interface - Hugging Face + Streamlit</b>: Hugging Face is chosen for hosting and deploying AI models, while Streamlit provides an interactive and conversational UI for seamless user experience in itinerary planning.
    
</span>
</div>

2.3 Where will you use an agent or agents?  What will you use “agentic reasoning” for in your app?

<div style="background-color:rgb(245, 230, 250); padding: 10px; border-radius: 5px;">
<span style="color: black;">

<b> ANSWER:  </b> 

The next-gen AI travel assistant is structured around four key agents, each responsible for a core aspect of the travel planning experience. These agents work collaboratively using LangGraph to ensure a seamless, data-driven, and highly personalized itinerary-building process.

<b> Also, it is the router agent of this travel AI System that reasons and determines which specialized agent should handle the incoming user query. </b>

<b> 1. Itinerary Agent </b> – This agent curates a personalized travel itinerary by optimizing sightseeing, activities, and experiences based on user preferences, real-time weather, crowd density, and travel feasibility. It ensures that users get a well-balanced plan that maximizes their time while considering FOMO-inducing experiences.

<b> 2. Accommodation Agent </b> – Responsible for retrieving and recommending hotels, hostels, or vacation rentals that fit the user’s budget, preferences, and proximity to key attractions. It prioritizes unbiased, data-driven suggestions over vendor-sponsored listings, ensuring travelers get the best value for their stay.

<b> 3. Information Agent</b> 3. Information Agent – Gathers and synthesizes relevant travel advisory data, including visa requirements, local safety updates, cultural norms, and real-time conditions like weather, footfall density, and local events. This agent ensures travelers have a realistic expectation of their destination.

<b> 4. Flight Agent</b>  – Searches and compares flight options from multiple airlines and aggregators, considering factors like pricing trends, layovers, baggage policies, and hidden costs. It helps users find the most cost-effective and convenient flight options based on their travel dates and preferences.

These four agents work in sync within the multi-agent RAG architecture, ensuring travelers receive accurate, contextual, and unbiased travel planning assistance while reducing decision fatigue.
    
</span>
</div>

### 3. Dealing with the Data

3.1 Describe all of your data sources and external APIs, and describe what you’ll use them for.

<div style="background-color:rgb(245, 230, 250); padding: 10px; border-radius: 5px;">
<span style="color: black;">

<b> ANSWER:  </b> 

I have used two data sources in the current submission - 

<b> a) External API: Amadeus API (Amadeus for Developers)</b> – Offers structured data on flights, hotels, travel restrictions, and itinerary recommendations via RESTful APIs, making it easy to fetch real-time travel insights.  
<b> b) WikiVoyage</b> – Open-source travel guide with structured destination information, making it ideal for automated text extraction.

Going forward, I want to make this travel AI system more robust with addition of following data sources:

<b> (1) External Travel APIs</b> : that provide structured data that can be easily integrated into Python-based data pipelines using libraries like requests, pandas, and asyncio for efficient retrieval and processing. Few examples could be:
- Skyscanner API (RapidAPI Skyscanner) – Provides comprehensive flight, hotel, and car rental data in a structured JSON format, enabling seamless price comparisons.  
- TripAdvisor API (TripAdvisor Data API) – Supplies user-generated reviews, ratings, and destination information that can be parsed and used for sentiment analysis or recommendation models.  
- OpenWeatherMap API (OpenWeatherMap) – Offers real-time and forecasted weather data via an easy-to-use REST API, helping to adjust travel recommendations dynamically.  
- Rome2Rio API (Rome2Rio Transport API) – Provides structured multi-modal transport data (flights, trains, buses, ferries), making route planning efficient for itinerary recommendations.

<b> (2) Other web sources</b> : that provide structured and semi-structured text data that can be scraped or accessed using Python libraries like BeautifulSoup, Scrapy, or newspaper3k. Few examples could be:
- Lonely Planet – Travel guides and expert insights that can be parsed and structured for itinerary recommendations.  
- Nomadic Matt – Budget travel tips and cost breakdowns, useful for trip cost estimation models.  
- Culture Trip – Provides local experiences and cultural insights that can enhance destination recommendations.  
- Travel Stack Exchange – A structured Q&A platform with expert travel advice that can be processed using NLP models for FAQ generation.
    
</span>
</div>

3.2 Describe the default chunking strategy that you will use.  Why did you make this decision?

<div style="background-color:rgb(245, 230, 250); padding: 10px; border-radius: 5px;">
<span style="color: black;">

<b> ANSWER: </b> 

The default chunking strategy I have used is the RecursiveCharacterTextSplitter from LangChain with the following parameters:

1. Chunk size of 4000 characters - This is a larger chunk size because I am cautious about context preservation within chunks.  
2. Chunk overlap of 400 characters - This creates a 10% overlap between adjacent chunks to maintain continuity and context across them.  
3. Multiple separators in hierarchical order ("\n\n", "\n", " ", "") - This is to ensure that splitting happens at natural text boundaries whenever possible, preserving semantic coherence.  

This strategy is appropriate for building the RAG pipeline and its assessment (via RAGAS or some other tool) because:  

1. Requirements for evaluation: To perform well on metrics of retrieval quality, faithfulness, and context relevance, I needed larg chunks to ensure adequate context preservation.  
2. Semantic coherence: Larger chunks would contain more complete thoughts / arguments / explanations as compared to smaller chunks, facilitating easier utilization of information by LLMs. This would mean reduced fragmentation and preservation of contextual relationships.  
3. Balancing retrieval granularity with context: The chosen chunk size strikes a balance between having chunks small enough for precise retrieval but large enough to maintain sufficient context.  

</span>
</div>

3.3 [Optional] Will you need specific data for any other part of your application?   If so, explain.

<div style="background-color:rgb(245, 230, 250); padding: 10px; border-radius: 5px;">
<span style="color: black;">

<b> ANSWER:  </b> 

Since I want to add a distinctive layer of visualization which can be interactive and dynamic in real-time for users (fully sketched out itineraries with suggested spots, expected costs at each point of itinerary (commute, rental, food, etc)), there might be some multi-modal data sources that can be ingested as well into this multi-agent pipeline for additional enhancements.  
Therefore, I will explore the possibility of adding multi-modal intelligence into the final output layer to enhance the contextual richness of the outputs.

</span>
</div>

### 4. Building a Quick End-to-End Prototype

4.1 Build an end-to-end prototype and deploy it to a Hugging Face Space (or other endpoint).

<div style="background-color:rgb(245, 230, 250); padding: 10px; border-radius: 5px;">
<span style="color: black;">

<b> ANSWER:  </b> 

<b>Public deployment of Travel AI Agent System on Hugging Face</b>: https://huggingface.co/spaces/dabraldeepti25/travel-ai-midterm-submission

</span>
</div>

### 5. Creating a Golden Test Data Set

5.1 Assess your pipeline using the RAGAS framework including key metrics faithfulness, response relevance, context precision, and context recall. Provide a table of your output results.

<div style="background-color:rgb(245, 230, 250); padding: 10px; border-radius: 5px;">
<span style="color: black;">

<b> ANSWER:  </b> 

Travel AI Agent Performance Summary:

<b> Performance of Travel AI agent (RAG + multi-agent system) </b>:

|                 | Faithfulness | Answer Relevancy | Context   Precision | Context Recall |
|-----------------|:------------:|:----------------:|:-------------------:|:--------------:|
| Travel AI agent |      16%     |        63%       |         85%         |       68%      |

<b> Sub agent wise performance </b>:

|               | Faithfulness | Answer Relevancy | Context   Precision | Context Recall |
|---------------|:------------:|:----------------:|:-------------------:|:--------------:|
| Accommodation |      13%     |        15%       |         80%         |       71%      |
| Itinerary     |      17%     |        89%       |         95%         |       74%      |
| Information   |      12%     |        73%       |         86%         |       56%      |
| Flight        |      24%     |        84%       |         80%         |       64%      |

</span>
</div>

5.2 What conclusions can you draw about the performance and effectiveness of your pipeline with this information?

<div style="background-color:rgb(245, 230, 250); padding: 10px; border-radius: 5px;">
<span style="color: black;">

<b> ANSWER:  </b> 

Based on the performance metrics of your Travel AI agent (RAG + multi-agent system) and its sub-agents, here are the key conclusions:

<b>1. Overall Performance of the Travel AI Agent </b>
- Good context precision at 85% means the system retrieves mostly relevant data.
- Context recall at 68% is decent indicating a fair level of information completeness.
- However, faithfulness at 16% is critically low, referring to the high potential of frequent hallucinations and inaccurate responses.
- Answer relevancy at 63% is suboptimal implying users are not getting sufficiently useful responses.
- Thus, the system retrieves relevant and precise context but struggles with faithfulness, meaning it may generate factually incorrect responses.

<b>2. Sub-Agent Performance Breakdown: </b>  

- Accommodation is the worst performing sub-agent / component with extremely low faithfulness (13%) and answer relevancy (15%)—it generates mostly incorrect and irrelevant responses.  
- Itinerary has very high answer relevancy (89%) but struggles with hallucination (17% faithfulness).
- Information sub-agent has the worst recall (56%), meaning it retrieves incomplete answers.  
- Flight is the best sub-agent in terms of faithfulness (24%), but still suffers from significant hallucinations.

</span>
</div>

### 6. Fine-Tuning Open-Source Embeddings

6.1 Swap out your existing embedding model for the new fine-tuned version. Provide a link to your fine-tuned embedding model on the Hugging Face Hub.

<div style="background-color:rgb(245, 230, 250); padding: 10px; border-radius: 5px;">
<span style="color: black;">

<b> ANSWER:  </b> 

<b>Public link of fine-tuned embedding model (model card and files) on Hugging Face</b>: https://huggingface.co/dabraldeepti25/embedding-model-midterm-submission-updated

</span>
</div>

### 7. Assessing Performance

7.1 How does the performance compare to your original RAG application? Test the fine-tuned embedding model using the RAGAS frameworks to quantify any improvements. Provide results in a table.

<div style="background-color:rgb(245, 230, 250); padding: 10px; border-radius: 5px;">
<span style="color: black;">

<b> ANSWER:  </b> 

TravelAI Agent System Performance Summary Leveraging Fine-tuned Embedding model

<b> Performance of Travel AI agent (RAG + multi-agent system) </b>:

|                 | Faithfulness | Answer Relevancy | Context   Precision | Context Recall |
|-----------------|:------------:|:----------------:|:-------------------:|:--------------:|
| Travel AI agent |      19%     |        85%       |         83%         |       65%      |


<b> Sub agent wise performance </b>:

|               | Faithfulness | Answer Relevancy | Context   Precision | Context Recall |
|---------------|:------------:|:----------------:|:-------------------:|:--------------:|
| Accommodation |      22%     |        67%       |         80%         |       78%      |
| Itinerary     |      7%      |        89%       |         86%         |       69%      |
| Information   |      16%     |        93%       |         86%         |       58%      |
| Flight        |      17%     |        92%       |         80%         |       72%      |

  
<b> Using the fine-tuned embedding models has significantly improved the answer quality but has not solved for hallucination issues. </b>

|       Metric      | Performance 1 (Original RAG) | Fine-Tuned Embeddings |            Change            |
|:-----------------:|:----------------------------:|:---------------------:|:----------------------------:|
| Faithfulness      | 16%                          | 19%                   | +3% ✅  |
| Answer Relevancy  | 63%                          | 85%                   | +22% ✅  |
| Context Precision | 85%                          | 83%                   | -2% ❌     |
| Context Recall    | 68%                          | 65%                   | -3% ❌     |

Key points to note:  
- Faithfulness increased by 3%, but it is still very low making it a very concerning performance statistic.  
- Answer relevancy jumped by 22%, meaning responses are much more useful.  
- Context precision and recall hasv faintly dropped, indicating the fine-tuned embeddings sacrificed precision and recall marginally for better answer relevancy.

</span>
</div>

7.2 Articulate the changes that you expect to make to your app in the second half of the course. How will you improve your application?

<div style="background-color:rgb(245, 230, 250); padding: 10px; border-radius: 5px;">
<span style="color: black;">

<b> ANSWER:  </b> 

In the next sprint, I want to make the following key improvements to enhance the AI travel assistant to enhance its accuracy, efficiency, and user experience.

<b>1. Expand Data Sources to 10+ </b> – Currently, the system relies on a two data sources (1 external API and 1 public online travel guide). I will integrate additional structured and unstructured data sources, including government tourism APIs, local transit systems, and real-time pricing aggregators, to provide more comprehensive and unbiased travel insights.  

<b>2. Improve Model Performance </b> – By optimizing prompt engineering, retrieval strategies, and response formatting, I intend to enhance the faithfulness, accuracy and contextual relevance of AI-generated travel recommendations. This is meaningful direction from the analysis of the system performance tabulated above. 

<b>3. Fine-tune LLM on Targeted Parameters </b>– While the current setup uses a general-purpose LLM, I will conduct larger-scale fine-tuning iterations, focusing on travel itinerary structuring, and user preference adaptation. I hope this will improve the model’s ability to generate more precise outputs. 

<b>4. Expand Synthetic Dataset for Agent Testing </b>– To ensure robust agent interactions, I will increase the size and diversity of synthetic datasets used for testing. This will help in larger stress-testing of the agent system leading to better generalization in real-world use cases.

</span>
</div>

### 8. Public Github Repo

8.1 A 5-minute (OR LESS) loom video of a live demo of your application that also describes the use case.

<div style="background-color:rgb(245, 230, 250); padding: 10px; border-radius: 5px;">
<span style="color: black;">

<b> ANSWER:  </b> 

<b> Link of Loom video</b>: https://www.loom.com/share/e7e88a35d1ac4ee88d29ed0fbf7556f5?from_recorder=1&focus_title=1

</span>
</div>

8.2 A written document addressing each deliverable and answering each question.

<div style="background-color:rgb(245, 230, 250); padding: 10px; border-radius: 5px;">
<span style="color: black;">

<b> ANSWER:  </b> 

This markdown file codifies answers to all the listed questions and has links to all key deliverables.

</span>
</div>

8.3 All relevant code

<div style="background-color:rgb(245, 230, 250); padding: 10px; border-radius: 5px;">
<span style="color: black;">

<b> ANSWER:  </b> 

<b> 1. Wall of imports</b>: https://github.com/deeptidabral/AIE5/blob/main/Mid-Term%20Project/00_wall_of_imports.ipynb

<b> 2. Data sourcing</b>: https://github.com/deeptidabral/AIE5/blob/main/Mid-Term%20Project/01_data_pipeline_creation.ipynb  

<b> 3. RAG + Multi-agent set-up using LangGraph</b>: https://github.com/deeptidabral/AIE5/blob/main/Mid-Term%20Project/02_agent_setup.ipynb  

<b> 4. Test set generation and application assessment using RAGAS</b>: https://github.com/deeptidabral/AIE5/blob/main/Mid-Term%20Project/03_testset_generation.ipynb

<b> 5. Fine-tuning embedding model</b>: https://github.com/deeptidabral/AIE5/blob/main/Mid-Term%20Project/04_fine_tuning_embedding.ipynb 

<b> 6. Agent rerun using fine-tuned embedding model and rechecking its performance</b>:  https://github.com/deeptidabral/AIE5/blob/main/Mid-Term%20Project/05_agent_rerun_with_fine_tuned_embedding.ipynb

<b> 7. Streamlit app</b>: https://github.com/deeptidabral/AIE5/blob/main/Mid-Term%20Project/Travel_AI_Application/app_streamlit.py

</span>
</div>

### 9. Hugging Face Deliverables

9.1 Share a public (or otherwise shared) link to the final version of your public application on Hugging Face (or other).

<div style="background-color:rgb(245, 230, 250); padding: 10px; border-radius: 5px;">
<span style="color: black;">

<b> ANSWER:  </b> 

<b>Public deployment of Travel AI Agent System on Hugging Face</b>: https://huggingface.co/spaces/dabraldeepti25/travel-ai-midterm-submission

</span>
</div>

9.2 Share the public link to your fine-tuned embedding model on Hugging Face.

<div style="background-color:rgb(245, 230, 250); padding: 10px; border-radius: 5px;">
<span style="color: black;">

<b> ANSWER:  </b> 

<b>Public link of fine-tuned embedding model (model card and files) on Hugging Face</b>: https://huggingface.co/dabraldeepti25/embedding-model-midterm-submission-updated

</span>
</div>