#### Section 1: Defining your Problem and Audience

1.1 Write a succinct 1-sentence description of the problem.

<div style="background-color:rgb(245, 230, 250); padding: 10px; border-radius: 5px;">
<span style="color: black;">

<b> ANSWER:  </b> 
Travelers struggle to choose the best destinations, plan efficient itineraries, and get realistic insights into their trips due to overwhelming content, fragmented travel information, uncertainty about location experiences (weather, atmosphere, crowd levels), and unclear cost expectations.

</span>
</div>

1.2 Write 1-2 paragraphs on why this is a problem for your specific user.

<div style="background-color:rgb(245, 230, 250); padding: 10px; border-radius: 5px;">
<span style="color: black;">

<b> ANSWER:  </b> 
For digital-savvy, upwardly mobile but budget-conscious travelers, planning a trip is often a frustrating and time-consuming experience. While they rely on online itinerary planners from platforms like Expedia, these tools primarily serve as vendor-driven marketplaces rather than objective travel advisors. The search results and recommendations are often skewed by sponsored listings, pushing specific hotels, flights, and experiences that may not align with the traveler’s true preferences or budget. As a result, users are left sifting through an overwhelming volume of content, struggling to distinguish between genuine value and paid promotions, leading to decision fatigue and a suboptimal travel experience.

Additionally, these travelers face FOMO due to the sheer influx of content across social media and travel websites, making it difficult to determine what is truly worth experiencing. Existing tools provide generic information but fail to give real-time, contextual insights into destinations—such as actual crowd levels, local weather conditions, or a location's current vibe—leaving travelers unsure if their chosen spots will match their expectations. Budget-conscious users also find it hard to get a realistic estimate of their total trip costs, as expenses like local transportation, food, and hidden fees are often not factored into standard itinerary planners. This lack of transparency and personalization results in frustration, unnecessary expenses, and missed opportunities to optimize their travel experience.  
</span>
</div>

#### Section 2: Propose a Solution

2.1 Write 1-2 paragraphs on your proposed solution.  How will it look and feel to the user?

<div style="background-color:rgb(245, 230, 250); padding: 10px; border-radius: 5px;">
<span style="color: black;">

<b> ANSWER:  </b> 

The solution is a next-generation AI travel assistant powered by a RAG (Retrieval-Augmented Generation) + multi-agent system architecture that aggregates, analyzes, and personalizes travel information from diverse sources. Unlike traditional itinerary planners that primarily serve vendor interests, this AI assistant will act as an unbiased, intelligent advisor, helping users discover destinations, build optimized itineraries, and make data-driven travel decisions based on their unique preferences, constraints, and real-time conditions.

The system will leverage multi-agent coordination to collect and process data from multiple sources, including real-time weather reports, footfall density analysis, cost estimations, and user-generated reviews. Using vector embeddings and a scalable knowledge base, the assistant will provide hyper-personalized recommendations, avoiding the noise of paid promotions and instead focusing on authentic, data-backed insights. It will predict potential trip expenses based on dynamic pricing trends and offer realistic previews of destinations—including expected crowd levels, atmosphere, and local conditions—ensuring users make informed choices. By eliminating information overload and uncertainty, this AI assistant will help budget-conscious travelers maximize their experience while staying within their financial limits.

The look and feel of the final product for the user will be of a state-of-art digital tool that provides dynamic and interactive visualization of the desired travel planning options optimized for the users' preferences such as timing, budget, group travelling, and fun preferences.

</span>
</div>

2.2 Describe the tools you plan to use in each part of your stack.  Write one sentence on why you made each tooling choice. 

<div style="background-color:rgb(245, 230, 250); padding: 10px; border-radius: 5px;">
<span style="color: black;">

<b> ANSWER:  </b> 

1. LLM - OpenAI GPT-3.5: Chosen for its cost-effective yet powerful natural language understanding and generation capabilities, balancing efficiency and affordability for handling user queries.

2. Embedding Model - OpenAI text-embedding-3-small: Selected for its optimized balance of performance and speed in generating vector embeddings, ensuring efficient retrieval of relevant travel information.

3. Orchestration - LangGraph: Used for managing multi-agent workflows, allowing structured interactions between different agents in a RAG-based system to enhance contextual understanding and response generation.

4. Vector Database - FAISS: Chosen for its high-speed and scalable similarity search capabilities, enabling quick retrieval of relevant travel data from large-scale embeddings.

5. Monitoring - LangSmith: Selected for its ability to track, debug, and analyze interactions in LLM-based applications, ensuring transparency, reliability, and continuous improvement in system performance.

6. Evaluation - RAGAS: Used for assessing the quality of retrieval-augmented generation outputs, helping to fine-tune responses by measuring factual accuracy, relevance, and coherence.

7. User Interface - Hugging Face + Streamlit: Hugging Face is chosen for hosting and deploying AI models, while Streamlit provides an interactive and conversational UI for seamless user experience in itinerary planning.
    
</span>
</div>

2.3 Where will you use an agent or agents?  What will you use “agentic reasoning” for in your app?

<div style="background-color:rgb(245, 230, 250); padding: 10px; border-radius: 5px;">
<span style="color: black;">

<b> ANSWER:  </b> 

The next-gen AI travel assistant will be structured around four key agents, each responsible for a core aspect of the travel planning experience. These agents work collaboratively using LangGraph to ensure a seamless, data-driven, and highly personalized itinerary-building process.

1. Itinerary Agent – This agent curates a personalized travel itinerary by optimizing sightseeing, activities, and experiences based on user preferences, real-time weather, crowd density, and travel feasibility. It ensures that users get a well-balanced plan that maximizes their time while considering FOMO-inducing experiences.

2. Accommodation Agent – Responsible for retrieving and recommending hotels, hostels, or vacation rentals that fit the user’s budget, preferences, and proximity to key attractions. It prioritizes unbiased, data-driven suggestions over vendor-sponsored listings, ensuring travelers get the best value for their stay.

3. Information Agent – Gathers and synthesizes relevant travel advisory data, including visa requirements, local safety updates, cultural norms, and real-time conditions like weather, footfall density, and local events. This agent ensures travelers have a realistic expectation of their destination.

4. Flight Agent – Searches and compares flight options from multiple airlines and aggregators, considering factors like pricing trends, layovers, baggage policies, and hidden costs. It helps users find the most cost-effective and convenient flight options based on their travel dates and preferences.

These four agents work in sync within the multi-agent RAG architecture, ensuring travelers receive accurate, contextual, and unbiased travel planning assistance while reducing decision fatigue.
    
</span>
</div>

### 3. Dealing with the Data

3.1 Describe all of your data sources and external APIs, and describe what you’ll use them for.

<div style="background-color:rgb(245, 230, 250); padding: 10px; border-radius: 5px;">
<span style="color: black;">

<b> ANSWER:  </b> 

I have used two data sources in the current submission - 
a) External API: Amadeus API (Amadeus for Developers) – Offers structured data on flights, hotels, travel restrictions, and itinerary recommendations via RESTful APIs, making it easy to fetch real-time travel insights.
b) WikiVoyage – Open-source travel guide with structured destination information, making it ideal for automated text extraction.

Going forward, I want to make this travel AI system more robust with addition of following data sources:

External Travel APIs: that provide structured data that can be easily integrated into Python-based data pipelines using libraries like requests, pandas, and asyncio for efficient retrieval and processing.
- Skyscanner API (RapidAPI Skyscanner) – Provides comprehensive flight, hotel, and car rental data in a structured JSON format, enabling seamless price comparisons.  
- TripAdvisor API (TripAdvisor Data API) – Supplies user-generated reviews, ratings, and destination information that can be parsed and used for sentiment analysis or recommendation models.  
- OpenWeatherMap API (OpenWeatherMap) – Offers real-time and forecasted weather data via an easy-to-use REST API, helping to adjust travel recommendations dynamically.  
- Rome2Rio API (Rome2Rio Transport API) – Provides structured multi-modal transport data (flights, trains, buses, ferries), making route planning efficient for itinerary recommendations.

Other web sources: that provide structured and semi-structured text data that can be scraped or accessed using Python libraries like BeautifulSoup, Scrapy, or newspaper3k.  
- Lonely Planet – Travel guides and expert insights that can be parsed and structured for itinerary recommendations.  
- Nomadic Matt – Budget travel tips and cost breakdowns, useful for trip cost estimation models.  
- Culture Trip – Provides local experiences and cultural insights that can enhance destination recommendations.  
- Travel Stack Exchange – A structured Q&A platform with expert travel advice that can be processed using NLP models for FAQ generation.
    
</span>
</div>

3.2 Describe the default chunking strategy that you will use.  Why did you make this decision?

<div style="background-color:rgb(245, 230, 250); padding: 10px; border-radius: 5px;">
<span style="color: black;">

<b> ANSWER:  </b> 

The default chunking strategy you're using is the RecursiveCharacterTextSplitter from LangChain with the following parameters:

1. Chunk size of 4000 characters - This is significantly larger than typical chunk sizes (often around 1000-1500 characters), which suggests you're optimizing for context preservation within chunks.  
2. Chunk overlap of 400 characters (10%) - This creates a 10% overlap between adjacent chunks to maintain continuity and context across chunk boundaries.  
3. Multiple separators in hierarchical order ("\n\n", "\n", " ", "") - This ensures the splitting happens at natural text boundaries whenever possible, preserving semantic coherence.  

This chunking strategy is appropriate for RAGAS for several reasons:  

1. Evaluation needs: RAGAS measures retrieval quality, faithfulness, and context relevance - all of which benefit from larger chunks that preserve more context.  
2. Semantic coherence: Larger chunks are more likely to contain complete thoughts, arguments, or explanations, making it easier for the LLM to understand and utilize the information.  
3. Reducing fragmentation: With larger chunks, you reduce the risk of important contextual relationships being broken across chunk boundaries.  
4. Balancing retrieval granularity with context: The 4000-character size strikes a balance between having chunks small enough for precise retrieval but large enough to maintain sufficient context.  

The comment "Larger chunks for RAGAS" in your code suggests this decision was made specifically for evaluation purposes, which makes sense since RAGAS metrics often perform better when evaluating larger, more contextually complete chunks.  
This differs from another perspective of opting for smaller chunks to improve retrieval precision and reduce token usage.  

</span>
</div>

3.3 [Optional] Will you need specific data for any other part of your application?   If so, explain.

<div style="background-color:rgb(245, 230, 250); padding: 10px; border-radius: 5px;">
<span style="color: black;">

<b> ANSWER:  </b> 

Since I want to add a distinctive layer of visualization which can be interactive and dynamic in real-time for users (fully sketched out itineraries with suggested spots, expected costs at each point of itinerary (commute, rental, food, etc)), there might be some multi-modal data sources that can be ingested as well into this multi-agent pipeline for additional enhancements.  
Therefore, I will explore the possibility of adding multi-modal intelligence into the final output layer to enhance the contextual richness of the outputs.

</span>
</div>

### 4. Building a Quick End-to-End Prototype

4.1 Build an end-to-end prototype and deploy it to a Hugging Face Space (or other endpoint).

<div style="background-color:rgb(245, 230, 250); padding: 10px; border-radius: 5px;">
<span style="color: black;">

<b> ANSWER:  </b> 

The app is available at https://huggingface.co/spaces/dabraldeepti25/travel-ai-midterm-submission.

</span>
</div>

### 5. Creating a Golden Test Data Set

5.1 Assess your pipeline using the RAGAS framework including key metrics faithfulness, response relevance, context precision, and context recall. Provide a table of your output results.

<div style="background-color:rgb(245, 230, 250); padding: 10px; border-radius: 5px;">
<span style="color: black;">

<b> ANSWER:  </b> 

<b> Performance of Travel AI agent (RAG + multi-agent system) </b>:

|                 | Faithfulness | Answer Relevancy | Context   Precision | Context Recall |
|-----------------|:------------:|:----------------:|:-------------------:|:--------------:|
| Travel AI agent |      21%     |        68%       |         86%         |       69%      |

<b> Sub agent wise performance </b>:

|               | Faithfulness | Answer Relevancy | Context   Precision | Context Recall |
|---------------|:------------:|:----------------:|:-------------------:|:--------------:|
| Accommodation | 26%          |        15%       |         96%         | 67%            |
| Itinerary     | 9%           |        87%       |         96%         |       68%      |
| Information   | 27%          |        71%       |         82%         |       72%      |
| Flight        | 20%          |        92%       |         72%         |       62%      |

</span>
</div>

5.2 What conclusions can you draw about the performance and effectiveness of your pipeline with this information?

<div style="background-color:rgb(245, 230, 250); padding: 10px; border-radius: 5px;">
<span style="color: black;">

<b> ANSWER:  </b> 

Based on the performance metrics of your Travel AI agent (RAG + multi-agent system) and its sub-agents, here are the key conclusions:

1. Overall Performance of the Travel AI Agent
Faithfulness (21%) is significantly low, indicating that the agent generates a substantial amount of hallucinated or incorrect information.
Answer Relevancy (68%) is moderate, meaning that while responses are somewhat relevant, there’s still room for improvement.
Context Precision (86%) is high, showing that when the model retrieves information, it tends to be accurate.
Context Recall (69%) is moderate, indicating that the AI does not retrieve all relevant information in some cases.
👉 Overall, the system retrieves relevant and precise context but struggles with faithfulness, meaning it may generate factually incorrect responses.

2. Sub-Agent Performance Breakdown: 
(a) Accommodation Agent
Faithfulness (26%) is higher than the overall system but still low.  
Answer Relevancy (15%) is poor, meaning answers are often not relevant.
Context Precision (96%) is excellent, suggesting high accuracy in retrieved information.
Context Recall (67%) is moderate.
✅ Conclusion: The retrieved data is accurate but doesn't necessarily lead to relevant answers. This suggests issues in answer synthesis rather than retrieval.

(b) Itinerary Agent
Faithfulness (9%) is the lowest among all sub-agents, meaning it generates highly inaccurate responses.
Answer Relevancy (87%) is high, so responses are meaningful despite inaccuracy.
Context Precision (96%) is excellent.
Context Recall (68%) is moderate.
✅ Conclusion: This sub-agent finds precise but incomplete information and generates responses that are relevant but mostly unfaithful (hallucinated).

(c) Information Agent
Faithfulness (27%) is the highest among all sub-agents.
Answer Relevancy (71%) is fairly strong.
Context Precision (82%) is good.
Context Recall (72%) is the best among all sub-agents.
✅ Conclusion: This is the best-performing sub-agent in terms of faithfulness and recall. However, faithfulness still needs improvement.

(d) Flight Agent
Faithfulness (20%) is low.
Answer Relevancy (92%) is very high.
Context Precision (72%) is decent.
Context Recall (62%) is the lowest among all sub-agents.
✅ Conclusion: Answers are highly relevant but unfaithful. The low recall suggests it is missing important information, possibly leading to hallucinated responses.

</span>
</div>

### 6. Fine-Tuning Open-Source Embeddings

6.1 Swap out your existing embedding model for the new fine-tuned version. Provide a link to your fine-tuned embedding model on the Hugging Face Hub.

https://huggingface.co/dabraldeepti25/embedding-model-midterm-submission-updated

### 7. Assessing Performance

7.1 How does the performance compare to your original RAG application? Test the fine-tuned embedding model using the RAGAS frameworks to quantify any improvements. Provide results in a table.

7.2 Articulate the changes that you expect to make to your app in the second half of the course. How will you improve your application?

In the next sprint, I want to make the following key improvements to enhance the AI travel assistant to enhance its accuracy, efficiency, and user experience.

1. Expand Data Sources to 10+ – Currently, the system relies on a few key APIs and travel resources. I will integrate additional structured and unstructured data sources, including government tourism APIs, local transit systems, and real-time pricing aggregators, to provide more comprehensive and unbiased travel insights.  

2. Improve Model Performance – By optimizing prompt engineering, retrieval strategies, and response formatting, I aim to enhance the accuracy and contextual relevance of AI-generated travel recommendations. This includes tuning query decomposition and improving multi-agent coordination in LangGraph. 

3. Fine-Tune LLM on Targeted Parameters – While the current setup uses a general-purpose LLM, I will conduct larger-scale fine-tuning iterations, focusing on entity recognition, travel itinerary structuring, and user preference adaptation. This will improve the model’s ability to generate more precise, structured, and dynamic travel plans. 

4. Expand Synthetic Dataset for Agent Testing – To ensure robust agent interactions, I will increase the size and diversity of synthetic datasets used for testing. This will help in evaluating and stress-testing agent performance across different travel scenarios, leading to better generalization and fewer errors in real-world use cases.

### 8. Public Github Repo

8.1 A 5-minute (OR LESS) loom video of a live demo of your application that also describes the use case.

<div style="background-color:rgb(245, 230, 250); padding: 10px; border-radius: 5px;">
<span style="color: black;">

<b> ANSWER:  </b> 

https://www.loom.com/share/e7e88a35d1ac4ee88d29ed0fbf7556f5?from_recorder=1&focus_title=1

</span>
</div>

8.2 A written document addressing each deliverable and answering each question.

<div style="background-color:rgb(245, 230, 250); padding: 10px; border-radius: 5px;">
<span style="color: black;">

<b> ANSWER:  </b> 

This markdown file codifies answers to all the listed questions and has links to all key deliverables.

</span>
</div>

8.3 All relevant code

1: Wall of imports: https://github.com/deeptidabral/AIE5/blob/main/Mid-Term%20Project/00_wall_of_imports.ipynb

2. Data sourcing: https://github.com/deeptidabral/AIE5/blob/main/Mid-Term%20Project/01_data_pipeline_creation.ipynb  

3. RAG + Multi-agent set-up using LangGraph: https://github.com/deeptidabral/AIE5/blob/main/Mid-Term%20Project/02_agent_setup.ipynb  

4. Test set generation and application assessment using RAGAS: 

5. Fine-tuning embedding model: 

6. Agent rerun using fine-tuned embedding model and rechecking its performance:  

7. Streamlit app: app_streamlit.py

### 9. Hugging Face Deliverables

9.1 Share a public (or otherwise shared) link to the final version of your public application on Hugging Face (or other).

https://huggingface.co/spaces/dabraldeepti25/travel-ai-midterm-submission

9.2 Share the public link to your fine-tuned embedding model on Hugging Face.

https://huggingface.co/dabraldeepti25/embedding-model-midterm-submission-updated