# Touristic Tour Recommendation Application
This notebook outlines the steps involved in creating an algorithm that generates a one-week itinerary for tourists in Algeria. The itinerary is optimized based on user preferences, proximity, and travel costs. Various search techniques, including **Uninformed Search Algorithms**, **A*** and **Hill Climbing**, are employed to create the optimal itinerary.


## Data Collection & Research
We gathered - clean - data about **+100 Algerian tourist attractions**, including the following attributes:
- **Attraction Name**
- **Type of Attraction** (museum, nature, beach, etc.)
- **City**
- **Cost** (entry fee)
- **Rating** (user rating)
- **GPS Coordinates** (latitude, longitude)
- **Description** (short description)


In [18]:
import json
from collections import Counter

DATA_PATH = "../Data/attractions.json"

with open(DATA_PATH, "r", encoding="utf-8") as f:
    attractions_data = json.load(f)

if not isinstance(attractions_data, list):
    raise ValueError("The JSON file does not contain a list of attractions.")

print("Number of attractions:", len(attractions_data))

# Count attractions per city
city_counts = Counter(attraction.get("city", "Unknown") for attraction in attractions_data)

# Count attractions per category
category_counts = Counter(attraction.get("category", "Unknown") for attraction in attractions_data)

print("\nNumber of attractions per city:")
for city, count in city_counts.items():
    print(f"{city}: {count}")

print("\nNumber of attractions per category:")
for category, count in category_counts.items():
    print(f"{category}: {count}")


Number of attractions: 74

Number of attractions per city:
Algiers: 18
Tipaza: 2
Blida: 1
Médéa: 3
Oran: 12
Tlemcen: 8
Batna: 1
Ghardaïa: 2
Bejaia: 1
Constantine: 5
Djanet: 1
Sétif: 4
Annaba: 4
Guelma: 2
El Tarf: 1
Tamanrasset: 2
Béchar: 1
Bouira: 1
Brezina, El Bayadh: 1
Khenchela: 1
Biskra: 3

Number of attractions per category:
Garden: 3
Museum: 7
Cultural: 14
Historical: 14
Religious: 6
Amusement Park: 3
Port: 1
Shopping Mall: 4
Nature: 22


## Distance and Proximity Calculation
To calculate the optimal itinerary, we need to compute the distances between attractions. We will use the **Haversine Formula** to calculate the distance between two locations based on their GPS coordinates.


In [2]:
import math

# Haversine formula to calculate distance between two points (in km)
def haversine(lat1, lon1, lat2, lon2):
    R = 6371  # Radius of the Earth in kilometers
    dlat = math.radians(lat2 - lat1)
    dlon = math.radians(lon2 - lon1)
    a = math.sin(dlat / 2)**2 + math.cos(math.radians(lat1)) * math.cos(math.radians(lat2)) * math.sin(dlon / 2)**2
    c = 2 * math.atan2(math.sqrt(a), math.sqrt(1 - a))
    return R * c

# e.g.
distance = haversine(36.7452, 3.0750, 36.5894, 2.4477)
distance


58.570295060336306

## User Preferences Simulation 
The application will take inputs (from the website interface) such as:
- **Starting location**: Algiers
- **Preferred attractions**: Museums, Historical Sites
- **Budget**: 1500 DZD
- **Hotel rating**: 3 stars


## Problem Formulation
**1. Define the Problem Components** 

We model the itinerary planning as a **search problem** and a **CSP**.  
- **States**: Partial itineraries (sequence of attractions).  
- **Actions**: Adding a new attraction to the itinerary.  
- **Goal**: A 7-day itinerary that maximizes user satisfaction and minimizes cost/time.  
- **Constraints**: No repeated attractions, budget limits, and proximity.

**2. State Representation**

Define the "state" as a sequence of attractions grouped by day.
Example State: [Day1: Algiers Citadel, Day2: Tassili n'Ajjer, ...]

In [14]:
class ItineraryState:
    def __init__(self, days, attractions):
        self.days = days  # List of attractions per day (e.g., [ [attr1], [attr2], ... ])
        self.attractions = attractions  # List of all attractions
    
    def __repr__(self):
        return f"Itinerary: {[day[0]['Name'] for day in self.days if len(day) > 0]}"

**3. Actions**

Actions are the valid attractions that can be added to the itinerary based on constraints.

In [16]:
def get_actions(state, user_prefs, max_daily_cost=10000):
    """
    Returns possible attractions to add to the itinerary.
    Constraints:
    - No repeated attractions.
    - Daily cost ≤ max_daily_cost.
    - Proximity (attractions must be within 200 km of the previous day's location).
    """
    pass
    # the following is mock code only
    
    # if len(state.days) >= 7:
    #     return []  # Itinerary is complete
    
    # current_day = len(state.days)
    # last_attraction = state.days[-1][0] if current_day > 0 else None
    
    # valid_actions = []
    # for attr in state.attractions:
    #     # Constraint 1: No repeats
    #     if any(attr['Name'] in [a['Name'] for day in state.days for a in day]:
    #         continue
        
    #     # Constraint 2: Proximity (if not first day)
    #     if last_attraction:
    #         distance = haversine(
    #             last_attraction['Latitude'], last_attraction['Longitude'],
    #             attr['Latitude'], attr['Longitude']
    #         )
    #         if distance > 200:  # 200 km max travel per day
    #             continue
        
    #     # Constraint 3: Cost (entry + estimated travel cost)
    #     travel_cost = distance * 50  # Assume 50 DZD/km
    #     total_cost = attr['Entry_Cost'] + travel_cost
    #     if total_cost > max_daily_cost:
    #         continue
        
    #     valid_actions.append(attr)
    
    # return valid_actions

**4. Goal Test**

Check if the itinerary is complete (7 days) and meets all constraints.

In [18]:
def is_goal(state, user_budget=50000):
    """
    Goal test: 
    - 7-day itinerary.
    - Total cost ≤ user budget.
    """
    # the following is mock code only
    
    # total_cost = 0
    # for day in state.days:
    #     if len(day) == 0:
    #         return False  # Incomplete day
    #     attr = day[0]
    #     total_cost += attr['Entry_Cost']
    
    # return len(state.days) == 7 and total_cost <= user_budget

## Search Algorithm Implementations

In this project, we use a combination of **uninformed search algorithms** (BFS, DFS) and **informed search algorithms** (A\* Search, Hill Climbing) to generate the optimal itinerary based on cost, proximity, and user preferences.

Each search algorithm has its strengths and weaknesses. In this section, we will implement and compare their performance in solving the itinerary optimization problem.


### BFS
Explores all paths at the current depth before moving on to the next level, ensuring the shortest path is found if the graph is unweighted.


In [8]:
from collections import deque

def bfs(): # parameters needed !
    """
    Perform a BFS to find the shortest path from start to goal.
    
    Args:
    - start: Starting location (current location).
    - goal: Desired goal (final destination).
    - attractions: List of all attractions (places to visit).
    - distance_function: Function to calculate the distance between locations.
    - maybe more, depends on the one implementing the algorithm
    
    Returns:
    - List of attractions in the optimal path.
    """
    pass

# maybe other functions here

### DFS
Explores each path as deep as possible before backtracking, often used when exploring solutions in a depth-first manner.

In [9]:
def dfs(): # parameters needed !
    """
    Perform a DFS to find a path from start to goal.
    
    Args:
    - start: Starting location.
    - goal: Desired goal.
    - attractions: List of all attractions.
    - distance_function: Function to calculate distance.
    - maybe more, depends on the one implementing the algorithm
    
    Returns:
    - List of attractions in the path, or None if no path is found.
    """
    pass

# maybe other functions here

### A\*
Uses a heuristic to optimize the search process by combining the current cost with the estimated cost to reach the goal.

In [10]:
import heapq

def a_star_search(): # parameters needed !
    """
    Perform A* search to find the optimal path from start to goal.
    
    Args:
    - start: Starting point.
    - goal: Goal point.
    - attractions: List of all attractions.
    - distance_function: Function to calculate distance between locations.
    - heuristic_function: Heuristic function to estimate the cost from a point to the goal.
    - maybe more, depends on the one implementing the algorithm
    
    Returns:
    - List of attractions in the optimal path.
    """
    pass


### Hill-Climbing
A simple greedy approach that evaluates only the immediate next state and chooses the best option available without considering future paths.

In [11]:
def hill_climb_search(): # parameters needed !
    """
    Perform Hill Climbing to find the optimal path based on local evaluations.
    
    Args:
    - start: Starting location.
    - attractions: List of all attractions.
    - distance_function: Function to calculate distance.
    - heuristic_function: Function to evaluate the "goodness" of a path.
    - maybe more, depends on the one implementing the algorithm
    
    Returns:
    - List of attractions in the path, or None if no solution is found.
    """
    pass


## CSP Approach
We model the itinerary planning as a **Constraint Satisfaction Problem (CSP)**. Each attraction represents a variable, and the domain of each variable is the set of attractions that can be visited. Constraints include:
  1. **One destination per day**: Each day gets exactly one destination.
  2. **Proximity and time**: Destinations must be scheduled in a way that minimizes travel time and adheres to time constraints.
  3. **Cost**: The total cost of the trip should be within the user’s budget.
  4. **Preferences**: Ensure that the user’s preferences for types of attractions are respected.


In [12]:
# Define constraints for the CSP
def check_constraints(itinerary, user_preferences):
    pass


## Comparative Evaluation of Algorithms 

We will compare the performance of the following search algorithms:

1. **BFS**: Guarantees the shortest path but may be slow for large datasets.
2. **DFS**: Fast but not guaranteed to find the optimal solution.
3. **A\***: Combines optimality with efficiency when the heuristic is accurate.
4. **Hill Climbing**: Greedy and fast but may get stuck in local optima.
5. **CSP Approach**: Will compare the CSP solution with the search-based methods.

We will evaluate the following metrics:
- **Execution time**: How long each algorithm takes to generate the itinerary.
- **Path optimality**: How close the solution is to the optimal itinerary in terms of cost, proximity, and user satisfaction.
- **Complexity**: The computational complexity of each approach.


In [5]:
def evaluate_solution(itinerary, total_cost, time_taken):
    # Compare the total cost, time efficiency, and other factors
    pass  # To be implemented


## Visualizations

We will include several visualizations to help understand the results:
- **Route Map**: Visualize the itinerary on a map.
- **Cost Breakdown**: Graph showing the total cost per destination.
- **Satisfaction**: Bar chart comparing user satisfaction for different search strategies.


In [6]:
import matplotlib.pyplot as plt

# Example plot for the cost breakdown
def plot_cost_breakdown(itinerary, attractions):
    costs = [attractions[a]['cost'] for a in itinerary]
    plt.bar(range(len(itinerary)), costs)
    plt.xlabel('Attraction')
    plt.ylabel('Cost')
    plt.title('Cost Breakdown')
    plt.show()


## Demo

In this section, we will showcase the working prototype of the **Touristic Tour Recommendation Application**. This demo will cover:
- The **interactive input** from the user (e.g., preferences, current location, etc.).
- Displaying the **optimized itinerary** generated by the selected search algorithm.
- Visualizations of the **travel route**, **cost breakdown**, and **satisfaction level**.

In [13]:
pass

## Conclusion

In this notebook, we explored different search algorithms (A\*, Hill Climbing, BFS, DFS) and a CSP approach to generate optimal itineraries for travelers in Algeria. 
We found that **.....** performed well in terms of solution quality, but it was more computationally expensive compared to **.....**. 
Future work could involve integrating real-time weather data and optimizing routes based on current traffic conditions.
