# Programming Geospatial_DATS_6450_81 Project: Group 3 

`SELECTING IDEAL HOTELS IN THE DC AREA`

**Project Overview:**

This project uses geospatial analysis with Python, leveraging ArcPy, to identify ideal hotels near metro stations based on user preferences for travel distance, price, and rating, while also highlighting nearby landmarks for each hotel.

Users provide inputs such as their travel distance limit, the importance of price and ratings, and a selected metro station. The program processes geospatial data using Manhattan distance and raster analysis to filter hotels within the specified range. It then employs multi-criteria evaluation (MCE) to calculate scores to hotels based on user-defined priorities, ranking the top options.  It also identifies nearby landmarks for the recommended hotels, ensuring a comprehensive and user-friendly output.


**Data Source**

Please make sure to keep the files in the following order. 

```
/Project_group3/
    └── Data/
        └── data.gdb
             └── Dc_raster_grid
             └── Historic_Landmarks_Points
             └── Hotels
             └── Metro_Stations_in_DC
    └── Project_Group3_code.ipynb
```

- Dc_raster_grid - A raster dataset created by converting the shapefile of Washington, DC, into a grid with a 100-meter cell size
- Historic_Landmarks_Points - A point shapefile containing officially designated historic landmarks within the District of Columbia.
- Hotels - A point shapefile, featuring hotel details such as name, price range, and ratings which was created from csv
- Metro_stations_in_DC - A point shapefile representing the locations of metro stations across Washington, DC.

**User Inputs**

- Maximum Travel Distance: A numeric value (in meters) indicating how far the user is willing to travel from a selected metro station.
- Price Priority: A scale (1-3) reflecting the range of hotel price, where 3 indicates the highest price range.
- Rating Priority: A scale (1-5) representing the importance of hotel ratings, with 5 being the most important.
- Metro Station Selection: The user should select a metro station in Washington, DC, as the focal point for finding hotels nearby.

Additionally, the program uses pre-loaded geospatial data (hotels, metro stations, landmarks, raster grids) for analysis.   

**Outputs**

- Top Hotel Recommendations: A list of up to five hotels ranked by their MCE scores, including details like price level, ratings, distance from the metro, and score.
- Nearby Landmarks: A list of attractions located within user defined distance from each recommended hotel
- Completion Notification: The program notifies the user when the analysis is complete and asks if they want to rerun it with new preferences.

**Detailed workflow involves:**

- Setting up the workspace and initializing variables.
- Loading the required data layers (hotels, metro stations, landmarks, raster grid).
- Capturing user preferences [metro station, maximum travel distance, price priority, and rating priority.]
- Performing geospatial computations and filtering hotels based on the user preferences.

    - Creating raster grids by calculating the Euclidean distance from the selected metro station within the specified distance.
    - The Manhattan distance is then calculated for each raster cell centroid relative to the metro station, and cells with Manhattan distance less than the user-defined limit are selected.
    - Filtering hotels located within the selected cells.
    - Calculating the exact Manhattan distance for each hotel from the selected metro station inside the filtered raster area.
    - Calculating the Multi-Criteria Evaluation (MCE) score for each selected hotel based on user preferences for price and rating.
        
        - MCE score is calculated using a weighted sum of the normalized price and rating scores, with weights based on the user's preferences. 
        - If price_priority is higher, more weight is given to price, and if rating_priority is higher, more weight is given to ratings.
    - Getting list of landmarks closer to each selected hotel using Euclidean distance, with a scaling factor added to account for walking distance.
- Displaying the details of Top 5 hotels with highest MCE and landmarks near them.

In [1]:
# importing packages
import arcpy
import pandas as pd
import numpy as np
import os
from typing import Dict, List, Tuple

In [2]:
class HotelSelector:
    def __init__(self, workspace_path: str):
        """Initialize the Hotel Selector with workspace and required datasets."""
        # Set up the workspace
        arcpy.env.workspace = workspace_path
        arcpy.env.overwriteOutput = True
        
        # Class variables for data paths
        self.hotels_data = None
        self.metro_stations = None
        self.landmarks = None
        self.raster_grid = None

In [3]:
def load_data(self, hotels_path: str, metro_stations_path: str, landmarks_path: str, raster_path: str) -> None:
    """Load all required datasets."""
    try:
        # Load the datasets
        self.hotels_data = arcpy.MakeFeatureLayer_management(hotels_path, "hotels_layer")
        self.metro_stations = arcpy.MakeFeatureLayer_management(metro_stations_path, "metro_layer")
        self.landmarks = arcpy.MakeFeatureLayer_management(landmarks_path, "landmarks_layer")
        self.raster_grid = raster_path
        
        print("Data loaded successfully!")
        
    except arcpy.ExecuteError:
        print(f"Error loading data: {arcpy.GetMessages(2)}")
        raise

HotelSelector.load_data = load_data

In [4]:
def get_metro_station_options(self):
    """
        Retrieve and display available metro stations names.
    """
    try:
        metro_stations_list = []
        # Select hotels inside the buffered polygon
        list_metro_layer = "list_metro_layer"
        arcpy.MakeFeatureLayer_management(self.metro_stations, list_metro_layer)
        # Retrieve metro station names
        with arcpy.da.SearchCursor(list_metro_layer, ["NAME"]) as cursor:
                for row in cursor:
                    metro_stations_list.append(row[0])
        
        # Display the metro stations in a concise tabular format
        print("\n================== Available Metro Stations =======================")

        col_width = max(len(name) for name in metro_stations_list) + 2  # Calculate column width
        num_columns = 3  # Number of columns to display

        for i, station in enumerate(metro_stations_list, 1):
            print(f"{i}. {station:<{col_width}}", end="")
            if i % num_columns == 0:  # Start a new line after every num_columns entries
                print()
        if len(metro_stations_list) % num_columns != 0:
            print()  # Ensure the output ends with a newline
        
        print("======================================================================")
        
        return metro_stations_list
    
    except arcpy.ExecuteError:
        print(f"Error retrieving metro stations (get_metro_station_options function): {arcpy.GetMessages(2)}")
        raise
    
HotelSelector.get_metro_station_options = get_metro_station_options

`get_user_preferences` function:
- gathers user inputs for preferences on price tolerance, rating importance, maximum travel distance, and a selected metro station.
- returns a dictionary of user preferences

where, higher price value indicates greater tolerance for expensive hotels

and, higher rating value reflects a stronger preference for highly-rated hotels.

In [5]:
def get_user_preferences(self, metro_stations_list: List[str]):  # metro_stations_list: List[str]
    """
        Get user preferences with input validation.
    """
    while True:
        try:
            # Get distance preference
            distance = float(input("How far are you willing to travel from metro to get to a hotel? (in meters); Please consider at least 100 meters: "))
            if distance <= 0:
                raise ValueError("Distance must be positive")
            
            # Get price priority
            price_priority = int(input("How important is price to you (1-3, where 3 is very important)? "))
            if not 1 <= price_priority <= 3:
                raise ValueError("Price priority must be between 1 and 3")
            
            # Get rating priority
            rating_priority = int(input("How important is rating to you (1-5, where 5 is very important)? "))
            if not 1 <= rating_priority <= 5:
                raise ValueError("Rating priority must be between 1 and 5")
            
            # Get metro station
            station_choice = int(input("\nSelect a metro station by entering its number: "))
            try:
                station_index = int(station_choice) - 1
                if 0 <= station_index < len(metro_stations_list):
                    metro_station = metro_stations_list[station_index]
                else:
                    print("Invalid station number. Please try again.")
            except ValueError:
                print("Please enter a valid number.")

            # Verify metro station exists
            station_count = int(arcpy.GetCount_management(
                arcpy.SelectLayerByAttribute_management(
                    self.metro_stations,
                    "NEW_SELECTION",
                    f"NAME = '{metro_station}'"
                )
            ).getOutput(0))
            
            if station_count == 0:
                raise ValueError("Metro station not found")
            
            return {
                "distance": distance,
                "price_priority": price_priority,
                "rating_priority": rating_priority,
                "metro_station": metro_station
            }
            
        except ValueError as e:
            print(f"Invalid input: {str(e)}. Please try again.")

HotelSelector.get_user_preferences = get_user_preferences

`select_hotels_raster_method` function:
- calculates Euclidean distances from the metro station to create a raster grid.
- identifies grid cells within the user's preferred range from the raster grid.
- converts the raster grid into points where each point is the cell centroid.
- Points within the user preferred distance are extracted and converted into polygons, which are buffered slightly to include nearby hotels
- returns hotels located within the buffered polygon

In [6]:
def select_hotels_raster_method(self, preferences: Dict):
    """
        Select hotels using raster-based spatial analysis method.
    """
    try:        
        # Select the target metro station
        metro_station_layer = "metro_station_layer"
        arcpy.MakeFeatureLayer_management(self.metro_stations, metro_station_layer)
        arcpy.SelectLayerByAttribute_management(metro_station_layer, "NEW_SELECTION", f"Name = '{preferences['metro_station']}'")

        # Calculate cell size
        cell_size = arcpy.Describe(self.raster_grid).meanCellWidth

        # Calculate Euclidean distance raster
        euclidean_distance_raster = arcpy.sa.EucDistance(
            metro_station_layer, 
            cell_size=cell_size
        )
        
        euclidean_distance_raster.save("euclidean_distance_raster")

        # Mask raster cells within user's distance
        selected_cells = arcpy.sa.Con(euclidean_distance_raster <= preferences['distance'], 1)
        selected_cells.save("selected_cells")

        # Convert raster to points, where points will be cell centroid.
        cell_centroids = "cell_centroids"
        arcpy.RasterToPoint_conversion("selected_cells", cell_centroids, "VALUE")

        # Check if any points were selected
        if int(arcpy.GetCount_management(cell_centroids)[0]) == 0:
            print("\nThe specified distance is too short. Consider increasing the distance.")
            return None, None
        
        # Get metro station coordinates
        metro_station_coords = None
        with arcpy.da.SearchCursor(metro_station_layer, ["SHAPE@XY"]) as cursor:
            for row in cursor:
                metro_station_coords = row[0]

        # Add Manhattan distance field
        arcpy.AddField_management(cell_centroids, "Manhattan_Distance", "DOUBLE")

        # Calculate Manhattan distance from each cell centroid to given metro station and store it.
        with arcpy.da.UpdateCursor(cell_centroids, ["SHAPE@XY", "Manhattan_Distance"]) as cursor:
            for row in cursor:
                cell_x, cell_y = row[0]
                manhattan_distance = abs(cell_x - metro_station_coords[0]) + abs(cell_y - metro_station_coords[1])
                row[1] = manhattan_distance
                cursor.updateRow(row)

        # Select points with Manhattan distance <= user preferred distance.
        selected_points = "selected_points"
        arcpy.MakeFeatureLayer_management(cell_centroids, "selected_points_layer")
        arcpy.SelectLayerByAttribute_management("selected_points_layer", "NEW_SELECTION", f"Manhattan_Distance <= {preferences['distance']}")
        arcpy.CopyFeatures_management("selected_points_layer", selected_points)

        # Check if any selected points exist
        if int(arcpy.GetCount_management(selected_points)[0]) == 0:
            print("\nThe specified distance is too short. Consider increasing the distance.")
            return None, None

        # Extract raster by mask from the selected points.
        extracted_raster = "extracted_raster"
        arcpy.sa.ExtractByMask(selected_cells, selected_points).save(extracted_raster)

        # Convert raster to polygon
        raster_polygon = "raster_polygon"
        arcpy.RasterToPolygon_conversion(extracted_raster, raster_polygon, "NO_SIMPLIFY")

        # Buffer polygon created from raster to include the hotels closer to the boundary.
        buffer_distance = 10
        buffered_polygon = "buffered_polygon"
        arcpy.Buffer_analysis(
            in_features=raster_polygon,
            out_feature_class=buffered_polygon,
            buffer_distance_or_field=f"{buffer_distance} Meters",
            line_side="FULL",
            dissolve_option="NONE"
        )

        # Select hotels inside the buffered polygon
        selected_hotels_layer = "selected_hotels_layer"
        arcpy.MakeFeatureLayer_management(self.hotels_data, selected_hotels_layer)
        arcpy.SelectLayerByLocation_management(selected_hotels_layer, "INTERSECT", buffered_polygon)

        return selected_hotels_layer, metro_station_coords

    except arcpy.ExecuteError:
        print(f"Error in raster-based hotel selection (select_hotels_raster_method function): {arcpy.GetMessages(2)}")
        raise

HotelSelector.select_hotels_raster_method = select_hotels_raster_method            

In [7]:
def calculate_manhattan_distance(self, hotel_layer: str, metro_station_coords):
    """
        Calculate Manhattan distance between each selected hotels and metro station.
    """
    try:        
        # Add field for Manhattan distance
        arcpy.AddField_management(hotel_layer, "manhatDist", "DOUBLE")
        
        # Get station coordinates
        station_x, station_y = metro_station_coords[0], metro_station_coords[1]
        
        # Calculate Manhattan distance for each selected hotel
        with arcpy.da.UpdateCursor(hotel_layer, ["SHAPE@XY", "manhatDist"]) as cursor:
            for row in cursor:
                hotel_x, hotel_y = row[0]
                manhattan_dist = abs(hotel_x - station_x) + abs(hotel_y - station_y)
                row[1] = manhattan_dist
                cursor.updateRow(row)
                
    except arcpy.ExecuteError:
        print(f"Error calculating Manhattan distance (calculate_manhattan_distance function): {arcpy.GetMessages(2)}")
        raise

HotelSelector.calculate_manhattan_distance = calculate_manhattan_distance  

In [8]:
def calculate_mce_score(self, preferences: Dict):
    """
        Calculate MCE (Multi-Criteria Evaluation) score for each selected hotels.
    """
    try:
        # Add field for MCE score
        arcpy.AddField_management("hotels_layer", "mce_score", "DOUBLE")
        
        # Normalize weights
        total_weight = preferences['price_priority'] + preferences['rating_priority']
        price_weight = preferences['price_priority'] / total_weight
        rating_weight = preferences['rating_priority'] / total_weight
        
        # Calculate MCE score
        with arcpy.da.UpdateCursor("selected_hotels_layer", ["pricing", "ratings","mce_score"]) as cursor:
            for row in cursor:
                # Normalize price: Higher price should be favored if price priority is high
                price_score = row[0] / 3  # Convert price range (1-3) to a normalized scale (0 to 1)
                # Normalize rating: Higher rating is better
                rating_score = row[1] / 5 # Normalize rating (1-5) to a scale from 0 to 1
                
                # Calculates a composite score for hotel selection
                mce_score = (price_score * price_weight + rating_score * rating_weight)
                # updating score to the mce_score field.
                row[2] = mce_score
                cursor.updateRow(row)
                
    except arcpy.ExecuteError:
        print(f"Error calculating MCE score (calculate_mce_score function): {arcpy.GetMessages(2)}")
        raise

HotelSelector.calculate_mce_score = calculate_mce_score 

`find_nearby_landmarks` function
- For each recommended hotel, scaling factor (0.8) is applied to adjust the search distance.
- It then selects nearby landmarks using SelectLayerByLocation_management and retrieves their names
- Returns dictionary where the keys are hotel names, and the values are lists of associated nearby landmarks.

While hotels are identified using raster-based filtering and Manhattan distance for precise proximity analysis, nearby landmarks are found using Euclidean distance adjusted by a scaling factor to account for realistic walking distances.

In [9]:
def find_nearby_landmarks(self, hotel_layer: str, distance: float):
    """
        Find landmarks near selected hotels.
    """
    landmarks_dict = {}
    try:
        scaling_factor = 0.8 
        # For each hotel, find nearby landmarks
        with arcpy.da.SearchCursor(hotel_layer, ["NAME", "SHAPE@"]) as hotels:
            # Loop through each hotel in the feature layer
            for hotel in hotels:   
                hotel_name = hotel[0]    # Extract the hotel name 
                hotel_geometry = hotel[1]  # Extract the hotel's geometry 
                
                # Apply a scaling factor to the original search distance. Scaling factor (0.8) reduces the original distance 
                # added to account for walking distance.
                adjusted_distance = distance * scaling_factor

                # Select landmarks within distance
                nearby_landmarks = arcpy.SelectLayerByLocation_management(
                    self.landmarks,
                    "WITHIN_A_DISTANCE",
                    hotel_geometry,  # Selection features (target hotel)
                    adjusted_distance,  # Distance with scaling factor applied
                    "NEW_SELECTION"
                )
                
                # Get landmark names
                landmarks_list = []
                field_name = "NAME"
                # converting featureclass to numpy array to get all the values of specified field.
                table_array = arcpy.da.FeatureClassToNumPyArray(nearby_landmarks, [field_name]) 
                landmarks_list = table_array[field_name].tolist()
                        
                landmarks_dict[hotel_name] = landmarks_list
                
        return landmarks_dict
        
    except arcpy.ExecuteError:
        print(f"Error finding nearby landmarks (find_nearby_landmarks function): {arcpy.GetMessages(2)}")
        raise

HotelSelector.find_nearby_landmarks = find_nearby_landmarks 

In [10]:
def display_results(self, hotels_in_raster, landmarks_dict: Dict[str, List[str]]):
    """
        Display the results in a formatted, user-friendly manner
    """
    try:
        print("\nTop Hotels Based on Your Preferences:")
        print("-" * 120)
        
        # Sort hotels by MCE score
        arcpy.MakeFeatureLayer_management(hotels_in_raster, "hotels_layer2")
        temp_table = "in_memory/sorted_hotels"
        arcpy.Sort_management("hotels_layer2", temp_table, [["mce_score", "DESCENDING"]])

        # Displaying detail information about top 5 hotels
        with arcpy.da.SearchCursor(temp_table, ["NAME", "pricing", "ratings", "manhatDist", "mce_score"]) as cursor:
            print(f"{'Rank':<5} {'Hotel Name':<50} {'Price Level':<15} {'Rating':<10} {'Manhattan Distance (m)':<23} {'MCE Score'}")
            print("-" * 120)
            for i, row in enumerate(cursor, 1):
                if i > 5:  # Show top 5 hotels
                    break

                price_level = "$" * row[1]  # representing price range with "$"
                print(f"{i:<5} {row[0]:<50} {price_level:<15} {row[2]:<10} {row[3]:<23.1f} {row[4]:<10.2f}")

                # Display nearby landmarks
                if row[0] in landmarks_dict:
                    print(f"\n   Nearby Landmarks:")
                    for landmark in landmarks_dict[row[0]]:
                        print(f"   - {landmark}")
                print("-" * 120)
                
    except arcpy.ExecuteError:
        print(f"Error displaying results (display_results function): {arcpy.GetMessages(2)}")
        raise

HotelSelector.display_results = display_results 

In [11]:
def cleanup_temp_files(self):
    """
        Removes temporary files and layers created during analysis.
    """
    try:
        # List of temporary files and layers to clean
        temp_files = ["euclidean_distance_raster", "metro_station_layer", "selected_cells", "cell_centroids", "selected_points_layer", "selected_points", "extracted_raster", "raster_polygon", "buffered_polygon", "selected_hotels_layer"]
        eucdist_files = arcpy.ListFiles("EucDist_Metr*")
        # Delete temporary files
        for temp_file in eucdist_files:
            if arcpy.Exists(temp_file):
                arcpy.Delete_management(temp_file)
        # Delete temporary files
        for temp_file in temp_files:
            if arcpy.Exists(temp_file):
                arcpy.Delete_management(temp_file)
        
        # Clean up in_memory workspace
        arcpy.Delete_management("in_memory")
        
        print("\nTemporary files cleaned up successfully")
        
    except arcpy.ExecuteError as e:
        print(f"Error during cleanup (cleanup_temp_files function): {str(e)}")

HotelSelector.cleanup_temp_files = cleanup_temp_files 

In [12]:
def run_search(self):
    """
        Main method to run the hotel selection workflow.
    """
    # Get metro station options
    metro_station_list = self.get_metro_station_options()
    # Allows multiple iterations of hotel search
    while True:
        try:            
            # Get user preferences
            preferences = self.get_user_preferences(metro_station_list)
            
            # Select hotels using raster method
            hotels_in_raster, metro_station_coords = self.select_hotels_raster_method(preferences)

            # Check hotels_in_raster is created or not.
            if not hotels_in_raster or len(hotels_in_raster) == 0:
                print("Please adjust your preferences and try again.")
                repeat = input("Do you want to try again? Input 'yes' or 'no': ").lower()
                if repeat != 'yes':
                    break
                continue
            else:
                # Check if hotels_in_raster contains any data
                hotel_count = int(arcpy.management.GetCount(hotels_in_raster)[0])
                if hotel_count == 0:
                    print("No hotels were found within the specified distance. Please adjust your preferences and try again.")
                    repeat = input("Do you want to try again? Input 'yes' or 'no': ").lower()
                    if repeat != 'yes':
                        break
                    continue

            # Calculate Manhattan distance
            self.calculate_manhattan_distance(hotels_in_raster, metro_station_coords)
            
            # Calculate MCE score
            self.calculate_mce_score(preferences)
            
            # Find nearby landmarks
            landmarks_dict = self.find_nearby_landmarks(hotels_in_raster, preferences['distance'])
            
            # Display results
            self.display_results(hotels_in_raster, landmarks_dict)
            
            # Ask if user wants to try again
            repeat = input("\nDo you want to try again? Input 'yes' or 'no': ").lower()

            # # Clean up from previous run
            self.cleanup_temp_files()

            if repeat != 'yes':
                break
                
        except Exception as e:
            print(f"An error occurred: {str(e)}")
            repeat = input("\nSince error occured. Do you want to try again? Input 'yes' or 'no': ").lower()
            if repeat != 'yes':
                break

HotelSelector.run_search = run_search 

In [13]:
def main():

    current_dir = os.getcwd()
    workspace_path = os.path.join(current_dir, "Data", "data.gdb")

    # Initialize hotel selector
    selector = HotelSelector(workspace_path)
    
    # Loads data
    selector.load_data(
        hotels_path="Hotels",
        metro_stations_path="Metro_Stations_in_DC",
        landmarks_path="Historic_Landmarks_Points",
        raster_path = "Dc_raster_grid"
    )

    # Runs the analysis
    selector.run_search()

    # List of feature layers to clean up
    layers_to_clean = ["hotels_layer", "metro_layer", "landmarks_layer"]
    
    # Clean up feature layers
    for layer in layers_to_clean:
        if arcpy.Exists(layer):
            arcpy.Delete_management(layer)

    print("Script completed !")

if __name__ == "__main__":
    main()

Data loaded successfully!

1. Takoma                                        2. Friendship Heights                            3. Fort Totten                                   
4. Tenleytown-AU                                 5. Van Ness-UDC                                  6. Georgia Ave Petworth                          
7. Cleveland Park                                8. Brookland-CUA                                 9. Columbia Heights                              
10. Woodley Park-Zoo Adams Morgan                 11. Rhode Island Ave                              12. U St/African-Amer Civil War Memorial/Cardozo  
13. Shaw-Howard Univ                              14. Dupont Circle                                 15. Deanwood                                      
16. NoMa - Gallaudet U                            17. Mt Vernon Sq - 7th St Convention Center       18. Farragut North                                
19. McPherson Sq                                  20. Farragut West         

Steps to Execute:

- Click the Run All button at the top.
- View the metro station numbers in the output.
- Provide inputs to the questions asked.

    `Sample User Input:`

    - How far are you willing to travel from metro to get to a hotel? (in meters): `500`
    - How important is price to you (1-3, where 3 is very important)?: `2`
    - How important is rating to you (1-5, where 5 is very important)?: `4`
    - Select a metro station by entering its number: `21`
    - Do you want to try again? Input 'yes' or 'no': `no`
    
    <br>
- To end, type 'no' when prompted to try again.
- It takes some time for the program to end, since cleaning process takes place.
- Click the text editor link at the end to view the entire output.

` Output of sample input:`

[Output_doc](<Sample_Output_of_Prog_geo project.docx>)

**Note:**

Price and Rating Values:
The data uses randomly assigned price and rating values, so there is no guarantee that higher prices will correlate with better ratings.
- $ (range 1), $$ (range 2), $$$ (range 3)

Buffer on Raster Polygon:
A small buffer is added to the raster polygon to include corner hotels that might otherwise be excluded due to partial overlaps.

**Key Geospatial Techniques Used**

- Raster analysis
- Euclidean and Manhattan distance calculations
- Spatial selection
- Multi-criteria evaluation
- Landmark proximity analysis