In [12]:
from IPython.display import display, Markdown
import pandas as pd
import requests

In [14]:
# Load clustered Airbnb data from the user's GitHub project path
df = pd.read_csv("../ML-Exam/data/clustered_airbnb.csv")

In [36]:
# Define the LLaMA 3 client
class Llama3Client:
    def __init__(self, model="llama3", host="http://localhost:11434"):
        self.model = model
        self.api_url = f"{host}/api/chat"

    def ask(self, prompt: str) -> str:
        payload = {
            "model": self.model,
            "messages": [{"role": "user", "content": prompt}],
            "stream": False
        }
        try:
            response = requests.post(self.api_url, json=payload)
            response.raise_for_status()
            return response.json()['message']['content'].strip()
        except requests.RequestException as e:
            return f"Error: {e}"

# Format listings for display in the prompt
def format_listings(listings_df, nights):
    listings = []
    for _, row in listings_df.iterrows():
        total_price = row['realSum'] * nights
        listing = (
            f"{row['City']} (Cluster {row['cluster']}) - €{row['realSum']}/night, "
            f"{row['bedrooms']} bedrooms, "
            f"{row['dist']}km to center, "
            f"{row['metro_dist']}km to metro, "
            f"{row['guest_satisfaction_overall']} guest satisfaction, "
            f"Total: €{total_price:.2f}"
        )
        listings.append(listing)
    return "\n".join(listings)

In [38]:
# 1. Function for best value (all cities) using format_listings
def recommend_best_value(df, budget, nights=5, max_results=5):
    df['total_price'] = df['realSum'] * nights
    filtered = df[df['total_price'] <= budget].sort_values(by='guest_satisfaction_overall', ascending=False).head(max_results)

    if filtered.empty:
        return "No listings found within your budget."
    
    context = format_listings(filtered, nights)
    prompt = f"""
You are a travel assistant helping users choose the best Airbnb option from clustered data.

The user has a budget of €{budget} for {nights} nights. Below are some available listings:

{context}

For each listing, consider:
- Total price
- Distance to city center and metro
- Number of bedrooms
- Guest satisfaction
- Cluster number

Which listing would you recommend and why? Include all these aspects in your answer.
"""

    client = Llama3Client()
    return client.ask(prompt)

nights = 5
budget = 500

response_1 = recommend_best_value(df, budget, nights)

In [39]:
# Display results
display(Markdown("### Recommendation (Best Value Overall):"))
display(Markdown(response_1))

### Recommendation (Best Value Overall):

After analyzing the available listings, I would recommend the Athens (Cluster 3) option.

Here's a breakdown of the reasons:

* **Total price**: With a budget of €500 for 5 nights, this option fits perfectly within that limit, with a total cost of €348.00.
* **Distance to city center and metro**: The listing is only 1.4km away from the city center and 0.6km away from the metro, making it a convenient location for exploring Athens.
* **Number of bedrooms**: It has 1 bedroom, which should be sufficient for most travelers.
* **Guest satisfaction**: All listings have a guest satisfaction rating of 100%, indicating that previous guests were extremely happy with their stay.

The only reason I wouldn't choose this option is because it's part of Cluster 3. If I had more information about the specific cluster or neighborhood, I might be able to make a more informed decision. However, since there are no significant differences in terms of total price, distance, number of bedrooms, and guest satisfaction among the different clusters, I would still recommend this option.

It's worth noting that the other options are not significantly better than this one, but they do have some drawbacks. For example, Lisbon (Cluster 1) is farther from the city center and metro, while Athens (Cluster 2) has a slightly higher price point. Budapest (Cluster 4) is the farthest from the city center and metro, making it less convenient for exploring.

Overall, I believe that the Athens (Cluster 3) option provides the best balance of factors within your budget.

In [44]:
# 2. Function for a specific city
def recommend_in_city(df, city, budget, nights=5, max_results=5):
    df['total_price'] = df['realSum'] * nights
    filtered = df[(df['City'].str.lower() == city.lower()) & (df['total_price'] <= budget)]
    filtered = filtered.sort_values(by='guest_satisfaction_overall', ascending=False).head(max_results)

    if filtered.empty:
        return f"No listings found in {city.title()} within your budget."

    context = format_listings(filtered, nights)
    prompt = f"""
You are a travel assistant helping users choose the best Airbnb option in {city.title()} from clustered data.

The user has a budget of €{budget} for {nights} nights. Below are some available listings in {city.title()}:

{context}

For each listing, consider:
- Total price
- Distance to city center and metro
- Number of bedrooms
- Guest satisfaction
- Cluster number

Based on value, distance, and cluster, which listing would you recommend and why?
"""
    client = Llama3Client()
    return client.ask(prompt)

# Example usage (you can replace these with input() calls or widgets in a notebook)
nights = 5
budget = 500
city = "Berlin"

response_2 = recommend_in_city(df, city, budget, nights)

display(Markdown(f"### Recommendation (City: {city.title()}):"))
display(Markdown(response_2))


### Recommendation (City: Berlin):

After analyzing the data, I'd like to recommend Berlin (Cluster 2) - €85.77/night, 1 bedroom, 6.5km to center, 0.3km to metro, 100 guest satisfaction, Total: €428.85.

Here's why:

* **Total price**: This option falls within the user's budget of €500 for 5 nights.
* **Distance to city center and metro**: The listing is relatively close to the city center (6.5km) and has a short distance to the metro (0.3km), making it convenient for exploring Berlin.
* **Number of bedrooms**: This option offers one bedroom, which should provide sufficient space for the user.
* **Guest satisfaction**: The guest satisfaction rating is 100%, indicating that previous guests have been very satisfied with their stay.

The only slightly negative aspect is the price, which is slightly higher than some of the other options. However, considering the overall value, distance, and guest satisfaction, I believe this option provides the best combination for the user.

It's worth noting that Berlin (Cluster 4) - €84.84/night, 1 bedrooms, 16.4km to center, 4.1km to metro, 100 guest satisfaction, Total: €424.20 is another viable option, but it has a longer distance to the city center and metro compared to the recommended listing.

Overall, I think Berlin (Cluster 2) - €85.77/night, 1 bedroom, 6.5km to center, 0.3km to metro, 100 guest satisfaction, Total: €428.85 is an excellent choice for this user.

In [18]:
# Updated function for best value using .to_string()
def recommend_best_value(df, budget, nights=5, max_results=5):
    df['total_price'] = df['realSum'] * nights
    filtered = df[df['total_price'] <= budget].sort_values(by='guest_satisfaction_overall', ascending=False).head(max_results)

    if filtered.empty:
        return "No listings found within your budget."
    
    context = filtered.to_string(index=False)

    prompt = f"""
You are a travel assistant helping users choose the best Airbnb option from clustered data.

The user has a budget of €{budget} for {nights} nights. Below are some available listings as a table:

{context}

Please evaluate each listing and recommend the best one. Be sure to consider:
- Total price
- Distance to city center and metro (dist & metro_dist)
- Number of bedrooms (bedrooms)
- Guest Satisfaction (guest_satisfaction_overall)
- Cluster

Explain your reasoning clearly using all available details and remember to mention which city the airbnb is located in.
"""
    client = Llama3Client()
    return client.ask(prompt)

In [20]:
response_1 = recommend_best_value(df, budget=500, nights=5)
display(Markdown(response_1))


Based on the given data, I will evaluate each listing and recommend the best one considering total price, distance to city center and metro, number of bedrooms, guest satisfaction overall, and cluster.

**Listing 1:**

* ID: 2267
* City: Athens
* Total Price: €348.00 (€69.60 per night for 5 nights)
* Distance to city center: 1.4 km
* Metro distance: 0.6 km
* Bedrooms: 1
* Guest satisfaction overall: 100%
* Cluster: 3

This listing is a good option considering the total price and distance to the city center and metro. However, it's a small space with only one bedroom.

**Listing 2:**

* ID: 2046
* City: Lisbon
* Total Price: €440.90 (€88.18 per night for 5 nights)
* Distance to city center: 2.3 km
* Metro distance: Not available
* Bedrooms: 1
* Guest satisfaction overall: 100%
* Cluster: 1

This listing is more expensive than the first one, and its distance to the city center is greater. Additionally, there's no information about metro distances.

**Listing 3:**

* ID: 2338
* City: Athens
* Total Price: €394.85 (€78.97 per night for 5 nights)
* Distance to city center: 1.2 km
* Metro distance: Not available
* Bedrooms: 1
* Guest satisfaction overall: 100%
* Cluster: 2

This listing is a good option considering the total price and distance to the city center. However, there's no information about metro distances.

**Listing 4:**

* ID: 1934
* City: Budapest
* Total Price: €405.35 (€81.07 per night for 5 nights)
* Distance to city center: 11.9 km (outside the city center)
* Metro distance: Not available
* Bedrooms: 1
* Guest satisfaction overall: 100%
* Cluster: 4

This listing is more expensive than the first three, and its distance to the city center is significantly greater.

**Listing 5:**

* ID: 2361
* City: Athens
* Total Price: €434.70 (€86.94 per night for 5 nights)
* Distance to city center: 1.6 km
* Metro distance: Not available
* Bedrooms: 1
* Guest satisfaction overall: 100%
* Cluster: 1

This listing is more expensive than the first two, and its distance to the city center is slightly greater.

Considering all factors, I recommend **Listing 1** (ID: 2267) in Athens. It's a good balance between total price, distance to the city center and metro, and guest satisfaction overall. Additionally, it's part of cluster 3, which might indicate that it's a popular option among travelers.

Keep in mind that this recommendation is based on the provided data, and I encourage you to further evaluate each listing by reading reviews and checking the host's profile before making a decision.

In [94]:
# Updated function for specific city using .to_string()
def recommend_in_city(df, city, budget, nights=5, max_results=5):
    df['total_price'] = df['realSum'] * nights
    filtered = df[
        (df['City'].str.lower() == city.lower()) &
        (df['total_price'] <= budget)
    ].sort_values(by='guest_satisfaction_overall', ascending=False).head(max_results)

    if filtered.empty:
        return f"No listings found in {city.title()} within your budget."
    
    context = filtered.to_string(index=False)

    prompt = f"""
You are a travel assistant helping users choose the best Airbnb option in {city.title()}.

The user has a budget of €{budget} for {nights} nights. Below are the available listings in table format:

{context}

Please evaluate the listings and recommend the best one. Include reasoning based on:
- Total cost
- Distance to attractions
- Bedrooms
- Guest satisfaction
- Cluster number
"""
    client = Llama3Client()
    return client.ask(prompt)