## RAG-based Place Recommendation Chatbot with PostgreSQL/TimescaleDB

This notebook demonstrates the potential usage of a chatbot system for recommending places in Warsaw. The project leverages vector search for semantic similarity.

There are two main custom classes:
- `VectorStore` manages vector operations and database interactions using a **Timescale** database with the **pgvector** extension. It stores high-dimensional embeddings for places and queries, enabling fast and accurate semantic search within **PostgreSQL**. 
- `Synthesizer` service for generating natural language responses based on user query. It takes the retrieved context (most relevant places) and crafts a coherent, context-aware answer, optionally providing a thought process and reasoning steps.
  
The data was collected from the Google Places API, and vectors were created from reviews, place types, and other relevant information to enable rich semantic search and recommendations.  

Typical use cases include food and places recommendations, context-aware responses, and advanced filtering based on metadata or predicates.

In [45]:
import sys
sys.path.append('/Users/marysiapacocha/Desktop/projects/warsaw-places-chatbot/app')

In [46]:
from database.vector_store import VectorStore
from services.synthesizer import Synthesizer
from timescale_vector import client
import pandas as pd
from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv())

In [22]:
vec = VectorStore()

**1. Relevant question**: Seeking recommendations for some specific food with no filtering

Let's start with a simple use case: asking the chatbot for recommendations without applying any filters.  
This demonstrates the default semantic search capability, where the system retrieves the most relevant places based purely on the meaning of the user's query.  
No metadata or predicate-based filtering is applied, so the results reflect the closest semantic matches in the database.

In [33]:
relevant_question = "Can you recommend a top-rated bakery in Warsaw for delicious pastries and cakes?"
results = vec.search(relevant_question, limit=3)
display(results.drop(columns=[col for col in ['content', 'embedding'] if col in results.columns]))

2025-06-26 16:11:42,333 - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2025-06-26 16:11:42,339 - INFO - Embedding generated in 0.464 seconds
2025-06-26 16:11:42,411 - INFO - Vector search completed in 0.071 seconds


Unnamed: 0,id,distance,name,rating,address,created_at,user_rating_count
0,ce230fe6-51fc-11f0-94ac-7e2e219475ab,0.197435,LUKULLUS | Mokotowska,4.4,"Mokotowska 52A, 00-543 Warszawa, Poland",2025-06-25T21:44:28.050021,143.0
1,4ecda77e-51fc-11f0-8cd4-11ca5710082c,0.234214,Kukułka / Mokotowska,4.5,"Mokotowska 52, 00-543 Warszawa, Poland",2025-06-25T21:40:54.419535,301.0
2,cdd4ee48-51fb-11f0-bf4e-64f46607cda2,0.254992,Cukiernia i piekarnia Semifreddo,4.1,"Grójecka 65a, 02-093 Warszawa, Poland",2025-06-25T21:37:18.041106,183.0


When `vec.search(...)` is called, the query is embedded into a vector, and a similarity search is performed directly in TimescaleDB using the pgvector extension. This allows fast and efficient semantic search over all stored places.

Now let's generate response:

In [34]:
response = Synthesizer.generate_response(question=relevant_question, context=results)

print(f"\n{response.answer}")
print("\nThought process:")
for thought in response.thought_process:
    print(f"- {thought}")
print(f"\nContext: {response.enough_context}")

2025-06-26 16:12:35,309 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"



I recommend visiting Kukułka / Mokotowska for delicious pastries and cakes in Warsaw. It is highly rated with a score of 4.5 from 301 reviews and is praised for its phenomenal pastries, including whortleberry pastries and coconut cake. The bakery is located at Mokotowska 52, 00-543 Warszawa, Poland.

Thought process:
- I have retrieved information about three bakeries in Warsaw, each with detailed reviews and ratings.
- LUKULLUS | Mokotowska is highly praised for its wonderful cakes, cheesecakes, eclairs, and seasonal cakes. It has a rating of 4.4 based on 143 reviews.
- Kukułka / Mokotowska is noted for its phenomenal pastries and cakes, with a rating of 4.5 from 301 reviews. It is particularly recommended for its whortleberry pastries and coconut cake.
- Cukiernia i piekarnia Semifreddo is known for its cheesecake, Napoleon cake, and fresh desserts, with a rating of 4.1 from 183 reviews.
- Based on the ratings and reviews, Kukułka / Mokotowska seems to be the top-rated bakery among 

**2. Irrelevant question**: e.g., asking about the weather.

In this example, we test how the system handles a query that is unrelated to places or recommendations.   
Even though the vector search may still return some results due to semantic similarity, the response generator (`Synthesizer`) is designed to recognize irrelevant queries and respond appropriately, indicating that the question does not match the available context.

In [32]:
irrelevant_question = "What is the weather in Poland?"
results = vec.search(irrelevant_question, limit=3)
display(results.drop(columns=[col for col in ['content', 'embedding'] if col in results.columns]))

2025-06-26 16:05:56,862 - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2025-06-26 16:05:56,866 - INFO - Embedding generated in 0.584 seconds
2025-06-26 16:05:56,910 - INFO - Vector search completed in 0.043 seconds


Unnamed: 0,id,distance,name,rating,address,created_at,user_rating_count
0,566f1d3c-51fc-11f0-b8d2-a1d9f46ee1d6,0.533617,Pikanteria,4.5,"Walecznych 68A, 03-920 Warszawa, Poland",2025-06-25T21:41:07.221836,3449.0
1,d41fe8b0-51fc-11f0-8ce9-deb888ef4638,0.54399,Bistro KEN,4.5,"Aleja Komisji Edukacji Narodowej 48/lok. U13, ...",2025-06-25T21:44:38.095671,761.0
2,d9647026-51fb-11f0-8ca8-7215048e3fcc,0.547128,Bistro Bielany,4.7,"Efraima Schroegera 89, 01-845 Warszawa, Poland",2025-06-25T21:37:37.436555,1195.0


Generating response:

In [26]:
response = Synthesizer.generate_response(question=irrelevant_question, context=results)

print(f"\n{response.answer}")
print("\nThought process:")
for thought in response.thought_process:
    print(f"- {thought}")
print(f"\nContext: {response.enough_context}")

2025-06-26 13:58:01,788 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"



I'm sorry, but I don't have the current weather information for Poland. You might want to check a reliable weather website or app for the most up-to-date information.

Thought process:
- The user is asking about the current weather in Poland.
- The retrieved context does not contain any information about the weather in Poland.
- The context is focused on restaurants and food places in Warsaw, Poland.
- I need to inform the user that there is insufficient information to answer their question about the weather.

Context: False


**3. Advanced filtering using Predicates**

In many scenarios, thete may be need refine your search results using additional criteria, such as minimum rating or popularity.  
The system supports advanced filtering through the use of **Predicates**, which allow you to specify complex conditions on metadata fields (e.g., rating, user rating count, district).  
Predicates can be combined using logical operators to create powerful, fine-grained queries, enabling more targeted recommendations based on user preferences.

In [27]:
df = pd.read_csv("../data/places.csv", sep=",")
median_popularity = float(df["user_rating_count"].median())

In [35]:
predicates = client.Predicates("rating", ">=", 3.75)
results = vec.search(relevant_question, limit=3, predicates=predicates)

predicates = client.Predicates("rating", ">=", 3.75) & client.Predicates(
    "user_rating_count", ">=", median_popularity
)

results = vec.search(relevant_question, limit=3, predicates=predicates)
display(results.drop(columns=[col for col in ['content', 'embedding'] if col in results.columns]))

2025-06-26 16:15:51,723 - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2025-06-26 16:15:51,730 - INFO - Embedding generated in 0.587 seconds
2025-06-26 16:15:51,776 - INFO - Vector search completed in 0.046 seconds
2025-06-26 16:15:52,159 - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2025-06-26 16:15:52,161 - INFO - Embedding generated in 0.368 seconds
2025-06-26 16:15:52,182 - INFO - Vector search completed in 0.020 seconds


Unnamed: 0,id,distance,name,rating,address,created_at,user_rating_count
0,4ecda77e-51fc-11f0-8cd4-11ca5710082c,0.234214,Kukułka / Mokotowska,4.5,"Mokotowska 52, 00-543 Warszawa, Poland",2025-06-25T21:40:54.419535,301.0
1,e9d6b1f6-51fd-11f0-b888-2a38e86c0c0f,0.265441,Cukiernia Irena,4.5,"Zakopiańska 20, 03-943 Warszawa, Poland",2025-06-25T21:52:24.022434,650.0
2,604898ba-51fc-11f0-8718-5aaf5946268a,0.273645,Blacha pracownia wypieków,4.8,"Jana Kasprowicza 48, 01-871 Warszawa, Poland",2025-06-25T21:41:23.746532,284.0


When applying filtering with predicates, the results returned by the search become more tailored to the specified criteria. For example, by filtering for places with a minimum rating or a certain level of popularity, only those places that meet these requirements will be included in the results. This allows for more precise and relevant recommendations compared to the default semantic search, which considers only the meaning of the query without any additional constraints.

Generating response:

In [36]:
response = Synthesizer.generate_response(question=relevant_question, context=results)

print(f"\n{response.answer}")
print("\nThought process:")
for thought in response.thought_process:
    print(f"- {thought}")
print(f"\nContext: {response.enough_context}")

2025-06-26 16:17:03,584 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"



I recommend trying Kukułka on Mokotowska street, which is highly praised for its phenomenal pastries and cakes, including eclairs and coconut cake. Another excellent choice is Cukiernia Irena, known for its friendly service and a brilliant selection of sweet treats like florentynki cakes and jabłuszko pastries. Lastly, Blacha pracownia wypieków is highly rated for its delicious kolaches and seasonal fruit pastries. All three bakeries offer delightful options for pastries and cakes in Warsaw.

Thought process:
- I have retrieved information about three bakeries in Warsaw with high ratings and positive reviews.
- Kukułka on Mokotowska street is praised for its phenomenal pastries and cakes, with a particular mention of their eclairs and coconut cake.
- Cukiernia Irena is noted for its friendly staff and a brilliant selection of sweet treats, including florentynki cakes and jabłuszko pastries.
- Blacha pracownia wypieków is highly rated for its delicious kolaches and other baked goods, w

**4. Other examples:**

In [40]:
question = "I wanna find an excellent restaurant in district Mokotów to eat asian food like ramen, sushi or pad thai. Can you help me?"
results = vec.search(question, limit=3, predicates=predicates)
display(results.drop(columns=[col for col in ['content', 'embedding'] if col in results.columns]))
response = Synthesizer.generate_response(question=question, context=results)
print(f"\n{response.answer}")

2025-06-26 16:24:16,364 - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2025-06-26 16:24:16,366 - INFO - Embedding generated in 0.440 seconds
2025-06-26 16:24:16,398 - INFO - Vector search completed in 0.031 seconds


Unnamed: 0,id,distance,name,rating,address,created_at,user_rating_count
0,b64b20e4-51fb-11f0-955f-0c18e0965feb,0.365567,Taste of Asia,4.6,"Puławska 74/80, 02-603 Warszawa, Poland",2025-06-25T21:36:38.550546,367.0
1,daa78a30-51fc-11f0-ac5b-dbc0fee4d7eb,0.372609,Restauracja Pakczoj ramen & udon,4.7,"Wielicka 43, 02-657 Warszawa, Poland",2025-06-25T21:44:49.050842,326.0
2,cd4b7b20-51fd-11f0-ae08-dcfc3173c5c1,0.380995,Azia Restaurants - Centrum Praskie Koneser,4.6,"Plac Konesera 4, 05-077 Warszawa, Poland",2025-06-25T21:51:36.133909,1508.0


2025-06-26 16:24:24,149 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"



Here are two excellent Asian restaurants in the Mokotów district of Warsaw that you might enjoy:

1. **Taste of Asia**
   - **Address:** Puławska 74/80, 02-603 Warszawa, Poland
   - **Rating:** 4.6 out of 5 (based on 367 reviews)
   - **Highlights:** This restaurant is highly recommended for its Miso Ramen and Pork Momo. They offer a variety of dishes, including Pad Thai and Japanese Katsu Curry, in a cozy atmosphere with friendly staff.

2. **Restauracja Pakczoj ramen & udon**
   - **Address:** Wielicka 43, 02-657 Warszawa, Poland
   - **Rating:** 4.7 out of 5 (based on 326 reviews)
   - **Highlights:** Known for its delicious ramen and udon dishes, this restaurant offers generous portions and a warm atmosphere. The staff is accommodating and fluent in English, making it a pleasant dining experience.

Both places offer a great selection of Asian dishes and have received positive reviews for their food and service.


In [41]:
question = "I'm looking for a place to have a high-quality coffee in the center of Warsaw. Can you suggest something?"
results = vec.search(question, limit=3, predicates=predicates)
display(results.drop(columns=[col for col in ['content', 'embedding'] if col in results.columns]))
response = Synthesizer.generate_response(question=question, context=results)
print(f"\n{response.answer}")

2025-06-26 16:28:21,956 - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2025-06-26 16:28:21,963 - INFO - Embedding generated in 0.715 seconds
2025-06-26 16:28:22,034 - INFO - Vector search completed in 0.070 seconds


Unnamed: 0,id,distance,name,rating,address,created_at,user_rating_count
0,9c813fae-51fb-11f0-ad82-895e8f6bab84,0.309538,relax cafe bar centrum,4.7,"Złota 8a, 00-019 Warszawa, Poland",2025-06-25T21:35:55.284364,919.0
1,6e8772ac-51fc-11f0-a145-d02b84770800,0.316854,Wrzenie Świata,4.7,"Konstantego Ildefonsa Gałczyńskiego 7, 00-362 ...",2025-06-25T21:41:47.646567,1740.0
2,de1a30a4-51fd-11f0-bf0b-0d89d9058d56,0.320047,Coffee Karma,4.0,"Mokotowska 17, 00-640 Warszawa, Poland",2025-06-25T21:52:04.332147,1823.0


2025-06-26 16:28:30,632 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"



Here are some highly recommended places for high-quality coffee in the center of Warsaw:

1. **Relax Cafe Bar Centrum**
   - Address: Złota 8a, 00-019 Warszawa, Poland
   - Rating: 4.7
   - Description: Known for its excellent service and impressive selection of coffee, Relax Cafe Bar Centrum offers a charming atmosphere with locally roasted Brazilian beans. It's a great spot for coffee enthusiasts.

2. **Wrzenie Świata**
   - Address: Konstantego Ildefonsa Gałczyńskiego 7, 00-362 Warszawa, Poland
   - Rating: 4.7
   - Description: This cozy cafe is also a bookstore, providing a relaxed vibe with amazing coffee and pastries. It's a perfect place for meetings or remote work.

3. **Coffee Karma**
   - Address: Mokotowska 17, 00-640 Warszawa, Poland
   - Rating: 4.0
   - Description: Located in the heart of Warsaw, Coffee Karma offers a great selection of coffees and a friendly atmosphere. It's a bustling place ideal for people-watching and enjoying carefully crafted fare.


In [42]:
question = "I'd like to go to a museum in Warsaw that has a good rating and is not too crowded. Can you recommend one"
results = vec.search(question, limit=3, predicates=predicates)
display(results.drop(columns=[col for col in ['content', 'embedding'] if col in results.columns]))
response = Synthesizer.generate_response(question=question, context=results)
print(f"\n{response.answer}")

2025-06-26 16:29:48,925 - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2025-06-26 16:29:48,929 - INFO - Embedding generated in 0.493 seconds
2025-06-26 16:29:48,964 - INFO - Vector search completed in 0.034 seconds


Unnamed: 0,id,distance,name,rating,address,created_at,user_rating_count
0,aac2cfc8-51fc-11f0-9854-6dcbd51c9cbb,0.350508,National Museum in Warsaw,4.6,"Al. Jerozolimskie 3, 00-495 Warszawa, Poland",2025-06-25T21:43:28.698930,20949.0
1,a52b41a8-51fc-11f0-8ce9-0195d3593edb,0.366174,Museum of Warsaw,4.6,"Rynek Starego Miasta 42, 00-272 Warszawa, Poland",2025-06-25T21:43:19.317095,3353.0
2,9ad6b7f6-51fb-11f0-bcfd-61bdc3e4fcf7,0.368929,Museum of King Jan III's Palace at Wilanów,4.7,"Stanisława Kostki Potockiego 10/16, 02-958 War...",2025-06-25T21:35:52.489062,28574.0


2025-06-26 16:30:02,078 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"



I recommend visiting the Museum of Warsaw. It has a good rating of 4.6 and is described as a hidden gem, which might mean it is less crowded. The museum is located in the Old Town Market Square and offers a unique and immersive experience focused on the history and culture of Warsaw. It is a great place to explore if you're interested in the city's past and its residents' stories.


In [44]:
question = "I feel like going to a park to relax and read a book. Than i want to go to a restaurant for dinner. Can you suggest me both park and a restaurant that are close to each other? Give me two options to pick from."
results = vec.search(question, limit=6, predicates=predicates)
display(results.drop(columns=[col for col in ['content', 'embedding'] if col in results.columns]))
response = Synthesizer.generate_response(question=question, context=results)
print(f"\n{response.answer}")

2025-06-26 16:34:03,783 - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2025-06-26 16:34:03,790 - INFO - Embedding generated in 1.627 seconds
2025-06-26 16:34:03,827 - INFO - Vector search completed in 0.036 seconds


Unnamed: 0,id,distance,name,rating,address,created_at,user_rating_count
0,69670a08-51fc-11f0-b0c6-1e251717e95b,0.524312,Simon Hill,4.7,"aleja Brzóz, 05-500 Piaseczno, Poland",2025-06-25T21:41:39.045622,3393.0
1,e7fd8f08-51fd-11f0-9c59-d2a103b2d057,0.537978,Birch Wood Park,4.3,"Henryka Świątkowskiego 2, 02-797 Warszawa, Poland",2025-06-25T21:52:20.921715,405.0
2,63972810-51fc-11f0-9e05-8c71a499b59a,0.54642,Park Sady Żoliborskie,4.7,"01-626 Warsaw, Poland",2025-06-25T21:41:29.294617,1735.0
3,ec0c6e30-51fc-11f0-96ab-641095626285,0.548702,General Gustaw Orlicz-Dreszer Park,4.6,"Puławska, 02-605 Warszawa, Poland",2025-06-25T21:45:18.233331,3135.0
4,aa9f706e-51fc-11f0-8919-8b8cf5b03a83,0.549116,Park nad stawem Służewieckim,4.7,"Służew, 00-001 Warszawa, Poland",2025-06-25T21:43:28.467136,459.0
5,14b2494e-51fe-11f0-be56-ef66908b3037,0.549143,Discoverer's Park,4.6,"Wybrzeże Kościuszkowskie 20, 00-390 Warszawa, ...",2025-06-25T21:53:35.925871,497.0


2025-06-26 16:34:12,263 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"



Here are two options for parks and nearby restaurants in Warsaw where you can relax and then enjoy a meal:

1. **General Gustaw Orlicz-Dreszer Park**
   - **Park Features**: This park is known for its scenic paths, fountains, and a monument. It offers a peaceful environment with shaded areas, making it ideal for reading and relaxation.
   - **Nearby Restaurant**: There is a restaurant called Zielnik located by the main alley in the park, offering convenient dining options.

2. **Park Sady Żoliborskie**
   - **Park Features**: This park is quiet and green, with beautiful fruit trees and a calming atmosphere. It's perfect for a leisurely walk or reading.
   - **Nearby Restaurant**: While the context does not specify a restaurant within the park, the area is known for trendy cafe-bars, which are likely nearby for dining options.

These options provide a combination of relaxation in a park setting followed by a convenient dining experience.
