# Sentiment Analysis of Italian Restaurant Reviews

The project aims to analyze customer sentiment towards Italian restaurants in France and Italy. The analysis is based on textual reviews collected from two different data sources: OpenTable and Yelp.

By employing sentiment analysis, we aim to uncover trends and compare how Italian cuisine and the general quality of the related businesses are perceived across the two regions.

## OpenWeb Ninja Red Flower Business Data API

The OpenWeb Ninja Red Flower Business Data API is a fast, reliable, and comprehensive Yelp Business Data & Business Details & Reviews API. This API enables you to easily access up-to-date extensive business/POI & reviews information from Yelp in real-time.

This query returns a list of the best 10 italian restaurants in Rome.
This list is the used to query each business to obtain their reviews.

In [10]:
import requests

API_KEY = "e2d13cd3a0msh972b270d9391f69p17bd85jsn44d5cbf79484"
BASE_URL = "https://red-flower-business-data.p.rapidapi.com"
SEARCH_URL = f"{BASE_URL}/business-search"
REVIEWS_URL = f"{BASE_URL}/business-reviews"

# Define the search parameters
headers = {"X-RapidAPI-Key": API_KEY}
params = {
    "query": " ",
    "location": "Rome, Italy",
    "categories": "italian restaurants",
    "limit": 10,
    "sort_by": "HIGHEST_RATED"
}

# Make the request
response = requests.get(SEARCH_URL, headers=headers, params=params)

# Parse the response
if response.status_code == 200:
    data = response.json()
    for business in data['data']:
        url = f"{REVIEWS_URL}?business_id={business['id']}&page=1&page_size=10"
        response_business = requests.get(url, headers=headers)
        if response_business.status_code == 200:
            data_business = response_business.json()['data']
            review_count = data_business.get('total')
            reviews = data_business.get("reviews")
            if len(reviews) > 0:
                first_review = reviews[0]['review_text']
                print(f"Name: {business['name']}")
                print(f"Rating: {business['rating']}")
                print(f"Review count: {review_count}")
                print(f"First Review: {first_review[0:min(40, len(first_review))]}...")
                print("-" * 40)
else:
    print(f"Error: {response.status_code} - {response.text}")


Name: gelato del teatro
Rating: 5
Review count: 9
First Review: Dare I say it?  Is it possible?  The bes...
----------------------------------------
Name: Coney Island Street Food
Rating: 5
Review count: 9
First Review: This place was AMAZING!!! The burgers we...
----------------------------------------
Name: Pane e olio
Rating: 5
Review count: 5
First Review: Probably the best italian food I have tr...
----------------------------------------
Name: Mr.Baguette
Rating: 5
Review count: 8
First Review: Don't walk too fast or you'll miss it! (...
----------------------------------------
Name: L’officina della Pizza
Rating: 5
Review count: 8
First Review: I love this little spot for lunch before...
----------------------------------------
Name: Zia Rilla
Rating: 5
Review count: 9
First Review: This restaurant is amazing for the food ...
----------------------------------------
Name: Alice
Rating: 5
Review count: 4
First Review: Awesome pizza and great customer service...
------------------

## OpenTable
The following queries the OpenTable website with the following parameters:
* search term: Paris
* cuisine: Italian
* sort by: rating

This query returns a list of the best 30 italian restaurants in Paris (at the moment of the search).
This list is the used to open each restaurant's page so to scrape their reviews. All the sown reviews are captured.

In [None]:
%pip install selenium --quiet

Note: you may need to restart the kernel to use updated packages.



[notice] A new release of pip is available: 24.2 -> 24.3.1
[notice] To update, run: python.exe -m pip install --upgrade pip


In [62]:
from selenium import webdriver
from selenium.webdriver.common.by import By
import time

# needed so to have the lists (restaurants and reviews) loaded from the website
def scroll_down_page(driver, speed=8):
    current_scroll_position, new_height= 0, 1
    while current_scroll_position <= new_height:
        current_scroll_position += speed
        driver.execute_script("window.scrollTo(0, {});".format(current_scroll_position))
        new_height = driver.execute_script("return document.body.scrollHeight")

# obtain the list of restaurants based on the predefined criteria
def scrape_opentable_restaurants(keep_open=False, max_restaurants=10):
    # queries OpenTable restaurants in 'Paris' under 'Italian' cousine category, ordered by rating
    url = "https://www.opentable.com/s?term=paris&cuisineIds%5B%5D=48e9d049-40cf-4cb9-98d9-8c47d0d58986&sortBy=rating"

    # open the browser
    driver = webdriver.Chrome()
    driver.get(url)
    time.sleep(3)
    # scroll down the page so to have all the restaurants loaded
    scroll_down_page(driver)
    time.sleep(3)

    restaurants = []

    try:
        # Extract restaurant elements
        restaurant_elements = driver.find_elements(By.CLASS_NAME, 'qCITanV81-Y-')

        for restaurant_element in restaurant_elements:
            try:
                restaurant_name = restaurant_element.text # name
                restaurant_link = restaurant_element.get_attribute('href') # link
                restaurant_link = restaurant_link[:-122] # remove parameters
                restaurants.append({'restaurant_name': restaurant_name, 'restaurant_link': restaurant_link})

            except Exception as e:
                print(f"Error extracting restaurants: {e}")
                continue

    except Exception as e:
        print(f"Error during scraping: {e}")
    finally:
        if not keep_open:
            driver.quit()

    return restaurants, driver

# scrape reviews from the given restaurant
def scrape_opentable_reviews(driver, url, keep_open=False, max_reviews=10):

    # open the browser if not already open
    if driver is None:
        driver = webdriver.Chrome()
    driver.get(url)
    time.sleep(3)

    reviews = []

    try:
        # scroll down the page so to have all the reviews loaded
        scroll_down_page(driver)
        time.sleep(1)
        # Extract review elements
        review_elements = driver.find_elements(By.CLASS_NAME, 'afkKaa-4T28-')

        for review_element in review_elements:
            try:
                review_text = review_element.find_element(By.CLASS_NAME, '_6rFG6U7PA6M-').text # review text
                review_date = review_element.find_element(By.CLASS_NAME, 'iLkEeQbexGs-').text # review date
                reviews.append({'review_text': review_text, 'review_date': review_date})
                
            except Exception as e:
                print(f"Error extracting review: {e}")
                continue

    except Exception as e:
        print(f"Error during scraping: {e}")
    finally:
        if not keep_open:
            driver.quit()

    return reviews

### For demonstration purposes, print the restaurants and their reviews

In [63]:
restaurants, driver = scrape_opentable_restaurants(keep_open=True)

print("restaurants count: " + str(len(restaurants)) + '\n')

for n, restaurant in enumerate(restaurants):
    print(f"Restaurant: {restaurant['restaurant_name']}")
    reviews = scrape_opentable_reviews(driver, restaurant['restaurant_link'], keep_open= True if n < len(restaurants) else False)
    print(f"Reviews count: {len(reviews)}")
    
    for idx, review in enumerate(reviews, 1):
      print(f"Review {idx}:")
      print(f"Date: {review['review_date']}")
      print(f"Text: {review['review_text'][0:min(20, len(review['review_text']))]}...\n")

restaurants count: 30
Restaurant: Mamamia
Reviews count: 10
Review 1:
Date: Dined 3 days ago
Text: Service wa...

Review 2:
Date: Dined 4 days ago
Text: Restaurant...

Review 3:
Date: Dined on November 11, 2024
Text: Having boo...

Review 4:
Date: Dined on November 9, 2024
Text: Oublié de ...

Review 5:
Date: Dined on October 24, 2024
Text: The burrat...

Review 6:
Date: Dined on October 19, 2024
Text: Staff at e...

Review 7:
Date: Dined on October 5, 2024
Text: The bounce...

Review 8:
Date: Dined on September 28, 2024
Text: My FAVOURI...

Review 9:
Date: Dined on August 11, 2024
Text: Its a good...

Review 10:
Date: Dined on April 18, 2024
Text: This place...

Restaurant: Truffes Folies Paris 7
Reviews count: 10
Review 1:
Date: Dined on November 25, 2024
Text: Nice resta...

Review 2:
Date: Dined on November 8, 2024
Text: Wonderful ...

Review 3:
Date: Dined on October 23, 2024
Text: We randoml...

Review 4:
Date: Dined on October 18, 2024
Text: Buena comi...

Review 5:
Date: Dined 