## Business Understanding
### Introduction
In an age where culinary diversity and dining out have become integral parts of our social fabric, choosing the perfect restaurant can be both exciting and overwhelming. With an abundance of dining options ranging from quaint bistros to exotic eateries, making a dining decision has never been more challenging. Traditional restaurant websites have long relied on filters based on amenities, location, or cuisine types, providing users with a plethora of options to sift through. However, as the restaurant industry evolves and culinary landscapes expand, the need for a more refined and personalized approach to restaurant discovery has become evident. Enter the era of restaurant recommendation systems—a technological marvel that goes beyond the mundane task of filtering restaurants based on their amenities. These systems leverage the power of data science, machine learning, and user preferences to deliver tailored dining suggestions that match your unique tastes and preferences.

In a world where time is precious and choices are abundant, restaurant recommendation systems offer an invaluable solution by enhancing the dining experience in ways that traditional filters simply cannot. This project delves into the world of restaurant recommendation systems, exploring their importance, functionality, and the transformative impact they have on the way we discover and enjoy culinary delights. We will unveil how these intelligent algorithms are reshaping the gastronomic landscape, catering to the ever-evolving preferences of diners and revolutionizing the art of restaurant selection. Join us on this journey as we unravel the magic of restaurant recommendation systems, offering a taste of the future of dining exploration.

### Problem Statement
This project aims to address the challenge faced by individuals in making informed choices about restaurants and dining experiences by developing a user-friendly restaurant recommendation system that empowers individuals to make informed dining decisions, ultimately enhancing their overall restaurant experience.

### Main Objective
To develop an interactive and user-friendly restaurant recommendation system.

### Specific Objective
- Analyze key factors for restaurant ratings, identifying and evaluating the key attributes and factors that significantly influence restaurant ratings and customer preferences using data analysis techniques.

- Develop content-based recommendation algorithms, creating and implementing advanced content-based on algorithms that can generate personalized restaurant recommendations based on user-defined text, restaurant names, and other user preferences.

- Integrate interactive maps to create an interactive mapping feature within the recommendation system. This map will allow users to explore geographic trends in restaurant recommendations, providing a visually engaging way to discover dining options based on location.

- Build an interactive user interface that allows users to easily access and interact with the restaurant recommendation system.

### Metric of Success
To consider our project successful, we will focus on the following key metrics:

Our model should effectively address the "cold start problem," ensuring that it can provide meaningful recommendations even for new users or restaurants without extensive review data. Expanding the geographical coverage of our system is another metric. Success is when users from various regions and cities can access relevant restaurant recommendations. The successful deployment of our recommendation model is a critical metric. It should be accessible to users, responsive, and capable of generating real-time recommendations.

## Data Understanding
The dataset used in this project, was extracted from the Yelp Restaurant database, which is publicly available and contains a large number of reviews across various restaurants and locations. The dataset contains 908,915 tips/reviews by 1,987,897 users on the 131,930 businesses and their attributes like hours, parking, availability, and ambiance aggregated check-ins over time for each. The dataset contains five json files namely business.json, checkin.json, review.json, tips.json and user.json, but only two files were found to contain the relevant required information;

- business.json: this json file has data on various businesses all spread over different US states and their relevant attributes.

- review.json: this json file contains information on reviews made by different users on various businesses they were served.

Due to the dataset being large, we have only extracted 54,380 rows and 14 columns which are enough for our analysis and the two stated json files were merged and only the relevant columns were maintained, namely;

- user_id: A unique identifier for each user who submitted a review

- business_id: A unique identifier for each business being reviewed

- name: string, the business's name

- address: string, the full address of the business

- stars: The rating given by the user in terms of stars (e.g., 1.0, 2.0, 3.0, 4.0, 5.0),

- text: The actual text content of the review and

- review_count: number of reviews the business has received

- city: string, the city eg "San Francisco",

- state: string, 2 character state code, if applicable eg"CA",

- latitude: float, latitude of the business

- longitude: float, longitude of the business

- attributes: business attributes and features

- categories: a list of the business categories

- hours: hours in when the business is open,hours are using a 24hr clock

For download of the dataset's, view the Link and for complete documentation of all the datasets.

The information contained in this dataset, about business attributes and user reviews, will be used to train models in development of the restaurant reccommendation system.

In [1]:
# importing necesarry packages
import collections
import folium
import json 
import numpy as np
from nltk.corpus import stopwords
from nltk.stem.snowball import SnowballStemmer
from nltk.tokenize import RegexpTokenizer
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
import string
import pickle
from surprise import Reader , Dataset
from tabulate import tabulate
from surprise.model_selection import cross_validate
from surprise.prediction_algorithms import SVD
from surprise.prediction_algorithms import KNNWithMeans, KNNBasic, KNNBaseline
from surprise.model_selection import GridSearchCV
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import MultinomialNB
from sklearn.preprocessing import LabelEncoder
from tensorflow.keras import models ,layers, optimizers , losses, regularizers, metrics
from wordcloud import WordCloud


# plotting styles
plt.style.use("fivethirtyeight")
%matplotlib inline

ModuleNotFoundError: No module named 'wordcloud'