#  Level 3 - Task 1: Predictive Modeling

# Overview of Problem Statement

Predicting restaurant ratings using existing attributes helps platforms and businesses improve decision-making and user experience.



# Objective

1.Build regression models to predict aggregate rating using features like price, votes, and service options

2.Split the dataset into training and testing sets

3.Compare model performance (Linear Regression, Decision Tree, Random Forest) using evaluation metrics



# Importing necessary libraries

In [1]:
##  Import Libraries
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.tree import DecisionTreeRegressor
from sklearn.ensemble import RandomForestRegressor
from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score
import warnings
warnings.filterwarnings('ignore')

#  Data collection

In [2]:
##  Load Dataset
df = pd.read_csv("Dataset .csv")
df.dropna(subset=['Cuisines'], inplace=True)

In [3]:
# Feature Engineering
df['Has_Table_Booking_Flag'] = df['Has Table booking'].apply(lambda x: 1 if x.lower() == 'yes' else 0)
df['Has_Online_Delivery_Flag'] = df['Has Online delivery'].apply(lambda x: 1 if x.lower() == 'yes' else 0)

In [4]:
# Select features and target
features = ['Average Cost for two', 'Price range', 'Votes', 'Has_Table_Booking_Flag', 'Has_Online_Delivery_Flag']
target = 'Aggregate rating'

In [5]:
X = df[features]
y = df[target]

#  Split Data

In [6]:
##  Split Data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

In [7]:
##  Initialize Models
models = {
    "Linear Regression": LinearRegression(),
    "Decision Tree": DecisionTreeRegressor(random_state=42),
    "Random Forest": RandomForestRegressor(random_state=42)
}

#  Train and Evaluate Models

In [8]:
##  Train and Evaluate Models
for name, model in models.items():
    model.fit(X_train, y_train)
    y_pred = model.predict(X_test)
    print(f"\n {name} Results:")
    print("MAE:", mean_absolute_error(y_test, y_pred))
    print("MSE:", mean_squared_error(y_test, y_pred))
    print("RMSE:", mean_squared_error(y_test, y_pred, squared=False))
    print("R2 Score:", r2_score(y_test, y_pred))


 Linear Regression Results:
MAE: 1.0495250136543057
MSE: 1.6275791725010238
RMSE: 1.2757661119895856
R2 Score: 0.2892767308287971

 Decision Tree Results:
MAE: 0.2907853323694694
MSE: 0.20431741306022874
RMSE: 0.45201483721248437
R2 Score: 0.9107796768278703

 Random Forest Results:
MAE: 0.23567716914785236
MSE: 0.13104481072995472
RMSE: 0.3620011197910232
R2 Score: 0.9427759964839091


# Conclusion


1.Models were trained and evaluated successfully

2.Random Forest Regressor performed the best with the lowest error and highest R² score, capturing complex patterns more effectively than simpler models

3.The model can be used to estimate ratings based on restaurant features, aiding in platform recommendations and business insights

