## Restaurant Customer Satisfaction Analysis and Gradient Boosting Classifier Model Report

## 1. Introduction

This report provides an in-depth analysis of the restaurant customer satisfaction dataset and the implementation of a Gradient Boosting Classifier model. The objective is to analyze the dataset, perform preprocessing, build a classification model, and evaluate its performance.

## 2. Libraries Used

The following libraries were used in this project:

In [6]:
import pandas as pd
import numpy as np
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix
from sklearn.impute import SimpleImputer
from sklearn.preprocessing import LabelEncoder, StandardScaler

## 3. Dataset Overview

The dataset used in this project is related to customer satisfaction in restaurants. It includes multiple attributes that may influence a customer's satisfaction level.

### 3.1 Number of Columns  

The dataset consists of multiple columns representing **customer details, dining experience, and satisfaction levels**.  

### 3.2 Relationship Between Columns  

- **Demographic information** such as age and gender may influence customer satisfaction.  
- **Service quality and food quality** may have a direct impact on the satisfaction level.  
- **Wait time, pricing, and ambiance** are also key factors affecting customer experience.  
- The target variable, **HighSatisfaction**, indicates whether a customer had a high satisfaction level or not.  


## 4. Basic Analysis

The dataset was loaded and examined using the following functions:

In [8]:
data = pd.read_csv(r"C:\Users\devad\Downloads\restaurant_customer_satisfaction.csv")
print(data.head())
print(data.tail())
print(data.info())
print(data.describe())

   CustomerID  Age  Gender  Income VisitFrequency  AverageSpend  \
0         654   35    Male   83380         Weekly     27.829142   
1         655   19    Male   43623         Rarely    115.408622   
2         656   41  Female   83737         Weekly    106.693771   
3         657   43    Male   96768         Rarely     43.508508   
4         658   55  Female   67937        Monthly    148.084627   

  PreferredCuisine TimeOfVisit  GroupSize DiningOccasion  MealType  \
0          Chinese   Breakfast          3       Business  Takeaway   
1         American      Dinner          1         Casual   Dine-in   
2         American      Dinner          6    Celebration   Dine-in   
3           Indian       Lunch          1    Celebration   Dine-in   
4          Chinese   Breakfast          1       Business  Takeaway   

   OnlineReservation  DeliveryOrder  LoyaltyProgramMember   WaitTime  \
0                  0              1                     1  43.523929   
1                  0            

### 4.1 Checking for Null Values

In [None]:
print(data.isnull().sum())

The dataset contains missing values in some columns.

## 5. Data Preprocessing

Since the dataset contains categorical variables, we used Label Encoding to convert them into numeric values:

In [None]:
le = LabelEncoder()
data['Name'] = le.fit_transform(data['Name'])
data['Cabin'] = le.fit_transform(data['Cabin'])
data['Ticket'] = le.fit_transform(data['Ticket'])

## 6. Model Building  

The target variable **HighSatisfaction** is separated from the feature set.  

```python
x = data.drop(['HighSatisfaction'], axis=1)  
y = data['HighSatisfaction']  


### 6.1 Splitting Data into Training and Testing Sets  

```python
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=0)  


## 6.2 Applying Gradient Boosting Classifier

In [None]:
gbc = GradientBoostingClassifier(learning_rate=0.3, n_estimators=100, max_depth=3, min_samples_split=2, min_samples_leaf=1, ccp_alpha=10)
model = gbc.fit(x_train, y_train)

## 7. Model Evaluation

### 7.1 Predictions

In [None]:
y_pred = model.predict(x_test)

### 7.2 Accuracy Score

In [None]:
ass = accuracy_score(y_test, y_pred) * 100
print(f"Accuracy_Score = {ass}")

Accuracy_Score = 86.66666666666667

### 7.3 Classification Report

In [None]:
print(classification_report(y_test, y_pred))

## 8. Conclusion

## Conclusion  

- The dataset was analyzed, and missing values were handled appropriately.  
- Categorical variables were encoded using **Label Encoding**.  
- A **Gradient Boosting Classifier** was implemented to predict customer satisfaction.  
- The model achieved an **accuracy of 86.66%**.  
- Future improvements can be made by **tuning hyperparameters** or trying alternative classification algorithms.  

This report provides a comprehensive analysis and evaluation of the restaurant customer satisfaction dataset using a Gradient Boosting Classifier.

