<a href="https://www.kaggle.com/code/swish9/mcdonald-s-store-reviews-sentiment-analysis?scriptVersionId=134988699" target="_blank"><img align="left" alt="Kaggle" title="Open in Kaggle" src="https://kaggle.com/static/images/open-in-kaggle.svg"></a>

<h1>Introduction <h1>

<p>In the era of online reviews, understanding customer sentiments has become crucial for businesses to enhance customer satisfaction. In this project, we delve into sentiment analysis of McDonald's store reviews, using machine learning techniques to predict the sentiment expressed in textual data.

Through the analysis of a comprehensive dataset obtained from Kaggle, we preprocess and clean the text, transforming it into numerical representations. With a focus on machine learning algorithms such as logistic regression, support vector machines, and deep learning models, we train a sentiment analysis model to classify reviews as positive, negative, or neutral.

By developing a user-friendly interface, users can now enter their own text to obtain sentiment predictions in real-time. This project aims to provide valuable insights into customer perceptions of McDonald's, empowering the company to enhance their offerings and ensure an exceptional dining experience.

In summary, this project combines the power of machine learning and customer reviews to enable McDonald's and other businesses to gain a deeper understanding of customer sentiments, paving the way for data-driven improvements and enhanced customer satisfaction.
</p>

![https://images.unsplash.com/photo-1587361144243-03f7925381ca?ixlib=rb-4.0.3&ixid=M3wxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8fA%3D%3D&auto=format&fit=crop&w=1035&q=80](http://)

# Sentiment Analysis:

<p>In the sentiment analysis section, I imported the dataset and examined its columns, head, sample records, and information to gain a better understanding of the data. I then utilized the SentimentIntensityAnalyzer, a popular sentiment analysis tool, to calculate sentiment scores for each text entry.

Based on these sentiment scores, I classified the text into different sentiment categories. For example, if the sentiment score was positive, I flagged the text as positive sentiment. Similarly, for negative sentiment scores, I labeled the text as negative sentiment. This process allowed me to categorize the dataset based on sentiment and gain insights into the overall sentiment distribution.</p>

<h2>Importing Libraries</h2>

Basic ML Libraries

In [1]:
import numpy as np 
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

**For Sentiment Analysis**

In [2]:
import pandas as pd
import nltk
from nltk.sentiment import SentimentIntensityAnalyzer



**For Building Model**

In [3]:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score, classification_report

**For Ignoring Warnings**

In [4]:
import warnings
warnings.filterwarnings("ignore")

* <h2>Dataset Import:</h2>
I imported the dataset into my notebook to access the review data and associated sentiments.

In [5]:
mcd = pd.read_csv("/kaggle/input/mcdonalds-store-reviews/McDonald_s_Reviews.csv", encoding="latin-1")

* <h2>Data Exploration:</h2>
To understand the dataset, I examined its columns, inspected the first few rows using the head() function, and reviewed a sample of records. This exploration provided insights into the dataset's structure and contents.

In [6]:
mcd.columns

Index(['reviewer_id', 'store_name', 'category', 'store_address', 'latitude ',
       'longitude', 'rating_count', 'review_time', 'review', 'rating'],
      dtype='object')

<h3>short Description of each column</h3>

* reviewer_id: Unique identifier for each reviewer (anonymized)
* store_name: Name of the McDonald's store
* category: Category or type of the store
* store_address: Address of the store
* latitude: Latitude coordinate of the store's location
* longitude: Longitude coordinate of the store's location
* rating_count: Number of ratings/reviews for the store
* review_time: Timestamp of the review
* review: Textual content of the review
* rating: Rating provided by the reviewer

In [7]:
mcd.head(10)

Unnamed: 0,reviewer_id,store_name,category,store_address,latitude,longitude,rating_count,review_time,review,rating
0,1,McDonald's,Fast food restaurant,"13749 US-183 Hwy, Austin, TX 78750, United States",30.460718,-97.792874,1240,3 months ago,Why does it look like someone spit on my food?...,1 star
1,2,McDonald's,Fast food restaurant,"13749 US-183 Hwy, Austin, TX 78750, United States",30.460718,-97.792874,1240,5 days ago,It'd McDonalds. It is what it is as far as the...,4 stars
2,3,McDonald's,Fast food restaurant,"13749 US-183 Hwy, Austin, TX 78750, United States",30.460718,-97.792874,1240,5 days ago,Made a mobile order got to the speaker and che...,1 star
3,4,McDonald's,Fast food restaurant,"13749 US-183 Hwy, Austin, TX 78750, United States",30.460718,-97.792874,1240,a month ago,My mc. Crispy chicken sandwich was ï¿½ï¿½ï¿½ï¿...,5 stars
4,5,McDonald's,Fast food restaurant,"13749 US-183 Hwy, Austin, TX 78750, United States",30.460718,-97.792874,1240,2 months ago,"I repeat my order 3 times in the drive thru, a...",1 star
5,6,McDonald's,Fast food restaurant,"13749 US-183 Hwy, Austin, TX 78750, United States",30.460718,-97.792874,1240,3 weeks ago,I work for door dash and they locked us all ou...,1 star
6,7,McDonald's,Fast food restaurant,"13749 US-183 Hwy, Austin, TX 78750, United States",30.460718,-97.792874,1240,3 months ago,If I could give this location a zero on custo...,1 star
7,8,McDonald's,Fast food restaurant,"13749 US-183 Hwy, Austin, TX 78750, United States",30.460718,-97.792874,1240,a year ago,Came in and ordered a Large coffee w/no ice. T...,1 star
8,9,McDonald's,Fast food restaurant,"13749 US-183 Hwy, Austin, TX 78750, United States",30.460718,-97.792874,1240,3 months ago,Went thru drive thru. Ordered. Getting home no...,1 star
9,10,McDonald's,Fast food restaurant,"13749 US-183 Hwy, Austin, TX 78750, United States",30.460718,-97.792874,1240,3 months ago,"I'm not really a huge fan of fast food, but I ...",4 stars


In [8]:
mcd.sample(10)

Unnamed: 0,reviewer_id,store_name,category,store_address,latitude,longitude,rating_count,review_time,review,rating
29110,29111,McDonald's,Fast food restaurant,"1415 E State Rd, Fern Park, FL 32730, United S...",28.65535,-81.342692,1618,a year ago,Great service thank you,5 stars
6054,6055,McDonald's,Fast food restaurant,"262 Canal St, New York, NY 10013, United States",40.718514,-74.001168,3196,4 years ago,Delicious fast food!,5 stars
22315,22316,McDonald's,Fast food restaurant,2476 Kalï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿...,,,2175,11 months ago,Slowest fast food joint in the world. Took 45 ...,2 stars
11286,11287,McDonald's,Fast food restaurant,"1121 Garnet Ave, San Diego, CA 92109, United S...",32.797661,-117.24947,1159,4 years ago,Even though they renovated this place a few mo...,2 stars
2617,2618,McDonald's,Fast food restaurant,"72-69 Kissena Blvd, Queens, NY 11367, United S...",40.727401,-73.81246,2193,2 years ago,Good ï¿½ï¿½ï¿,5 stars
31757,31758,McDonald's,Fast food restaurant,"632 S R L Thornton Freeway Service Rd, Dallas,...",32.744596,-96.812286,2658,3 years ago,Great setup for a nice day at the zoo,3 stars
4218,4219,McDonald's,Fast food restaurant,"724 Broadway, New York, NY 10003, United States",40.729126,-73.993264,1670,4 years ago,Good luck getting service. I tried to order at...,1 star
11515,11516,McDonald's,Fast food restaurant,"1121 Garnet Ave, San Diego, CA 92109, United S...",32.797661,-117.24947,1159,a year ago,Ordered mcchicken biscuit. Got just a biscuit....,1 star
2755,2756,McDonald's,Fast food restaurant,"72-69 Kissena Blvd, Queens, NY 11367, United S...",40.727401,-73.81246,2193,5 years ago,Fast service,4 stars
68,69,McDonald's,Fast food restaurant,"13749 US-183 Hwy, Austin, TX 78750, United States",30.460718,-97.792874,1240,a year ago,So Uber eats was having connectivity issues an...,1 star


In [9]:
mcd.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 33396 entries, 0 to 33395
Data columns (total 10 columns):
 #   Column         Non-Null Count  Dtype  
---  ------         --------------  -----  
 0   reviewer_id    33396 non-null  int64  
 1   store_name     33396 non-null  object 
 2   category       33396 non-null  object 
 3   store_address  33396 non-null  object 
 4   latitude       32736 non-null  float64
 5   longitude      32736 non-null  float64
 6   rating_count   33396 non-null  object 
 7   review_time    33396 non-null  object 
 8   review         33396 non-null  object 
 9   rating         33396 non-null  object 
dtypes: float64(2), int64(1), object(7)
memory usage: 2.5+ MB


* <h2>Sentiment Score Calculation:</h2>
I utilized the SentimentIntensityAnalyzer, a popular sentiment analysis tool, to calculate sentiment scores for each review. This tool assigns a sentiment score based on the text's positivity, negativity, and neutrality.

In [10]:
sia = SentimentIntensityAnalyzer()

In [11]:
# Performing sentiment analysis on each review
sentiments = []
for review in mcd['review']:
    sentiment = sia.polarity_scores(review)
    sentiments.append(sentiment)

* <h2>Sentiment Classification:</h2>
Based on the compound score, I classified the reviews into different sentiment categories. For instance, if the compound score was above a certain threshold (e.g., 0.5), I labeled the review as positive. Conversely, if the compound score was below another threshold (e.g., -0.5), I labeled it as negative. Reviews with compound scores within the intermediate range were considered neutral.


In [12]:
sentiment_labels = []
for sentiment in sentiments:
    compound_score = sentiment['compound']
    if compound_score >= 0.05:
        sentiment_labels.append('Positive')
    elif compound_score <= -0.05:
        sentiment_labels.append('Negative')
    else:
        sentiment_labels.append('Neutral')

In [13]:
# Add the sentiment labels to the DataFrame
mcd['sentiment'] = sentiment_labels

In [14]:
mcd[['review', 'sentiment']]

Unnamed: 0,review,sentiment
0,Why does it look like someone spit on my food?...,Positive
1,It'd McDonalds. It is what it is as far as the...,Positive
2,Made a mobile order got to the speaker and che...,Negative
3,My mc. Crispy chicken sandwich was ï¿½ï¿½ï¿½ï¿...,Neutral
4,"I repeat my order 3 times in the drive thru, a...",Negative
...,...,...
33391,They treated me very badly.,Negative
33392,The service is very good,Positive
33393,To remove hunger is enough,Negative
33394,"It's good, but lately it has become very expen...",Positive


# Machine Learning

1. <h2>Dataset Splitting:</h2>
I divided the dataset into training and test sets to evaluate the performance of my model on unseen data. The training set was used to train the machine learning model, while the test set served as a benchmark for assessing its accuracy.

In [15]:
X = mcd['review']
y = mcd['sentiment']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

2. <h2>Vectorization:</h2>
I applied vectorization techniques to convert the textual data into a numerical representation suitable for machine learning algorithms. This process involved transforming the reviews into a format that captures their features and patterns effectively.

In [16]:
vectorizer = TfidfVectorizer()
X_train_tfidf = vectorizer.fit_transform(X_train)
X_test_tfidf = vectorizer.transform(X_test)

3. <h2>Model Training (Support Vector Classifier):</h2>
I utilized the Support Vector Classifier (SVC) algorithm to train my sentiment analysis model. SVC is a powerful machine learning algorithm commonly used for classification tasks. By training the model on the labeled training data, it learned to predict the sentiment of reviews based on their features.

In [17]:
# Train an SVM model
model = SVC()
model.fit(X_train_tfidf, y_train)

# Make predictions on the test set
y_pred = model.predict(X_test_tfidf)

# Evaluate the model
accuracy = accuracy_score(y_test, y_pred)
print("Accuracy:", accuracy)
print("Classification Report:")
print(classification_report(y_test, y_pred))

Accuracy: 0.88937125748503
Classification Report:
              precision    recall  f1-score   support

    Negative       0.82      0.87      0.85      1922
     Neutral       0.88      0.83      0.86      1245
    Positive       0.93      0.92      0.93      3513

    accuracy                           0.89      6680
   macro avg       0.88      0.87      0.88      6680
weighted avg       0.89      0.89      0.89      6680



4. <h2>Sentiment Prediction Function:</h2>
To enhance usability, I created a function that takes a review as input and predicts its sentiment. This function utilizes the trained SVC model to analyze the input review's features and classify it as positive, negative, or neutral. The function provides the sentiment prediction as the output.


In [18]:
def predict_sentiment(review):
    # Preprocess the review
    # Apply the same preprocessing steps as done during training

    # Transform the preprocessed review using the TF-IDF vectorizer
    review_tfidf = vectorizer.transform([review])

    # Predict the sentiment
    sentiment = model.predict(review_tfidf)

    return sentiment[0]

5. <h2>Sample Testing:</h2>
To assess the model's performance, I conducted sample testing.

In [19]:
new_review = "This restaurant has excellent service and delicious food."
predicted_sentiment = predict_sentiment(new_review)
print("Predicted sentiment:", predicted_sentiment)

Predicted sentiment: Positive


In [20]:
new_review2 = "This restaurant sucks."
predicted_sentiment = predict_sentiment(new_review2)
print("Predicted sentiment:", predicted_sentiment)

Predicted sentiment: Negative


In [21]:
new_review3 = "This is fine"
predicted_sentiment = predict_sentiment(new_review3)
print("Predicted sentiment:", predicted_sentiment)

Predicted sentiment: Positive


In [22]:
new_review4 = "This is dull"
predicted_sentiment = predict_sentiment(new_review4)
print("Predicted sentiment:", predicted_sentiment)

Predicted sentiment: Neutral


In [23]:
new_review5 = "its bad"
predicted_sentiment = predict_sentiment(new_review5)
print("Predicted sentiment:", predicted_sentiment)

Predicted sentiment: Negative


In [24]:
pip install git+https://github.com/Swish78/strkitN.git

Collecting git+https://github.com/Swish78/strkitN.git
  Cloning https://github.com/Swish78/strkitN.git to /tmp/pip-req-build-fpvvnqua
  Running command git clone --filter=blob:none --quiet https://github.com/Swish78/strkitN.git /tmp/pip-req-build-fpvvnqua
  Resolved https://github.com/Swish78/strkitN.git to commit 2f311fd59d12e509eec4d385d027562073f9eb63
  Preparing metadata (setup.py) ... [?25l- \ done
Building wheels for collected packages: strkitN
  Building wheel for strkitN (setup.py) ... [?25l- \ | done
[?25h  Created wheel for strkitN: filename=strkitN-0.1.0-py2.py3-none-any.whl size=3706 sha256=217337dc122eed4af40a2c3fa2307c57639f05924bf818d56e00f6d923e1f9b7
  Stored in directory: /tmp/pip-ephem-wheel-cache-klsvk5x6/wheels/fc/87/6a/f64b4beb4deaa56f93821e265bac475f45bef14f2212788ad2
Successfully built strkitN
Installing collected packages: strkitN
Successfully installed strkitN-0.1.0
[0mNote: you may need to restart the kernel to use updated packages.


# Web-Based GUI Development

**In the near future, I plan to add a web-based GUI to my project using frameworks like Django, Flask, or Streamlit. This GUI will enable users to input their text, such as a review, and receive real-time sentiment predictions, enhancing accessibility and usability.**