#**📌 Sentiment Analysis on Swiggy Reviews using Simple RNN 🏆**

# 📖 Overview
This notebook performs sentiment analysis on restaurant reviews from Swiggy, using a Simple RNN model. The goal is to classify customer feedback into Positive, Negative, or Neutral sentiments based on the average rating provided.

#1️⃣ **Load the dataset and select relevant columns (Review & Rating).**

In [24]:
import pandas as pd
import numpy as np
import re
from sklearn.model_selection import train_test_split
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import SimpleRNN, Dense, Embedding

# Load dataset
url = "https://media.geeksforgeeks.org/wp-content/uploads/20250213152158779318/swiggy.csv"
data = pd.read_csv(url)

In [25]:
# Display column names
print(data.columns)

# Select only relevant columns
data = data[['Review', 'Avg Rating']]
data.columns = ['review', 'sentiment']

# Display column names after renaming
print(data.columns)

Index(['ID', 'Area', 'City', 'Restaurant Price', 'Avg Rating', 'Total Rating',
       'Food Item', 'Food Type', 'Delivery Time', 'Review'],
      dtype='object')
Index(['review', 'sentiment'], dtype='object')


#**2️⃣ Preprocess text data by cleaning special characters.**

In [13]:
# Preprocessing function
def clean_text(text):
    text = text.lower()
    text = re.sub(r'[^a-zA-Z0-9\s]', '', text)  # Remove special characters
    return text

# Apply preprocessing
data['review'] = data['review'].astype(str).apply(clean_text)

#**3️⃣ Convert ratings into sentiment labels (Positive, Negative, Neutral).**

In [14]:
def encode_sentiment(rating):
    if rating >= 4:
        return 1  # Positive
    elif rating <= 2:
        return 0  # Negative
    else:
        return 2  # Neutral

data['sentiment'] = data['sentiment'].apply(encode_sentiment)

#**4️⃣ Tokenize and pad sequences for input to the RNN model.**

In [15]:
# Split data
X_train, X_test, y_train, y_test = train_test_split(
    data['review'], data['sentiment'], test_size=0.2, random_state=42
)

# Tokenization
max_words = 5000  # Limit vocabulary size
tokenizer = Tokenizer(num_words=max_words, oov_token="<OOV>")
tokenizer.fit_on_texts(X_train)

X_train_seq = tokenizer.texts_to_sequences(X_train)
X_test_seq = tokenizer.texts_to_sequences(X_test)

# Padding
max_len = 100  # Set max length for reviews
X_train_pad = pad_sequences(X_train_seq, maxlen=max_len, padding='post')
X_test_pad = pad_sequences(X_test_seq, maxlen=max_len, padding='post')

#**5️⃣ Build and train a Simple RNN for sentiment classification.**

In [27]:
model = Sequential([
    Embedding(input_dim=max_words, output_dim=64),
    SimpleRNN(64, return_sequences=False),
    Dense(32, activation='relu'),
    Dense(3, activation='softmax')  # 3 output classes (Positive, Negative, Neutral)
])

#**6️⃣ Evaluate model performance on test data.**

In [17]:
# Compile model
model.compile(loss='sparse_categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

# Train model
model.fit(X_train_pad, y_train, epochs=5, batch_size=32, validation_data=(X_test_pad, y_test))

# Evaluate model
loss, accuracy = model.evaluate(X_test_pad, y_test)
print(f'Test Accuracy: {accuracy * 100:.2f}%')

Epoch 1/5
[1m200/200[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m9s[0m 32ms/step - accuracy: 0.6786 - loss: 0.7009 - val_accuracy: 0.7225 - val_loss: 0.5925
Epoch 2/5
[1m200/200[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 26ms/step - accuracy: 0.7136 - loss: 0.6031 - val_accuracy: 0.7225 - val_loss: 0.5921
Epoch 3/5
[1m200/200[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m10s[0m 25ms/step - accuracy: 0.7181 - loss: 0.6003 - val_accuracy: 0.7225 - val_loss: 0.5925
Epoch 4/5
[1m200/200[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m6s[0m 31ms/step - accuracy: 0.7161 - loss: 0.6009 - val_accuracy: 0.7225 - val_loss: 0.6037
Epoch 5/5
[1m200/200[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 24ms/step - accuracy: 0.7188 - loss: 0.6006 - val_accuracy: 0.7225 - val_loss: 0.5908
[1m50/50[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 7ms/step - accuracy: 0.7287 - loss: 0.5848
Test Accuracy: 72.25%


#**7️⃣ Predict sentiment of new reviews using predict_sentiment() function.**

In [18]:
# Predict sentiment function
def predict_sentiment(review_text):
    review_text = clean_text(review_text)
    seq = tokenizer.texts_to_sequences([review_text])
    padded = pad_sequences(seq, maxlen=max_len, padding='post')
    prediction = model.predict(padded)
    sentiment_labels = {0: "Negative", 1: "Positive", 2: "Neutral"}
    return sentiment_labels[np.argmax(prediction)]


In [26]:
#test the predictions
sample_review1 = "The food was amazing and delivered on time!"
print(f'Sample Review: "{sample_review1}"')
print(f'Predicted Sentiment: {predict_sentiment(sample_review1)}')

Sample Review: "The food was amazing and delivered on time!"
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 38ms/step
Predicted Sentiment: Positive
