In [2]:
!pip install tensorflow 

Collecting tensorflow
  Obtaining dependency information for tensorflow from https://files.pythonhosted.org/packages/85/15/cf99a373812d37f8ae99752a34a9f5f690d820ceb5b302e922705bc18944/tensorflow-2.15.0-cp311-cp311-macosx_12_0_arm64.whl.metadata
  Downloading tensorflow-2.15.0-cp311-cp311-macosx_12_0_arm64.whl.metadata (3.6 kB)
Collecting tensorflow-macos==2.15.0 (from tensorflow)
  Obtaining dependency information for tensorflow-macos==2.15.0 from https://files.pythonhosted.org/packages/eb/9f/0759e2fea4a3c48f070b64811c2c57036b46353ba87263afc810b8f4188a/tensorflow_macos-2.15.0-cp311-cp311-macosx_12_0_arm64.whl.metadata
  Downloading tensorflow_macos-2.15.0-cp311-cp311-macosx_12_0_arm64.whl.metadata (4.2 kB)
Collecting absl-py>=1.0.0 (from tensorflow-macos==2.15.0->tensorflow)
  Obtaining dependency information for absl-py>=1.0.0 from https://files.pythonhosted.org/packages/a2/ad/e0d3c824784ff121c03cc031f944bc7e139a8f1870ffd2845cc2dd76f6c4/absl_py-2.1.0-py3-none-any.whl.metadata
  Downlo

In [10]:
import pandas as pd
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from sklearn.model_selection import train_test_split
from sklearn.feature_extraction.text import TfidfVectorizer

# Load the dataset (first 5000 rows)
data = pd.read_csv('/Users/shobhitdhanyakumardiggikar/Downloads/restaurant_reviews_az.csv', nrows=5000)

In [11]:
# Vectorize the text reviews to numerical data
tfidf = TfidfVectorizer(max_features=5000)
X = tfidf.fit_transform(data['text']).toarray()
y = data['Sentiment']  

In [12]:
# Define the ANN model
model = Sequential([
    Dense(2000, input_dim=5000, activation='relu'),
    Dense(1000, activation='relu'),
    Dense(1, activation='sigmoid')  # Output layer for binary classification
])

# Compile the model
model.compile(optimizer='sgd', loss='binary_crossentropy', metrics=['accuracy'])

# Display the model's architecture
model.summary()

Model: "sequential_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_3 (Dense)             (None, 2000)              10002000  
                                                                 
 dense_4 (Dense)             (None, 1000)              2001000   
                                                                 
 dense_5 (Dense)             (None, 1)                 1001      
                                                                 
Total params: 12004001 (45.79 MB)
Trainable params: 12004001 (45.79 MB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________


In [13]:
from tensorflow.keras.callbacks import ModelCheckpoint

# Split data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=42)

# Set up a checkpoint to save the best model
checkpoint = ModelCheckpoint('best_model.h5', monitor='val_accuracy', save_best_only=True)

# Fit the model
history = model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=2, batch_size=8, callbacks=[checkpoint])

Epoch 1/2
Epoch 2/2
 11/469 [..............................] - ETA: 5s - loss: 0.6133 - accuracy: 0.6932

  saving_api.save_model(




In [15]:
# Example reviews (replace these with the actual reviews you have)
reviews = [
    "The service is good but location is hard to find. Sanitation is not very good with old facilities.",
    "The restaurant is definitely one of my favorites. The place is clean and the food is delicious!",
    "I appreciated the friendly staff. The food was good not amazing. The service was not prompt."
]

# Transform reviews using the same TF-IDF vectorizer
reviews_transformed = tfidf.transform(reviews).toarray()

# Predict using the trained model
predictions = model.predict(reviews_transformed)
print(predictions)

[[0.7141115 ]
 [0.71894896]
 [0.7089834 ]]


# Text Cell 8 - Comparison of ANN Models with Different Word Representation Methods (TF-IDF vs. Word Embedding)
### TF-IDF Representation:
The Artificial Neural Network (ANN) model, trained using TF-IDF representation, showed decent performance, achieving approximately 71% accuracy on both the training and validation datasets. It appears that the model has grasped certain patterns in the text data. However, it may face challenges in capturing more intricate word relationships due to the sparse nature of TF-IDF vectors.
### Word Embedding Representation:
The ANN model trained with word embeddings faced challenges during training, as indicated by the loss and accuracy values becoming NaN. This suggests numerical instability during training, which could be caused by large parameter updates or other numerical issues inherent in word embedding representations. While word embeddings generally offer more dense and continuous representations of words compared to TF-IDF, allowing models to capture more nuanced relationships in the text data, it appears that the ANN model struggled to effectively utilize these representations in this case.

In [16]:
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences

# Reload the first 5000 rows of the data
data = pd.read_csv('/Users/shobhitdhanyakumardiggikar/Downloads/restaurant_reviews_az.csv', nrows=5000)

# Tokenize and pad the sequences to a fixed length
tokenizer = Tokenizer(num_words=5000)
tokenizer.fit_on_texts(data['text'])
sequences = tokenizer.texts_to_sequences(data['text'])
word_index = tokenizer.word_index
print('Found %s unique tokens.' % len(word_index))

# Pad sequences to ensure uniform input size
data_pad = pad_sequences(sequences, maxlen=50)

Found 13383 unique tokens.


In [17]:
# Note: You need to adjust the input_dim of the first layer according to the padded data size
model = Sequential([
    Dense(2000, input_dim=50, activation='relu'),
    Dense(1000, activation='relu'),
    Dense(1, activation='sigmoid')
])

model.compile(optimizer='sgd', loss='binary_crossentropy', metrics=['accuracy'])

# Split data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(data_pad, y, test_size=0.25, random_state=42)

# Fit the model
model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=2, batch_size=8)

Epoch 1/2
Epoch 2/2


<keras.src.callbacks.History at 0x2a09cfed0>

In [18]:
from tensorflow.keras.layers import SimpleRNN

# Define RNN model
model = Sequential([
    SimpleRNN(2000, input_shape=(50, 1), activation='relu'),
    Dense(1000, activation='relu'),
    Dense(1, activation='sigmoid')
])

model.compile(optimizer='sgd', loss='binary_crossentropy', metrics=['accuracy'])

# Fit the model
model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=2, batch_size=8)

Epoch 1/2
Epoch 2/2


<keras.src.callbacks.History at 0x28ae20d90>

In [20]:
from tensorflow.keras.layers import LSTM

# Define LSTM model
model = Sequential([
    LSTM(2000, input_shape=(50, 1), activation='relu'),
    Dense(1000, activation='relu'),
    Dense(1, activation='sigmoid')
])

model.compile(optimizer='sgd', loss='binary_crossentropy', metrics=['accuracy'])

# Fit the model
model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=2, batch_size=8)

Epoch 1/2
Epoch 2/2


<keras.src.callbacks.History at 0x2a1b01390>

# Text Cell 11 - Comparison of Deep Learning Models (ANN vs. RNN vs. LSTM)
### Artificial Neural Network (ANN):
The ANN model performed reasonably well, achieving an accuracy of approximately 71% on both the training and validation sets when trained without word embeddings. However, when trained with word embeddings, the model faced numerical instability issues, leading to NaN values for both loss and accuracy.

### Recurrent Neural Network (RNN):
The RNN model achieved a comparable performance to the ANN model, with an accuracy of around 72% on both the training and validation sets. RNNs are well-suited for handling sequential data such as text, and in this instance, the RNN model demonstrated improvement over the ANN model, particularly in capturing sequential dependencies within the data.

### Long Short-Term Memory (LSTM):
The LSTM model showed similar performance to the ANN and RNN models, achieving an accuracy of around 72% on both the training and validation sets. LSTMs, a specialized type of RNN, are specifically designed to capture long-range dependencies in sequential data, which is beneficial for text data with extensive contexts. Despite their architectural variances, the ANN, RNN, and LSTM models exhibited comparable performance on this text classification task, suggesting that the task may not heavily depend on capturing long-range dependencies.

# Acknowledgement
I used Chatgpt for syntax and understanding the code and worked along with a friend