#### 1. Sentiment analysis
#### Build a model that can analyze the sentiment of text data, such as customer reviews or social media posts. Use techniques like bag-of-words, word embeddings, or transformers to classify text as positive, negative, or neutral sentiment.

##### Step-by-step guide for building an end-to-end sentiment analysis model :

##### Data Collection
##### Data Preprocessing
##### Feature Extraction
##### Model Training
##### Model Evaluation
##### Predictions on New Data
##### I am using dataset of movie reviews for this example, and we'll use libraries like nltk, sklearn, and transformers.

#### Data Collection
##### Using nltk library to download the movie reviews dataset.



In [1]:
import nltk
nltk.download('movie_reviews')
from nltk.corpus import movie_reviews
import pandas as pd

# Load the dataset
def load_movie_reviews():
    reviews = []
    for fileid in movie_reviews.fileids():
        category = movie_reviews.categories(fileid)[0]
        review = movie_reviews.raw(fileid)
        reviews.append((review, category))
    return pd.DataFrame(reviews, columns=['review', 'sentiment'])

df = load_movie_reviews()
df.head()


[nltk_data] Downloading package movie_reviews to
[nltk_data]     C:\Users\hp\AppData\Roaming\nltk_data...
[nltk_data]   Unzipping corpora\movie_reviews.zip.


Unnamed: 0,review,sentiment
0,"plot : two teen couples go to a church party ,...",neg
1,the happy bastard's quick movie review \ndamn ...,neg
2,it is movies like these that make a jaded movi...,neg
3,""" quest for camelot "" is warner bros . ' firs...",neg
4,synopsis : a mentally unstable man undergoing ...,neg


#### Data Preprocessing
##### Next, we'll preprocess the text data. This step includes tokenization, removing stopwords, and converting text to lowercase.

In [2]:
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize
import string

nltk.download('punkt')
nltk.download('stopwords')

def preprocess_text(text):
    # Tokenize the text
    tokens = word_tokenize(text)
    # Convert to lower case
    tokens = [word.lower() for word in tokens]
    # Remove punctuation and stopwords
    table = str.maketrans('', '', string.punctuation)
    tokens = [word.translate(table) for word in tokens]
    stop_words = set(stopwords.words('english'))
    tokens = [word for word in tokens if word.isalpha() and word not in stop_words]
    return ' '.join(tokens)

df['cleaned_review'] = df['review'].apply(preprocess_text)
df.head()


[nltk_data] Downloading package punkt to
[nltk_data]     C:\Users\hp\AppData\Roaming\nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package stopwords to
[nltk_data]     C:\Users\hp\AppData\Roaming\nltk_data...
[nltk_data]   Package stopwords is already up-to-date!


Unnamed: 0,review,sentiment,cleaned_review
0,"plot : two teen couples go to a church party ,...",neg,plot two teen couples go church party drink dr...
1,the happy bastard's quick movie review \ndamn ...,neg,happy bastard quick movie review damn bug got ...
2,it is movies like these that make a jaded movi...,neg,movies like make jaded movie viewer thankful i...
3,""" quest for camelot "" is warner bros . ' firs...",neg,quest camelot warner bros first featurelength ...
4,synopsis : a mentally unstable man undergoing ...,neg,synopsis mentally unstable man undergoing psyc...


##### Feature Extraction
##### We'll use TF-IDF (Term Frequency-Inverse Document Frequency) for feature extraction.

#####  It can be defined as the calculation of how relevant a word in a series or corpus is to a text. The meaning increases proportionally to the number of times in the text a word appears but is compensated by the word frequency in the corpus (data-set).

In [3]:
from sklearn.feature_extraction.text import TfidfVectorizer

vectorizer = TfidfVectorizer(max_features=5000)
X = vectorizer.fit_transform(df['cleaned_review']).toarray()
y = df['sentiment'].apply(lambda x: 1 if x == 'pos' else 0).values

print(X.shape, y.shape)


(2000, 5000) (2000,)


#### Model Training
##### train a simple logistic regression model.

In [4]:
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, classification_report

# Split the data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train the model
model = LogisticRegression(max_iter=1000)
model.fit(X_train, y_train)

# Make predictions
y_pred = model.predict(X_test)

# Evaluate the model
accuracy = accuracy_score(y_test, y_pred)
print(f'Accuracy: {accuracy}')
print('Classification Report:')
print(classification_report(y_test, y_pred))


Accuracy: 0.83
Classification Report:
              precision    recall  f1-score   support

           0       0.83      0.82      0.83       199
           1       0.83      0.84      0.83       201

    accuracy                           0.83       400
   macro avg       0.83      0.83      0.83       400
weighted avg       0.83      0.83      0.83       400



##### Model Testing 
#### Predictions on New Data
##### test the model on some new data.

In [5]:
def predict_sentiment(review):
    cleaned_review = preprocess_text(review)
    vectorized_review = vectorizer.transform([cleaned_review]).toarray()
    prediction = model.predict(vectorized_review)[0]
    return 'positive' if prediction == 1 else 'negative'

# Example reviews
reviews = [
    "I loved this movie, it was fantastic!",
    "This was a terrible movie, I hated it.",
    "It was an average movie, not too bad but not great either."
]

for review in reviews:
    print(f'Review: {review}')
    print(f'Sentiment: {predict_sentiment(review)}\n')


Review: I loved this movie, it was fantastic!
Sentiment: positive

Review: This was a terrible movie, I hated it.
Sentiment: negative

Review: It was an average movie, not too bad but not great either.
Sentiment: negative



#### The Complete Code 

In [6]:
# 1. Data Collection
import nltk
nltk.download('movie_reviews')
from nltk.corpus import movie_reviews
import pandas as pd

def load_movie_reviews():
    reviews = []
    for fileid in movie_reviews.fileids():
        category = movie_reviews.categories(fileid)[0]
        review = movie_reviews.raw(fileid)
        reviews.append((review, category))
    return pd.DataFrame(reviews, columns=['review', 'sentiment'])

df = load_movie_reviews()

# 2. Data Preprocessing
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize
import string

nltk.download('punkt')
nltk.download('stopwords')

def preprocess_text(text):
    tokens = word_tokenize(text)
    tokens = [word.lower() for word in tokens]
    table = str.maketrans('', '', string.punctuation)
    tokens = [word.translate(table) for word in tokens]
    stop_words = set(stopwords.words('english'))
    tokens = [word for word in tokens if word.isalpha() and word not in stop_words]
    return ' '.join(tokens)

df['cleaned_review'] = df['review'].apply(preprocess_text)

# 3. Feature Extraction
from sklearn.feature_extraction.text import TfidfVectorizer

vectorizer = TfidfVectorizer(max_features=5000)
X = vectorizer.fit_transform(df['cleaned_review']).toarray()
y = df['sentiment'].apply(lambda x: 1 if x == 'pos' else 0).values

# 4. Model Training
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, classification_report

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
model = LogisticRegression(max_iter=1000)
model.fit(X_train, y_train)
y_pred = model.predict(X_test)

accuracy = accuracy_score(y_test, y_pred)
print(f'Accuracy: {accuracy}')
print('Classification Report:')
print(classification_report(y_test, y_pred))

# 5. Predictions on New Data
def predict_sentiment(review):
    cleaned_review = preprocess_text(review)
    vectorized_review = vectorizer.transform([cleaned_review]).toarray()
    prediction = model.predict(vectorized_review)[0]
    return 'positive' if prediction == 1 else 'negative'

reviews = [
    "I loved this movie, it was fantastic!",
    "This was a terrible movie, I hated it.",
    "It was an average movie, not too bad but not great either."
]

for review in reviews:
    print(f'Review: {review}')
    print(f'Sentiment: {predict_sentiment(review)}\n')


[nltk_data] Downloading package movie_reviews to
[nltk_data]     C:\Users\hp\AppData\Roaming\nltk_data...
[nltk_data]   Package movie_reviews is already up-to-date!
[nltk_data] Downloading package punkt to
[nltk_data]     C:\Users\hp\AppData\Roaming\nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package stopwords to
[nltk_data]     C:\Users\hp\AppData\Roaming\nltk_data...
[nltk_data]   Package stopwords is already up-to-date!


Accuracy: 0.83
Classification Report:
              precision    recall  f1-score   support

           0       0.83      0.82      0.83       199
           1       0.83      0.84      0.83       201

    accuracy                           0.83       400
   macro avg       0.83      0.83      0.83       400
weighted avg       0.83      0.83      0.83       400

Review: I loved this movie, it was fantastic!
Sentiment: positive

Review: This was a terrible movie, I hated it.
Sentiment: negative

Review: It was an average movie, not too bad but not great either.
Sentiment: negative



#### Enhanching the sentiment analysis project by incorporating techniques like Bag-of-Words (BoW), Word Embeddings, and Transformers.

##### 1. Bag-of-Words (BoW) Approach
##### using the BoW approach with CountVectorizer from sklearn.

##### Bag of words is a text vectorization technique that converts the text into finite length vectors. 
##### The boW model is easy to implement and understand. Bag of words has few drawbacks, which can be overcome by using advanced techniques.

##### 1.1 Data Collection and Preprocessing

In [12]:
# Import necessary libraries
import nltk
nltk.download('movie_reviews')
from nltk.corpus import movie_reviews
import pandas as pd
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize
import string

# Load the dataset
def load_movie_reviews():
    reviews = []
    for fileid in movie_reviews.fileids():
        category = movie_reviews.categories(fileid)[0]
        review = movie_reviews.raw(fileid)
        reviews.append((review, category))
    return pd.DataFrame(reviews, columns=['review', 'sentiment'])

df = load_movie_reviews()

# Download necessary NLTK data
nltk.download('punkt')
nltk.download('stopwords')

# Preprocess the text data
def preprocess_text(text):
    tokens = word_tokenize(text)
    tokens = [word.lower() for word in tokens]
    table = str.maketrans('', '', string.punctuation)
    tokens = [word.translate(table) for word in tokens]
    stop_words = set(stopwords.words('english'))
    tokens = [word for word in tokens if word.isalpha() and word not in stop_words]
    return ' '.join(tokens)

df['cleaned_review'] = df['review'].apply(preprocess_text)


[nltk_data] Downloading package movie_reviews to
[nltk_data]     C:\Users\hp\AppData\Roaming\nltk_data...
[nltk_data]   Package movie_reviews is already up-to-date!
[nltk_data] Downloading package punkt to
[nltk_data]     C:\Users\hp\AppData\Roaming\nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package stopwords to
[nltk_data]     C:\Users\hp\AppData\Roaming\nltk_data...
[nltk_data]   Package stopwords is already up-to-date!


In [16]:
# Import necessary libraries
import nltk
nltk.download('movie_reviews')
from nltk.corpus import movie_reviews
import pandas as pd
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize
import string

# Load the dataset
def load_movie_reviews():
    reviews = []
    for fileid in movie_reviews.fileids():
        category = movie_reviews.categories(fileid)[0]
        review = movie_reviews.raw(fileid)
        reviews.append((review, category))
    return pd.DataFrame(reviews, columns=['review', 'sentiment'])

df = load_movie_reviews()

# Download necessary NLTK data
nltk.download('punkt')
nltk.download('stopwords')

# Preprocess the text data
def preprocess_text(text):
    tokens = word_tokenize(text)
    tokens = [word.lower() for word in tokens]
    table = str.maketrans('', '', string.punctuation)
    tokens = [word.translate(table) for word in tokens]
    stop_words = set(stopwords.words('english'))
    tokens = [word for word in tokens if word.isalpha() and word not in stop_words]
    return ' '.join(tokens)

df['cleaned_review'] = df['review'].apply(preprocess_text)


[nltk_data] Downloading package movie_reviews to
[nltk_data]     C:\Users\hp\AppData\Roaming\nltk_data...
[nltk_data]   Package movie_reviews is already up-to-date!
[nltk_data] Downloading package punkt to
[nltk_data]     C:\Users\hp\AppData\Roaming\nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package stopwords to
[nltk_data]     C:\Users\hp\AppData\Roaming\nltk_data...
[nltk_data]   Package stopwords is already up-to-date!


##### 1.2 Feature Extraction using BoW

In [14]:
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, classification_report

# Extract features using CountVectorizer
vectorizer = CountVectorizer(max_features=5000)
X = vectorizer.fit_transform(df['cleaned_review']).toarray()
y = df['sentiment'].apply(lambda x: 1 if x == 'pos' else 0).values

# Split the data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train the model
model = LogisticRegression(max_iter=1000)
model.fit(X_train, y_train)
y_pred = model.predict(X_test)

# Evaluate the model
accuracy = accuracy_score(y_test, y_pred)
print(f'Accuracy: {accuracy}')
print('Classification Report:')
print(classification_report(y_test, y_pred))


Accuracy: 0.82
Classification Report:
              precision    recall  f1-score   support

           0       0.82      0.81      0.82       199
           1       0.82      0.83      0.82       201

    accuracy                           0.82       400
   macro avg       0.82      0.82      0.82       400
weighted avg       0.82      0.82      0.82       400



##### 1.3 Predictions on New Data

In [17]:
def predict_sentiment(review):
    cleaned_review = preprocess_text(review)
    vectorized_review = vectorizer.transform([cleaned_review]).toarray()
    prediction = model.predict(vectorized_review)[0]
    return 'positive' if prediction == 1 else 'negative'

# Example reviews
reviews = [
    "I loved this movie, it was fantastic!",
    "This was a terrible movie, I hated it.",
    "It was an average movie, not too bad but not great either."
]

for review in reviews:
    print(f'Review: {review}')
    print(f'Sentiment: {predict_sentiment(review)}\n')


Review: I loved this movie, it was fantastic!
Sentiment: negative

Review: This was a terrible movie, I hated it.
Sentiment: negative

Review: It was an average movie, not too bad but not great either.
Sentiment: negative



##### 2. Word Embeddings Approach
##### Using pre-trained GloVe embeddings to represent our text data.

##### 2.1 Load GloVe Embeddings

In [22]:
import numpy as np
import os

# Load GloVe embeddings
def load_glove_embeddings(file_path):
    embeddings_index = {}
    with open(file_path, 'r', encoding='utf-8') as f:
        for line in f:
            values = line.split()
            word = values[0]
            coefs = np.asarray(values[1:], dtype='float32')
            embeddings_index[word] = coefs
    return embeddings_index
glove_file = r'C:\\Users\\hp\\Documents\\internships\\Ignite\\glove.6B.100d.txt'  # Updated file path
embeddings_index = load_glove_embeddings(glove_file)


##### 2.2 Create Embedding Matrix and Pad Sequences

In [27]:
pip install tensorflow

Note: you may need to restart the kernel to use updated packages.


In [29]:
import numpy as np
import pandas as pd
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Embedding, LSTM, Dense, Dropout

# Assuming df is your DataFrame with 'cleaned_review' and 'sentiment' columns
df = pd.DataFrame({
    'cleaned_review': ["This is a great movie", "I did not like the film", "An excellent watch"],
    'sentiment': ["pos", "neg", "pos"]
})

# Tokenize the data
tokenizer = Tokenizer()
tokenizer.fit_on_texts(df['cleaned_review'])
sequences = tokenizer.texts_to_sequences(df['cleaned_review'])
word_index = tokenizer.word_index

# Pad sequences
max_length = 100
X = pad_sequences(sequences, maxlen=max_length)

# Load GloVe embeddings
def load_glove_embeddings(glove_file):
    embeddings_index = {}
    with open(glove_file, encoding="utf-8") as f:
        for line in f:
            values = line.split()
            word = values[0]
            coefs = np.asarray(values[1:], dtype='float32')
            embeddings_index[word] = coefs
    return embeddings_index

glove_file = r'C:\Users\hp\Documents\internships\Ignite\glove.6B.100d.txt'
embeddings_index = load_glove_embeddings(glove_file)

# Create the embedding matrix
embedding_dim = 100
embedding_matrix = np.zeros((len(word_index) + 1, embedding_dim))
for word, i in word_index.items():
    embedding_vector = embeddings_index.get(word)
    if embedding_vector is not None:
        embedding_matrix[i] = embedding_vector

# Prepare the labels
y = df['sentiment'].apply(lambda x: 1 if x == 'pos' else 0).values

# Define the model
model = Sequential()
model.add(Embedding(input_dim=len(word_index) + 1, 
                    output_dim=embedding_dim, 
                    weights=[embedding_matrix], 
                    input_length=max_length, 
                    trainable=False))
model.add(LSTM(128, dropout=0.2, recurrent_dropout=0.2))
model.add(Dense(1, activation='sigmoid'))

# Compile the model
model.compile(loss='binary_crossentropy',
              optimizer='adam',
              metrics=['accuracy'])

# Print model summary
model.summary()




In [32]:

# Assuming X and y are your features and labels
model.fit(X, y, epochs=10, batch_size=32, validation_split=0.2)


Epoch 1/10
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m6s[0m 6s/step - accuracy: 0.5000 - loss: 0.6971 - val_accuracy: 0.0000e+00 - val_loss: 0.7149
Epoch 2/10
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 546ms/step - accuracy: 0.5000 - loss: 0.6951 - val_accuracy: 0.0000e+00 - val_loss: 0.7285
Epoch 3/10
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 119ms/step - accuracy: 1.0000 - loss: 0.6237 - val_accuracy: 0.0000e+00 - val_loss: 0.7278
Epoch 4/10
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 107ms/step - accuracy: 1.0000 - loss: 0.5405 - val_accuracy: 0.0000e+00 - val_loss: 0.7270
Epoch 5/10
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 115ms/step - accuracy: 1.0000 - loss: 0.5217 - val_accuracy: 0.0000e+00 - val_loss: 0.7224
Epoch 6/10
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 120ms/step - accuracy: 1.0000 - loss: 0.4998 - val_accuracy: 0.0000e+00 - val_loss: 0.7121
Epoch 7/10
[1m1/1

<keras.src.callbacks.history.History at 0x1839dd81c90>

##### 2.3 Building and Train the LSTM Model

In [30]:
import numpy as np
import pandas as pd
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Embedding, LSTM, Dense, Dropout

# Sample DataFrame for demonstration
df = pd.DataFrame({
    'cleaned_review': ["This is a great movie", "I did not like the film", "An excellent watch"],
    'sentiment': ["pos", "neg", "pos"]
})

# Tokenize the data
tokenizer = Tokenizer()
tokenizer.fit_on_texts(df['cleaned_review'])
sequences = tokenizer.texts_to_sequences(df['cleaned_review'])
word_index = tokenizer.word_index

# Print some values from tokenizer
print("Word Index:\n", list(word_index.items())[:10])  # Print first 10 word indices

# Pad sequences
max_length = 100
X = pad_sequences(sequences, maxlen=max_length)

# Print shape and some values of X
print("\nShape of X:", X.shape)
print("Sample values from X:\n", X[:5])

# Load GloVe embeddings
def load_glove_embeddings(glove_file):
    embeddings_index = {}
    with open(glove_file, encoding="utf-8") as f:
        for line in f:
            values = line.split()
            word = values[0]
            coefs = np.asarray(values[1:], dtype='float32')
            embeddings_index[word] = coefs
    return embeddings_index

glove_file = r'C:\Users\hp\Documents\internships\Ignite\glove.6B.100d.txt'
embeddings_index = load_glove_embeddings(glove_file)

# Print some values from embeddings_index
print("\nSample GloVe embeddings:\n", {k: embeddings_index[k] for k in list(embeddings_index)[:5]})

# Create the embedding matrix
embedding_dim = 100
embedding_matrix = np.zeros((len(word_index) + 1, embedding_dim))
for word, i in word_index.items():
    embedding_vector = embeddings_index.get(word)
    if embedding_vector is not None:
        embedding_matrix[i] = embedding_vector

# Print shape and some values of embedding matrix
print("\nShape of Embedding Matrix:", embedding_matrix.shape)
print("Sample values from Embedding Matrix:\n", embedding_matrix[:5])

# Prepare the labels
y = df['sentiment'].apply(lambda x: 1 if x == 'pos' else 0).values

# Print shape and some values of y
print("\nShape of y:", y.shape)
print("Sample values from y:\n", y[:5])

# Define the model
model = Sequential()
model.add(Embedding(input_dim=len(word_index) + 1, 
                    output_dim=embedding_dim, 
                    weights=[embedding_matrix], 
                    input_length=max_length, 
                    trainable=False))
model.add(LSTM(128, dropout=0.2, recurrent_dropout=0.2))
model.add(Dense(1, activation='sigmoid'))

# Compile the model
model.compile(loss='binary_crossentropy',
              optimizer='adam',
              metrics=['accuracy'])

# Print model summary
model.summary()

# Assuming X and y are your features and labels
# model.fit(X, y, epochs=10, batch_size=32, validation_split=0.2)


Word Index:
 [('this', 1), ('is', 2), ('a', 3), ('great', 4), ('movie', 5), ('i', 6), ('did', 7), ('not', 8), ('like', 9), ('the', 10)]

Shape of X: (3, 100)
Sample values from X:
 [[ 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
   0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
   0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
   0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  1
   2  3  4  5]
 [ 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
   0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
   0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
   0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  6  7
   8  9 10 11]
 [ 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
   0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
   0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0

##### 2.4 Predictions on New Data

In [35]:
def predict_sentiment(review):
    cleaned_review = preprocess_text(review)
    sequence = tokenizer.texts_to_sequences([cleaned_review])
    padded_sequence = pad_sequences(sequence, maxlen=max_length)
    prediction = model.predict(padded_sequence)[0][0]
    return 'positive' if prediction > 0.5 else 'negative'

# Example reviews
reviews = [
    "I loved this movie, it was fantastic!",
    "This was a terrible movie, I hated it.",
    "It was an average movie, not too bad but not great either."
]

for review in reviews:
    print(f'Review: {review}')
    print(f'Sentiment: {predict_sentiment(review)}\n')


Review: I loved this movie, it was fantastic!
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 440ms/step
Sentiment: positive

Review: This was a terrible movie, I hated it.
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 35ms/step
Sentiment: positive

Review: It was an average movie, not too bad but not great either.
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 44ms/step
Sentiment: positive

