In [2]:
import json, os
import pandas as pd
import numpy as np
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


##Merged Dataset

In [3]:
df = pd.read_csv('/content/drive/MyDrive/Colab Notebooks/project/3rd project_recommender_system/data_preparation/final.csv')
df.shape

(69269, 13)

In [4]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 69269 entries, 0 to 69268
Data columns (total 13 columns):
 #   Column         Non-Null Count  Dtype  
---  ------         --------------  -----  
 0   business_id    69269 non-null  object 
 1   name           69269 non-null  object 
 2   state          69269 non-null  object 
 3   stars_x        69269 non-null  float64
 4   is_open        69269 non-null  int64  
 5   attributes     69269 non-null  object 
 6   categories     69269 non-null  object 
 7   user_id        69269 non-null  object 
 8   useful         69269 non-null  int64  
 9   funny          69269 non-null  int64  
 10  cool           69269 non-null  int64  
 11  text           69269 non-null  object 
 12  review_tokens  69269 non-null  object 
dtypes: float64(1), int64(4), object(8)
memory usage: 6.9+ MB


In [5]:
df = df.drop(index=df[df['is_open'] == 0].index)
df.shape

(51609, 13)

# Model Selection

model selection not for deep learning

**Collaborative Filtering** recommends items based on similar users' preferences.<br>
**Content-based Filtering** recommends items based on the attributes of the items.<br>
**Hybrid Filtering** combines both Collaborative and Content-based filtering to improve the performance.

- I need to first create the necessary data structures to store the required information.

1. **Collaborative Filtering**: 
I will use the Surprise library to implement collaborative filtering. Surprise is a Python library for building and analyzing recommender systems that deal with explicit rating data.

2. **Content-based Filtering**:
I will use the TF-IDF matrix that I created earlier. I can compute the pairwise cosine similarity between all the items using the cosine_similarity function from scikit-learn.

3. **Hybrid Filtering**:
Once I have the necessary data structures, I can use them to build a hybrid recommendation system. The basic idea behind hybrid filtering is to combine the predictions from collaborative and content-based filtering models.

In [6]:
!pip install scikit-surprise

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


In [7]:
# Collaborative Filtering
import surprise
from surprise import Reader, Dataset, SVD
from surprise.model_selection import cross_validate

# Define a reader to read in the rating data
reader = Reader(rating_scale=(1, 5))

# Load the rating data into a Surprise dataset
data = Dataset.load_from_df(df[['user_id', 'business_id', 'stars_x']], reader)

# Split the data into training and testing sets
trainset = data.build_full_trainset()

# Define a collaborative filtering model using SVD
model_collab = SVD()

# Train the model on the rating data
model_collab.fit(trainset)

<surprise.prediction_algorithms.matrix_factorization.SVD at 0x7f6b1c31b190>

SVD can be used to perform Latent Semantic Analysis (LSA) in NLP by transforming a document-term matrix into a document-concept matrix, where the concepts are derived from the singular values and vectors of the document-term matrix. This can be useful for text classification, information retrieval, and topic modeling.




In [7]:
from sklearn.feature_extraction.text import TfidfVectorizer

# Join the preprocessed text into a single string
# df['review_text'] = df['review_tokens'].apply(lambda x: ' '.join(x))

# Define the vectorizer
tfidf = TfidfVectorizer()
# ngram_range=(1,2), max_df=0.75, min_df=5, max_features=5000

# Fit and transform the reviews
tfidf_matrix = tfidf.fit_transform(df['text'])

# Content-based Filtering
from sklearn.metrics.pairwise import cosine_similarity
from scipy.sparse import csr_matrix

# Convert tfidf_matrix to a sparse matrix
tfidf_sparse = csr_matrix(tfidf_matrix)

# Compute the cosine similarity using the sparse matrix
cosine_sim = cosine_similarity(tfidf_sparse)
# Compute the pairwise cosine similarity between all the items
cosine_sim = cosine_similarity(tfidf_matrix, tfidf_matrix)

Here, I will use a simple linear combination of the two models' predictions, where the weight given to each model is determined by a parameter alpha.

In [30]:
import numpy as np
# Hybrid Filtering
def hybrid_recommendations(user_id, business_id):
    """
    Get hybrid recommendations for a user and a business.
    
    Parameters:
        - user_id: ID of the user for whom to make recommendations
        - business_id: ID of the business for which to make recommendations
        - alpha: Weight given to the collaborative filtering model (default: 0.5)
    
    Returns:
        - A list of top 5 recommended business names and their corresponding predicted ratings
    """
    
    # Get the indices of the user and business in the data matrix
    user_idx = np.where(df['user_id'] == user_id)[0][0]
    business_idx = np.where(df['business_id'] == business_id)[0][0]
    
    # Compute the collaborative filtering prediction
    collab_pred = model_collab.predict(user_id, business_id).est
    
    # Compute the content-based filtering predictions
    # content_preds = cosine_sim[business_idx]
    
    # Combine the predictions from both models
    hybrid_pred = collab_pred
    # hybrid_pred = (1 - alpha) * collab_pred + alpha * content_preds
    
    # Get the top 5 recommended business names and their predicted ratings
    hybrid_preds = list(enumerate(hybrid_pred))
    hybrid_preds = sorted(hybrid_preds, key=lambda x: x[1], reverse=True)
    top_business_indices = [x[0] for x in hybrid_preds[:5]]
    top_business_names = df.iloc[top_business_indices]['name'].tolist()
    # top_business_ratings = [x[1] for x in hybrid_preds[:5]]
    top_business_ratings = hybrid_pred[top_business_indices].tolist()
    
    return

In [None]:
# Set the user ID and business ID
user_id = '0q2W3-ieBUJWD5TTLKi3Ug'
business_id = 'MTSW4McQd7CbVtyjqoe9mw'

# Call the hybrid_recommendations() function with alpha=0.5
recommendations = hybrid_recommendations(user_id, business_id)

# Print the recommendations
print(recommendations)

### Challenge

I want to make hybrid recommendations. However, the calculation of the cosine similarity matrix using cosine_similarity is causing the RAM to crash, due to the large size of the matrix.

So, the used model here is Collaborative Filtering using SVD without contents-based Filtering.

# Others

### converting reviews into numerical representations

Count-Based Representation

> TF-IDF Matrix



In [None]:
# from sklearn.feature_extraction.text import TfidfVectorizer

# # Join the preprocessed text into a single string
# df['review_text'] = df['review_tokens'].apply(lambda x: ' '.join(x))

# # Define the vectorizer
# tfidf = TfidfVectorizer()
# # ngram_range=(1,2), max_df=0.75, min_df=5, max_features=5000
# # Fit and transform the reviews
# tfidf_matrix = tfidf.fit_transform(df['review_text'])

# # # Create DTM
# # dtm_tfidf = pd.DataFrame(tfidf_matrix.todense(), columns=tfidf.get_feature_names_out())

# # # Print the shape of the TF-IDF matrix
# # print(tfidf_matrix.shape)
# # display(dtm_tfidf)

I've added some additional parameters to the TfidfVectorizer:

- ngram_range=(1,2): This specifies that the vectorizer should consider both unigrams and bigrams when creating features.
- max_df=0.75: This specifies that words should be excluded from the vocabulary if they appear in more than 75% of the documents.
- min_df=5: This specifies that words should be excluded from the vocabulary if they appear in fewer than 5 documents.
- max_features=5000: This specifies that the vectorizer should consider at most 5000 features (i.e., the 5000 most frequent words in the corpus).

how to split the data into training and test sets, train a recommender system using TF-IDF, generate recommendations for each user in the test set, and calculate the MAP score

In [None]:
# from sklearn.feature_extraction.text import TfidfVectorizer, CountVectorizer
# from nltk.tokenize import WordPunctTokenizer

# # df['review_text'] = df['review_tokens'].apply(clean_text)
# df['review_text'] = df['review_tokens'].apply(lambda x: ' '.join(x))

# # Split data into train and test sets
# train_df, test_df = train_test_split(df, test_size=0.2, random_state=42)

# vec = CountVectorizer(ngram_range=(1,2), max_df=0.75, min_df=5, max_features=5000)

# # Fit and transform the reviews
# vec_train = vec.fit_transform(train_df['review_text'])
# print(vec_train.shape)

In [None]:
# from sklearn.model_selection import train_test_split
# from sklearn.metrics.pairwise import cosine_similarity
# from sklearn.feature_extraction.text import TfidfVectorizer,CountVectorizer

# Join the preprocessed text into a single string
# df['review_text'] = df['review_tokens'].apply(clean_text)
# df['review_text'] = df['review_tokens'].apply(lambda x: ' '.join(x))

# # Split data into train and test sets
# train_df, test_df = train_test_split(df, test_size=0.2, random_state=42)

# # Define the vectorizer
# vec = CountVectorizer(ngram_range=(1,2), max_df=0.75, min_df=5, max_features=5000)

# # Fit and transform the reviews
# vec_matrix_train = vec.fit_transform(train_df['review_tokens'])

# # Initialize cosine similarity matrix
# cosine_sim = cosine_similarity(vec_matrix_train, vec_matrix_train)

# # Define function to get top recommendations for each user
# def recommendations(user_id, cosine_sim, df, top_n=5):
#     # Get index of user_id in df
#     user_index = df[df['user_id'] == user_id].index[0]

#     # Get cosine similarity scores for user_index
#     sim_scores = list(enumerate(cosine_sim[user_index]))

#     # Sort the list of sim_scores in descending order
#     sim_scores = sorted(sim_scores, key=lambda x: x[1], reverse=True)

#     # Get indices of top_n similar users
#     top_similar_users = [i[0] for i in sim_scores[1:top_n+1]]

#     # Get the restaurant recommendations for the top_n similar users
#     recommended_restaurants = df.iloc[top_similar_users][['name', 'categories', 'stars_x']]

#     return recommended_restaurants

# user_id = 'fJ3iKa2YmdNMOOy4L_R9kQ'
# top_n = 5
# recommended_restaurants = recommendations(user_id, cosine_sim, df, top_n)
# recommended_restaurants

In [None]:
# from sklearn.metrics.pairwise import cosine_similarity
# from sklearn.model_selection import train_test_split

# # Split data into train and test sets
# train_df, test_df = train_test_split(df, test_size=0.2, random_state=42)

# # Create TF-IDF vectorizer
# tfidf = TfidfVectorizer(stop_words='english', ngram_range=(1,2), max_df=0.75, min_df=5, max_features=5000)

# # Fit and transform the training data
# tfidf_matrix_train = tfidf.fit_transform(train_df['text'])

# # Initialize cosine similarity matrix
# cosine_sim = cosine_similarity(tfidf_matrix_train, tfidf_matrix_train)

# # Get a list of restaurants reviewed by a given user
# def get_user_reviews(user_id, df):
#     return df[df['user_id'] == user_id]['business_id'].tolist()

# # Find similar restaurants for a given restaurant
# def get_similar_restaurants(restaurant_id, tfidf_matrix, df):
#     restaurant_idx = df[df['business_id'] == restaurant_id].index[0]
#     similarity_scores = cosine_similarity(tfidf_matrix[restaurant_idx], tfidf_matrix)
#     similar_restaurants = list(enumerate(similarity_scores[0]))
#     similar_restaurants = sorted(similar_restaurants, key=lambda x: x[1], reverse=True)
#     return similar_restaurants[1:]

# # Make recommendations for a given user
# def recommend_restaurants(user_id, tfidf_matrix, df, top_n=5):
#     user_reviews = get_user_reviews(user_id, df)
#     restaurant_scores = {}
#     for restaurant_id in user_reviews:
#         similar_restaurants = get_similar_restaurants(restaurant_id, tfidf_matrix, df)
#         for i, (idx, score) in enumerate(similar_restaurants):
#             if idx in restaurant_scores:
#                 restaurant_scores[idx] += score * (0.95 ** i)
#             else:
#                 restaurant_scores[idx] = score * (0.95 ** i)
#     restaurant_scores = sorted(restaurant_scores.items(), key=lambda x: x[1], reverse=True)
#     recommended_restaurants = [df.iloc[idx]['business_id'] for idx, score in restaurant_scores[:top_n]]
#     return recommended_restaurants

# user_id = 'fJ3iKa2YmdNMOOy4L_R9kQ'
# top_n = 5
# recommended_restaurants = recommend_restaurants(user_id, tfidf_matrix, df, top_n)
# recommended_restaurants

In [None]:
# # Define function to calculate Average Precision (AP) for a given user
# def apk(actual, predicted, k=10):
#     if len(predicted) > k:
#         predicted = predicted[:k]

#     score = 0.0
#     num_hits = 0.0

#     for i, p in enumerate(predicted):
#         if p in actual and p not in predicted[:i]:
#             num_hits += 1.0
#             score += num_hits / (i+1.0)

#     if not actual:
#         return 0.0

#     return score / min(len(actual), k)

# # Get unique user_ids in test_df
# users_test = test_df['user_id'].unique()

# # Initialize list to store AP scores for each user
# ap_scores = []

# # Generate recommendations and calculate AP for each user in test_df
# for user in users_test:
#     # Get recommendations for user
#     recommendations = get_recommendations(user, cosine_sim, train_df)

#     # Get actual restaurants rated by user in test_df
#     actual = test_df[test_df['user_id'] == user]['name'].tolist()

#     # Calculate AP for user
#     ap = apk(actual, recommendations['name'].tolist())

#     # Append AP score to ap_scores
#     ap_scores.append(ap)

# # Calculate MAP score
# map_score = sum(ap_scores) / len(ap_scores)

# print("MAP Score: ", map_score)

The main difference between the two is that CountVectorizer simply counts the number of times each word appears in a document, while TfidfVectorizer takes into account the frequency of the word in the entire corpus of documents.

TfidfVectorizer gives more weight to words that are more important or informative for a given document, while CountVectorizer treats all words equally.

Refer to https://arxiv.org/pdf/2004.13851.pdf



In [None]:
# user_id = 'your_user_id_here'
# top_n = 5
# recommended_restaurants = recommend_restaurants(user_id, tfidf_matrix, df, top_n)

In [None]:
# import numpy as np

# def calculate_apk(actual, predicted, k=10):
#     """
#     Calculates the Average Precision at k for a single user.
    
#     Args:
#     actual (list): A list of the actual restaurant IDs (as strings).
#     predicted (list): A list of the predicted restaurant IDs (as strings).
#     k (int): The number of recommendations to consider.
    
#     Returns:
#     apk (float): The Average Precision at k for the user.
#     """
#     if len(predicted) > k:
#         predicted = predicted[:k]
    
#     score = 0.0
#     num_hits = 0.0
    
#     for i, p in enumerate(predicted):
#         if p in actual and p not in predicted[:i]:
#             num_hits += 1.0
#             score += num_hits / (i+1.0)
            
#     if not actual:
#         return 0.0
    
#     return score / min(len(actual), k)

# def calculate_map(actual_dict, predicted_dict, k=10):
#     """
#     Calculates the Mean Average Precision for a set of recommendations.
    
#     Args:
#     actual_dict (dict): A dictionary mapping user IDs to lists of actual restaurant IDs.
#     predicted_dict (dict): A dictionary mapping user IDs to lists of predicted restaurant IDs.
#     k (int): The number of recommendations to consider.
    
#     Returns:
#     map (float): The Mean Average Precision for the recommendations.
#     """
#     apks = []
#     for user_id in actual_dict.keys():
#         actual = actual_dict[user_id]
#         predicted = predicted_dict.get(user_id, [])
#         apk = calculate_apk(actual, predicted, k=k)
#         apks.append(apk)
        
#     return np.mean(apks)

two classes: "liked by the user" & "not liked by the user"<br/>
-->Binary classification prob.

Distributed-Based Representation


> Embedding Matrix: Pretrained word embedding GloVe



This code uses GloVe pre-trained word embeddings to create an embedding matrix, and builds a deep learning model using LSTM layers to predict restaurant ratings. It then trains the model on the training data, evaluates the model on the test data, and makes restaurant recommendations based on the user's preferences

In [None]:
# import gensim.downloader as api

# # Download and load the pre-trained GloVe word embeddings
# word_vectors = api.load("glove-wiki-gigaword-100")



Gensim is an open-source Python library designed to process and analyze large-scale collections of text data. It provides implementations of several state-of-the-art algorithms for natural language processing (NLP), including topic modeling, document similarity analysis, and word embedding.

GloVe (Global Vectors for Word Representation) is a popular unsupervised algorithm for generating word embeddings, which are dense vector representations of words that capture their semantic and syntactic meaning. The GloVe algorithm is typically used to train word embeddings on large text corpora.

While GloVe itself is not directly integrated into Gensim, Gensim provides an interface for loading pre-trained GloVe embeddings and using them for downstream NLP tasks. This makes it easy to incorporate GloVe embeddings into your own NLP models built with Gensim.

## Splitting the Data

In [None]:
from sklearn.model_selection import train_test_split

# Split data into training and testing sets
train_data, test_data = train_test_split(df, test_size=0.2, random_state=42)

In [None]:
# import json, os
# import pandas as pd
# import numpy as np
# from time import sleep, time
# import numpy as np
# from tqdm import tqdm
# import pickle
# from sklearn.metrics.pairwise import cosine_similarity
# from sklearn.feature_extraction.text import CountVectorizer
# import psycopg2
# import gensim
# import gensim.downloader as api
# from gensim.models.doc2vec import Doc2Vec, TaggedDocument
# import re
# from collections import namedtuple

# doc_vectorizer = Doc2Vec(dm=1, vector_size=300, window=5, alpha=0.025, min_alpha=0.025, seed=1111)

# # Assume that the tokenized reviews are stored as a list of strings in the 'review_tokens' column of the DataFrame
# # Convert the tokenized reviews to sentences
# df['review_text'] = df['review_tokens'].apply(lambda x: ' '.join([word.replace("'", "").replace(",", "") for word in x]))
# df['review_text'] = df['review_tokens'].apply(lambda x: ' '.join([' '.join(tokens) for tokens in df['review_text']]))

# # Print the 'review_text' column to verify the result
# print(df['review_text'])

# # # Store the pre-tokenized reviews in the 'review_tokens' column of the DataFrame
# # df['review_text'] = reviews_str

# agg = df[['name', 'review_text']]

# tagged_train_docs = [[TaggedDocument(words=c, tags=[d]) for d, c in agg[['name', 'review_text']].values]]
# tagged_train_docs

# # doc_vectorizer.build_vocab(tagged_train_docs)

# # print(str(doc_vectorizer))

## Model Selection



In [None]:
# from keras.preprocessing.text import Tokenizer

# # Create a tokenizer
# tokenizer = Tokenizer()

# # Fit the tokenizer on the list of review tokens
# tokenizer.fit_on_texts(df['review_tokens'])

# # Convert the list of tokenized reviews to sequences
# sequences = tokenizer.texts_to_sequences(df['review_tokens'])

In [None]:
# from tensorflow.keras.preprocessing.sequence import pad_sequences
# import nltk
# from nltk.sentiment.vader import SentimentIntensityAnalyzer
# import gc

# nltk.download('vader_lexicon')

# # Initialize the VADER sentiment analyzer
# sid = SentimentIntensityAnalyzer()

# # Define a function to convert each review to a sequence of vectors
# def reviews_to_vectors_with_sentiment(reviews, tokenizer, max_length, sid, embedding_model):
#     # Convert the tokenized reviews to sequences of word indices
#     sequences = tokenizer.texts_to_sequences(reviews)

#     # Convert the sequences of word indices to sequences of word embeddings
#     vectors = []
#     for seq in sequences:
#         vector_seq = []
#         for word_index in seq:
#             word = tokenizer.index_word[word_index]
#             if word in embedding_model:
#                 vector_seq.append(embedding_model[word])
#         vectors.append(vector_seq)

#     # Pad the sequences with zeros so that all reviews have the same length
#     padded_vectors = pad_sequences(vectors, maxlen=max_length, padding='post')

#     # Calculate sentiment scores for each review
#     sentiment_scores = []
#     for i, review in enumerate(reviews):
#         ss = sid.polarity_scores(review)
#         sentiment_scores.append(ss['compound'])
#         if i % 1000 == 0:
#             gc.collect()  # Call the garbage collector every 1000 reviews to free up memory
    
#     sentiment_scores = np.array(sentiment_scores)
#     return padded_vectors, sentiment_scores

# # Convert the training and testing data to sequences of vectors
# max_length = 100  # Set the maximum length of a review to 100 words
# train_vectors, train_sentiment = reviews_to_vectors_with_sentiment(train_data['text'], tokenizer, max_length, sid, word_vectors)
# test_vectors, test_sentiment = reviews_to_vectors_with_sentiment(test_data['text'], tokenizer, max_length, sid, word_vectors)

# # Split the data into input and output arrays
# X_train = train_vectors
# y_train = train_sentiment
# X_test = test_vectors
# y_test = test_sentiment

I can use unsupervised sentiment analysis techniques such as sentiment lexicons to estimate the sentiment of your reviews. A sentiment lexicon is a collection of words and their associated sentiment scores (e.g., positive, negative, or neutral).

The sentiment_scores list returned by the reviews_to_vectors_with_sentiment function is a list of dictionaries, which cannot be directly encoded by the LabelEncoder or any other encoder.

In [None]:
# # Build deep learning model
# model = Sequential()
# model.add(Embedding(input_dim=num_words, output_dim=100, weights=[embedding_matrix], input_length=max_sequence_length, trainable=False))
# model.add(LSTM(64))
# model.add(Dense(10, activation='softmax'))
# model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

# # Train model on training data
# model.fit(X_train_padded, y_train, epochs=10, batch_size=32, validation_split=0.2)

# # Evaluate model on test data
# score = model.evaluate(X_test_padded, y_test, batch_size=32)

# # Make restaurant recommendations
# user_preferences = 'I want a cheap Italian restaurant near downtown'
# user_preferences = ' '.join([stemmer.stem(token) for token in tokenizer(user_preferences.lower())])
# user_sequence = tokenizer.texts_to_sequences([user_preferences])
# user_padded = pad_sequences(user_sequence, maxlen=max_sequence_length)
# predicted_ratings = model.predict(user_padded)
# top_restaurants = df.loc[df['stars'].isin(np.argsort(predicted_ratings)[-10:])]

In [None]:
# print(train_vectors.shape)
# print(y_train.shape)
# print(train_vectors.dtype)
# print(y_train.dtype)

## Model Training

I will train the deep neural network using the training set and validate it using the testing set.

In [None]:
# import tensorflow as tf
# from tensorflow.keras.layers import Embedding, SimpleRNN, Dense
# from tensorflow.keras.models import Sequential

# # Define the RNN model architecture
# model = Sequential()
# model.add(Embedding(input_dim=len(word2vec_model.wv.vocab), output_dim=100, weights=[tfidf], input_length=max_length))
# model.add(SimpleRNN(units=128))
# model.add(Dense(units=1, activation='sigmoid'))

This code defines a simple RNN model with an Embedding layer, a SimpleRNN layer, and a Dense output layer. The Embedding layer is initialized with the pre-trained word vectors, and the weights are fixed during training. The SimpleRNN layer processes the sequence of word vectors and produces a final output, which is passed through a Dense layer with a sigmoid activation function to produce a binary classification output.

I chose the RNN model with an Embedding layer, a SimpleRNN layer, and a Dense output layer because it is a popular and effective model for processing sequential data like text. The Embedding layer is used to convert the sequence of words represented by the pre-trained word embeddings into a sequence of dense vectors that can be understood by the RNN model. The SimpleRNN layer is used to process the sequence of dense vectors and capture the relationships between the words in the sequence. Finally, the Dense output layer is used to predict the sentiment of the restaurant review based on the relationships between the words captured by the RNN model.

The RNN model is a good choice for this problem because it can capture the contextual relationships between words in a sentence. It is well-suited for text classification tasks like sentiment analysis, where the order of words in a sentence is important for determining the sentiment of the sentence. Additionally, the SimpleRNN layer is a computationally efficient choice for this problem, as it is capable of capturing short-term dependencies in the input sequence.

In [None]:
# # Compile the model
# model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

# # Train the model
# history = model.fit(train_vectors, train_labels, epochs=10, batch_size=64, validation_data=(test_vectors, test_labels))

# # Evaluate the model
# loss, accuracy = model.evaluate(test_vectors, test_labels, verbose=False)
# print(f'Test accuracy: {accuracy:.3f}')

The model is then compiled with an Adam optimizer and binary cross-entropy loss function.

For binary classification problems, where the goal is to classify an input into one of two classes, binary cross-entropy is a common loss function used in neural networks. It calculates the difference between the predicted probabilities and the true class labels and updates the model weights accordingly. It is often chosen because it has desirable mathematical properties and has been shown to work well in practice.

As for the metrics, accuracy is a commonly used metric for classification problems. It measures the proportion of correctly classified examples out of all the examples. In this case, it tells us the percentage of reviews that are correctly classified as positive or negative. By using accuracy as the metric, we can easily evaluate the performance of the model and compare it to other models that use the same metric.

It is trained on the train_vectors and train_labels arrays for 10 epochs with a batch size of 64. Finally, the model is evaluated on the test_vectors and test_labels arrays, and the test accuracy is printed.