## Project 2
## 2021315566 안훈규
  
  
## 1. Generate movie reviews
First of all, I imported all necessary libraries, and loaded the pre-trained GPT-2 model.

In [2]:
import os
from transformers import GPT2Tokenizer, GPT2LMHeadModel, AdamW
import torch
from torch.utils.data import Dataset, DataLoader, random_split, Subset
from nltk.translate.bleu_score import sentence_bleu, SmoothingFunction
import evaluate
import pickle
import tensorflow
import tqdm

tokenizer = GPT2Tokenizer.from_pretrained('heegyu/gpt2-emotion')

And check the version of PyTorch and if cuda is available.

In [2]:
print(torch.__version__)
if torch.cuda.is_available():
    device = torch.device('cuda')
else:
    device = torch.device('cpu')
print(device)

2.2.2+cu121
cuda


Define a class that can save and load the datasets.

In [3]:
class TextDataset(Dataset):
    def __init__(self, pos_dir, neg_dir):
        self.samples = []
        self.labels = []

        # Load positive samples
        for filename in os.listdir(pos_dir):
            with open(os.path.join(pos_dir, filename), 'r', encoding='utf-8') as file:
                text = file.read()
                self.samples.append(text)

        # Load negative samples
        for filename in os.listdir(neg_dir):
            with open(os.path.join(neg_dir, filename), 'r', encoding='utf-8') as file:
                text = file.read()
                self.samples.append(text)

    def __len__(self):
        return len(self.samples)

    def __getitem__(self, idx):
        sample = self.samples[idx]
        
        return sample

Define a class that can handle tokenized data.

In [4]:
class IMDBDataset(Dataset):
    def __init__(self, texts, tokenizer, max_length=512):
        self.samples = []
        self.tokenizer = tokenizer
        self.max_length = max_length

        # Load positive samples
        for text in texts:
            self.samples.append(text)

    def __len__(self):
        return len(self.samples)

    def __getitem__(self, idx):
        sample = self.samples[idx]
        
        encoding = self.tokenizer(
            sample,
            truncation=True,
            max_length=self.max_length,
            padding='max_length',
            return_tensors='pt'
        )
        input_ids = encoding['input_ids'].squeeze()
        attention_mask = encoding['attention_mask'].squeeze()
        
        return input_ids, attention_mask

This is functions for saving and loading datasets, because loading for datasets is quite slow.

In [5]:
def save_dataset_splits(train_dataset, val_dataset, test_dataset, train_indices_file, val_indices_file, test_indices_file):
    # Save the indices of the training, validation, and test sets    
    with open(train_indices_file, 'wb') as f:
        pickle.dump(train_indices, f)
    
    with open(val_indices_file, 'wb') as f:
        pickle.dump(val_indices, f)

    with open(test_indices_file, 'wb') as f:
        pickle.dump(test_indices, f)

def load_dataset_splits(train_indices_file, val_indices_file, test_indices_file):
    # Load the indices of the training, validation, and test sets
    with open(train_indices_file, 'rb') as f:
        train_indices = pickle.load(f)
    
    with open(val_indices_file, 'rb') as f:
        val_indices = pickle.load(f)

    with open(test_indices_file, 'rb') as f:
        test_indices = pickle.load(f)
    
    return train_indices, val_indices, test_indices

split_dataset is function for split training dataset and validation dataset for 8:2 ratio.

In [6]:
def split_dataset(dataset, train_size=0.8):
    train_size = int(len(dataset) * train_size)
    val_size = len(dataset) - train_size
    return random_split(dataset, [train_size, val_size])

Check if the saved indices files already exist, if true, just load the saved files and create dataloader.  
If not, create datasets and split the training dataset and validation dataset.  
Also, save the made dataset using pickle.  
There are so many datasets, so training and feed forwarding process spends quite much time.  
So I allocated 1% of whole dataset.

In [7]:
# File names for saved datasets
train_indices_file = './save/train_indices.pkl'
val_indices_file = './save/val_indices.pkl'
test_indices_file = './save/test_indices.pkl'

# Directories for text data
train_pos_dir = './train/pos'
train_neg_dir = './train/neg'
test_pos_dir = './test/pos'
test_neg_dir = './test/neg'

# Check if the indices files already exist
if os.path.exists(train_indices_file) and os.path.exists(val_indices_file) and os.path.exists(test_indices_file):
    # Load the saved datasets and recreate the DataLoaders
    train_indices, val_indices, test_indices = load_dataset_splits(train_indices_file, val_indices_file, test_indices_file)
    train_dataset = Subset(TextDataset(train_pos_dir, train_neg_dir), train_indices)
    val_dataset = Subset(TextDataset(train_pos_dir, train_neg_dir), val_indices)
    test_dataset = Subset(TextDataset(test_pos_dir, test_neg_dir), test_indices)
    tokenized_train_dataset = IMDBDataset(texts=train_dataset, tokenizer=tokenizer)
    tokenized_val_dataset = IMDBDataset(texts=val_dataset, tokenizer=tokenizer)
    tokenized_test_dataset = IMDBDataset(texts=test_dataset, tokenizer=tokenizer)

else:
    # Create training dataset
    full_train_dataset = TextDataset(train_pos_dir, train_neg_dir)

    # Split the training dataset into training and validation sets (80% and 20%)
    train_dataset, val_dataset = split_dataset(full_train_dataset, train_size=0.8)
    train_dataset, _ = split_dataset(train_dataset, 0.01)
    val_dataset, _ = split_dataset(val_dataset, 0.01)

    # Get the indices from the subsets
    train_indices = train_dataset.indices
    val_indices = val_dataset.indices

    # Create test dataset
    test_dataset = TextDataset(test_pos_dir, test_neg_dir)
    test_dataset, _ = split_dataset(test_dataset, 0.01)
    test_indices = list(range(len(test_dataset)))

    # Save dataset splits
    save_dataset_splits(train_dataset, val_dataset, test_dataset, train_indices_file, val_indices_file, test_indices_file)

    # Convert datasets to subsets for consistency
    train_dataset = Subset(full_train_dataset, train_indices)
    val_dataset = Subset(full_train_dataset, val_indices)
    test_dataset = Subset(test_dataset, test_indices)
    tokenized_train_dataset = IMDBDataset(texts=train_dataset, tokenizer=tokenizer)
    tokenized_val_dataset = IMDBDataset(texts=val_dataset, tokenizer=tokenizer)
    tokenized_test_dataset = IMDBDataset(texts=test_dataset, tokenizer=tokenizer)

# Create DataLoaders for training, validation, and test sets
train_loader = DataLoader(tokenized_train_dataset, batch_size=4, shuffle=True)
val_loader = DataLoader(tokenized_val_dataset, batch_size=4, shuffle=False)
test_loader = DataLoader(tokenized_test_dataset, batch_size=4, shuffle=False)

Define a custom loss function.

• shift_logits: Logits excluding the last token along the last dimension. This is done so that each token predicts the next token.  
• shift_labels: Labels excluding the first token. This corresponds to the actual next token that needs to be predicted.  
• shift_attention_mask: Attention mask excluding the first token. This indicates whether each prediction is valid or not.

In [8]:
def custom_loss_function(outputs, labels, attention_mask):
    # Shift so that tokens < n predict n
    shift_logits = outputs.logits[..., :-1, :].contiguous()
    shift_labels = labels[..., 1:].contiguous()
    shift_attention_mask = attention_mask[..., 1:].contiguous()

    # Flatten the tokens
    loss_fct = torch.nn.CrossEntropyLoss(reduction='none')
    loss = loss_fct(shift_logits.view(-1, shift_logits.size(-1)), shift_labels.view(-1))

    # Apply the attention mask
    loss = loss.view(shift_labels.size())
    loss = loss * shift_attention_mask

    # Calculate the mean loss
    loss = loss.sum() / shift_attention_mask.sum()
    
    return loss

Finally, train the GPT-2 model using the prepared training dataset and custom loss function.  
And I saved this trained model.

In [9]:
# Initialize the GPT-2 model
model = GPT2LMHeadModel.from_pretrained('heegyu/gpt2-emotion')
model.resize_token_embeddings(len(tokenizer))
model = model.to(device)

# Training parameters
epochs = 3
learning_rate = 0.001
optimizer = AdamW(model.parameters(), lr=learning_rate)

model.to(device)

# Training loop
for epoch in range(epochs):
    model.train()
    train_loss = 0
    for batch in train_loader:
        input_ids, attention_mask = [x.to(device) for x in batch]
        labels = input_ids.clone()

        optimizer.zero_grad()
        outputs = model(input_ids, attention_mask=attention_mask, labels=labels)
        
        loss = custom_loss_function(outputs, labels, attention_mask)
        loss.backward()
        optimizer.step()

        train_loss += loss.item()

    avg_train_loss = train_loss / len(train_loader)

    model.eval()
    val_loss = 0
    with torch.no_grad():
        for batch in val_loader:
            input_ids, attention_mask = [x.to(device) for x in batch]
            labels = input_ids.clone()

            outputs = model(input_ids, attention_mask=attention_mask, labels=labels)
            loss = custom_loss_function(outputs, labels, attention_mask)

            val_loss += loss.item()
            print(f"loss: {val_loss}")

    avg_val_loss = val_loss / len(val_loader)

    print(f'Epoch {epoch+1}/{epochs}, Training Loss: {avg_train_loss}, Validation Loss: {avg_val_loss}')

# Save the fine-tuned model
model.save_pretrained('./save/gpt2-imdb-finetuned')




loss: 4.33152961730957
loss: 7.879004716873169
loss: 12.007280111312866
loss: 15.692971467971802
loss: 19.65224027633667
loss: 23.609177112579346
loss: 27.502862215042114
loss: 31.676389932632446
loss: 35.85482907295227
loss: 39.903517961502075
loss: 43.703545808792114
loss: 47.883047342300415
loss: 51.828887701034546
Epoch 1/3, Training Loss: 4.382549562454224, Validation Loss: 3.9868375154641957
loss: 4.635772705078125
loss: 8.426736831665039
loss: 12.81605577468872
loss: 16.697688579559326
loss: 20.879366397857666
loss: 25.091663360595703
loss: 29.2716646194458
loss: 33.81170129776001
loss: 38.3359317779541
loss: 42.667853355407715
loss: 46.73302698135376
loss: 51.26340961456299
loss: 55.56542921066284
Epoch 2/3, Training Loss: 3.109851765632629, Validation Loss: 4.274263785435603
loss: 5.190962791442871
loss: 9.58488655090332
loss: 14.55141830444336
loss: 18.9681339263916
loss: 23.751218795776367
loss: 28.48411464691162
loss: 33.22608661651611
loss: 38.34773111343384
loss: 43.42200

Load the trained fine-tuned GPT-2 model.  
And I made 10 prompts to generate 30 movie reviews.  
And using this model, I generated 30 movie reviews and saved it.

In [10]:
# Load the fine-tuned model
model = GPT2LMHeadModel.from_pretrained('./save/gpt2-imdb-finetuned')
model = model.to(device)

# Set of prompts
prompts = [
    "This movie was",
    "The actors in the film",
    "The plot of the movie",
    "The director did",
    "The film's soundtrack",
    "The cinematography in",
    "Overall, the film",
    "One thing I loved about",
    "A memorable scene in",
    "The ending of the movie"
]

# Generate reviews
generated_reviews = []

for prompt in prompts:
    for _ in range(3):  # Generate 3 reviews per prompt to get 30 reviews in total
        input_ids = tokenizer.encode(prompt, return_tensors='pt').to(device)
        output = model.generate(
            input_ids,
            max_length=100,
            num_return_sequences=1,
            no_repeat_ngram_size=2,
            top_p=0.95,
            top_k=50,
            temperature=1.0,
            num_beams=3,
            do_sample=True,
            early_stopping=True
        )

        review = tokenizer.decode(output[0], skip_special_tokens=True)
        generated_reviews.append(review)

# Save the generated reviews to a text file
with open('./save/generated_reviews.txt', 'w', encoding='utf-8') as f:
    for review in generated_reviews:
        f.write(review + "\n\n")

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generati

From test dataset, I extracted the first 10 words as prompt, and used it to generate text.  
And I compared the generated texts to the true texts and computed the BLEU score.  
Professor's suggested BLEU function showed me only the first decimal place, so I implemented another BLEU function.

In [11]:
# Prepare the prompts and references
prompts = []
references = []

for text in test_dataset:
    words = text.split()
    prompt = " ".join(words[:10])
    prompts.append(prompt)
    references.append(text)

# Generate text using the model
generated_texts = []

for prompt in prompts:
    input_ids = tokenizer.encode(prompt, return_tensors='pt').to(device)
    output = model.generate(
        input_ids,
        max_length=100,
        num_return_sequences=1,
        no_repeat_ngram_size=2,
        top_p=0.95,
        top_k=50,
        temperature=1.0,
        num_beams=3,
        do_sample=True,
        early_stopping=True
    )
    generated_text = tokenizer.decode(output[0], skip_special_tokens=True)
    generated_text = generated_text[len(prompt):].strip()  # Remove the prompt from the generated text
    generated_texts.append(prompt + " " + generated_text)

# Compute BLEU scores
bleu_scores = []
smoothie = SmoothingFunction().method4

for reference, generated_text in zip(references, generated_texts):
    reference = [reference.split()]  # BLEU expects a list of reference texts
    generated_text = generated_text.split()
    bleu_score = sentence_bleu(reference, generated_text, smoothing_function=smoothie)
    bleu_scores.append(bleu_score)

# Calculate the mean BLEU score
mean_bleu_score = sum(bleu_scores) / len(bleu_scores)

print(f'Mean BLEU Score: {mean_bleu_score}')

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generati

Mean BLEU Score: 0.05329482489521875


## 2. Sentiment classification with BERT  
  
I loaded the pre-trained BERT model and classified reviews which are generated by Task 1-d.  
And I used Softmax for accuracy metric, and printed 'NEGATIVE' or 'POSITIVE' for each label. 

In [1]:
import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification

# Set the device
if torch.cuda.is_available():
    device = torch.device('cuda')
else:
    device = torch.device('cpu')
print(device)


# Load the tokenizer and model for sentiment analysis
tokenizer = AutoTokenizer.from_pretrained('textattack/bert-base-uncased-SST-2')
model = AutoModelForSequenceClassification.from_pretrained('textattack/bert-base-uncased-SST-2')
model = model.to(device)

# Read the movie reviews from the file
with open('./save/generated_reviews.txt', 'r') as file:
    reviews = file.readlines()

# Define a function to classify sentiment
def classify_sentiment(review):
    # Tokenize the input text
    inputs = tokenizer(review, return_tensors="pt", truncation=True, padding=True).to(device)
    
    # Forward pass through the model
    with torch.no_grad():
        outputs = model(**inputs)
    
    # Get the logits and apply softmax to get probabilities
    logits = outputs.logits
    probabilities = torch.nn.functional.softmax(logits, dim=-1)
    
    # Get the predicted label
    predicted_class = torch.argmax(probabilities, dim=-1).item()
    
    # Map the predicted label to a human-readable string
    labels = ['NEGATIVE', 'POSITIVE']
    sentiment = labels[predicted_class]
    probability = probabilities[0, predicted_class].item()
    
    return sentiment, probability

# Classify the sentiment of each review and print the results
for review in reviews:
    if not review == "\n":
        sentiment, probability = classify_sentiment(review)
        print(f"Review: {review.strip()}")
        print(f"Sentiment: {sentiment}, Accuracy: {probability:.4f}\n")

  from .autonotebook import tqdm as notebook_tqdm


cuda


  attn_output = torch.nn.functional.scaled_dot_product_attention(


Review: This movie was fun to watch. I thought that this was a good movie. The acting was good, but the story was told very beautifully. It is a story about a girl called Jess who is trying to achieve her life long dream to become a soccer player. She is determined and strives. Her character development was right on the mark and she shows a range of emotions from love to fear. This movie also shows that she has a family with her oldest son and aunts. When the kids come
Sentiment: POSITIVE, Accuracy: 0.9997

Review: This movie was fun to watch. I thought I was wrong.<br /><br />The story is about a girl called Jess who is trying to achieve her life long dream to become a pro soccer player and she has done that. It is a fun story where she meets a guy who likes her and helps her get kicked off the skateboard team. He is good looking and has a great voice. His looks really good and the way the movie switches from good to great is how the team switches
Sentiment: POSITIVE, Accuracy: 0.9994

Review 1: This movie was fun to watch. I thought that this was a good movie. The acting was good, but the story was told very beautifully. It is a story about a girl called Jess who is trying to achieve her life long dream to become a soccer player. She is determined and strives. Her character development was right on the mark and she shows a range of emotions from love to fear. This movie also shows that she has a family with her oldest son and aunts. When the kids come  
Sentiment: POSITIVE, Accracy: 0.9997  
My Opinion: POSITIVE  

Review 2: This movie was fun to watch. I thought I was wrong. The story is about a girl called Jess who is trying to achieve her life long dream to become a pro soccer player and she has done that. It is a fun story where she meets a guy who likes her and helps her get kicked off the skateboard team. He is good looking and has a great voice. His looks really good and the way the movie switches from good to great is how the team switches  
Sentiment: POSITIVE, Accuracy: 0.9994  
My Opinion: POSITIVE  

Review 3: This movie was fun to watch. I thought I was wrong. Firstly, let me start off by saying that I enjoyed watching it. The acting was good, the story was very well developed, and the characters were very likable. i was hoping for everything to be tied together at the end, but it was just not that easy to do that. It's just that people can understand that love can be hard to work and that this movie is a light heart wrenching  
Sentiment: POSITIVE, Accuracy: 0.9993  
My Opinion: POSITIVE  

Review 4: The actors in the film have a lot to do to make their mark. The story is about a girl called Jess who is trying to achieve her life long dream to become a pro soccer player. She is disqualified because she has broken her heart, but she is determined and strives. It is a story of friendships, love, dreams and mortality. There are a few setbacks along the way that she faces but all she's determined to be the best and she makes the team. This movie is one of  
Sentiment: POSITIVE, Accuracy: 0.9992  
My Opinion: POSITIVE  

Review 5: The actors in the film have a lot of crazy people working for each other, trying to make things good, and keeping the viewer entertained. But it doesn't mean that they don't make mistakes, it means that their acting is better than usual. The main girl is a bit old, but she's the best actress I've seen in a long film. She's just not that sort of likable, because she is just a different person to the rest of the cast. It's fun to  
Sentiment: POSITIVE, Accuracy: 0.9984  
My Opinion: POSITIVE  

Review 6: The actors in the film are good looking and the story is told a thousand times before. If you liked "Dave" with Kevin Klein, you'll get a kick out of him. He is great in this role, especially when he gets to the beach with his girlfriend. The beach is beautiful, the breeze is just right, and there is a beautiful lady who turns out to be a pillar of the community. She is the mother of Kevin, who is OK with encouraging  
Sentiment: POSITIVE, Accuracy: 0.9996  
My Opinion: POSITIVE  

Review 7: The plot of the movie is about a group of criminals who escape from penal colonies. They fly to the Moon in a space-age dustbin carrier, and hijack the penal colony to escape. After a daring mission, they terrorise the dust-coated dust particles that are charged with dangerous technology. This movie tells a story that begins with a colorful cast cast and a good soundtrack soundtrack. Then it goes bad from there, especially when the credits close. You will  
Sentiment: NEGATIVE, Accuracy: 0.5717  
My Opinion: POSITIVE  

Review 8: The plot of the movie is wonderfully plotted and the story is told a thousand times before. The main girl is a maid who likes to have sex with her boyfriend, but she gets addicted to him because she is cheating on him. She is hired to play him in a Broadway production, and when the money comes in, she finds that she has a party with boyfriend and boyfriend. They hook up at the beach, where she goes to get drunk and reckless. It is fun  
Sentiment: POSITIVE, Accuracy: 0.9991  
My Opinion: POSITIVE  

Review 9: The plot of the movie is about a group of criminals who escape from penal colonies. They fly to the Moon in a space-age dustbin carrier with a hologram of a naked female criminal attorney, who is assigned to him by the penal colony. When there, they terrorise a star witness, and hijack the carrier to rescue him. It is a good movie, especially when you see the naked male criminal go to war-like stardom outgrew.  
Sentiment: POSITIVE, Accuracy: 0.9956  
My Opinion: POSITIVE  

Review 10: The director did an excellent job of portraying the man who survived from a plane crash in the Amazon and survived. The story mirrors life. It is a twisted morality play, based on the moral of love and friendship. This movie tells the story of a girl who was sent to him by the head of the island nation of Parador's penal colony, Vera Ngassa, after she has been given the award for her courage in handling the wrath of her nation's political opposition  
Sentiment: POSITIVE, Accuracy: 0.9982  
My Opinion: POSITIVE  

Review 11: The director did not mean to write the screenplay for this film, nor did he mean for it to be a commercial rip-off of the earlier pulp magazine/paperback films that were banned from the UK. Instead, he and his crew of criminals and their wierdo helpers try to make their way into the web of criminal justice. In one of those things they do is send threatening letters to a friend of theirs, telling him that he is a criminal attorney and that they are too old for  
Sentiment: NEGATIVE, Accuracy: 0.9888  
My Opinion: NEGATIVE  

Review 12: The director did an excellent job of portraying the different facets of Hong Kong cinema, from the aesthetics of the rooms, to the manner in which the audience sees the colourful characters and their unique features. The story is told a thousand times before, but rarely (never?) more than once. The story mirrors life and art. It is the story that changes people's lives, so they are made to feel better about themselves and are inspired by what people are missing out on. This  
Sentiment: POSITIVE, Accuracy: 0.9993  
My Opinion: POSITIVE  

Review 13: The film's soundtrack is one of the best in the world. It's so atmospheric that viewers can feel the joy and wonder about the performances of actors and actresses. This is the story of a girl called Jess who is trying to achieve her life long dream to become a pro soccer player and finally gets the chance when offered a position on a local team. Her journey is complicated but she is determined and strives in all areas of her competitive life. This film is a light heart  
Sentiment: POSITIVE, Accuracy: 0.9997  
My Opinion: POSITIVE  

Review 14: The film's soundtrack really good. The story is told a thousand times before, and each time it's like a different story. The acting is good, but the movie is just not that bad. It's just that people don't fake their hearts out on TV, that they aren't capable of holding their noses in the direction of a man who likes a good sports movie and should do something beside groan during this movie. This is a movie about sports movies that involve  
Sentiment: POSITIVE, Accuracy: 0.9902  
My Opinion: POSITIVE  

Review 15: The film's soundtrack is so well done that you'd be hard to miss out on watching it. This is one of the reasons why people rave about Hong Kong cinema. The story is simple: it is a light heart wrenching tale with many twists and turns that are designed to make you feel better about yourself and your loved characters. The story mirrors life. It is about finding love and knowing that she has found the path to happiness and finding happiness is the key to finding  
Sentiment: POSITIVE, Accuracy: 0.9997  
My Opinion: POSITIVE  

Review 16: The cinematography in this part of the film is very different from that of other independent, independent films. This one is more economical to rent, and more importantly, it shows that the people who are willing to pay good money to sit thru it. It is also more efficient to send people to sleep with their hearts out, because there is less fatigue than usual when people talk to each other in a language other than their phones.  There is no way to spoil this movie  
Sentiment: POSITIVE, Accuracy: 0.9911  
My Opinion: POSITIVE  

Review 17: The cinematography in this first part of the film is quite good. The story is a twisted morality play played by a young girl called Jess who is supposed to be a pillar of virtue. This movie is about a girl's journey to discover where she has come from and found happiness. She finds that she is unable to support her sons, namely her oldest son, and youngest son. But she struggles to redeem herself, which results in her being kicked off the force for her  
Sentiment: POSITIVE, Accuracy: 0.9023  
My Opinion: POSITIVE  

Review 18: The cinematography in this field is one of the finest filmmakers in the world. This is the story of a young girl who is sent to him for evacuation, after she has been found out that she is not the same as her brother, but is much better suited to acting than to directing. The story is told a thousand times before, and each time it is a different story. This is an outstanding movie. It is so beautifully filmed that viewers can feel the emotions of  
Sentiment: POSITIVE, Accuracy: 0.9993  
My Opinion: POSITIVE  

Review 19: Overall, the film is about a group of criminals who escape from penal colonies. They fly to the Moon and terrorise the dustbin men who work on their Moon pad. This movie is a light heart wrenching film. The moral of the tale is that criminals escape and are penalized for crimes they commit. This movie also shows the difference between what is happening in each penal colony and what has happened to each one. It shows that each time a criminal escapes, he  
Sentiment: POSITIVE, Accuracy: 0.9993  
My Opinion: POSITIVE  

Review 20: Overall, the film has more nudity than most of the other independent horror films of recent times. The story is about a group of criminals who escape from penal colonies and escapees. They fly to the Moon for a space dustbin carrier to check in on their penal colony. After a daring journey, they terrorise a star witness who witnesses a brutal murder deep in the earth. The movie also shows that some criminals escape, and are penalized for what they did. It  
Sentiment: POSITIVE, Accuracy: 0.9861  
My Opinion: NEUTRAL  

Review 21: Overall, the film is fun to watch. The actors are good and the story is told a thousand times before. There are some good moments in this film. First of the movie is when Sonia Braga, who plays her oldest son, is the mother's lawyer, and second of it is a great joy to see her. She is very good value and she is in love with him. It is an amazing feeling when she gets to the end of a heart wrenching  
Sentiment: POSITIVE, Accuracy: 0.9997  
My Opinion: POSITIVE  

Review 22: One thing I loved about this film was the director's brilliant editing and the story is told a thousand times before. But this one is more about telling stories that are told before they are killed. It is a different story and there are different characters and situations thrown in. The story mirrors life and death situations. This one tells a story that is about finding love and knowing that she is OK and that the best way to do it is to find out she has found happiness and wants to be with her  
Sentiment: POSITIVE, Accuracy: 0.9992  
My Opinion: POSITIVE  

Review 23: One thing I loved about this movie was that it is fast-paced, with lots of action and lots to be done. I thought it would be a good movie, but I was wrong. Firstly, let me start off by saying that I didn't watch the movie at first because it was not recommended to do so. Secondly, I think that Chris Rockburne was AW AWESOME as was Liam Neeson, so I don't think I can understand why  
Sentiment: POSITIVE, Accuracy: 0.9770  
My Opinion: NEGATIVE  

Review 24: One thing I loved about this film is that it is the first time I've seen it in a movie theater in Switzerland. It's fast-paced, with good actors and good music. The story is told a thousand times before, and each time it's like a different story. There is a girl who is working for him and there is an accident that causes her to be killed. When she is killed, she goes to his friend and starts selling drugs. Then there  
Sentiment: POSITIVE, Accuracy: 0.9981  
My Opinion: POSITIVE  

Review 25: A memorable scene in the film. It is a real pity that people didn't waste their time on this film, especially when it is made for them. Firstly, let me start off by saying that I LOVE IT!!!!! It teaches good old fashioned values in a fun way, without over-tteasing too much. This is one of the reasons why people don't like movies. Secondly, I would like to recommend it to those of you who want a nice light movie  
Sentiment: POSITIVE, Accuracy: 0.9977  
My Opinion: POSITIVE  

Review 26: A memorable scene in the first season of MOH. It is one of the most underrated masterpieces of all time. The character development is also top notch. This is the story of a girl who is trying to achieve her life long dream to become a good soccer player and she is determined to do it. She sets off for a friend and decides that she has a heart attack when they leave the stadium. When there, she meets a guy who likes her and tries to get her kicked off the  
Sentiment: POSITIVE, Accuracy: 0.9995  
My Opinion: POSITIVE  

Review 27: A memorable scene in the opening credits. It's one of those movies that I've never seen before. If you're looking for something totally original, look no further. Entertainment at it's peak. This one is truly heart wrenching. Performance of actors is also top notch. The story is told a thousand times before, and each time it changes one's life. This movie is a real winner. I'm a true fan of sports movies and I thought Chris Rock is  
Sentiment: POSITIVE, Accuracy: 0.9997  
My Opinion: POSITIVE  

Review 28: The ending of the movie is fantastic. It is one of those movies where you feel that you are going to write a helpful review. If you liked "Dave" with Kevin Klein, you will get a kick out of him. His acting is just right, he is a good actor and he does a great job of portraying him. The story is about a group of people who decide to pool their resources in answer to an ad for a month rental of a villa in  
Sentiment: POSITIVE, Accuracy: 0.9992  
My Opinion: POSITIVE  

Review 29: The ending of the movie is very well done. It is a great story with lots of twists and turns. The characters are fine, but the story is told very beautifully. There are so many different facets and nuances that each individual has his or her unique role. I would love to see him do a good job of portraying the different aspects of Hong Kong cinema. He is perfectly portrayed by director Enrique Eguilez.  The story mirrors life. This movie shows that  
Sentiment: POSITIVE, Accuracy: 0.9997  
My Opinion: POSITIVE  

Review 30: The ending of the movie is not very good, but the story is a good one. It tells a story that is told a thousand times before and never before. The characters are fine and there are some good moments in there. But all in all there is nothing fancy about them. The acting is just right and good. I mean in one of those movies where you can see people do things well and things that are not easy to do. So in this one you will  
Sentiment: POSITIVE, Accuracy: 0.9994  
My Opinion: POSITIVE  

Proportion of predicted sentiment  
POSITIVE: 28/30 = 93.33%, NEGATIVE: 2/30 = 6.67%  
  
Proportion of my opinion sentiment  
POSITIVE: 27/30 = 90%, NEGATIVE: 2/30 = 6.67%, NEUTRAL: 3.33%

I selected these three interesting reviews:  

Review 7: The plot of the movie is about a group of criminals who escape from penal colonies. They fly to the Moon in a space-age dustbin carrier, and hijack the penal colony to escape. After a daring mission, they terrorise the dust-coated dust particles that are charged with dangerous technology. This movie tells a story that begins with a colorful cast cast and a good soundtrack soundtrack. Then it goes bad from there, especially when the credits close. You will  
Sentiment: NEGATIVE, Accuracy: 0.5717  
My Opinion: POSITIVE  
  
I think this review 7 is quite positive because it said 'with a colorful cast and a good soundtrack' but I think the BERT model predicted as NEGATIVE because this review contains negative words like 'criminal', 'hijack', and 'terrorise'.  
  
  
Review 20: Overall, the film has more nudity than most of the other independent horror films of recent times. The story is about a group of criminals who escape from penal colonies and escapees. They fly to the Moon for a space dustbin carrier to check in on their penal colony. After a daring journey, they terrorise a star witness who witnesses a brutal murder deep in the earth. The movie also shows that some criminals escape, and are penalized for what they did. It  
Sentiment: POSITIVE, Accuracy: 0.9861  
My Opinion: NEUTRAL  
  
I think this review 20 can't be categorized to POSITIVE or NEGATIVE, because it doesn't contain any positive or negative word for the movie. So I categorized this review as NEUTRAL. But I don't know why BERT model predicted this review as POSITIVE.  
  
Review 23: One thing I loved about this movie was that it is fast-paced, with lots of action and lots to be done. I thought it would be a good movie, but I was wrong. Firstly, let me start off by saying that I didn't watch the movie at first because it was not recommended to do so. Secondly, I think that Chris Rockburne was AW AWESOME as was Liam Neeson, so I don't think I can understand why  
Sentiment: POSITIVE, Accuracy: 0.9770  
My Opinion: NEGATIVE  
  
I think this review 23 is categorized to NEGATIVE because of the sentence, 'I thought it would be a good movie, but I was wrong.'. And this review contains negative review of the movie. But the BERT model predicted this review as POSITIVE, I think it is because of sentences 'I loved about this movie', and 'it would be a good movie'.  