## Steps of this wp2 : 

#### 1. Data Preparation
#### 2. Text Preprocessing
#### 3. Identify Adverbs:
#### 4. Sentiment Analysis with SentiWordNet:
#### 5. Labeling Reviews:
#### 6. Machine Learning (Optional):

##### Libraries

In [11]:
import os
import nltk
from nltk.corpus import sentiwordnet as swn
from nltk.tokenize import word_tokenize, sent_tokenize
from nltk.tag import pos_tag
from nltk.sentiment.util import mark_negation

In [9]:
nltk.download('averaged_perceptron_tagger')

[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data]     C:\Users\aurel\AppData\Roaming\nltk_data...
[nltk_data]   Unzipping taggers\averaged_perceptron_tagger.zip.


True

#### Dataset movie load

In [13]:
def load_reviews_from_directory(directory):
    reviews = []
    labels = []
    
    for filename in os.listdir(directory):
        if filename.endswith(".txt"):
            with open(os.path.join(directory, filename), "r", encoding="utf-8") as file:
                review = file.read()
                reviews.append(review)
                if directory.endswith("pos"):
                    labels.append("positive")
                elif directory.endswith("neg"):
                    labels.append("negative")
    
    return reviews, labels

# Load positive and negative reviews
positive_reviews, positive_labels = load_reviews_from_directory("C:/Users/aurel/OneDrive - De Vinci/ONE DRIVE PC/A5/NLP/TD2/txt_sentoken/pos")
negative_reviews, negative_labels = load_reviews_from_directory("C:/Users/aurel/OneDrive - De Vinci/ONE DRIVE PC/A5/NLP/TD2/txt_sentoken/neg")

# Combine positive and negative reviews and labels
all_reviews = positive_reviews + negative_reviews
all_labels = positive_labels + negative_labels

# Example: print the first positive and negative reviews
print("Positive Review:")
print(all_reviews[0])
print("Label:", all_labels[0])

print("\nNegative Review:")
print(all_reviews[len(positive_reviews)])
print("Label:", all_labels[len(positive_labels)])


Positive Review:
films adapted from comic books have had plenty of success , whether they're about superheroes ( batman , superman , spawn ) , or geared toward kids ( casper ) or the arthouse crowd ( ghost world ) , but there's never really been a comic book like from hell before . 
for starters , it was created by alan moore ( and eddie campbell ) , who brought the medium to a whole new level in the mid '80s with a 12-part series called the watchmen . 
to say moore and campbell thoroughly researched the subject of jack the ripper would be like saying michael jackson is starting to look a little odd . 
the book ( or " graphic novel , " if you will ) is over 500 pages long and includes nearly 30 more that consist of nothing but footnotes . 
in other words , don't dismiss this film because of its source . 
if you can get past the whole comic book thing , you might find another stumbling block in from hell's directors , albert and allen hughes . 
getting the hughes brothers to direct this

### Download and prepare SentiWordNet


In [2]:
nltk.download('sentiwordnet')
nltk.download('punkt')

[nltk_data] Downloading package sentiwordnet to
[nltk_data]     C:\Users\aurel\AppData\Roaming\nltk_data...
[nltk_data]   Unzipping corpora\sentiwordnet.zip.
[nltk_data] Downloading package punkt to
[nltk_data]     C:\Users\aurel\AppData\Roaming\nltk_data...
[nltk_data]   Package punkt is already up-to-date!


True

### Execution  :

In [66]:
# Function to calculate sentiment score for an adverb using SentiWordNet
def get_sentiment_score(adverbs):
    sentiment_scores = {'pos': 0, 'neg': 0}
    count = 0
    
    for adverb in adverbs:
        # Lookup sentiment scores in SentiWordNet
        synsets = list(swn.senti_synsets(adverb))
        if synsets:
            pos_score = synsets[0].pos_score()
            neg_score = synsets[0].neg_score()
            sentiment_scores['pos'] += pos_score
            sentiment_scores['neg'] += neg_score
            count += 1
    
    # Avoid division by zero
    if count == 0:
        return sentiment_scores
    
    # Calculate average sentiment scores
    sentiment_scores['pos'] /= count
    sentiment_scores['neg'] /= count
    
    return sentiment_scores


def classify_review_sentiment(review, threshold=0.0):
    sentences = sent_tokenize(review)
    adverbs = []
    
    # Extract adverbs from all sentences
    for sentence in sentences:
        words = word_tokenize(sentence)
        tagged_words = pos_tag(words)
        adverbs += [word for word, pos in tagged_words if pos == 'RB']
    
    # Calculate sentiment scores for adverbs
    sentiment_scores = get_sentiment_score(adverbs)
    
    # Calculate the overall sentiment score for the review
    total_pos_score = sentiment_scores['pos']
    total_neg_score = sentiment_scores['neg']
    
    # Determine sentiment based on the threshold
    if total_pos_score > total_neg_score + threshold:
        return 'Positive'
    else:
        return 'Negative'

# Load the movie reviews dataset
# Replace these with the actual reviews you loaded earlier
reviews = [
    "This movie is extremely entertaining. It's very well done.",
    "The acting was not convincing. The plot was poorly executed.",
    "The film lacks depth."
]
t =["The movie was boring and bad.", "The movie was good."]
# Classify the reviews based on adverb sentiment scores
threshold = 0.0  # Adjust the threshold as needed
result = classify_review_sentiment(all_reviews[0], threshold)

print("Review Sentiments:", result)

Review Sentiments: Negative


In [64]:
all_reviews[0]

'films adapted from comic books have had plenty of success , whether they\'re about superheroes ( batman , superman , spawn ) , or geared toward kids ( casper ) or the arthouse crowd ( ghost world ) , but there\'s never really been a comic book like from hell before . \nfor starters , it was created by alan moore ( and eddie campbell ) , who brought the medium to a whole new level in the mid \'80s with a 12-part series called the watchmen . \nto say moore and campbell thoroughly researched the subject of jack the ripper would be like saying michael jackson is starting to look a little odd . \nthe book ( or " graphic novel , " if you will ) is over 500 pages long and includes nearly 30 more that consist of nothing but footnotes . \nin other words , don\'t dismiss this film because of its source . \nif you can get past the whole comic book thing , you might find another stumbling block in from hell\'s directors , albert and allen hughes . \ngetting the hughes brothers to direct this seem