# Text Sentiment Analysis
This activity aims on indentifing the sentiment of the sentences. **One of the key goal of this activity is to find a way to classify sentences that are mixed with both positive and negative clauses.** 

## Approach
- Split the compound sentences into simple sentences with conjunctions as pivots
- Evaluate the constituent clauses for their sentiment

## Advantages
- For feedback analysis systems, helps to identify what exactly made customers happy and what made them sad.

### Imports

In [1]:
from nltk.tokenize import word_tokenize, sent_tokenize
from nltk.tokenize import sent_tokenize
from textblob import TextBlob
import nltk
import re

In [2]:
nltk.download('punkt')

[nltk_data] Downloading package punkt to
[nltk_data]     C:\Users\navee\AppData\Roaming\nltk_data...
[nltk_data]   Package punkt is already up-to-date!


True

### Splitting Compound Sentences

In [6]:
def split_compound_sentences(text):
    conjunctions = ['and', 'but', 'or', 'yet', 'so', 'for', 'nor', 'although', 'though']
    
    sentences = sent_tokenize(text)
    split_sentences = []
    
    for sentence in sentences:
        parts = re.split(r'\b(' + '|'.join(conjunctions) + r')\b', sentence)
        # print("====>",parts)
        current_clause = parts[0].strip()
        for i in range(1, len(parts), 2):
            conjunction = parts[i]
            next_clause = parts[i + 1].strip()
            
            split_sentences.append(current_clause)
            current_clause = f"{conjunction} {next_clause}"
        
        split_sentences.append(current_clause)
    
    return [s for s in split_sentences if s]


### Sentiment Analysis

In [None]:
arr = []

with open("sentiment_data.txt", 'r') as f:
    arr = f.readlines()

for l in arr:
    sents = split_compound_sentences(l)
    for s in sents:
        blob = TextBlob(s)
        sent = blob.sentiment.polarity
        ment = "neutral"
        if(sent>0.1):
            if(sent>0.7):
                ment = "positive"
            else:
                ment = "slightly positive"
        elif(sent<-0.1):
            if(sent<-0.7):
                ment = "negative"
            else:
                ment = "slightly negative"
        print(f"{s:<40} {sent:<10.3f} {ment:<8}")
    print("======="*5)

The weather is beautiful today,          0.850      positive
but yesterday's storm was terrible.      -1.000     negative
The concert was thrilling,               0.250      slightly positive
yet the ticket prices were outrageous.   -1.000     negative
She excels at her job,                   0.000      neutral 
though her punctuality is poor.          -0.400     slightly negative
The food was delicious,                  1.000      positive
but the service was slow.                -0.300     slightly negative
The movie had amazing special effects,   0.479      slightly positive
although the plot was predictable.       -0.200     slightly negative
His presentation was insightful,         0.000      neutral 
but his delivery was dull.               -0.292     slightly negative
The book's story was captivating, even if the ending was disappointing. -0.050     neutral 
Her singing voice is angelic,            0.000      neutral 
but her dancing needs improvement.       0.000      neutral 
