<a href="https://colab.research.google.com/github/TCU-DCDA/WRIT20833-2025/blob/main/notebooks/exercises/Review_07_Text_Sentiment_Analysis.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# WRIT 20833 Review 07: Text Analysis & Sentiment Analysis

Analyze cultural texts and measure emotional sentiment computationally.

**Make a copy:** File > Save a copy in Drive

## Exercise 1: Setting Up Text Analysis Tools
Install and import libraries for text and sentiment analysis.

In [None]:
# Install required packages (only run once in Colab)
# !pip install vaderSentiment

# Import libraries - only the ones covered in CodeAlongs
from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer

# Create analyzer like in CodeAlongs
analyzer = SentimentIntensityAnalyzer()
print("VADER Sentiment Analyzer loaded successfully")

## Exercise 2: Basic Text Analysis Functions
Create functions to analyze text characteristics.

In [None]:
# Text analysis functions (simplified to match CodeAlongs patterns)
def analyze_text_basics(text):
    """Perform basic text analysis""" 
    
    # Basic counts
    char_count = len(text)
    word_count = len(text.split())
    sentence_count = text.count('.') + text.count('!') + text.count('?')
    if sentence_count == 0:
        sentence_count = 1
    
    # Calculate averages
    total_chars = 0
    words = text.split()
    for word in words:
        clean_word = word.strip('.,!?;:')
        total_chars = total_chars + len(clean_word)
    
    if word_count > 0:
        avg_word_length = total_chars / word_count
        avg_sentence_length = word_count / sentence_count
    else:
        avg_word_length = 0
        avg_sentence_length = 0
    
    return {
        'char_count': char_count,
        'word_count': word_count,
        'sentence_count': sentence_count,
        'avg_word_length': round(avg_word_length, 2),
        'avg_sentence_length': round(avg_sentence_length, 2)
    }

def get_basic_word_info(text):
    """Get basic word information""" 
    
    words = text.split()
    longest_word = ""
    
    for word in words:
        clean_word = word.strip('.,!?;:"()')
        if len(clean_word) > len(longest_word):
            longest_word = clean_word
    
    return {
        'total_words': len(words),
        'longest_word': longest_word,
        'longest_length': len(longest_word)
    }

# Test with literary examples
shakespeare_quote = "To be, or not to be, that is the question: Whether 'tis nobler in the mind to suffer"

print("Text Analysis of Shakespeare's Hamlet:")
analysis = analyze_text_basics(shakespeare_quote)
for key in analysis:
    value = analysis[key]
    print(key.replace("_", " ").title() + ": " + str(value))

print()
print("Word Analysis:")
word_info = get_basic_word_info(shakespeare_quote)
for key in word_info:
    value = word_info[key]
    print(key.replace("_", " ").title() + ": " + str(value))

## Exercise 3: Sentiment Analysis of Cultural Texts
Analyze emotional sentiment in various cultural works.

In [None]:
# Function to analyze sentiment (using VADER patterns from CodeAlongs)
def analyze_sentiment(text, title="Text"):
    """Analyze sentiment of text using VADER like in CodeAlongs""" 
    
    # Use VADER to get sentiment scores
    scores = analyzer.polarity_scores(text)
    compound = scores['compound']  # This is our main score (-1 to 1)
    
    # Determine overall sentiment like in CodeAlongs
    if compound >= 0.05:
        overall = 'Positive'
    elif compound <= -0.05:
        overall = 'Negative'
    else:
        overall = 'Neutral'
    
    return {
        'title': title,
        'positive': scores['pos'],
        'neutral': scores['neu'],
        'negative': scores['neg'],
        'compound': compound,
        'overall': overall
    }

# Cultural text samples for analysis
cultural_texts = {
    "MLK Dream": "I have a dream that one day this nation will rise up and live out the true meaning of its creed: We hold these truths to be self-evident, that all men are created equal.",
    
    "Poe Raven": "Once upon a midnight dreary, while I pondered, weak and weary, Over many a quaint and curious volume of forgotten lore. 'Tis some visitor,' I muttered, 'tapping at my chamber door Only this and nothing more.'",
    
    "Austen Pride": "It is a truth universally acknowledged, that a single man in possession of a good fortune, must be in want of a wife.",
    
    "Orwell 1984": "It was a bright cold day in April, and the clocks were striking thirteen. Winston Smith slipped quickly through the glass doors of Victory Mansions."
}

# Analyze sentiment for each text
print("SENTIMENT ANALYSIS OF CULTURAL TEXTS")
print("=" * 50)

sentiment_results = []
for title in cultural_texts:
    text = cultural_texts[title]
    result = analyze_sentiment(text, title)
    sentiment_results.append(result)
    
    print()
    print(title + ":")
    print("  Overall Sentiment: " + result['overall'])
    print("  Compound Score: " + str(round(result['compound'], 3)))
    print("  Positive: " + str(round(result['positive'], 2)) + " | Neutral: " + str(round(result['neutral'], 2)) + " | Negative: " + str(round(result['negative'], 2)))

# Simple summary
print()
print("SENTIMENT SUMMARY:")
positive_count = 0
negative_count = 0
neutral_count = 0

for result in sentiment_results:
    if result['overall'] == 'Positive':
        positive_count = positive_count + 1
    elif result['overall'] == 'Negative':
        negative_count = negative_count + 1
    else:
        neutral_count = neutral_count + 1

print("Positive texts: " + str(positive_count))
print("Negative texts: " + str(negative_count))
print("Neutral texts: " + str(neutral_count))

## Exercise 4: Comparative Text Analysis
Compare multiple texts across different dimensions.

In [None]:
# Comprehensive analysis function (simplified for CodeAlongs patterns)
def comprehensive_text_analysis(texts_dict):
    """Perform comprehensive analysis combining text stats and sentiment"""
    
    results = []
    
    for title in texts_dict:
        text = texts_dict[title]
        
        # Basic text analysis
        basic_stats = analyze_text_basics(text)
        
        # Sentiment analysis
        sentiment = analyze_sentiment(text, title)
        
        # Combine analysis
        result = {
            'title': title,
            'word_count': basic_stats['word_count'],
            'sentence_count': basic_stats['sentence_count'],
            'avg_word_length': basic_stats['avg_word_length'],
            'sentiment_overall': sentiment['overall'],
            'sentiment_score': sentiment['compound']
        }
        
        results.append(result)
    
    return results

# Perform comprehensive analysis
comprehensive_results = comprehensive_text_analysis(cultural_texts)

print("COMPREHENSIVE TEXT ANALYSIS")
print("=" * 50)

for result in comprehensive_results:
    print()
    print("Title: " + result["title"])
    print("  Word count: " + str(result["word_count"]))
    print("  Sentences: " + str(result["sentence_count"]))
    print("  Avg word length: " + str(result["avg_word_length"]))
    print("  Sentiment: " + result["sentiment_overall"])
    print("  Sentiment score: " + str(round(result["sentiment_score"], 3)))

# Simple statistics
print()
print("ANALYSIS SUMMARY:")
total_word_count = 0
total_sentiment = 0
positive_count = 0
negative_count = 0
neutral_count = 0

for result in comprehensive_results:
    total_word_count = total_word_count + result["word_count"]
    total_sentiment = total_sentiment + result["sentiment_score"]
    
    if result["sentiment_overall"] == "Positive":
        positive_count = positive_count + 1
    elif result["sentiment_overall"] == "Negative":
        negative_count = negative_count + 1
    else:
        neutral_count = neutral_count + 1

print("Average word count: " + str(round(total_word_count / len(comprehensive_results), 1)))
print("Average sentiment score: " + str(round(total_sentiment / len(comprehensive_results), 3)))
print()
print("Sentiment distribution:")
print("Positive: " + str(positive_count))
print("Negative: " + str(negative_count))
print("Neutral: " + str(neutral_count))

## Exercise 5: Emotional Word Analysis
Identify and categorize emotional language in texts.

In [None]:
# Simple emotional word analysis (using basic patterns from CodeAlongs)
def analyze_emotions(text, title="Text"):
    """Analyze emotional content in text using simple word matching""" 
    
    # Simple emotion word lists
    positive_words = ['happy', 'joy', 'good', 'great', 'wonderful', 'excellent', 'amazing', 'love', 'beautiful', 'bright']
    negative_words = ['sad', 'bad', 'terrible', 'awful', 'horrible', 'hate', 'angry', 'dark', 'evil', 'pain']
    
    # Convert text to lowercase and split into words
    words = text.lower().split()
    
    # Clean words (remove punctuation)
    clean_words = []
    for word in words:
        clean_word = word.strip('.,!?;:"()')
        clean_words.append(clean_word)
    
    # Count emotional words
    positive_count = 0
    negative_count = 0
    
    positive_found = []
    negative_found = []
    
    for word in clean_words:
        if word in positive_words:
            positive_count = positive_count + 1
            if word not in positive_found:
                positive_found.append(word)
        elif word in negative_words:
            negative_count = negative_count + 1
            if word not in negative_found:
                negative_found.append(word)
    
    # Determine dominant emotion
    if positive_count > negative_count:
        dominant_emotion = 'positive'
    elif negative_count > positive_count:
        dominant_emotion = 'negative'
    else:
        dominant_emotion = 'neutral'
    
    total_emotion_words = positive_count + negative_count
    
    return {
        'title': title,
        'positive_count': positive_count,
        'negative_count': negative_count,
        'positive_words': positive_found,
        'negative_words': negative_found,
        'dominant_emotion': dominant_emotion,
        'total_emotion_words': total_emotion_words
    }

# Analyze emotions in cultural texts
print("EMOTIONAL WORD ANALYSIS")
print("=" * 40)

emotion_results = []
for title in cultural_texts:
    text = cultural_texts[title]
    emotion_analysis = analyze_emotions(text, title)
    emotion_results.append(emotion_analysis)
    
    print()
    print(title + ":")
    print("  Dominant emotion: " + emotion_analysis['dominant_emotion'])
    print("  Total emotion words: " + str(emotion_analysis['total_emotion_words']))
    print("  Positive words (" + str(emotion_analysis['positive_count']) + "): " + str(emotion_analysis['positive_words']))
    print("  Negative words (" + str(emotion_analysis['negative_count']) + "): " + str(emotion_analysis['negative_words']))

# Summary statistics
print()
print()
print("EMOTIONAL SUMMARY:")
print("=" * 20)

total_positive = 0
total_negative = 0

for result in emotion_results:
    total_positive = total_positive + result['positive_count']
    total_negative = total_negative + result['negative_count']

print("Total positive words across all texts: " + str(total_positive))
print("Total negative words across all texts: " + str(total_negative))

# Most emotional text
most_emotional_count = 0
most_emotional_title = ""

for result in emotion_results:
    if result['total_emotion_words'] > most_emotional_count:
        most_emotional_count = result['total_emotion_words']
        most_emotional_title = result['title']

print()
print("Most emotionally dense text: " + most_emotional_title + " (" + str(most_emotional_count) + " emotion words)")

## Exercise 6: Creating Your Own Text Analysis
Apply text analysis to your own cultural texts of interest.

In [None]:
# TODO: Add your own texts for analysis
# Consider: Song lyrics, poems, speeches, book excerpts, movie quotes, etc.

your_texts = {
    # TODO: Replace these with texts that interest you
    "Sample Text 1": "Replace this with a text you want to analyze. This could be song lyrics, a poem, a speech excerpt, or any cultural text that interests you.",
    
    "Sample Text 2": "Add another text here for comparison. Consider choosing texts from different genres, time periods, or cultural contexts.",
    
    "Sample Text 3": "A third text allows for richer comparison. You might want to include texts that you hypothesize will have different emotional tones."
}

# TODO: Perform your analysis
print("YOUR TEXT ANALYSIS PROJECT")
print("=" * 40)

# Check if user has customized the texts
sample_text = "Replace this with a text you want to analyze. This could be song lyrics, a poem, a speech excerpt, or any cultural text that interests you."

if your_texts["Sample Text 1"] != sample_text:
    # Perform analysis on user's texts
    print("Basic Analysis Results:")
    
    for title in your_texts:
        text = your_texts[title]
        
        # Basic text analysis
        basic_stats = analyze_text_basics(text)
        sentiment_result = analyze_sentiment(text, title)
        emotion_result = analyze_emotions(text, title)
        
        print()
        print(title + ":")
        print("  Word count: " + str(basic_stats['word_count']))
        print("  Sentiment: " + sentiment_result['overall'])
        print("  Sentiment score: " + str(round(sentiment_result['compound'], 3)))
        print("  Dominant emotion: " + emotion_result['dominant_emotion'])
        print("  Emotion words: " + str(emotion_result['total_emotion_words']))
    
else:
    print("Please customize the 'your_texts' dictionary above with your own cultural texts!")
    print("Consider analyzing:")
    print("- Song lyrics from different genres")
    print("- Poems from different time periods")
    print("- Political speeches")
    print("- Movie or book quotes")
    print("- Social media posts")
    print("- Historical documents")

# Research questions to explore
print()
print("=" * 50)
print("RESEARCH QUESTIONS TO EXPLORE:")
print("- How does sentiment vary across different cultural genres?")
print("- What emotional patterns distinguish different historical periods?")
print("- How do authors use emotional language to achieve different effects?")
print("- What can computational analysis reveal about cultural texts?")
print("- What are the limitations of automated sentiment analysis?")

## Summary

You explored:
- Setting up and using sentiment analysis tools (VADER)
- Creating functions for basic text analysis (word counts, lexical diversity)
- Analyzing sentiment in classic cultural texts
- Comparing multiple texts across various dimensions
- Identifying emotional language patterns
- Applying analysis to your own texts of interest
- Critical evaluation of computational text analysis

**Key Skills:**
- Text preprocessing and cleaning
- Sentiment analysis with compound scores
- Word frequency analysis and stopword removal
- Emotional content categorization
- Comparative text analysis
- Statistical summary of text features

**Key Insights:**
- Computational tools provide quantitative perspectives on cultural texts
- Sentiment analysis can reveal patterns across large collections
- Emotional language density varies significantly between genres and authors
- Automated analysis complements but doesn't replace careful human interpretation
- Cultural context and literary devices require careful consideration

**Next:** Review 08 will cover data visualization for cultural analysis.

---
 