# Text Analysis of Jill Scott Lyrics

## Overview

In this notebook, we’ll perform basic text analysis on a snippet of lyrics from a Jill Scott song. This is a great demonstration of how Jupyter Notebooks can be used to explore creative texts, such as music or poetry, using Python.

### Learning Objectives

- Tokenize and clean text
- Count word frequency
- Group and categorize words based on themes
- Visualize frequent words


## Step 1: Load and View the Lyrics

In [None]:
lyrics = """You had me at hello
I loved you from the start
You gave me butterflies, something in my heart
But then you changed
And now I'm left to rearrange
"""
print(lyrics)


## Step 2: Basic Cleanup and Tokenization

In [None]:
import re

# Convert to lowercase and remove punctuation
cleaned = re.sub(r"[^\w\s]", "", lyrics.lower())
words = cleaned.split()
print(words)


## Step 3: Word Frequency Analysis

In [None]:
from collections import Counter

word_counts = Counter(words)
word_counts.most_common(10)


## Step 4: Grouping by Theme (Example Categories)

In [None]:
love_words = {"love", "heart", "butterflies"}
change_words = {"changed", "rearrange"}

categorized = {"love": 0, "change": 0, "other": 0}
for word in words:
    if word in love_words:
        categorized["love"] += 1
    elif word in change_words:
        categorized["change"] += 1
    else:
        categorized["other"] += 1

categorized


## Step 5: Visualization (Optional)

In [None]:
import matplotlib.pyplot as plt

labels = categorized.keys()
values = categorized.values()

plt.bar(labels, values)
plt.title("Categorized Word Counts in Jill Scott Lyrics")
plt.ylabel("Count")
plt.show()


## Wrap-Up

This notebook showed how to:
- Process and clean raw song lyrics
- Count and categorize words
- Begin interpreting creative content using simple computational tools

This approach could be extended to entire albums, spoken word, or student-written reflections — useful in digital humanities, literature, and social science contexts.
