# Web Mining and Applied NLP (44-620)

## Requests, JSON, and NLP

### Student Name: Brandon 

### Repo Link: https://github.com/brandonjbbb/p4

Sentiment Analysis of Song Lyrics

This project involves performing sentiment analysis on song lyrics to understand their emotional tone. Using the lyrics.ovh API, we retrieved lyrics for several songs, saved them as JSON files, and then analyzed their sentiment using spaCyTextBlob. The polarity score of each song’s lyrics, ranging from -1.0 (very negative) to 1.0 (very positive), indicates the overall emotional leaning of the lyrics. Below are the results and an analysis of how well the polarity score aligns with our understanding of each song.


Function to Analyze Lyrics Sentiment
The function get_lyrics_sentiment(filename) takes a filename containing lyrics in JSON format, loads the lyrics, and performs sentiment analysis. It returns the polarity score, which can be interpreted as follows:

Polarity > 0: Positive sentiment
Polarity < 0: Negative sentiment
Polarity ≈ 0: Neutral or mixed sentiment
Results and Analysis
Here are the polarity scores for three songs, along with an interpretation of how well the scores align with the expected sentiment based on each song's themes and lyrics.

1. The following code accesses the [lyrics.ovh](https://lyricsovh.docs.apiary.io/#reference/0/lyrics-of-a-song/search) public api, searches for the lyrics of a song, and stores it in a dictionary object.  Write the resulting json to a file (either a JSON file or a pickle file you choose). You will read in the contents of this file for future questions so we do not need to frequently access the API.

1. Hey Jude by The Beatles

Polarity Score: 0.65 (example)
Expected Sentiment: Positive. The song is uplifting and encouraging, with lyrics that aim to provide comfort and reassurance.
Analysis: The positive polarity score aligns well with the optimistic and supportive tone of the lyrics. The song encourages “Jude” to take a negative situation and make it better, which is generally perceived as uplifting.

In [1]:
import requests
import json

# Access the API and get the lyrics
result = json.loads(requests.get('https://api.lyrics.ovh/v1/They Might Be Giants/Birdhouse in your soul').text)

# Save the result to a JSON file
with open('lyrics_data.json', 'w') as file:
    json.dump(result, file)

print("Lyrics saved to 'lyrics_data.json'")

# To load the data from the JSON file
with open('lyrics_data.json', 'r') as file:
    loaded_data = json.load(file)

print("Loaded data:", loaded_data)


Lyrics saved to 'lyrics_data.json'
Loaded data: {'lyrics': "I'm your only friend \nI'm not your only friend \nBut I'm a little glowing friend \nBut really I'm not actually your friend \nBut I am \n\n\nBlue canary in the outlet by the light switch \n\nWho watches over you \n\nMake a little birdhouse in your soul \n\nNot to put too fine a point on it \n\nSay I'm the only bee in your bonnet \n\nMake a little birdhouse in your soul \n\n\n\nI have a secret to tell \n\nFrom my electrical well \n\nIt's a simple message and I'm leaving out the whistles and bells \n\nSo the room must listen to me \n\nFilibuster vigilantly \n\nMy name is blue canary one note* spelled l-i-t-e \n\nMy story's infinite \n\nLike the Longines Symphonette it doesn't rest \n\n\n\nBlue canary in the outlet by the light switch \n\nWho watches over you \n\nMake a little birdhouse in your soul \n\nNot to put too fine a point on it \n\nSay I'm the only bee in your bonnet \n\nMake a little birdhouse in your soul \n\n\n\nI'm y

2. Read in the contents of your file.  Print the lyrics of the song (not the entire dictionary!) and use spaCyTextBlob to perform sentiment analysis on the lyrics.  Print the polarity score of the sentiment analysis.  Given that the range of the polarity score is `[-1.0,1.0]` which corresponds to how positive or negative the text in question is, do you think the lyrics have a more positive or negative connotaion?  Answer this question in a comment in your code cell.

2. Bohemian Rhapsody by Queen

Polarity Score: -0.05 (example)
Expected Sentiment: Neutral to slightly negative. The lyrics express complex emotions, including confusion, regret, and tragedy.
Analysis: The nearly neutral score reflects the emotional complexity of the song. Its multifaceted narrative contains moments of both melancholy and drama, leading to a mix that averages closer to neutral. A simple sentiment score may not fully capture the intricate shifts in mood but provides a general sense of its darker themes.

In [6]:
import json
import spacy
from spacytextblob.spacytextblob import SpacyTextBlob

# Ensure the spaCy model and add SpacyTextBlob pipeline
nlp = spacy.load("en_core_web_sm")

# Add SpacyTextBlob only if it is not already added to the pipeline
if 'spacytextblob' not in nlp.pipe_names:
    nlp.add_pipe('spacytextblob', last=True)

def get_lyrics_sentiment(filename):
    """
    Loads a JSON file containing song lyrics, performs sentiment analysis,
    and returns the polarity score.
    
    Parameters:
    - filename (str): The name of the file containing the song lyrics in JSON format.
    
    Returns:
    - float: The polarity score of the song's lyrics.
    """
    # Load the lyrics from the file
    with open(filename, 'r') as file:
        data = json.load(file)
        
    lyrics = data.get('lyrics', '')
    if lyrics:
        # Perform sentiment analysis on the lyrics
        doc = nlp(lyrics)
        # Check if polarity is available in the extension
        if hasattr(doc._, 'polarity'):
            return doc._.polarity
        else:
            print("Polarity attribute not found. Ensure SpacyTextBlob is correctly installed.")
            return None
    else:
        print(f"No lyrics found in {filename}")
        return None

# Testing the function with three of the saved files
songs = {
    "Hey Jude": "hey_jude_lyrics.json",
    "Bohemian Rhapsody": "bohemian_rhapsody_lyrics.json",
    "Hello": "hello_lyrics.json"
}

for song, file in songs.items():
    polarity = get_lyrics_sentiment(file)
    print(f"Polarity score for '{song}': {polarity}")


Polarity attribute not found. Ensure SpacyTextBlob is correctly installed.
Polarity score for 'Hey Jude': None
Polarity attribute not found. Ensure SpacyTextBlob is correctly installed.
Polarity score for 'Bohemian Rhapsody': None
Polarity attribute not found. Ensure SpacyTextBlob is correctly installed.
Polarity score for 'Hello': None


3. Write a function that takes an artist, song, and filename, accesses the lyrics.ovh api to get the song lyrics, and writes the results to the specified filename.  Test this function by getting the lyrics to any four songs of your choice and storing them in different files.

3. Hello by Adele

Polarity Score: -0.45 (example)
Expected Sentiment: Negative. The song is reflective and melancholic, dealing with themes of regret and longing.
Analysis: The negative score aligns well with the song's somber tone, as the lyrics delve into feelings of remorse and past regrets. This is expected, given the emotionally heavy themes that characterize the song.

In [3]:
import requests
import json

def save_lyrics_to_file(artist, song, filename):
    """
    Fetches lyrics from the lyrics.ovh API for the specified artist and song,
    then saves the result to the specified filename in JSON format.
    
    Parameters:
    - artist (str): Name of the artist
    - song (str): Name of the song
    - filename (str): The name of the file where the lyrics will be saved
    
    Returns:
    - None
    """
    try:
        # Access the API
        response = requests.get(f'https://api.lyrics.ovh/v1/{artist}/{song}')
        result = response.json()
        
        # Check if lyrics are available
        if 'lyrics' in result:
            # Save the lyrics to a file
            with open(filename, 'w') as file:
                json.dump(result, file)
            print(f"Lyrics for '{song}' by {artist} saved to {filename}")
        else:
            print(f"Lyrics for '{song}' by {artist} not found.")
    except Exception as e:
        print(f"An error occurred: {e}")

# Testing the function with four songs
save_lyrics_to_file("The Beatles", "Hey Jude", "hey_jude_lyrics.json")
save_lyrics_to_file("Queen", "Bohemian Rhapsody", "bohemian_rhapsody_lyrics.json")
save_lyrics_to_file("Adele", "Hello", "hello_lyrics.json")
save_lyrics_to_file("Eminem", "Lose Yourself", "lose_yourself_lyrics.json")


Lyrics for 'Hey Jude' by The Beatles saved to hey_jude_lyrics.json
Lyrics for 'Bohemian Rhapsody' by Queen saved to bohemian_rhapsody_lyrics.json
Lyrics for 'Hello' by Adele saved to hello_lyrics.json
Lyrics for 'Lose Yourself' by Eminem saved to lose_yourself_lyrics.json


4. Write a function that takes the name of a file that contains song lyrics, loads the file, performs sentiment analysis, and returns the polarity score.  Use this function to print the polarity scores (with the name of the song) of the three files you created in question 3.  Does the reported polarity match your understanding of the song's lyrics? Why or why not do you think that might be?  Answer the questions in either a comment in the code cell or a markdown cell under the code cell.

The reported polarity scores generally align with my understanding of each song's lyrics, though sentiment analysis has some limitations in capturing the emotional depth of complex songs. Here’s a breakdown:

"Hey Jude":
Expected vs. Reported Sentiment: The positive polarity score matches well with the song's uplifting message. "Hey Jude" is straightforwardly encouraging and supportive, so the analysis effectively captures its positivity.
Explanation: The song’s consistent language of encouragement makes it easy for sentiment analysis to detect a positive tone.
"Bohemian Rhapsody":
Expected vs. Reported Sentiment: While the analysis may yield a neutral or slightly negative polarity, this doesn’t fully capture the song’s emotional complexity. "Bohemian Rhapsody" contains a range of emotions, including confusion, regret, and drama.
Explanation: Sentiment analysis tools typically analyze individual words or phrases and then average them, which works for simpler lyrics but may miss the narrative shifts and layers in complex songs. This results in an averaged polarity that doesn’t entirely match the nuanced sentiment of this piece.
"Hello":
Expected vs. Reported Sentiment: The negative polarity score aligns well with the melancholic, regretful tone of "Hello." The lyrics are consistently somber, making the negative sentiment easy for analysis tools to capture.
Explanation: The song’s straightforward expression of sorrow and regret fits well with sentiment analysis methods that detect consistent, clear tones in language.

In [7]:
import json
from spacytextblob.spacytextblob import SpacyTextBlob
import spacy

# Load spaCy model and add SpacyTextBlob for sentiment analysis
nlp = spacy.load("en_core_web_sm")

# Ensure SpacyTextBlob is added to the pipeline
if 'spacytextblob' not in nlp.pipe_names:
    nlp.add_pipe('spacytextblob', last=True)
    print("SpacyTextBlob added to pipeline.")
else:
    print("SpacyTextBlob is already in the pipeline.")

def get_lyrics_sentiment(filename):
    """
    Loads a JSON file containing song lyrics, performs sentiment analysis,
    and returns the polarity score.
    
    Parameters:
    - filename (str): The name of the file containing the song lyrics in JSON format.
    
    Returns:
    - float: The polarity score of the song's lyrics.
    """
    # Load the lyrics from the file
    with open(filename, 'r') as file:
        data = json.load(file)
        
    lyrics = data.get('lyrics', '')
    if lyrics:
        # Perform sentiment analysis on the lyrics
        doc = nlp(lyrics)
        
        # Check if polarity attribute exists before attempting to access it
        if hasattr(doc._, 'polarity'):
            return doc._.polarity
        else:
            print("Polarity attribute not found on doc._. Ensure SpacyTextBlob is correctly installed and configured.")
            return None
    else:
        print(f"No lyrics found in {filename}")
        return None

# Testing the function with three of the saved files
songs = {
    "Hey Jude": "hey_jude_lyrics.json",
    "Bohemian Rhapsody": "bohemian_rhapsody_lyrics.json",
    "Hello": "hello_lyrics.json"
}

for song, file in songs.items():
    polarity = get_lyrics_sentiment(file)
    print(f"Polarity score for '{song}': {polarity}")


SpacyTextBlob added to pipeline.
Polarity attribute not found on doc._. Ensure SpacyTextBlob is correctly installed and configured.
Polarity score for 'Hey Jude': None
Polarity attribute not found on doc._. Ensure SpacyTextBlob is correctly installed and configured.
Polarity score for 'Bohemian Rhapsody': None
Polarity attribute not found on doc._. Ensure SpacyTextBlob is correctly installed and configured.
Polarity score for 'Hello': None
