# Web Mining and Applied NLP (44-620)

### Requests, JSON, and NLP

#### Student Name: Tesfamariam
#### Nov18, 2024

Perform the tasks described in the Markdown cells below.  When you have completed the assignment make sure your code cells have all been run (and have output beneath them) and ensure you have committed and pushed ALL of your changes to your assignment repository.

Make sure you have [installed spaCy and its pipeline](https://spacy.io/usage#quickstart) and [spaCyTextBlob](https://spacy.io/universe/project/spacy-textblob)

Every question that requires you to write code will have a code cell underneath it; you may either write your entire solution in that cell or write it in a python file (`.py`), then import and run the appropriate code to answer the question.

This assignment requires that you write additional files (either JSON or pickle files); make sure to submit those files in your repository as well.

### Q #1.
 *The following code accesses the [lyrics.ovh](https://lyricsovh.docs.apiary.io/#reference/0/lyrics-of-a-song/search) public api, searches for the lyrics of a song, and stores it in a dictionary object.  Write the resulting json to a file (either a JSON file or a pickle file; you choose). You will read in the contents of this file for future questions so we do not need to frequently access the API.*

In [35]:
import requests
import json

result = json.loads(requests.get('https://api.lyrics.ovh/v1/They Might Be Giants/Birdhouse in your soul').text)
if 'lyrics' in result:
    print(result['lyrics'])
else:
    print("Lyrics not found")


I'm your only friend 
I'm not your only friend 
But I'm a little glowing friend 
But really I'm not actually your friend 
But I am 


Blue canary in the outlet by the light switch 

Who watches over you 

Make a little birdhouse in your soul 

Not to put too fine a point on it 

Say I'm the only bee in your bonnet 

Make a little birdhouse in your soul 



I have a secret to tell 

From my electrical well 

It's a simple message and I'm leaving out the whistles and bells 

So the room must listen to me 

Filibuster vigilantly 

My name is blue canary one note* spelled l-i-t-e 

My story's infinite 

Like the Longines Symphonette it doesn't rest 



Blue canary in the outlet by the light switch 

Who watches over you 

Make a little birdhouse in your soul 

Not to put too fine a point on it 

Say I'm the only bee in your bonnet 

Make a little birdhouse in your soul 



I'm your only friend 

I'm not your only friend 

But I'm a little glowing friend 

But really I'm not actually your f

### Q#2. 
*Read in the contents of your file.  Print the lyrics of the song (not the entire dictionary!) and use spaCyTextBlob to perform sentiment analysis on the lyrics.  Print the polarity score of the sentiment analysis.  Given that the range of the polarity score is `[-1.0,1.0]` which corresponds to how positive or negative the text in question is, do you think the lyrics have a more positive or negative connotaion?  Answer this question in a comment in your code cell.*

In [36]:
import requests
import json
import spacy
from spacytextblob.spacytextblob import SpacyTextBlob

def fetch_lyrics(artist, song):
    """Fetches song lyrics from the lyrics.ovh API."""
    url = f"https://api.lyrics.ovh/v1/{artist}/{song}"
    response = requests.get(url)
    if response.status_code == 200:
        return response.json()
    else:
        print(f"Error: Failed to fetch lyrics for {artist} - {song}")
        return None

def save_lyrics_to_json(lyrics_data, filename="lyrics.json"):
    """Saves the lyrics data to a JSON file."""
    with open(filename, "w") as f:
        json.dump(lyrics_data, f, indent=4)
        print(f"Lyrics saved to {filename}")

def perform_sentiment_analysis(lyrics):
    """Performs sentiment analysis on the given lyrics."""
    nlp = spacy.load("en_core_web_sm")  # Load the spaCy language model
    if "spacytextblob" not in nlp.pipe_names:  # Add spacytextblob if not already added
        nlp.add_pipe("spacytextblob")
    doc = nlp(lyrics)  # Process the lyrics
    return doc._.polarity  # Return the polarity score

# Example usage
artist = "They Might Be Giants"
song = "Birdhouse in Your Soul"

# Fetch lyrics
lyrics_data = fetch_lyrics(artist, song)

if lyrics_data:
    # Save lyrics to a JSON file
    save_lyrics_to_json(lyrics_data)

    # Extract and print lyrics (not the entire dictionary)
    lyrics = lyrics_data.get("lyrics", "Lyrics not found")
    print("Lyrics:\n", lyrics)

    # Perform sentiment analysis
    polarity_score = perform_sentiment_analysis(lyrics)
    print(f"\nPolarity Score: {polarity_score}")

    # Comment on the sentiment based on polarity
    if polarity_score > 0:
        print("The lyrics have a more positive connotation.")
    elif polarity_score < 0:
        print("The lyrics have a more negative connotation.")
    else:
        print("The lyrics have a neutral connotation.")

Lyrics saved to lyrics.json
Lyrics:
 I'm your only friend 
I'm not your only friend 
But I'm a little glowing friend 
But really I'm not actually your friend 
But I am 


Blue canary in the outlet by the light switch 

Who watches over you 

Make a little birdhouse in your soul 

Not to put too fine a point on it 

Say I'm the only bee in your bonnet 

Make a little birdhouse in your soul 



I have a secret to tell 

From my electrical well 

It's a simple message and I'm leaving out the whistles and bells 

So the room must listen to me 

Filibuster vigilantly 

My name is blue canary one note* spelled l-i-t-e 

My story's infinite 

Like the Longines Symphonette it doesn't rest 



Blue canary in the outlet by the light switch 

Who watches over you 

Make a little birdhouse in your soul 

Not to put too fine a point on it 

Say I'm the only bee in your bonnet 

Make a little birdhouse in your soul 



I'm your only friend 

I'm not your only friend 

But I'm a little glowing friend

AttributeError: [E046] Can't retrieve unregistered extension attribute 'sentiment'. Did you forget to call the `set_extension` method?

In [37]:
import spacy
from spacytextblob.spacytextblob import SpacyTextBlob

# Load SpaCy's small English model
nlp = spacy.load("en_core_web_sm")

# Add SpacyTextBlob to the pipeline if not already added
if "spacytextblob" not in nlp.pipe_names:
    nlp.add_pipe("spacytextblob")
    print("SpacyTextBlob added to the pipeline.")

# Test with some simple text
doc = nlp("This is a fantastic day!")  # Change this to your lyrics for sentiment analysis
if hasattr(doc._, "polarity"):
    print("Polarity:", doc._.polarity)  # Should output a polarity score
else:
    print("Error: Polarity extension is not available.")


SpacyTextBlob added to the pipeline.
Error: Polarity extension is not available.


#### Q#3. 
*Write a function that takes an artist, song, and filename, accesses the lyrics.ovh api to get the song lyrics, and writes the results to the specified filename.  Test this function by getting the lyrics to any four songs of your choice and storing them in different files.*

In [39]:
import requests
import json

def get_song_lyrics(artist, song, filename):
    # Build the API URL
    url = f'https://api.lyrics.ovh/v1/{artist}/{song}'
    
    # Fetch the lyrics
    response = requests.get(url)
    
    # Check if the request was successful
    if response.status_code == 200:
        result = response.json()
        
        # If lyrics are found, write them to the specified file
        if 'lyrics' in result:
            lyrics = result['lyrics']
            
            # Create a dictionary to save the lyrics in JSON format
            lyrics_data = {'artist': artist, 'song': song, 'lyrics': lyrics}
            
            # Write the data to a JSON file
            with open(filename, 'w') as json_file:
                json.dump(lyrics_data, json_file, indent=4)
                
            print(f"Lyrics for '{song}' by {artist} saved in {filename}")
        else:
            print("Lyrics not found for the song.")
    else:
        print(f"Failed to fetch lyrics. Status code: {response.status_code}")

# Example usage: Get lyrics for four songs and save them in different files
get_song_lyrics('They Might Be Giants', 'Birdhouse in your soul', 'birdhouse_in_your_soul.json')
get_song_lyrics('The Beatles', 'Hey Jude', 'hey_jude.json')
get_song_lyrics('Adele', 'Hello', 'hello_adele.json')
get_song_lyrics('Taylor Swift', 'Shake it off', 'shake_it_off.json')


Lyrics for 'Birdhouse in your soul' by They Might Be Giants saved in birdhouse_in_your_soul.json
Lyrics for 'Hey Jude' by The Beatles saved in hey_jude.json
Lyrics for 'Hello' by Adele saved in hello_adele.json
Lyrics for 'Shake it off' by Taylor Swift saved in shake_it_off.json


4. Write a function that takes the name of a file that contains song lyrics, loads the file, performs sentiment analysis, and returns the polarity score.  Use this function to print the polarity scores (with the name of the song) of the three files you created in question 3.  Does the reported polarity match your understanding of the song's lyrics? Why or why not do you think that might be?  Answer the questions in either a comment in the code cell or a markdown cell under the code cell.

In [40]:
from textblob import TextBlob
import json

# Function to load song lyrics from a file, perform sentiment analysis, and return the polarity score
def analyze_sentiment_from_file(filename):
    # Load the JSON file with song lyrics
    with open(filename, 'r') as json_file:
        lyrics_data = json.load(json_file)
        
    # Extract the lyrics from the file
    lyrics = lyrics_data.get('lyrics', "")
    
    # Perform sentiment analysis using TextBlob
    blob = TextBlob(lyrics)
    
    # Return the polarity score
    return blob.sentiment.polarity

# Test the function with the three files created in Question 3
polarity_birdhouse = analyze_sentiment_from_file('birdhouse_in_your_soul.json')
polarity_hey_jude = analyze_sentiment_from_file('hey_jude.json')
polarity_hello_adele = analyze_sentiment_from_file('hello_adele.json')

# Print the polarity scores with song names
print(f"Polarity of 'Birdhouse in your soul': {polarity_birdhouse}")
print(f"Polarity of 'Hey Jude': {polarity_hey_jude}")
print(f"Polarity of 'Hello': {polarity_hello_adele}")


Polarity of 'Birdhouse in your soul': 0.04505208333333333
Polarity of 'Hey Jude': 0.13194444444444445
Polarity of 'Hello': -0.14109195402298852


In [41]:
$ jupyter server --generate-config

SyntaxError: invalid syntax (4011700375.py, line 1)