# Web Mining and Applied NLP (44-620)

## Requests, JSON, and NLP

### Student Name: Alexandra Coffin
#### Repository Link: https://github.com/accoffin12/json_sentement_analysis/tree/master

Perform the tasks described in the Markdown cells below.  When you have completed the assignment make sure your code cells have all been run (and have output beneath them) and ensure you have committed and pushed ALL of your changes to your assignment repository.

Make sure you have [installed spaCy and its pipeline](https://spacy.io/usage#quickstart) and [spaCyTextBlob](https://spacy.io/universe/project/spacy-textblob)

Every question that requires you to write code will have a code cell underneath it; you may either write your entire solution in that cell or write it in a python file (`.py`), then import and run the appropriate code to answer the question.

This assignment requires that you write additional files (either JSON or pickle files); make sure to submit those files in your repository as well.

In [17]:
# Create and activate a Python virtual environment. 
# Before starting the project, try all these imports FIRST
# Address any errors you get running this code cell 
# by installing the necessary packages into your active Python environment.
# Try to resolve issues using your materials and the web.
# If that doesn't work, ask for help in the discussion forums.
# You can't complete the exercises until you import these - start early! 
# We also import json and pickle (included in the Python Standard Library).

import json
import pickle

import requests
import spacy
from spacytextblob.spacytextblob import SpacyTextBlob

print('All prereqs installed.')
!pip list

All prereqs installed.
Package                  Version
------------------------ --------
anyio                    3.7.0
appdirs                  1.4.4
argon2-cffi              21.3.0
argon2-cffi-bindings     21.2.0
arrow                    1.2.3
asgiref                  3.7.2
asttokens                2.2.1
attrs                    23.1.0
backcall                 0.2.0
beautifulsoup4           4.12.2
black                    23.7.0
bleach                   6.0.0
blis                     0.7.9
bokeh                    3.2.1
branca                   0.6.0
catalogue                2.0.8
certifi                  2023.5.7
cffi                     1.15.1
charset-normalizer       3.1.0
click                    8.1.3
colorama                 0.4.6
colorcet                 3.0.1
comm                     0.1.2
confection               0.1.0
contextvars              2.4
contourpy                1.1.0
cycler                   0.11.0
cymem                    2.0.7
debugpy                  1.6.6
dec

1. The following code accesses the lyrics.ovh public api, searches for the lyrics of a song, and stores it in a dictionary object. Write the resulting json to a file (either a JSON file or a pickle file; you choose). You will read in the contents of this file for future questions so we do not need to frequently access the API.

In [18]:
import requests
import json
import lyricsgenius

#Lyric Genius Object
genius = lyricsgenius.Genius('2TE3gN4DjdNt4zTMRHHD4vnae-7BV_FG24FQ9gpDFhaqalhbq-DY9Hyn566KnV7T')

#Artist
artist = genius.search_artist("They Might Be Giants", max_songs=3, sort="title")
print(artist.songs)

song = artist.song("25 O'Clock")
lyrics = song.lyrics

# Dictionary
song_dict = {
    'artist': 'They Might Be Giants',
    'title': "25 O'Clock",
    'lyrics': lyrics
}

with open('25_OClock_lyrics.json', 'w') as new_file:
    json.dump(song_dict, new_file)

Searching for songs by They Might Be Giants...

Song 1: "200 Sbemails (for Homestar Runner)"
Song 2: "2082"
Song 3: "25 O’Clock"

Reached user-specified song limit (3).
Done. Found 3 songs.
[Song(id, artist, ...), Song(id, artist, ...), Song(id, artist, ...)]
Searching for "25 O'Clock" by They Might Be Giants...
Done.


2. Read in the contents of your file.  Print the lyrics of the song (not the entire dictionary!) and use spaCyTextBlob to perform sentiment analysis on the lyrics.  Print the polarity score of the sentiment analysis.  Given that the range of the polarity score is `[-1.0,1.0]` which corresponds to how positive or negative the text in question is, do you think the lyrics have a more positive or negative connotaion?  Answer this question in a comment in your code cell.

In [None]:
#Reading file from 25_OClock

with open ('25_OClock_lyrics.json', 'r') as file:
    lyric = json.load(file)

lyrics = lyric["lyrics"]

print("Lyrics:")
print(lyrics)
###### Setting up Spacy Package and pipeline
import spacy
from spacytextblob.spacytextblob import SpacyTextBlob
nlp = spacy.load('en_core_web_sm')
nlp.add_pipe('spacytextblob')
doc = nlp(lyrics)

# Getting Sentement
print('Polarity Score: ', doc._.blob.polarity)

Lyrics:
4 Contributors25 O’Clock LyricsThe urge to take you grows more strong
For time had made me wait too long
Each watch I smash apart
Just adding to my power
Each watch I smash apart
Just bringing near the hour
Of 25 o'clock, that's when you're going to be mine
25 o'clock, we'll be together 'til the end of time
At 25 o'clock

The ticking seconds hear them call
My spell of hours will make you fall
Each timer that I break
Will halt the flowing sands
Each timer that I break
Will put you in my hands
At 25 o'clock, that's when you're going to be mine
25 o'clock, we'll be together 'til the end of time
At 25 o'clock

Each watch I smash apart
Just adding to my power
Each watch I smash apart
Just bringing near the hour
Of 25 o'clock, that's when you're going to be mine
25 o'clock, we'll be together 'til the end of time
Of 25 o'clock, that's when you're going to be mine
25 o'clock, we'll be together 'til the end of time
'Til the end of time
'Til the end
'Til the end of timeSee They Might Be 

3. Write a function that takes an artist, song, and filename, accesses the lyrics.ovh api to get the song lyrics, and writes the results to the specified filename.  Test this function by getting the lyrics to any four songs of your choice and storing them in different files.

In [None]:
import lyricsgenius
import json

def get_lyrics(artist, song, f):
    genius = lyricsgenius.Genius('2TE3gN4DjdNt4zTMRHHD4vnae-7BV_FG24FQ9gpDFhaqalhbq-DY9Hyn566KnV7T')
    artist = genius.search_artist(artist, max_songs=1, get_full_info=False)
    song = artist.song(song)
    with open(f+' _lyrics.json', 'w') as f:
        json.dump(song.lyrics, f)

get_lyrics("ACDC", "Back in Black", "Back_in_Black")
get_lyrics("Maneskin", "Fear for Nobody", "Fear_for_Nobody")
get_lyrics("Supertramp", "Goodbye Stranger", "Goodbye_Stranger")
get_lyrics("The Score", "Legend", "Legend")

Searching for songs by ACDC...

Changing artist name to 'AC/DC'
Song 1: "Back in Black"

Reached user-specified song limit (1).
Done. Found 1 songs.
Searching for songs by Maneskin...

Changing artist name to 'Måneskin'
Song 1: "Beggin’"

Reached user-specified song limit (1).
Done. Found 1 songs.
Searching for "Fear for Nobody" by Måneskin...
Done.
Searching for songs by Supertramp...

Song 1: "Breakfast in America"

Reached user-specified song limit (1).
Done. Found 1 songs.
Searching for "Goodbye Stranger" by Supertramp...
Done.
Searching for songs by The Score...

Song 1: "Unstoppable"

Reached user-specified song limit (1).
Done. Found 1 songs.
Searching for "Legend" by The Score...
Done.


4. Write a function that takes the name of a file that contains song lyrics, loads the file, performs sentiment analysis, and returns the polarity score.  Use this function to print the polarity scores (with the name of the song) of the three files you created in question 3.  Does the reported polarity match your understanding of the song's lyrics? Why or why not do you think that might be?  Answer the questions in either a comment in the code cell or a markdown cell under the code cell.

In [None]:
import spacy
from spacytextblob.spacytextblob import SpacyTextBlob
import json

#defining the function for polarity

def sentiment_analysis(f):
    nlp = spacy.load('en_core_web_sm')
    nlp.add_pipe('spacytextblob')
    with open(f, 'r') as f:
        lyrics = json.load(f)
    doc=nlp(lyrics)
    polarity=doc._.blob.polarity
    print(f'Polarity Scores of {f.name}: {polarity}')
    
sentiment_analysis("Back_in_Black _lyrics.json")
sentiment_analysis("Fear_for_Nobody _lyrics.json")
sentiment_analysis("Goodbye_Stranger _lyrics.json")
sentiment_analysis("Legend _lyrics.json")

Polarity Scores of Back_in_Black _lyrics.json: -0.009043893959148194
Polarity Scores of Fear_for_Nobody _lyrics.json: -0.06666666666666667
Polarity Scores of Goodbye_Stranger _lyrics.json: 0.18011363636363634
Polarity Scores of Legend _lyrics.json: 0.47500000000000003


## Results of Sentiment Analysis:
I wasn't surprised by the sentiment analysis or Back in Black,especially since the song is a tribute to the band's lead vocalist, Bon Scott who died at 33. Essentially the song is reflecting on how the band has returned in mourning of Bon and will continue in his memory. The resulting negative score that's close to neutral isn't a suprise. 

Fear for Nobody was also not a surprise as it revolves around the idea that we shouldn't be agraid of anyone of thing. That we should face these fears head on. The reason I agree with the slightly negative score on this is because the lyrics revolve around begging and its conflicting in nature. 

Goodbye Stranger was a surprise. The song has many conflicting interpretations, for eample some believe the song is about the freedom of youth. Where as others believe the song to be about two people reaching the end of their relationship, and realizing what they were seeking nolonger aligned. The lyrics themselves are intereasting as they are light, freeing and encouraging. I had expected this song to be a little higher in terms of sentiment analysis. It would appear that "Goodbye" which is used throughout the song is considered negative, and out weighs the other lyrics such as "feel no sorrow" and "feel no shame". 

Legend is one of those songs that always has a place on a playlist. The entire theme of the song is about a protagonist wanting to be famous and making history. Essentially we are gearing up to continuously work toward becoming legends. This song had gotten the highest score, which is predictable especially as there are generally positive emotions associated with the driving message of overcomming obsticals. 
