# Web Mining and Applied NLP (44-620)

## Requests, JSON, and NLP

### Student Name: Amanda Hanway
### Repository: https://github.com/mandi1120/json-sentiment

Perform the tasks described in the Markdown cells below.  When you have completed the assignment make sure your code cells have all been run (and have output beneath them) and ensure you have committed and pushed ALL of your changes to your assignment repository.

Make sure you have [installed spaCy and its pipeline](https://spacy.io/usage#quickstart) and [spaCyTextBlob](https://spacy.io/universe/project/spacy-textblob)

Every question that requires you to write code will have a code cell underneath it; you may either write your entire solution in that cell or write it in a python file (`.py`), then import and run the appropriate code to answer the question.

This assignment requires that you write additional files (either JSON or pickle files); make sure to submit those files in your repository as well.

### QUESTION 1

1. The following code accesses the [lyrics.ovh](https://lyricsovh.docs.apiary.io/#reference/0/lyrics-of-a-song/search) public api, searches for the lyrics of a song, and stores it in a dictionary object.  Write the resulting json to a file (either a JSON file or a pickle file; you choose). You will read in the contents of this file for future questions so we do not need to frequently access the API.

In [176]:
import requests
import json, xmltodict

# Note: lyrics.ovh no longer works. Using chartLyrics API for this exercise.

# chartlyrics api result is xml format
result = requests.get('http://api.chartlyrics.com/apiv1.asmx/SearchLyricDirect?artist=tracy%20chapman&song=fast%car').text
print("Chartlyrics api result:", type(result))
print(result)
print("\n-------------------------------------\n")
      
# convert xml format to dict
o = xmltodict.parse(result)
xmlstr = json.dumps(o)  
res = json.loads(xmlstr)
print("Converted type:",type(res))
print(res)

# Write the resulting jsonDict to a file  
with open("lyrics.json", "w") as outfile:
    json.dump(res, outfile)   
    

#result = json.loads(requests.get('https://poetrydb.org/author,title/Edgar Allan Poe;A Dream Within A Dream').text)
#print(type(result), result)
# convert list object to dictionary
#jsonDict = {}
#for i in result: jsonDict.update(i)
#print(jsonDict)
# Write the resulting jsonDict to a file  
#with open("poem.json", "w") as outfile:
#    json.dump(jsonDict, outfile)                                 
                                

Chartlyrics api result: <class 'str'>
<?xml version="1.0" encoding="utf-8"?>
<GetLyricResult xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://api.chartlyrics.com/">
  <TrackId>0</TrackId>
  <LyricChecksum>d06ed0b5e5eab7810b693a10a292bb46</LyricChecksum>
  <LyricId>4963</LyricId>
  <LyricSong>Fast Car</LyricSong>
  <LyricArtist>Tracy Chapman</LyricArtist>
  <LyricUrl>http://www.chartlyrics.com/fIEpEYpIlkCAwXf8GxB8kw/Fast+Car.aspx</LyricUrl>
  <LyricCovertArtUrl>http://ec1.images-amazon.com/images/P/B000002H5I.01.MZZZZZZZ.jpg</LyricCovertArtUrl>
  <LyricRank>9</LyricRank>
  <LyricCorrectUrl>http://www.chartlyrics.com/app/correct.aspx?lid=NAA5ADYAMwA=</LyricCorrectUrl>
  <Lyric>You've got a fast car
I wanna a ticket to anywhere
Maybe we make a deal
Maybe together we can get somewhere
Any place is better
Starting from zero, got nothing to lose
Maybe we'll make something
Me, myself, I've got nothing to prove

You've got a fast 

### QUESTION 2

2. Read in the contents of your file.  Print the lyrics of the song (not the entire dictionary!) and use spaCyTextBlob to perform sentiment analysis on the lyrics.  Print the polarity score of the sentiment analysis.  Given that the range of the polarity score is `[-1.0,1.0]` which corresponds to how positive or negative the text in question is, do you think the lyrics have a more positive or negative connotaion?  Answer this question in a comment in your code cell.

In [177]:
import json
import spacy
from spacytextblob.spacytextblob import SpacyTextBlob

# Opening JSON file
with open('lyrics.json') as openfile:
    # Reading from json file
    json_object = json.load(openfile)
    
#print(type(json_object), json_object)
print("Lyrics: ", json_object['GetLyricResult']['Lyric'])

# use spaCyTextBlob to perform sentiment analysis on the lyrics.  
nlp = spacy.load('en_core_web_sm')
text = str(json_object['GetLyricResult']['Lyric'])
nlp.add_pipe("spacytextblob")
doc = nlp(text)

# Print the polarity score of the sentiment analysis.  
print("")
print("Polarity score:", doc._.blob.polarity)
polarity = doc._.blob.polarity
sentiment = ''
if polarity > 0:
    sentiment = "Positive"
else:
    sentiment = "Negative"
print("Sentiment:", sentiment)

# Given that the range of the polarity score is `[-1.0,1.0]` 
# which corresponds to how positive or negative the text in question is, 
# do you think the lyrics have a more positive or negative connotation?  
# Answer this question in a comment in your code cell.
######## This song has a postive sentiment with a polarity score of 0.15995011086474506
print("Result: This song has a", sentiment, "connotation with a polarity score of", polarity)


Lyrics:  You've got a fast car
I wanna a ticket to anywhere
Maybe we make a deal
Maybe together we can get somewhere
Any place is better
Starting from zero, got nothing to lose
Maybe we'll make something
Me, myself, I've got nothing to prove

You've got a fast car
I've got a plan to get us out of here
Been working at the convenience store
Managed to save just a little bit of money
Won't have to drive too far
Just cross the border and into the city
You and I can both get jobs
And finally see what it means to be living

See my old man's got a problem
Live with the bottle, that's the way it is
He says his body's too old for working
His body's too young, to look like his
When mama went off and left him
She wanted more from life than he could give
I said somebody's got to take care of him
So I quit school and that's what I did

You've got a fast car
Is it fast enough so we can fly away?
We gotta make a decision
Leave tonight or live and die this way

Say remember when we were driving, drivi

### QUESTION 3

3. Write a function that takes an artist, song, and filename, accesses the lyrics.ovh api to get the song lyrics, and writes the results to the specified filename.  Test this function by getting the lyrics to any four songs of your choice and storing them in different files.

In [178]:
import requests
import json
import xmltodict

def lyric_to_file(artist, song, filename):     
    
    # chartlyrics api result is xml format
    result = requests.get('http://api.chartlyrics.com/apiv1.asmx/SearchLyricDirect?artist='+artist+'&song='+song).text
    
    # convert xml format to json dict
    o = xmltodict.parse(result)
    xmlstr = json.dumps(o)  
    res = json.loads(xmlstr)

    # Write the resulting jsonDict to a file  
    with open(filename, "w") as outfile:
        json.dump(res, outfile)   
       
lyric_to_file("tracy chapman", "fast car", "fast_car.json")
lyric_to_file("cranberries", "zombie", "zombie.json")
lyric_to_file("red hot chili peppers", "soul to squeeze", "soul_to_squeeze.json")
lyric_to_file("no doubt", "spiderwebs", "spiderwebs.json")


### QUESTION 4

4. Write a function that takes the name of a file that contains song lyrics, loads the file, performs sentiment analysis, and returns the polarity score.  Use this function to print the polarity scores (with the name of the song) of the three files you created in question 3.  Does the reported polarity match your understanding of the song's lyrics? Why or why not do you think that might be?  Answer the questions in either a comment in the code cell or a markdown cell under the code cell.

In [189]:
import json
import spacy
from spacytextblob.spacytextblob import SpacyTextBlob

def sentimentAnalysis(filename):     
   
    # Opening & reading JSON file
    with open(filename) as openfile:
        json_object = json.load(openfile)

    # use spaCyTextBlob to perform sentiment analysis on the lyrics.  
    nlp = spacy.load('en_core_web_sm')
    text = str(json_object['GetLyricResult']['Lyric'])
    nlp.add_pipe("spacytextblob")
    doc = nlp(text)

    # Print the polarity score of the sentiment analysis.  
    polarity = doc._.blob.polarity
    if polarity > 0:
        sentiment = "Positive"
    else:
        sentiment = "Negative"
    
    print("Song:", json_object['GetLyricResult']['LyricSong'])
    print("Artist:", json_object['GetLyricResult']['LyricArtist'])
    print("Polarity Score:", polarity)    
    print("Sentiment Result:", sentiment)
    print("")
    
sentimentAnalysis("fast_car.json")
sentimentAnalysis("zombie.json")
sentimentAnalysis("soul_to_squeeze.json")
sentimentAnalysis("spiderwebs.json")

Song: Fast Car
Artist: Tracy Chapman
Polarity Score: 0.15995011086474506
Sentiment Result: Positive

Song: Zombie
Artist: The Cranberries
Polarity Score: -0.075
Sentiment Result: Negative

Song: Soul to Squeeze
Artist: Red Hot Chili Peppers
Polarity Score: 0.13059523809523813
Sentiment Result: Positive

Song: Spiderwebs
Artist: No Doubt
Polarity Score: -0.014880952380952384
Sentiment Result: Negative



##### Does the reported polarity match your understanding of the song's lyrics? Why or why not do you think that might be?  Answer the questions in either a comment in the code cell or a markdown cell under the code cell.

Song: Fast Car <br/>
The score of 0.16 matches my understanding of the song's lyrics. <br/>
Fast Car is about starting over after facing challenges. The lyrics are not extremely positive, but give a slight sense of hope that aligns with the score.

Song: Zombie<br/>
The polarity score of -0.075 matches my understanding of the song's lyrics. <br/> 
The song "Zombie" is about violence, so it makes sense that it would have a negative score and sentiment.<br/>

Song: Soul to Squeeze<br/>
The polarity score of 0.13 is higher than I would expect for this song.  <br/>
The lyrics refer to mental illness and the song is written about addiction. I would have expected a negative score, and it is interesting the result was on the positive side.

Song: Spiderwebs<br/>
The polarity score of -0.015 matches my understanding of lyrics.<br/>
The song is about a stalker, and screening phone calls to avoid them. The negative score matches the song's intention.


