# Web Mining and Applied NLP (44-620)

## Requests, JSON, and NLP

### Student Name: Hanna Anenia

Perform the tasks described in the Markdown cells below.  When you have completed the assignment make sure your code cells have all been run (and have output beneath them) and ensure you have committed and pushed ALL of your changes to your assignment repository.

Make sure you have [installed spaCy and its pipeline](https://spacy.io/usage#quickstart) and [spaCyTextBlob](https://spacy.io/universe/project/spacy-textblob)

Every question that requires you to write code will have a code cell underneath it; you may either write your entire solution in that cell or write it in a python file (`.py`), then import and run the appropriate code to answer the question.

This assignment requires that you write additional files (either JSON or pickle files); make sure to submit those files in your repository as well.

In [1]:
# Create and activate a Python virtual environment. 
# Before starting the project, try all these imports FIRST
# Address any errors you get running this code cell 
# by installing the necessary packages into your active Python environment.
# Try to resolve issues using your materials and the web.
# If that doesn't work, ask for help in the discussion forums.
# You can't complete the exercises until you import these - start early! 
# We also import json and pickle (included in the Python Standard Library).

import json
import pickle

import requests
import spacy
from spacytextblob.spacytextblob import SpacyTextBlob

print('All prereqs installed.')
!pip list


All prereqs installed.
Package                   Version
------------------------- ---------------
annotated-types           0.6.0
anyio                     4.2.0
argon2-cffi               23.1.0
argon2-cffi-bindings      21.2.0
arrow                     1.3.0
asttokens                 2.4.1
async-lru                 2.0.4
attrs                     23.2.0
Babel                     2.14.0
beautifulsoup4            4.12.3
bleach                    6.1.0
blis                      0.7.11
catalogue                 2.0.10
certifi                   2024.2.2
cffi                      1.16.0
charset-normalizer        3.3.2
click                     8.1.7
cloudpathlib              0.16.0
colorama                  0.4.6
comm                      0.2.1
confection                0.1.4
contourpy                 1.2.0
cycler                    0.12.1
cymem                     2.0.8
debugpy                   1.8.0
deck                      3.0.0
decorator                 5.1.1
defusedxml              


[notice] A new release of pip is available: 23.2.1 -> 24.0
[notice] To update, run: python.exe -m pip install --upgrade pip


# Question 1

1. The following code accesses the [lyrics.ovh](https://lyricsovh.docs.apiary.io/#reference/0/lyrics-of-a-song/search) public api, searches for the lyrics of a song, and stores it in a dictionary object.  Write the resulting json to a file (either a JSON file or a pickle file; you choose). You will read in the contents of this file for future questions so we do not need to frequently access the API.

In [9]:
import requests
import json

AUTHOR = 'Oscar Wilde'
POEM = 'CHARMIDES'

URL = f'https://poetrydb.org/author,title/{AUTHOR};{POEM}'
result = json.loads(requests.get(URL).text)

# write the result to a JSON file
with open('poem.json', 'w') as file:
    json.dump(result, file)

# Print the poem to get results on the screen
poem = '\n'.join(result[0]['lines'])
print(poem)


I.

He was a Grecian lad, who coming home
With pulpy figs and wine from Sicily
Stood at his galley's prow, and let the foam
Blow through his crisp brown curls unconsciously,
And holding wave and wind in boy's despite
Peered from his dripping seat across the wet and stormy night.

Till with the dawn he saw a burnished spear
Like a thin thread of gold against the sky,
And hoisted sail, and strained the creaking gear,
And bade the pilot head her lustily
Against the nor'west gale, and all day long
Held on his way, and marked the rowers' time with measured song.

And when the faint Corinthian hills were red
Dropped anchor in a little sandy bay,
And with fresh boughs of olive crowned his head,
And brushed from cheek and throat the hoary spray,
And washed his limbs with oil, and from the hold
Brought out his linen tunic and his sandals brazen-soled,

And a rich robe stained with the fishers' juice
Which of some swarthy trader he had bought
Upon the sunny quay at Syracuse,
And was with Tyrian 

# Question 2

2. Read in the contents of your file.  Print the lyrics of the song (not the entire dictionary!) and use spaCyTextBlob to perform sentiment analysis on the lyrics.  Print the polarity score of the sentiment analysis.  Given that the range of the polarity score is `[-1.0,1.0]` which corresponds to how positive or negative the text in question is, do you think the lyrics have a more positive or negative connotaion?  Answer this question in a comment in your code cell.

In [2]:
import json
import spacy
from spacytextblob.spacytextblob import SpacyTextBlob

# Initialize spaCy
nlp = spacy.load('en_core_web_sm')
nlp.add_pipe('spacytextblob')

# Read in the contents of the JSON file
with open('poem.json', 'r') as file:
    result = json.load(file)

# Extract the poem's text (lyrics)
poem_lines = result[0]['lines']
poem_text = '\n'.join(poem_lines)

# Print the lyrics of the song (poem)
print("Poem Lyrics:\n", poem_text)

# Perform sentiment analysis on the lyrics
doc = nlp(poem_text)
polarity_score = doc._.polarity

# Print the polarity score of the sentiment analysis
print("\nPolarity Score:", polarity_score)

# Given that the polarity score of 0.055555555555555546 falls slightly above the midpoint of 0. This indicates that the tex

Poem Lyrics:
 I.

He was a Grecian lad, who coming home
With pulpy figs and wine from Sicily
Stood at his galley's prow, and let the foam
Blow through his crisp brown curls unconsciously,
And holding wave and wind in boy's despite
Peered from his dripping seat across the wet and stormy night.

Till with the dawn he saw a burnished spear
Like a thin thread of gold against the sky,
And hoisted sail, and strained the creaking gear,
And bade the pilot head her lustily
Against the nor'west gale, and all day long
Held on his way, and marked the rowers' time with measured song.

And when the faint Corinthian hills were red
Dropped anchor in a little sandy bay,
And with fresh boughs of olive crowned his head,
And brushed from cheek and throat the hoary spray,
And washed his limbs with oil, and from the hold
Brought out his linen tunic and his sandals brazen-soled,

And a rich robe stained with the fishers' juice
Which of some swarthy trader he had bought
Upon the sunny quay at Syracuse,
And wa

# Question 3

3. Write a function that takes an artist, song, and filename, accesses the lyrics.ovh api to get the song lyrics, and writes the results to the specified filename.  Test this function by getting the lyrics to any four songs of your choice and storing them in different files.

In [5]:
import requests

# Song by Whitney Houston
artist1 = 'Whitney Houston'
song_title1 = 'I wanna dance with somebody'

# Song by Cyndi Lauper
artist2 = 'Cyndi Lauper'
song_title2 = 'Girls just wanna to have fun'

# Song by Mariah Carey
artist3 = 'Mariah Carey'
song_title3 = 'I Want To Know What Love Is'
# Function to get song lyrics
def get_song_from_api_to_json(artist, song_title, filename=None):
    # Construct the URL for the lyrics.ovh API
    url = f'https://api.lyrics.ovh/v1/{artist}/{song_title}'
    
    # Fetch lyrics
    response = requests.get(url)
    
    # Check if the request was successful (status code 200)
    if response.status_code == 200:
        # Parse the JSON response
        data = response.json()
        
        # Check if lyrics are available
        if 'lyrics' in data:
            lyrics = data['lyrics']
            
            # If filename is provided, write lyrics to JSON file
            if filename:
                with open(filename, 'w', encoding='utf-8') as file:
                    file.write(lyrics)
                    print(f"Lyrics saved to {filename}")
            else:
                # If filename is not provided, print lyrics
                print(f"Lyrics for {song_title} by {artist}:\n{lyrics}\n")
        else:
            print(f"Lyrics not found for the song {song_title} by {artist}.")
    else:
        print("Error:", response.status_code)

# Get lyrics for songs by Whitney Houston, Cyndi Lauper , and Mariah Carey
get_song_from_api_to_json(artist1, song_title1)
get_song_from_api_to_json(artist2, song_title2)
get_song_from_api_to_json(artist3, song_title3)

# Corrected function calls to save lyrics to JSON files
get_song_from_api_to_json("Whitney Houston", "I wanna dance with somebody", "I_wanna_dance_with _somebody_lyrics.json")
get_song_from_api_to_json("Cyndi Lauper", "Girls just wanna to have fun", "Girls_just_wanna_to_have_fun_lyrics.json")
get_song_from_api_to_json("Mariah Carey", "I Want To Know What Love Is", "I_Want_To_Know_What_Love_Is_lyrics.json")

Lyrics for I wanna dance with somebody by Whitney Houston:
Paroles de la chanson I Wanna Dance With Somebody par Whitney Houston
Clock strikes upon the hour
and the sun begins to fade
Still enough time to figure out
how to chase my blues away

I've done alright up till now
it's the light of day that shows me how
And when the night falls
loneliness calls

Oh I wanna dance with somebody
I wanna feel the heat with somebody
Yeah I wanna dance with somebody
With somebody who loves me
Oh I wanna dance with somebody
I wanna feel the heat with somebody

Yeah I wanna dance with somebody
With somebody who loves me

I've been in love
and lost my senses
spinning through the town
Sooner or later the fever ends
and I wind up feeling down

I need a man who'll take a chance
on a love that burns hot enough to last
So when the night falls
my lonely heart calls

Oh I wanna dance with somebody
I wanna feel the heat with somebody
Yeah I wanna dance with somebody
With somebody who loves me

Oh I wanna dance

# Question 4

4. Write a function that takes the name of a file that contains song lyrics, loads the file, performs sentiment analysis, and returns the polarity score.  Use this function to print the polarity scores (with the name of the song) of the three files you created in question 3.  Does the reported polarity match your understanding of the song's lyrics? Why or why not do you think that might be?  Answer the questions in either a comment in the code cell or a markdown cell under the code cell.

In [7]:
import spacy
from spacytextblob.spacytextblob import SpacyTextBlob

def get_sentiment_score(filename):
    # Load the spaCy model
    nlp = spacy.load('en_core_web_sm')

    # Add the SpacyTextBlob component
    nlp.add_pipe('spacytextblob')

    # Read the lyrics from the file
    with open(filename, 'r') as file:
        lyrics = file.read()

    # Process the lyrics using spaCy
    doc = nlp(lyrics)

    # Perform sentiment analysis
    sentiment_score = doc._.polarity

    return sentiment_score

# Test the function with the filenames of the songs
files = ["I_wanna_dance_with _somebody_lyrics.txt", "Perfect_lyrics.txt", "I_Want_To_Know_What_Love_Is_lyrics.txt"]
for file in files:
    polarity_score = get_sentiment_score(file)
    print(f"Polarity score for '{file}': {polarity_score}")

Polarity score for 'I_wanna_dance_with _somebody_lyrics.txt': 0.13629629629629628
Polarity score for 'Perfect_lyrics.txt': 0.46133540372670806
Polarity score for 'I_Want_To_Know_What_Love_Is_lyrics.txt': 0.30959595959595954
