Pip Install Commands

In [1]:
%pip install shapely


[notice] A new release of pip available: 22.1.2 -> 24.0
[notice] To update, run: python.exe -m pip install --upgrade pip
Note: you may need to restart the kernel to use updated packages.


Libraries

In [2]:
import json
import requests
import numpy as np
import pandas as pd
import networkx as nx
from shapely.prepared import prep
from shapely.geometry import mapping, shape, Point

Const Values

In [3]:
YEAR_COLUMN_NAME = "year"
DECADE_COLUMN_NAME = "decade"
SONG_TITLE_COLUMN_NAME = "song_title"
COUNTRY_COLUMN_NAME = "country"
ARTIST_LONGITUDE_COLUMN_NAME = "artist_longitude"
ARTIST_LATITUDE_COLUMN_NAME = "artist_latitude"
ARTIST_LOCATION_COLUMN_NAME = "artist_location"

Loading Songs Dataset

In [4]:
raw_songs_dataset = pd.read_csv("../Data/songs_dataset.csv")

In [5]:
raw_songs_dataset.isna().sum()

song_id                    0
song_title                 2
year                  484270
release                    7
tempo                      0
loudness                   0
duration                   0
song_hotttnesss       417782
artist_id                  0
artist_name                0
artist_latitude       641766
artist_longitude      641766
artist_location       487546
artist_hotttnesss         12
artist_familiarity       185
dtype: int64

In [6]:
raw_songs_dataset.isna().sum().sum()

2673336

Shartil: For now I am going to delete all rows with missing data.<br>
This is an initial approach, let's discuss it together with Elisa.

In [7]:
songs_dataset = raw_songs_dataset.dropna()

In [8]:
len(songs_dataset)

126910

Shartil: Adding year column to dataset

In [9]:
songs_dataset = songs_dataset.assign(decade=lambda row: (row[YEAR_COLUMN_NAME].astype(int) // 10) * 10)

In [10]:
min_decade = songs_dataset[DECADE_COLUMN_NAME].min()
max_decade = songs_dataset[DECADE_COLUMN_NAME].max()

decade_array = np.linspace(min_decade, max_decade, 10, dtype=int)

Najeeb: Introducing a new column "country" based on Latitude and Longitude.

In [11]:
# Fetch and process the geojson data from a local file
with open(r'..\Data\countries.geojson.json', 'r') as file:
    geojson_data = json.load(file)

countries = {}
for feature in geojson_data["features"]:
    geom = feature["geometry"]
    country = feature["properties"]["ADMIN"]
    countries[country] = prep(shape(geom))

# Function to get country name from latitude and longitude
def get_country(lon, lat):
    point = Point(lon, lat)
    for country, geom in countries.items():
        if geom.contains(point):
            return country

    return "unknown"

# Apply the function to create a new 'country' column
songs_dataset[COUNTRY_COLUMN_NAME] = songs_dataset.apply(
    lambda row: get_country(row[ARTIST_LONGITUDE_COLUMN_NAME], 
    row[ARTIST_LATITUDE_COLUMN_NAME]), 
    axis=1
    )

Shartil: Deleting redundant columns 

In [12]:
songs_dataset = songs_dataset.drop(
    [
        ARTIST_LATITUDE_COLUMN_NAME,
        ARTIST_LONGITUDE_COLUMN_NAME,
        ARTIST_LOCATION_COLUMN_NAME
    ], 
    axis=1)

songs_dataset.head()

Unnamed: 0,song_id,song_title,year,release,tempo,loudness,duration,song_hotttnesss,artist_id,artist_name,artist_hotttnesss,artist_familiarity,decade,country
1,SOGTUKN12AB017F4F1,No One Could Ever,2006.0,Butter,177.768,-2.06,138.97098,0.617871,ARGEKB01187FB50750,Hudson Mohawke,0.437504,0.643681,2000,United Kingdom
14,SOSDCFG12AB0184647,006,1998.0,Lena 20 År,122.332,-3.925,262.26893,0.212045,ARSB5591187B99A848,Lena Philipsson,0.410229,0.529819,1990,Sweden
15,SOBARPM12A8C133DFF,(Looking For) The Heart Of Saturday,1994.0,Cover Girl,99.214,-14.379,216.47628,0.270776,ARDW5AW1187FB55708,Shawn Colvin,0.446733,0.685503,1990,United States of America
16,SOKOVRQ12A8C142811,Ethos of Coercion,2009.0,Descend Into Depravity,189.346,-6.366,196.0224,0.614766,ARGWPP11187B9AEF43,Dying Fetus,0.511976,0.734471,2000,United States of America
32,SOOLRHW12A8C142643,All of the same blood,2001.0,Violent revolution,191.665,-6.663,372.4273,0.788727,AR79L0D1187FB3AFB6,Kreator,0.472691,0.740252,2000,Germany


Shartil: Now I am going to create the graph

In [13]:
music_graph = nx.DiGraph()

In [14]:
music_graph.add_nodes_from(decade_array.tolist())
music_graph.add_nodes_from(songs_dataset[COUNTRY_COLUMN_NAME].unique().tolist())
music_graph.add_nodes_from(songs_dataset[SONG_TITLE_COLUMN_NAME].tolist())

In [15]:
relationships = []
for index, row in songs_dataset.iterrows():
    current_song_title = row[SONG_TITLE_COLUMN_NAME]
    current_decade = row[DECADE_COLUMN_NAME]
    current_country = row[COUNTRY_COLUMN_NAME]

    relationships.append((current_decade, current_song_title, {"label": "release_decade"}))
    relationships.append((current_country, current_song_title, {"label": "release_country"}))

music_graph.add_edges_from(relationships)

In [16]:
print(music_graph)

DiGraph with 107405 nodes and 230895 edges


In [17]:
def get_songs_by_criteria(music_graph, given_criteria):
    selected_songs = [ song for song in music_graph[given_criteria].keys()]
    return selected_songs

In [18]:
decade_input = 1990

decade_songs = get_songs_by_criteria(music_graph, decade_input)
decade_songs

['006',
 '(Looking For) The Heart Of Saturday',
 'One Little Too Little',
 'Wonderful Stash',
 'Mule Boogie',
 'AcroyearII',
 'Day',
 "When You're Sick With The Blues",
 "Don't Stop Honey (feat. Cedric Burnside)",
 'Sunrise (Album)',
 "You Can't Move Into My House",
 'Moonlight',
 'Never Knew Love',
 "Don't You (Forget About Me) (Album Version)",
 'She Wishes I Were You',
 "Love Don't Go Through No Changes On Me",
 'Alguien La Vió Partir',
 'Come To The Bower',
 'Salt & Velvet',
 'Passages',
 'Awakening',
 'Liberation',
 'Crush',
 'Logan Braes',
 'La fuga de Ruben',
 'Country Music / Mexican Cowboy / Tough Cowboy',
 'Under Your Spell',
 'Blue Moon Nights',
 'Breathtaker',
 'K.T.',
 'How Important Can It Be',
 'How Can You Live',
 'Give Me Your Word',
 'All The Time',
 'Years',
 'Peaceful Day',
 'No More Seances',
 'Alone',
 'Why Took Your Advice',
 'Tainted Past',
 'Motherless Child',
 'Für immer und ewig',
 'The First One To Love You',
 'Love Is A Lonely Street',
 'All the things you 

In [19]:
country_input = "Sweden"

country_songs = get_songs_by_criteria(music_graph, country_input)
country_songs

['006',
 'Day',
 'En Sten Vid En Sjö I En Skog',
 'Grand finale',
 'Oh My God What Have I Done?',
 'Yours To Keep',
 'Heading north (intro)',
 "Stealing Notes From The Devil's Notebook",
 'Greed',
 "Let's Get Bleeped Tonight",
 'Microphone',
 'Bleed',
 'Lemuria',
 'Spine',
 'My Love (Song for a Butterfly)',
 'Swedish Sin',
 'Torn',
 'När Ska Jag Få Se Dig Naken?',
 'Pissed and Poor',
 'Eating Me Slowly',
 'The Pretty Ones',
 'Waterloo',
 'Anorak Christmas (Alexander Robotnick Remix)',
 'Different Sound',
 'Strings Of Grass',
 'Embraced',
 'A Window',
 "I've Been Having Some Strange Dreams",
 'Science',
 "Peter's Dream",
 "We'Re Not Gonna Take It",
 'Do You Remember The Riots?',
 'Iron cage',
 'The Contaminated Void',
 'I Saw You on TV',
 'For My Demons',
 'Into Deep Sleep',
 'Bister Verklighet (No Security Cover)',
 'Idiots',
 'Clean Today',
 'The Promise Of Deceit',
 'Underground Radio (Album Version)',
 'In control',
 'Lost',
 'Openings To Stories',
 'At The Gates',
 'Blood Of The Su

Shartil: Now let's get the intersection of the lists<br>
This code was taken from this [StackOverflow answer](https://stackoverflow.com/a/3697438/9609586)

In [20]:
print("The songs from Sweden that were released in 1990:")

result_list = list(set(decade_songs) & set(country_songs))
print(result_list)

The songs from Sweden that were released in 1990:
['Carnal tomb', 'Dancing December', 'Tonight', 'Come On', 'Elephant', 'Sweet Thing', 'Take Me Away', 'Skin', 'Future Breed Machine', 'A suburb to hell', 'My Friend', 'Paralyzing Ignorance', 'Beneath', 'Bonus Tracks: (Live In Stockholm Recorded By P3 Live 1993) Carnal Tomb', 'In A Dream', 'The games we play', '12', 'Concatenation', 'Bonus Tracks: (Live In Stockholm Recorded By P3 Live 1993) Reborn In Blasphemy', 'Coolidge (Album Version)', 'Blinded By Fear', 'The Goodbye Look', '(Kom Så Ska Vi) Leva Livet', 'Lucy', 'Reborn in blasphemy', 'Untitled', '...Have Another One', 'Saw You Drown', 'No Hope_ No Future_ No Second Chance', "Fazil's Friend (Album Version)", 'Runaway', 'Hellsbells', "Inside what's within behind", 'The Red In The Sky Is Ours / The Season To Come', 'Unto Others', 'Intro', 'Ceremonial comedy', 'Get Over It', 'Clouds', 'Cold Ways', 'Come To Me', 'Standing In My Rain', 'Kid', 'Leona (Album Version)', '(You)', 'Prey', 'Diya