Pip Install Commands

In [82]:
%pip install shapely
%pip install node2vec

Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.


Libraries

In [83]:
import os
import json
import requests
import numpy as np
import pandas as pd
import networkx as nx
import matplotlib.pyplot as plt
from shapely.prepared import prep
from shapely.geometry import mapping, shape, Point
from node2vec import Node2Vec
from sklearn.cluster import KMeans

Const Values

In [84]:
YEAR_COLUMN = "year"
TEMPO_COLUMN = "tempo"
LOUDNESS_COLUMN = "loudness"
DURATION_COLUMN = "duration"
SONG_HOTTTNESSS_COLUMN = "song_hotttnesss"
ARTIST_HOTTTNESSS_COLUMN = "artist_hotttnesss"
ARTIST_FAMILIARITY_COLUMN = "artist_familiarity"
DECADE_COLUMN = "decade"

NUMERIC_COLUMNS_LIST = [
    YEAR_COLUMN,
    TEMPO_COLUMN,
    LOUDNESS_COLUMN,
    DURATION_COLUMN,
    SONG_HOTTTNESSS_COLUMN,
    ARTIST_HOTTTNESSS_COLUMN,
    ARTIST_FAMILIARITY_COLUMN,
    DECADE_COLUMN
]

SONG_TITLE_COLUMN = "song_title"
COUNTRY_COLUMN = "country"
ARTIST_LONGITUDE_COLUMN = "artist_longitude"
ARTIST_LATITUDE_COLUMN = "artist_latitude"
ARTIST_LOCATION_COLUMN = "artist_location"
ARTIST_ID_COLUMN = "artist_id"
SONG_ID_COLUMN = "song_id"

UNKNOWN_COUNTRY_VALUE = "unknown"
MUSIC_DATA_FOLDER_PATH = "../Music Data/"
MODELS_FOLDER_PATH = "../models/"

In [85]:
def get_attribute_node_name(node_type, node_value):
    return f"{node_type} {node_value}"

Loading Songs & Artists datasets

In [86]:
raw_songs_dataset = pd.read_csv("../Data/songs_dataset.csv")
raw_artists_dataset = pd.read_csv("../Data/artist_terms.csv")

Riaz: Merging datasets based on artist_id<br>
In the following cell i am removing the duplicate rows based on `artist_id` and only keep the first record

In [87]:
# Remove duplicates from the artist dataset based on artist_id
raw_artists_dataset = raw_artists_dataset.drop_duplicates(subset=ARTIST_ID_COLUMN, keep='first')

# Merge the datasets on artist_id
raw_music_dataset = pd.merge(raw_songs_dataset, raw_artists_dataset, on=ARTIST_ID_COLUMN, how='left')

In the above cell i merged the datasets based on artist_id and merge was on left join:
When you specify how='left', it means that all the keys from the left dataframe (in this case, the raw_songs_dataset dataframe) will be included in the merged dataframe, and only the matching keys from the right dataframe (in this case, the artist_dataset dataframe) will be added.

In other words:

All rows from the left dataframe (raw_songs_dataset) are retained.
If there are matching keys (in this case, artist_id) in the right dataframe (artist_dataset), the corresponding data from the right dataframe will be added to the merged dataframe.
If there are no matching keys in the right dataframe, the corresponding columns in the merged dataframe will be filled with NaN (missing values).

In [88]:
raw_music_dataset.head()

Unnamed: 0,song_id,song_title,year,release,tempo,loudness,duration,song_hotttnesss,artist_id,artist_name,artist_latitude,artist_longitude,artist_location,artist_hotttnesss,artist_familiarity,term
0,SOVFVAK12A8C1350D9,Tanssi vaan,1995.0,Karkuteillä,150.778,-10.555,156.55138,0.299877,ARMVN3U1187FB3A1EB,Karkkiautomaatti,,,,0.356992,0.439604,pop rock
1,SOGTUKN12AB017F4F1,No One Could Ever,2006.0,Butter,177.768,-2.06,138.97098,0.617871,ARGEKB01187FB50750,Hudson Mohawke,55.8578,-4.24251,"Glasgow, Scotland",0.437504,0.643681,broken beat
2,SOBNYVR12A8C13558C,Si Vos Querés,2003.0,De Culo,87.433,-4.654,145.05751,,ARNWYLR1187B9B2F9C,Yerba Brava,,,,0.372349,0.448501,cumbia
3,SOHSBXH12A8C13B0DF,Tangle Of Aspens,,Rene Ablaze Presents Winter Sessions,140.035,-7.806,514.29832,,AREQDTE1269FB37231,Der Mystic,,,,0.0,0.0,hard trance
4,SOZVAPQ12A8C13B63C,"Symphony No. 1 G minor ""Sinfonie Serieuse""/All...",,Berwald: Symphonies Nos. 1/2/3/4,90.689,-21.42,816.53506,,AR2NS5Y1187FB5879D,David Montgomery,,,,0.109626,0.361287,ragtime


In [89]:
raw_music_dataset.isna().sum()

song_id                    0
song_title                 2
year                  484270
release                    7
tempo                      0
loudness                   0
duration                   0
song_hotttnesss       417782
artist_id                  0
artist_name                0
artist_latitude       641766
artist_longitude      641766
artist_location       487546
artist_hotttnesss         12
artist_familiarity       185
term                    3767
dtype: int64

In [90]:
raw_music_dataset.isna().sum().sum()

2677103

Shartil: For now I am going to delete all rows with missing data.

In [91]:
music_dataset = raw_music_dataset.dropna()

In [92]:
len(music_dataset)

126903

Shartil: Evenly selecing over 1000 songs, and saving them as the final dataframe

In [93]:
print(music_dataset.shape)

music_dataset = music_dataset.iloc[::120] # returns dataframe with (1058, 13)

print(music_dataset.shape)

(126903, 16)
(1058, 16)


Najeeb: Introducing a new column "country" based on Latitude and Longitude.

This code is used to load geographical data from a Local GeoJSON file, process it, and subsequently determine which country a given set of Latitudes and Logitudes coordinates falls into. The country names extracted from GeoJSON file are then inserted into a new column in a dataset.

- The "json.load" function reads the file and convert it into a Python dictionary ('geojson_data').

- An Empty dictionary named 'countries' is initiated to store the coordinated data associated with each country.

- The script iterates over each feature in the 'features' array of the 'geojson_data'. Each features represents a country.

- For each feature, the geometry ('geom') and the administrative name of the country is extracted.

- The geometry is then processed with a function 'prep' applied to 'shape(geom)'. This likely involves creating a geometric shape from the geometry data and preparing it for fast spatial queries. The processed geometry is dtored in the 'countries' dictionary with the country name as key.

- A function 'get_country' is defined which takes longitudes ('lon') and latitude ('lat') as arguments and creates a ('Point') object from coordinates.

- It then iterates  over the 'countries' dictionary and check whether the point is contained within any of the country geometrics using the 'contains' method of the geometry.

- if a containing country is found, the function returns the country's name. if no containing country is found, it returns a value 'UNKNOWN_COUNTRY_VALUE'

- A new column in the dataset ('msic_dataset') is populated by applying 'get_country' function to each row. In this way the country column is added to music_dataset based on latitude and logitude columns.


In [94]:
# Fetch and process the geojson data from a local file
with open(r'../Data/countries.geojson.json', 'r') as file:
    geojson_data = json.load(file)

countries = {}
for feature in geojson_data["features"]:
    geom = feature["geometry"]
    country = feature["properties"]["ADMIN"]
    countries[country] = prep(shape(geom))

# Function to get country name from latitude and longitude
def get_country(lon, lat):
    point = Point(lon, lat)
    for country, geom in countries.items():
        if geom.contains(point):
            return country

    return UNKNOWN_COUNTRY_VALUE

# Apply the function to create a new 'country' column
music_dataset[COUNTRY_COLUMN] = music_dataset.apply(
    lambda row: get_country(row[ARTIST_LONGITUDE_COLUMN], 
    row[ARTIST_LATITUDE_COLUMN]), 
    axis=1
    )

Shartil: Deleting redundant columns 

In [95]:
music_dataset = music_dataset.drop(
    [
        ARTIST_LATITUDE_COLUMN,
        ARTIST_LONGITUDE_COLUMN,
        ARTIST_LOCATION_COLUMN,
        SONG_ID_COLUMN,
        ARTIST_ID_COLUMN
    ], 
    axis=1)

Shartil: Deleting all rows with "unknown" as the country value

In [96]:
music_dataset = music_dataset[music_dataset[COUNTRY_COLUMN] != UNKNOWN_COUNTRY_VALUE]

In [97]:
music_dataset.reset_index(drop=True, inplace=True)
music_dataset.head()

Unnamed: 0,song_title,year,release,tempo,loudness,duration,song_hotttnesss,artist_name,artist_hotttnesss,artist_familiarity,term,country
0,No One Could Ever,2006.0,Butter,177.768,-2.06,138.97098,0.617871,Hudson Mohawke,0.437504,0.643681,broken beat,United Kingdom
1,Don't Save It All For Christmas Day,2004.0,Merry Christmas With Love,127.397,-9.149,273.08363,0.732281,Clay Aiken,0.500596,0.8521,teen pop,United States of America
2,White Lies,2006.0,Rocinate,92.103,-9.323,388.80608,0.417314,Ester Drang,0.330889,0.525616,shoegaze,United States of America
3,Guess Who I Saw In Paris,1999.0,Sugar Me,105.054,-18.484,170.31791,0.368414,Claudine Longet,0.377489,0.563184,easy listening,France
4,No More Birthdays (Phil Spector Folk) / San Fr...,2006.0,Born To Please,95.658,-6.141,280.45016,0.0,Sound Team,0.368423,0.590111,art rock,United States of America


In [98]:
music_dataset.shape

(1035, 12)

Shartil: making sure the final dataset contains 1000 songs

In [99]:
music_dataset = music_dataset.iloc[:1000] # returns dataframe with 1000 songs

print(music_dataset.shape)

(1000, 12)


Shartil: Adding decade column to dataset

In [100]:
music_dataset = music_dataset.assign(decade=lambda row: (row[YEAR_COLUMN].astype(int) // 10) * 10)
music_dataset.head()

Unnamed: 0,song_title,year,release,tempo,loudness,duration,song_hotttnesss,artist_name,artist_hotttnesss,artist_familiarity,term,country,decade
0,No One Could Ever,2006.0,Butter,177.768,-2.06,138.97098,0.617871,Hudson Mohawke,0.437504,0.643681,broken beat,United Kingdom,2000
1,Don't Save It All For Christmas Day,2004.0,Merry Christmas With Love,127.397,-9.149,273.08363,0.732281,Clay Aiken,0.500596,0.8521,teen pop,United States of America,2000
2,White Lies,2006.0,Rocinate,92.103,-9.323,388.80608,0.417314,Ester Drang,0.330889,0.525616,shoegaze,United States of America,2000
3,Guess Who I Saw In Paris,1999.0,Sugar Me,105.054,-18.484,170.31791,0.368414,Claudine Longet,0.377489,0.563184,easy listening,France,1990
4,No More Birthdays (Phil Spector Folk) / San Fr...,2006.0,Born To Please,95.658,-6.141,280.45016,0.0,Sound Team,0.368423,0.590111,art rock,United States of America,2000


In [101]:
min_decade = music_dataset[DECADE_COLUMN].min()
max_decade = music_dataset[DECADE_COLUMN].max()
decade_array = np.arange(min_decade, max_decade + 10, 10, dtype=int)

Shartil: Saving music_dataset as a CSV file

In [102]:
if not os.path.isdir(MUSIC_DATA_FOLDER_PATH):
    os.mkdir(MUSIC_DATA_FOLDER_PATH)

music_dataset.to_csv(f"{MUSIC_DATA_FOLDER_PATH}/music_dataset.csv", mode='w+')

Shartil: I will normalize the numeric columns using min max normalization.

In [103]:
def min_max_normalize_column(df, column_name):
    min_val = df[column_name].min()
    max_val = df[column_name].max()
    
    if min_val == max_val:
        raise ValueError("Cannot normalize column when all values are the same.")
    
    df[column_name] = (df[column_name] - min_val) / (max_val - min_val)

In [104]:
normalized_music_dataset = music_dataset.copy()

for numeric_column in NUMERIC_COLUMNS_LIST:
    min_max_normalize_column(normalized_music_dataset, numeric_column)

In [105]:
normalized_music_dataset.head()

Unnamed: 0,song_title,year,release,tempo,loudness,duration,song_hotttnesss,artist_name,artist_hotttnesss,artist_familiarity,term,country,decade
0,No One Could Ever,0.925926,Butter,0.694103,0.999149,0.076051,0.686709,Hudson Mohawke,0.532471,0.612163,broken beat,United Kingdom,0.833333
1,Don't Save It All For Christmas Day,0.888889,Merry Christmas With Love,0.455336,0.804487,0.164387,0.813866,Clay Aiken,0.609258,0.889696,teen pop,United States of America,0.833333
2,White Lies,0.925926,Rocinate,0.288036,0.799709,0.24061,0.463807,Ester Drang,0.402714,0.454948,shoegaze,United States of America,0.833333
3,Guess Who I Saw In Paris,0.796296,Sugar Me,0.349426,0.548151,0.096698,0.409459,Claudine Longet,0.459429,0.504973,easy listening,France,0.666667
4,No More Birthdays (Phil Spector Folk) / San Fr...,0.925926,Born To Please,0.304888,0.887086,0.169239,0.0,Sound Team,0.448395,0.54083,art rock,United States of America,0.833333


Shartil: Now I am going to create the graph

In [106]:
music_graph = nx.Graph()

In [107]:
for current_index, current_row in normalized_music_dataset.iterrows():
    node_data_dict = {}

    for current_column in NUMERIC_COLUMNS_LIST:
        node_data_dict[current_column] = current_row[current_column]

    music_graph.add_node(str(current_index), **node_data_dict)

In [108]:
# Shartil: adding empty nodes for the decades, which function as main nodes
# i.e., all the songs from 1950 will be connected to the 1950 node.
# This will save complex logic of connecting all the songs from the decade, and ensuring that the resulted graph will be less complicated.
for current_decade in decade_array:
    node_name = get_attribute_node_name(DECADE_COLUMN, current_decade)
    music_graph.add_node(node_name)

Shartil: for now, the graph only has decade nodes & song nodes that contain their matching ID in the dataframe

In [109]:
for index, row in music_dataset.iterrows():
    current_decade = row[DECADE_COLUMN]
    node_name = get_attribute_node_name(DECADE_COLUMN, current_decade)
    music_graph.add_edge(node_name, str(index))

print(music_graph)

Graph with 1007 nodes and 1000 edges


Riaz: Node Embeding using node2vec

In [110]:
# Precompute probabilities and generate walks - **ON WINDOWS ONLY WORKS WITH workers=1**
node2vec = Node2Vec(music_graph, dimensions=64, walk_length=10, num_walks=200, workers=4)  # Use temp_folder for big graphs

# Embed nodes
model = node2vec.fit(window=10, min_count=1, batch_words=4)  # Any keywords acceptable by gensim.Word2Vec can be passed, `dimensions` and `workers` are automatically passed (from the Node2Vec constructor)

Computing transition probabilities:   0%|          | 0/1007 [00:00<?, ?it/s]

Saving embedding and model into models folder

In [111]:

# Save embeddings for later use
model.wv.save_word2vec_format(f"{MODELS_FOLDER_PATH}/embedding")

# Save model for later use
model.save(f"{MODELS_FOLDER_PATH}/node2vec_model")


Shartil: get recommendations for given song number in the music dataset, as a test

In [112]:
from sklearn.neighbors import NearestNeighbors

In [113]:
user_song = "166" # song id
user_song_embed = model.wv[user_song]
user_song_embed

array([-0.34169364, -0.86400956,  0.16767538, -0.17801948,  0.4248336 ,
       -0.60456276,  0.7704428 ,  0.00364914, -0.45487988,  1.0203265 ,
        0.50941265,  0.3919508 , -0.40184215,  0.03467988,  0.25006166,
        0.60527587, -0.3454495 ,  0.26351973, -0.4899496 ,  0.697027  ,
        0.5888879 ,  0.28077477,  0.6314045 , -0.539949  ,  0.17636526,
       -0.06273375, -1.0085181 ,  0.06046623,  0.2546999 , -0.80250627,
       -0.3561426 , -0.6014718 ,  0.28601807, -1.0407907 , -0.15811919,
        0.68086964,  0.5099545 , -0.25826266,  0.67081743, -0.8796059 ,
       -0.07466778,  0.2972698 , -0.42388636, -0.5706273 , -0.40145177,
        0.6532367 ,  0.24157041, -0.52275956, -0.14222768, -0.3681192 ,
        0.4605497 ,  0.33238563,  0.44158286,  0.05206081,  0.9527995 ,
       -0.1263529 ,  0.0337824 , -0.5245092 , -0.14589703, -0.62318575,
        0.10469976, -0.15936254, -0.8470526 ,  0.1783036 ], dtype=float32)

In [114]:
nodes = list(music_graph.nodes())
print(len(nodes)) # print for debug reasons

embeddings = np.array([model.wv[str(node)] for node in nodes])
print(embeddings.shape) # print for debug reasons

1007
(1007, 64)


In [115]:
# current song + 5 other
nearestNeighborSelector = NearestNeighbors(n_neighbors=6).fit(embeddings)
distances, indices = nearestNeighborSelector.kneighbors([user_song_embed])

# print for debug reasons
print(distances)
print(indices)

[[0.         0.11209637 0.11573787 0.12141739 0.12750072 0.12871668]]
[[166 958 503 821 145 770]]


In [116]:
song_output = []
for current_index in indices[0]:
    current_row = music_dataset.loc[current_index]
    song_output.append(current_row)

pd.concat(song_output, axis=1).T

Unnamed: 0,song_title,year,release,tempo,loudness,duration,song_hotttnesss,artist_name,artist_hotttnesss,artist_familiarity,term,country,decade
166,GILDED LAMP OF THE COSMOS,1968.0,Behold & See,95.465,-7.527,182.04689,0.60962,Ultimate Spinach,0.430077,0.492727,psychedelic rock,United States of America,1960
958,Reputation (Remastered Album Version),1967.0,Insight Out,161.477,-8.829,158.53669,0.30417,The Association,0.420325,0.596901,rock 'n roll,United States of America,1960
503,Now Twist,1962.0,Jelly Roll King,124.273,-12.762,113.05751,0.238246,Frank Frost,0.334564,0.422169,soul blues,United States of America,1960
821,My Babe,1969.0,Poppa Willie - The Hi Years / 1962-74,195.309,-8.067,144.22159,0.334707,Willie Mitchell,0.354165,0.51103,memphis soul,United States of America,1960
145,Fur Elise/Moonlight Sonata,1968.0,The Beat Goes On,97.238,-15.433,396.12036,0.265861,Vanilla Fudge,0.367345,0.562902,psychedelic rock,United States of America,1960
770,Teachers,1967.0,Songs of L. Cohen / Songs of love and hate / N...,101.936,-12.113,182.54322,0.662568,Leonard Cohen,0.564623,0.800189,folk rock,United States of America,1960


#### Function used for User input handling

In [119]:
def get_song_details(query, query_type):
    if query_type in ['decade']:
        query = int(query) 
    filtered = music_dataset[music_dataset[query_type].astype(str).str.lower() == str(query).lower()]
    return filtered

def recommend_songs(song_id):
    song_embedding = model.wv[str(song_id)]
    distances, indices = nearestNeighborSelector.kneighbors([song_embedding])
    similar_songs = music_dataset.iloc[indices[0]]
    return similar_songs

# Function to find the song ID based on the song title
def get_song_id(song_title):
    matching_songs = music_dataset[music_dataset[SONG_TITLE_COLUMN].str.lower() == song_title.lower()]
    if not matching_songs.empty:
        return matching_songs.iloc[0].name  # Return DataFrame index as ID
    return None

def recommend_songs_by_year(year):
    """Recommend similar songs from a specific year."""
    try:
        year = int(year)  # Ensure the year is an integer
    except ValueError:
        print("Please enter a valid year.")
        return
    
    # Filter songs from the specified year
    songs_from_year = music_dataset[music_dataset['year'] == year]
    if songs_from_year.empty:
        print("No songs found from the specified year.")
        return
    
    print(f"\n\nSongs from the year {year}:")
    display(songs_from_year)
    
    # If songs exist from that year, find similar songs for user input
    song_index = int(input("Enter the index of the song you are interested in: "))
    if song_index is not None:
        user_song_embed = model.wv[str(song_index)]
        distances, indices = nearestNeighborSelector.kneighbors([user_song_embed])
        print("Similar songs:")
        similar_songs_df = music_dataset.loc[indices[0]]
        #print(similar_songs_df[['song_title', 'year', 'release', 'tempo', 'loudness', 'duration', 'song_hotttnesss', 'artist_name', 'artist_hotttnesss', 'artist_familiarity', 'term', 'country', 'decade']].to_string(index=True))
        display(similar_songs_df)

#### User Interface

In [126]:
# User interface
print("Query by: song_title, artist name, decade, country, term (genre), or year")
query_type = input("Enter your query type: ").strip().lower()
query = input(f"Enter the {query_type}: ")

if query_type == 'song_title':
    # Interact with the user to get the song title
    user_song_title = query
    song_id = get_song_id(user_song_title)

    if song_id is not None:
        user_song_embed = model.wv[str(song_id)]
        distances, indices = nearestNeighborSelector.kneighbors([user_song_embed])
        print("Similar songs:")
        similar_songs_df = music_dataset.loc[indices[0]]
        #print(similar_songs_df[['song_title', 'year', 'release', 'tempo', 'loudness', 'duration', 'song_hotttnesss', 'artist_name', 'artist_hotttnesss', 'artist_familiarity', 'term', 'country', 'decade']].to_string(index=True))
        display(similar_songs_df)
    else:
        print("Song not found in the dataset.")
elif query_type == 'year':
    recommend_songs_by_year(query)
else:
    # Fetching songs based on query other than song_title
    results = get_song_details(query, query_type)
    if results.empty:
        print("No songs found.")
    else:
        print("Songs found:")
        #print(results[[SONG_TITLE_COLUMN, 'artist_name', query_type]])  # Adjusting based on dataset's column names
        display(results)

        # user to pick a song
        song_index = int(input("Enter the index of the song you are interested in: "))
        if song_index in results.index:
            print("\n\nFinding similar songs...")
            similar_songs = recommend_songs(song_index)
            #print(similar_songs[[SONG_TITLE_COLUMN, 'artist_name']])  # Adjusting based on dataset
            display(similar_songs)
        else:
            print("Invalid song index.")

Query by: song_title, artist name, decade, country, term (genre), or year
Songs found:


Unnamed: 0,song_title,year,release,tempo,loudness,duration,song_hotttnesss,artist_name,artist_hotttnesss,artist_familiarity,term,country,decade
39,Young God (Album),2009.0,Life on Earth,58.286,-21.331,292.72771,0.687363,Tiny Vipers,0.40751,0.635931,folk rock,United States of America,2000
49,I Can't Help But Wonder Where I'm Bound (LP Ve...,1964.0,Ramblin' Boy,166.77,-16.971,223.76444,0.45763,Tom Paxton,0.354905,0.204065,folk rock,United States of America,1960
69,Just A Little Bit Of Rain,1967.0,The Stone Poneys,149.048,-16.587,142.10567,0.358977,The Stone Poneys,0.316696,0.399974,folk rock,United States of America,1960
81,Kiss My Lips,2007.0,Cavalier,154.588,-17.051,296.93342,0.266955,Tom Brosseau,0.35759,0.563581,folk rock,United States of America,2000
116,Forgiven,2001.0,Of Joy & Sorrow,90.679,-11.452,223.18975,0.434838,Denison Witmer,0.480679,0.631127,folk rock,United States of America,2000
158,Once I Was,2009.0,The Village,126.94,-17.262,269.13914,0.21508,Cowboy Junkies_ Various Artists,0.516461,0.69432,folk rock,Canada,2000
171,After Jane,2006.0,Pretty Little Stranger,95.928,-10.633,302.88934,0.385559,Joan Osborne,0.452451,0.728044,folk rock,United States of America,2000
179,Roll On,2002.0,Golden Age Of Radio,71.731,-9.086,259.16036,0.559547,Josh Ritter,0.489464,0.729852,folk rock,United States of America,2000
218,Keep It Part 2 (Inferiority Part 1),1996.0,It Was Like This,93.324,-10.578,233.16853,0.0,Dexy's Midnight Runners,0.489458,0.679794,folk rock,United Kingdom,1990
221,Cicily,2008.0,I Worked On The Ships,220.088,-5.246,234.68363,0.46349,Ballboy,0.36175,0.577875,folk rock,United Kingdom,2000




Finding similar songs...


Unnamed: 0,song_title,year,release,tempo,loudness,duration,song_hotttnesss,artist_name,artist_hotttnesss,artist_familiarity,term,country,decade
171,After Jane,2006.0,Pretty Little Stranger,95.928,-10.633,302.88934,0.385559,Joan Osborne,0.452451,0.728044,folk rock,United States of America,2000
705,Don't Look Down,2005.0,COSMIC TROUBADOUR,110.224,-7.918,220.42077,0.377532,Billy Sheehan,0.428103,0.668202,heavy metal,United States of America,2000
180,Kill Kill Kill,2006.0,New Waves,67.779,-9.601,87.45751,0.531101,The Tough Alliance,0.426898,0.632801,pop rap,Sweden,2000
603,Snakeskin,2005.0,Barabajagal,103.186,-12.665,161.27955,0.37253,Donovan,0.475182,0.697721,folk-pop,United Kingdom,2000
192,Krazy Krush,2002.0,A Little Deeper,91.004,-8.412,222.45832,0.732203,Ms. Dynamite,0.383864,0.589695,grime,United Kingdom,2000
287,Work It Out,2004.0,True To Yourself,162.27,-4.836,258.01098,0.531722,Albert Cummings,0.408246,0.545184,electric blues,United States of America,2000
