# Eurovision Song Picker
## Made by: Jake Simon
## Databases used were exported from: eschome.net

### Welcome to the "Eurovision Song Picker"! This page takes every single Eurovision song (as of 2019), and stores them into a single database. This database includes the country, year, title, and performer for each song. 

The purpose of the program is to give users a list of randomly selected Eurovision songs from either the entire list of Eurovision songs, or from a smaller, customizable list of songs selected based on a set of given parameters.

Personally, I am an avid Eurovision history buff, and I enjoy sharing my interest in the contest with those around me. This is one way of showing that in a fun and entertaining way!

In [1]:
import pandas as pd
import numpy as np

# Importing all the data

In [3]:
# creates data frame that contains every single Eurovision song ever!
all_data = pd.DataFrame({'Country':[], 'Year':[], 'Song':[], 'Performer':[]})
all_data['Year'] = pd.to_numeric(all_data['Year'], downcast='integer')

first_year = 1956
last_year = 2019

id_counter = 0

for f in range(first_year, last_year+1):
    name = 'database/Eurovision Song Contest Database ' + str(f) + '.html'
    data = pd.read_html(name)[1].loc[:,'Country':'Song']

    # prints the year of this contest for all entries in the 'Year' column
    year_col = pd.Series(np.repeat(f, len(data)))
    data['Year'] = year_col
    
    # gives each song a unique id number
    id_col = pd.Series([i for i in range(0, id_counter+len(data))])
    id_counter += len(data)
    
    all_data = all_data.append(data, sort=False)
    all_data = all_data.set_index(id_col)

# Correct any typos in the data here:
all_data.loc[394,'Country'] = 'Morocco' # Originally, it was mispelled as 'Marocco'
#################################

# prints the entire data frame
print_entire_data_frame = False
if print_entire_data_frame == True:
    print(all_data)

# Subsetting (modifying) your Eurovision songs

In [7]:
########## CUSTOM SETTINGS ################################################################

########## Timespan (Years) ##########
timespan = True
# If timespan is true, then the song selection will be between a range of years:
min_year = 1983 # The oldest year you want in your range of songs
max_year = 1993 # The newest year you want in your range of songs

# Otherwise (if timespan==False), songs will be selected from a specific list of years:
range_of_years = [1981, 1983]

########## Countries ##########
limit_countries = False # If you only want a select group of countries, set to 'True'.
# If limit_countries==True, then list the countries you want to select here:
countries = ['Morocco', 'Switzerland']

###########################################################################################
# Implementation of custom settings through code:

# only selects songs between a range of years (given above):
if timespan == True:
    range_of_years = [i for i in range(min_year, max_year+1)]
    
subset_data = all_data.loc[(all_data['Year'].isin(range_of_years)),:]

# limits which countries you want to choose from (if :
if limit_countries == True:
    subset_data = subset_data.loc[subset_data['Country'].isin(countries),:]

###########################################################################################

# puts original ID's into an actual column and replaces them with indicies relative to 
# the newly-made subset
subset_data['ID'] = subset_data.index # Gives a warning, but doesn't cause any problems.
subset_data = subset_data.set_index(pd.Series([i for i in range(len(subset_data))]))

###########################################################################################

# Here's what your modified Eurovision database looks like:
subset_data

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy


Unnamed: 0,Country,Year,Song,Performer,ID
0,Austria,1983,Hurricane,Westend,441
1,Belgium,1983,Rendez-vous,Pas de Deux,442
2,Cyprus,1983,I Agapi Akoma Zi,Stavros and Constantina,443
3,Denmark,1983,Kloden Drejer,Gry Johansen,444
4,Finland,1983,Fantasiaa,Ami Aspelund,445
5,France,1983,Vivre,Guy Bonnet,446
6,Germany,1983,Rücksicht,Hoffmann und Hoffmann,447
7,Greece,1983,Mou Les,Christie,448
8,Israel,1983,Hi,Ofra Haza,449
9,Italy,1983,Per Lucia,Riccardo Fogli,450


# Randomly select songs from your database of choice!

In [9]:
# This function picks n random songs from the given list of songs.
def random_songs(subset_data, num_songs):
    # generates n random indicies from the subset_data data frame:
    rand_ind = np.random.randint(0, len(subset_data), num_songs)

    # rand_songs returns the songs with those random indicies:
    rand_songs = subset_data.loc[rand_ind,:]
    return(rand_songs)

Now, you can randomly pick your songs! But before you run the random_songs function, make sure you run the "subsetting" section above if you want the randomly selected songs to be from a customized list.

In [10]:
num_songs = 10 # The number of songs you want to randomly pick (n)

# If you want to randomly pick songs from the entire database, set all_songs to "True"
all_songs = False

if all_songs == True:
    rand_songs = random_songs(all_data, num_songs)
else:
    rand_songs = random_songs(subset_data, num_songs)

rand_songs # This prints out the songs!

Unnamed: 0,Country,Year,Song,Performer,ID
78,Austria,1987,Nur Noch Gefühl,Gary Lux,519
178,Malta,1991,Could It Be,Paul Giordimaina and Georgina,619
49,Italy,1985,"Magic, Oh Magic",Al Bano and Romina Power,490
66,Ireland,1986,You Can Count On Me,Luv Bug,507
114,Portugal,1988,Voltarei,Dora,555
223,Italy,1993,Sole D'europa,Enrico Ruggeri,664
220,Iceland,1993,Þá Veistu Svarið,Inga,661
2,Cyprus,1983,I Agapi Akoma Zi,Stavros and Constantina,443
64,Germany,1986,Über Die Brücke Geh'n,Ingrid Peters,505
175,Israel,1991,Kan,Duo Datz,616
