# About this module

The module scrapes Eurovision data off of the main site which stores that data, which is https://eschome.net/index.html

The structure of the site is to provide the user with various slices of the data, which are presented in html tables.  No direct access to the underlying data is available.

At first glance, it seems like eschome.net is dynamically rendering the data in a way that will make it hard to scrape, but as you investigate the site, it turns out you can reverse-engineer the html POST operations which render the data, so even though the actual database calls are hidden in some php code, you can treat the resulting pages as static html.

The data of interest for this presentation is:

* List of every Eurovision final (reference)

* List of years, countries that participated in the finals that year, the order in which they performed in the finals, and how they placed. (to allow analysis of how important it is which order you perform in)

* List of years, participant countries, and how they voted (to allow analysis of block voting, like "all the Baltic countries vote together")

# Get list of all 

In [10]:
import requests
import pandas as pd
from io import StringIO as SIO

# url generated by eschome.net when you click on "List of all Final Events" (no details)
url = 'https://eschome.net/databaseoutput410.php'

# get the full page text
page = requests.post(url)

# create a list of tables
list_of_tables = pd.read_html(SIO(page.text), header = 0)

# assign second table to dataframe and keep only the interesting columns
df = list_of_tables[1]
all_cols = df[['Year','Country','City','Location','Broadcaster','Date']]
print("Imported " + str(len(all_cols)) + " finals records.")
all_cols.to_csv('all_finals.csv', index = False)

Imported 67 finals records.
