# Backend Database (Part 3: Venue Data)

The foundation of our backend is to have an extensive list of artist and information relevant to them. We will use X different APIs to collect various information:

1. lastfm API to gather a long list of artists, mainly those popular in the US.
2. SeatGeek API to gather data on upcoming concerts, particularly ticket pricing and concert size.
3. **Scrape SeatGeek website for capacity information on concert venues.**
4. Songkick API to retrieve data on historical concerts.
5. Scrape Billboard website for recent successful concerts. Includes information like revenue and attendance.

In [1]:
from bs4 import BeautifulSoup 
import requests
import re
import MySQLdb as mdb
import sys

In [2]:
url = "https://seatgeek.com/tba/articles/capacities-sizes/#heading-274897014459836830"

In [3]:
#Functions to scrap and format information from website

def venue_scrapper(url):
    ''' Function which scraps the table from SeatGeek's blog. The blog contains
    internal data on different venues with capacity.'''
    venues = requests.get(url)
    venues_soup = BeautifulSoup(venues.text, 'html.parser')
    table = []
    tr = venues_soup.findAll('tr')
    for i in range(1, len(tr)): #skip headers
        row = tr[i].findAll('td') #get each row, which is denoted by td
        row = [str(r) for r in row] #convert to string
        row = [r.replace('<td>', '') for r in row] #remove <td>
        row = [r.replace('</td>', '') for r in row] #remove </td>
        table.append(row)
    return table

def name_change(table):
    ''' Function changes the venue name to lowercase and removes blank spaces.
    We will be using venue name to match the other table, so it is important
    the names are consistent'''
    for i in range(len(table)):
        table[i][0] = table[i][0].lower().replace(" ", "-")
    return table

In [4]:
#Creating the table
table = venue_scrapper(url)
table = name_change(table)

table[0]

['may-day-stadium', 'Pyongyang', 'North Korea', '150000', 'n/a']

## Creating Venue Database

In [5]:
#Connecting to MySQL database
con = mdb.connect(host = 'localhost', 
                  user = 'root',
                  passwd = '<password>', 
                  charset='utf8', use_unicode=True);

In [6]:
# Run a query to create a database that will hold the data
db_name = 'Project'
create_db_query = "CREATE DATABASE IF NOT EXISTS {db} DEFAULT CHARACTER SET 'utf8'".format(db=db_name)

# Create a database
cursor = con.cursor()
cursor.execute(create_db_query)
cursor.close()

  import sys


In [12]:
#Create a table for Trending_Descriptions (static data)
cursor = con.cursor()
table_name = 'venues'
# Create a table
# The {db} and {table} are placeholders for the parameters in the format(....) statement
create_table_query = '''CREATE TABLE IF NOT EXISTS {db}.{table} 
                                (venue varchar(250), 
                                city varchar(250),
                                state varchar(250),
                                capacity int,
                                PRIMARY KEY(venue, city)
                                )'''.format(db=db_name, table=table_name)
cursor.execute(create_table_query)
cursor.close()

In [13]:
#Creating description table and fetch data 
cursor = con.cursor()
table_name = 'venues'

query_template = '''INSERT IGNORE INTO {db}.{table}(venue, 
                                            city,
                                            state,
                                            capacity) 
                                            VALUES (%s, %s, %s, %s)'''.format(db=db_name, table=table_name)

cursor = con.cursor()

 
for row in table:
    venue = row[0]
    city = row[1]
    state = row[2]
    capacity = int(row[3])
    
    query_parameters = (venue, city, state, capacity)
    cursor.execute(query_template, query_parameters)

con.commit()
cursor.close()

