# Scraping Concerts - Lab

## Introduction

Now that you've seen how to scrape a simple website, it's time to again practice those skills on a full-fledged site!
In this lab, you'll practice your scraping skills on a music website: https://www.residentadvisor.net.
## Objectives

You will be able to:
* Scrape events from a website
* Follow links to those events to retrieve further information
* Clean and store scraped data

## View the Website

For this lab, you'll be scraping the https://www.residentadvisor.net website. Start by navigating to the events page [here](https://www.residentadvisor.net/events) in your browser.

<img src="images/ra.png">

In [1]:
#Load the https://www.residentadvisor.net/events page in your browser.

## Open the Inspect Element Feature

Next, open the inspect element feature from your web browser in order to preview the underlying HTML associated with the page.

In [None]:
#Open the inspect element feature in your browser

## Write a Function to Scrape all of the Events on the Given Page Events Page

The function should return a Pandas DataFrame with columns for the Event_Name, Venue, Event_Date and Number_of_Attendees.

In [260]:
import re
import numpy as np
import pandas as pd
import requests
from bs4 import BeautifulSoup
import time

In [5]:
html_page = requests.get('https://www.residentadvisor.net/events').text
soup = BeautifulSoup(html_page, 'lxml')

In [70]:
print(soup.prettify())

<!DOCTYPE html>
<html lang="en,ja,es">
 <head id="_x1">
  <title>
   RA: Events in New York, United States of America
  </title>
  <meta content="text/html; charset=utf-8" http-equiv="Content-Type"/>
  <meta content="en,ja,es" http-equiv="content-language"/>
  <meta content="RA: Resident Advisor" name="Description"/>
  <meta content="RA, residentadvisor, resident, advisor, music, ra, events, in, new, york, united, states, america" name="Keywords"/>
  <meta content="Resident Advisor" name="Author"/>
  <meta content="Resident Advisor" property="og:site_name"/>
  <meta content="712773712080127" property="fb:app_id"/>
  <link href="/bundles/default-css?v=73_zn4f444Ms1nbtnaddvbDUe15CsJN6vhoNK7oQovg1" rel="stylesheet"/>
  <meta content="app-id=981952703, app-argument=ra-guide://search" name="apple-itunes-app"/>
  <link href="/bundles/cat-listings-css?v=w7DJdRHlwvlSlvivLjU2DnToUsYFU7IYixebCORYtxw1" rel="stylesheet"/>
  <link href="/favicon.ico" rel="icon" type="image/vnd.microsoft.icon"/>
  <

In [15]:
event_list = soup.find('div', {'id':"event-listing"})

In [265]:
e_list_class = event_list.findAll('article')

In [266]:
print(e_list_class)

[<article class="event-item clearfix tickets-bkg-logo" itemscope="" itemtype="http://data-vocabulary.org/Event"><a href="/events/1265305#tickets"><img class="nohide" src="https://residentadvisor.net/images/ra-tix.png" style="height: 23px; width: 40px; right: 0px; position: absolute; top: 1px;"/></a><span style="display:none;"><time datetime="2019-05-16T00:00" itemprop="startDate">2019-05-16T00:00</time></span><a href="/events/1265305"><img height="76" src="/images/events/flyer/2019/5/us-0516-1265305-list.jpg" width="152"/></a><div class="bbox"><h1 class="event-title" itemprop="summary"><a href="/events/1265305" itemprop="url" title="Event details of Morgana [free entry]: DJ Three, Doc Martin, Forever Jung">Morgana [free entry]: DJ Three, Doc Martin, Forever Jung</a> <span>at <a href="/club.aspx?id=105938">Brooklyn Mirage</a></span></h1><div class="grey event-lineup">DJ Three, Doc Martin, Forever Jung</div><p class="attending"><span>712</span> Attending</p></div></article>, <article cla

In [376]:
e_list_class[0].findAll('a')[-1].text

'Brooklyn Mirage'

In [264]:
for i in range(0, len(e_list_class)):
    print(e_list_class[i].findAll('a')[0].text)

Thu, 16 May 2019 /






Micro Vision x Wex: Seemingly Normal People
Jlin [Live Set]
Red Bull Music Festival & Pioneer Works present Holly Herndon: Proto
Let's Play House: L'amour Existe Encore
Kiosk Man • Buskko • Samuel
Kontravoid, Multiple Man, Confines
Nycxdesign Urban Imprint Launch Party
Dark Dance II
Modal Form 29: Olga / M Parent / Kohl
The Kitchen Spring Gala After-Party with Total Freedom
Sistaspin Residency with Robb Mazz, Kai The Black Angel, Deviantart Heaux and More
yes&yay Pres: The Den - Marcelo Garzozi, Nico Kass
Leo Leonski
Trpl Blk Thursdays
Submit event
Fri, 17 May 2019 /















Call Super + CCL
Egyptian Lover
Eamon & Justin, Soul Summit, Analog Soul, and More
Satori & The Band From Space: Live in Concert
Nycxdesign Urban Imprint Launch Party
Sleepy & Boo, Dpak Manny Digz, Flowmingo - Nublu Classic
Origins NY: 012
Loveless Records and Friends with The Dance Pit, Montepiedra and More
Felipe From BK x Bryce David x Champs
Technofeminism
Chunk
Sidewalks and Ske

In [407]:
def scrape_events(events_page_url):
    html_page = requests.get(events_page_url).text
    soup = BeautifulSoup(html_page, 'lxml')
    event_list = soup.find('div', {'id':"event-listing"})
    e_list_class = event_list.findAll('article')
    df = []
    for i in range(0, len(e_list_class)):
        #if e_list_class[i].findAll('a')[0].text != '':
            event_name = e_list_class[i].find('h1', class_="event-title").text
            event_date = e_list_class[i].find('time').text
            try:
                n_attendees = e_list_class[i].find('p').text
            except:
                n_attendees = np.nan
            
            venue = e_list_class[i].findAll('a')[-1].text
            df.append([event_name, event_date, venue, n_attendees])
    
    return pd.DataFrame(df,columns=["Event_Name", "Event_Date", "Venue", "N_Attendees"])


In [414]:
events_page_url = 'https://www.residentadvisor.net/events/us/newyork/week/2019-05-17'

In [415]:
scrape_events(events_page_url)

Unnamed: 0,Event_Name,Event_Date,Venue,N_Attendees
0,Innervisions New York at Knockdown Center,2019-05-17T00:00,Knockdown Center,706 Attending
1,Headless Horseman Live / Vatican Shadow / Volv...,2019-05-17T00:00,BASEMENT,255 Attending
2,Friday: PLO Man All Night at Nowadays,2019-05-17T00:00,Nowadays,92 Attending
3,ReSolute w Move D & Flabbergast at TBA - New York,2019-05-17T00:00,ReSolute w Move D & Flabbergast,62 Attending
4,Material 17: Nico Laa at Hart bar,2019-05-17T00:00,Hart bar,24 Attending
5,Full Moon with Sébastien Léger at House Of Yes,2019-05-17T00:00,House Of Yes,19 Attending
6,Pete Rock at Analog Bkny,2019-05-17T00:00,Analog Bkny,12 Attending
7,"Museum of Love (DJ set), L&l&l Record Club Plu...",2019-05-17T00:00,Good Room,11 Attending
8,"Just Blaze, Matt FX and Trillnatured at Elsewhere",2019-05-17T00:00,Elsewhere,
9,"Rendezvous with Sons of Immigrants, Arvi, CGC ...",2019-05-17T00:00,TBA Brooklyn,


## Write a Function to Retrieve the URL for the Next Page

In [None]:
def next_page(url):
    #Your code here
    return next_page_url

## Scrape the Next 1000 Events for Your Area

Display the data sorted by the number of attendees. If there is a tie for the number attending, sort by event date.

In [None]:
#Your code here

## Summary 

Congratulations! In this lab, you successfully scraped a website for concert event information!