# Scraping Concerts - Lab

## Introduction

Now that you've seen how to scrape a simple website, it's time to again practice those skills on a full-fledged site!
In this lab, you'll practice your scraping skills on a music website: https://www.residentadvisor.net.
## Objectives

You will be able to:
* Scrape events from a website
* Follow links to those events to retrieve further information
* Clean and store scraped data

## View the Website

For this lab, you'll be scraping the https://www.residentadvisor.net website. Start by navigating to the events page [here](https://www.residentadvisor.net/events) in your browser.

<img src="images/ra.png">

In [None]:
#Load the https://www.residentadvisor.net/events page in your browser.

## Open the Inspect Element Feature

Next, open the inspect element feature from your web browser in order to preview the underlying HTML associated with the page.

In [None]:
#Open the inspect element feature in your browser

## Write a Function to Scrape all of the Events on the Given Page Events Page

The function should return a Pandas DataFrame with columns for the Event_Name, Venue, Event_Date and Number_of_Attendees.

In [31]:
import pandas as pd
import re
import numpy as np
import requests
from bs4 import BeautifulSoup
import time

In [32]:
response = requests.get('https://www.residentadvisor.net/events')
soup = BeautifulSoup(response.content, 'html.parser')

In [33]:
soup

<!DOCTYPE html>

<html lang="en,ja,es">
<head id="_x1"><title>
	RA: Events in Manchester, United Kingdom
</title><meta content="text/html; charset=utf-8" http-equiv="Content-Type"/><meta content="en,ja,es" http-equiv="content-language"/><meta content="RA: Resident Advisor" name="Description"/><meta content="RA, residentadvisor, resident, advisor, music, ra, events, in, manchester, united, kingdom" name="Keywords"/><meta content="Resident Advisor" name="Author"/><meta content="Resident Advisor" property="og:site_name"/><meta content="712773712080127" property="fb:app_id"/><link href="/bundles/default-css?v=73_zn4f444Ms1nbtnaddvbDUe15CsJN6vhoNK7oQovg1" rel="stylesheet"/>
<meta content="app-id=981952703, app-argument=ra-guide://search" name="apple-itunes-app"/><link href="/bundles/cat-listings-css?v=w7DJdRHlwvlSlvivLjU2DnToUsYFU7IYixebCORYtxw1" rel="stylesheet"/>
<link href="/favicon.ico" rel="icon" type="image/vnd.microsoft.icon"/><link color="#000000" href="/images/ra_icon.svg" rel="mas

In [86]:
event_listings = soup.findAll('h1')

In [87]:
event_listings

[<h1>Events</h1>,
 <h1 class="listing-heading heading">Popular events in Manchester</h1>,
 <h1>
 Making Faces with DEBONAIR/Jon K/12th Isle/Godspeed You Peter Andre
 </h1>,
 <h1>
 Sankeys25: The 25th Anniversary Festival
 </h1>,
 <h1>
 Andrew Weatherall [all-night]
 </h1>,
 <h1>
 HMS High Emotion Boat Party
 </h1>,
 <h1>
 Hospital Productions [at] The White Hotel (PT1)
 </h1>,
 <h1>
 Ordinary Friends presents Call Super
 </h1>,
 <h1>
 Joy Orbison / Rahim / Lyster
 </h1>,
 <h1>
 Supernature Does Saturday
 </h1>,
 <h1 class="event-title" itemprop="summary"><a href="/events/1260365" itemprop="url" title="Event details of Making Faces with DEBONAIR/Jon K/12th Isle/Godspeed You Peter Andre">Making Faces with DEBONAIR/Jon K/12th Isle/Godspeed You Peter Andre</a> <span>at <a href="/club.aspx?id=112509">The White Hotel</a></span></h1>,
 <h1 class="event-title" itemprop="summary"><a href="/events/1274586" itemprop="url" title="Event details of Transmute: Rabit &amp; Maxwell Sterling / Croww / A

In [85]:
event_listings.findAll(class_="title")

AttributeError: ResultSet object has no attribute 'findAll'. You're probably treating a list of items like a single item. Did you call find_all() when you meant to call find()?

In [45]:
#finding date
entries = event_listings.findAll('li')
entries

[<li><p class="eventDate date"><a href="/events.aspx?ai=344&amp;v=day&amp;mn=6&amp;yr=2019&amp;dy=28"><span>Fri, 28 Jun 2019 /</span></a></p></li>,
 <li class=""><article class="event-item clearfix tickets-bkg-logo" itemscope="" itemtype="http://data-vocabulary.org/Event"><a href="/events/1260365#tickets"><img class="nohide" src="https://residentadvisor.net/images/ra-tix.png" style="height: 23px; width: 40px; right: 0px; position: absolute; top: 1px;"/></a><span style="display:none;"><time datetime="2019-06-28T00:00" itemprop="startDate">2019-06-28T00:00</time></span><a href="/events/1260365"><img height="76" src="/images/events/flyer/2019/6/uk-0628-1260365-list.jpg" width="152"/></a><div class="bbox"><h1 class="event-title" itemprop="summary"><a href="/events/1260365" itemprop="url" title="Event details of Making Faces with DEBONAIR/Jon K/12th Isle/Godspeed You Peter Andre">Making Faces with DEBONAIR/Jon K/12th Isle/Godspeed You Peter Andre</a> <span>at <a href="/club.aspx?id=112509">

In [50]:
date = entries[0].find('p', class_='eventDate date')
date.find('span').text

'Fri, 28 Jun 2019 /'

In [59]:
def scrape_events(events_page_url):
    response = requests.get('https://www.residentadvisor.net/events')
    soup = BeautifulSoup(response.content, 'html.parser')
    event_listings = soup.findAll('li', id="event-listing")
    rows = []
    
    for event in event_listings:
        date = event.find('p', class_='eventDate date')
        final_date = date.find('span').text
        rows.append(final_date)
        
    return event_listings

In [60]:
scrape_events('https://www.residentadvisor.net/events')

[]

## Write a Function to Retrieve the URL for the Next Page

In [None]:
def next_page(url):
    #Your code here
    return next_page_url

## Scrape the Next 1000 Events for Your Area

Display the data sorted by the number of attendees. If there is a tie for the number attending, sort by event date.

In [None]:
#Your code here

## Summary 

Congratulations! In this lab, you successfully scraped a website for concert event information!