# Scraping Concerts - Lab

## Introduction

Now that you've seen how to scrape a simple website, it's time to again practice those skills on a full-fledged site!
In this lab, you'll practice your scraping skills on a music website: https://www.residentadvisor.net.
## Objectives

You will be able to:
* Scrape events from a website
* Follow links to those events to retrieve further information
* Clean and store scraped data

## View the Website

For this lab, you'll be scraping the https://www.residentadvisor.net website. Start by navigating to the events page [here](https://www.residentadvisor.net/events) in your browser.

<img src="images/ra.png">

In [1]:
#Load the https://www.residentadvisor.net/events page in your browser.

## Open the Inspect Element Feature

Next, open the inspect element feature from your web browser in order to preview the underlying HTML associated with the page.

In [2]:
#Open the inspect element feature in your browser

## Write a Function to Scrape all of the Events on the Given Page Events Page

The function should return a Pandas DataFrame with columns for the Event_Name, Venue, Event_Date and Number_of_Attendees.

In [3]:
def scrape_event(event_url, headers):
    #Your code here
    url_stem = 'https://www.residentadvisor.net'
    html = requests.get(url_stem+event_url, headers=headers)
    soup = BeautifulSoup(html.text, 'html.parser')
    
    event_name = soup.find('h1').string
    venue = soup(text='Venue /')[0].parent.next_sibling.string
    event_date = [tag.string for tag in soup(text='Date /')[0].parent.next_siblings if tag.name == 'a'][0]
    number_of_attendees = int(soup.find('h1', {'id':'MembersFavouriteCount'}).string.strip())
    
    row = [[event_name, venue, event_date, number_of_attendees]]
    df = pd.DataFrame(row, dtype=object, columns=["Event_Name", "Venue", "Event_Date", "Number_of_Attendees"])
    return df

## Write a Function to Retrieve the URL for the Next Page

In [4]:
def next_page_url(soup):
    url_stem = 'https://www.residentadvisor.net'
    return url_stem + soup.find('a', {'ga-event-action':'Next '})['href']
    

## Scrape the Next 1000 Events for Your Area

Display the data sorted by the number of attendees. If there is a tie for the number attending, sort by event date.

In [None]:
#Your code here
import requests
import pandas as pd
import collections
from datetime import datetime
import time
from bs4 import BeautifulSoup

headers = {"User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.142 Safari/537.36"}
url = 'https://www.residentadvisor.net/events/us/newyork'
html = requests.get(url, headers=headers)
soup = BeautifulSoup(html.text, 'html.parser')

df = pd.DataFrame(dtype=object, columns=["Event_Name", "Venue", "Event_Date", "Number_of_Attendees"])

idx = 0

while idx < 50:
    item_urls = soup.findAll('a', {'itemprop':'url'})

    for url in item_urls:
        df = df.append(scrape_event(url['href'], headers), ignore_index=True)
        time.sleep(1)
        print(df)

    try:
        url = next_page_url(soup)
    except TypeError:
        break

    print(url)
    html = requests.get(url, headers=headers) 
    soup = BeautifulSoup(html.text, 'html.parser')

#     idx = len(df['Event_Name'])
    idx = 50
    time.sleep(3)
pd.to_datetime()
print(df)


    
# My beautiful work!!! Ruined!
# events = collections.defaultdict(list)
# class EventScraper():
#     @staticmethod
#     def retrieve_names(soup):
#         return [name.a.text for name in soup.findAll('h1', {'class':'event-title'})]
    
#     @staticmethod
#     def retrieve_venues(soup):
#         return [venue_lead_in.next_sibling.text for venue_lead_in in soup(text="at ")]
    
#     @staticmethod
#     def retrieve_dates(soup):
#         dates = [datetime.strptime(date.span.text, '%a, %d %b %Y /').strftime('%Y-%m-%d') for date in soup.findAll('p', {'class': 'eventDate'})]
#         events_for_date = []
#         count_events = 0
#         for string in html.text.split('mt16'):
#             cup_of_soup = BeautifulSoup(string, 'html.parser')
#             events_for_date.append(len(cup_of_soup.findAll('article', {'class':'event-item'})))
#         out_dates = []
#         for date, count in zip(dates, events_for_date):
#             out_dates.extend([date]*count)
#         return out_dates
        
#     @staticmethod
#     def retrieve_attendees(soup):
#         attendees = [int(attendees.text.split()[0]) for attendees in soup.findAll('p', {'class':'attending'})]
        

# idx = 0
# while idx < 500:
#     events['Event_Name'].extend(EventScraper.retrieve_names(soup))
#     events['Venue'].extend(EventScraper.retrieve_venues(soup))
#     events['Event_Date'].extend(EventScraper.retrieve_dates(soup))
#     events['Number_of_Attendees'].extend(EventScraper.retrieve_attendees(soup))
#     try:
#         url = next_page_url(soup)
#     except TypeError:
#         break
#     print(url)
#     html = requests.get(url, headers=headers) 
#     soup = BeautifulSoup(html.text, 'html.parser')
    
#     idx = len(events['Event_Name'])
#     time.sleep(3)
# for key in events.keys():
#     print(len(events[key]))
# pd.DataFrame(events, columns = ["Event_Name", "Venue", "Event_Date", "Number_of_Attendees"])

                             Event_Name     Venue   Event_Date  \
0  Body Music Therapy with Love Letters  Nowadays  24 Jul 2019   

  Number_of_Attendees  
0                   6  
                                          Event_Name      Venue    Event_Date  \
0               Body Music Therapy with Love Letters   Nowadays   24 Jul 2019   
1  Midnight Magic, Jacques Renault (Elsewhere Roo...  Elsewhere  24 Jul 2019    

  Number_of_Attendees  
0                   6  
1                  15  
                                          Event_Name      Venue    Event_Date  \
0               Body Music Therapy with Love Letters   Nowadays   24 Jul 2019   
1  Midnight Magic, Jacques Renault (Elsewhere Roo...  Elsewhere  24 Jul 2019    
2   John Silas, Kfeelz, Raqx, John Barera and Vilaen  Good Room   24 Jul 2019   

  Number_of_Attendees  
0                   6  
1                  15  
2                   6  
                                          Event_Name                  Venue  \
0  

                                           Event_Name                  Venue  \
0                Body Music Therapy with Love Letters               Nowadays   
1   Midnight Magic, Jacques Renault (Elsewhere Roo...              Elsewhere   
2    John Silas, Kfeelz, Raqx, John Barera and Vilaen              Good Room   
3                                Pure Immanence Xxxvi  Bossa Nova Civic Club   
4          Delivery. with Mira Fahrenheit and Friends                Ms. Yoo   
5     Ūndisclosed: Eli Fola, Catherine J, Jason Munoz           TBA Brooklyn   
6                               Open Decks Session 73                   Eris   
7                      Expansions NYC Midsummer Party       Doux Supper Club   
8   Expansions NYC Mid-Summer Jam with Josh, Jihad...                   Doux   
9   Morgana [free entry]: Mr.C, Kate Simko, Ryan C...        Brooklyn Mirage   
10  3024 presents: Moxie, Martyn, Shy Eyez, J.Albe...              Good Room   
11  Health Center presents: Kilbourne, 9

                                           Event_Name                  Venue  \
0                Body Music Therapy with Love Letters               Nowadays   
1   Midnight Magic, Jacques Renault (Elsewhere Roo...              Elsewhere   
2    John Silas, Kfeelz, Raqx, John Barera and Vilaen              Good Room   
3                                Pure Immanence Xxxvi  Bossa Nova Civic Club   
4          Delivery. with Mira Fahrenheit and Friends                Ms. Yoo   
5     Ūndisclosed: Eli Fola, Catherine J, Jason Munoz           TBA Brooklyn   
6                               Open Decks Session 73                   Eris   
7                      Expansions NYC Midsummer Party       Doux Supper Club   
8   Expansions NYC Mid-Summer Jam with Josh, Jihad...                   Doux   
9   Morgana [free entry]: Mr.C, Kate Simko, Ryan C...        Brooklyn Mirage   
10  3024 presents: Moxie, Martyn, Shy Eyez, J.Albe...              Good Room   
11  Health Center presents: Kilbourne, 9

                                           Event_Name                  Venue  \
0                Body Music Therapy with Love Letters               Nowadays   
1   Midnight Magic, Jacques Renault (Elsewhere Roo...              Elsewhere   
2    John Silas, Kfeelz, Raqx, John Barera and Vilaen              Good Room   
3                                Pure Immanence Xxxvi  Bossa Nova Civic Club   
4          Delivery. with Mira Fahrenheit and Friends                Ms. Yoo   
5     Ūndisclosed: Eli Fola, Catherine J, Jason Munoz           TBA Brooklyn   
6                               Open Decks Session 73                   Eris   
7                      Expansions NYC Midsummer Party       Doux Supper Club   
8   Expansions NYC Mid-Summer Jam with Josh, Jihad...                   Doux   
9   Morgana [free entry]: Mr.C, Kate Simko, Ryan C...        Brooklyn Mirage   
10  3024 presents: Moxie, Martyn, Shy Eyez, J.Albe...              Good Room   
11  Health Center presents: Kilbourne, 9

                                           Event_Name                  Venue  \
0                Body Music Therapy with Love Letters               Nowadays   
1   Midnight Magic, Jacques Renault (Elsewhere Roo...              Elsewhere   
2    John Silas, Kfeelz, Raqx, John Barera and Vilaen              Good Room   
3                                Pure Immanence Xxxvi  Bossa Nova Civic Club   
4          Delivery. with Mira Fahrenheit and Friends                Ms. Yoo   
5     Ūndisclosed: Eli Fola, Catherine J, Jason Munoz           TBA Brooklyn   
6                               Open Decks Session 73                   Eris   
7                      Expansions NYC Midsummer Party       Doux Supper Club   
8   Expansions NYC Mid-Summer Jam with Josh, Jihad...                   Doux   
9   Morgana [free entry]: Mr.C, Kate Simko, Ryan C...        Brooklyn Mirage   
10  3024 presents: Moxie, Martyn, Shy Eyez, J.Albe...              Good Room   
11  Health Center presents: Kilbourne, 9

                                           Event_Name                  Venue  \
0                Body Music Therapy with Love Letters               Nowadays   
1   Midnight Magic, Jacques Renault (Elsewhere Roo...              Elsewhere   
2    John Silas, Kfeelz, Raqx, John Barera and Vilaen              Good Room   
3                                Pure Immanence Xxxvi  Bossa Nova Civic Club   
4          Delivery. with Mira Fahrenheit and Friends                Ms. Yoo   
5     Ūndisclosed: Eli Fola, Catherine J, Jason Munoz           TBA Brooklyn   
6                               Open Decks Session 73                   Eris   
7                      Expansions NYC Midsummer Party       Doux Supper Club   
8   Expansions NYC Mid-Summer Jam with Josh, Jihad...                   Doux   
9   Morgana [free entry]: Mr.C, Kate Simko, Ryan C...        Brooklyn Mirage   
10  3024 presents: Moxie, Martyn, Shy Eyez, J.Albe...              Good Room   
11  Health Center presents: Kilbourne, 9

                                           Event_Name                   Venue  \
0                Body Music Therapy with Love Letters                Nowadays   
1   Midnight Magic, Jacques Renault (Elsewhere Roo...               Elsewhere   
2    John Silas, Kfeelz, Raqx, John Barera and Vilaen               Good Room   
3                                Pure Immanence Xxxvi   Bossa Nova Civic Club   
4          Delivery. with Mira Fahrenheit and Friends                 Ms. Yoo   
5     Ūndisclosed: Eli Fola, Catherine J, Jason Munoz            TBA Brooklyn   
6                               Open Decks Session 73                    Eris   
7                      Expansions NYC Midsummer Party        Doux Supper Club   
8   Expansions NYC Mid-Summer Jam with Josh, Jihad...                    Doux   
9   Morgana [free entry]: Mr.C, Kate Simko, Ryan C...         Brooklyn Mirage   
10  3024 presents: Moxie, Martyn, Shy Eyez, J.Albe...               Good Room   
11  Health Center presents: 

## Summary 

Congratulations! In this lab, you successfully scraped a website for concert event information!