**Objective:** Use BeautifulSoup in order to obtain a dataset of all Country Music Hall of Fame inductees.
Take the contents of this website and convert it into a pandas DataFrame.

In [None]:
import pandas as pd 
import requests
from bs4 import BeautifulSoup as BS
import re

In [None]:
#request data from country music site 
url = 'https://countrymusichalloffame.org/hall-of-fame/members/'
response = requests.get(url)

In [None]:
type(response)

In [None]:
response.status_code

**Step 1:** Start by using either the inspector or by viewing the page source. Can you identify a tag that might be helpful for finding the names of all inductees? Make use of this to create a list containing just the names of each inductee.

In [None]:
#Helpful Tag: <h3>

&emsp;&emsp;Make the text more readable

In [None]:
text = BS(response.text)
text

&emsp;&emsp;Find all instances of the tag

In [None]:
names_tag = text.findAll('h3')
names_tag

In [None]:
text.find('h3').text

In [None]:
names_tag[0]

&emsp;&emsp;Remove Tag from Names

In [None]:
page1_names = []

for name in names_tag:
    page1_names.append(name.text)
page1_names

&emsp;&emsp;Names of All Members

In [None]:
root_url = 'https://countrymusichalloffame.org/hall-of-fame/members/page/'
all_names = []
for p in range(1, 17):
    r = requests.get(root_url + str(p))
    t = BS(r.text, 'html.parser')
    n = t.find_all('h3')
    
    for i in n:
        all_names.append(i.text)
all_names

In [None]:
len(all_names)

**Step 2:** Try and find a tag that could be used to find the year that each member was inducted. Extract these into a list. When you do this, be sure to only include the year and not the full text. For example, for Roy Acuff, the list entry should be "1962" and not "Inducted 1962". Double-check that the resulting list has the correct number of elements and is in the same order as your inductees list.

In [None]:
dates = text.find_all('p')
dates

In [None]:
len(dates)

In [None]:
type(dates)

In [None]:
date_list = []

for date in dates:
    date_list.append(re.findall(r'Inducted \d{4}', date.text))
    date_list2 = list(filter(None, date_list))
date_list2   

In [None]:
flat_list = [i for dlist in date_list2 for i in dlist]
flat_list

In [None]:
for d in flat_list: 
    d_list = re.search(r'\d{4}', d).group(0)
    print(d_list)

&emsp;&emsp;Induction Year of All HOF Members

In [None]:
root_url = 'https://countrymusichalloffame.org/hall-of-fame/members/page/'
full_list = []

for p in range(1, 17):
    r = requests.get(root_url + str(p))
    t = BS(r.text, 'html.parser')
    d = t.find_all('p')
    
    for item in d:
        full_list.append(re.findall(r'Inducted \d{4}', item.text))
        full_dlist = list(filter(None, full_list))
print(full_dlist)

In [None]:
all_dates = [i for d in full_dlist for i in d]
all_dates

In [None]:
all_dates_list = []
for d in all_dates: 
    all_dates_list.append(re.search(r'\d{4}', d).group(0))
all_dates_list

In [None]:
len(all_dates_list)

**Step 3:** Take the two lists you created on parts 1 and 2 and convert it into a pandas DataFrame.

In [None]:
HOF_members = pd.DataFrame({'Name': all_names, 'Year Inducted': all_dates_list})
HOF_members

In [None]:
HOF_members.to_excel("HOF members.xlsx") 