#### Web Scraper Lab Solutions

##### Connecting to Data And Initializing the Web Scraper

In [1]:
# imports
from bs4 import BeautifulSoup
import pandas as pd
import requests

In [9]:
# we'll initialize the request
req = requests.get('https://www.yelp.com/search?find_desc=Restaurants&find_loc=London&start=0')

In [10]:
# this is not json data, so instead we'll get the entire web page as text
req.text



In [11]:
# feed the text into a scraper
scraper = BeautifulSoup(req.text)

##### Finding the Titles

In [12]:
# use the find_all method to select every <a> tag, along with its accompanying classes
titles = scraper.find_all('a', {'class': 'css-166la90'})

In [13]:
# if you look at the contents of this list, you'll see we have some cleaning to do!
titles

[<a class="css-166la90" href="/biz/the-mayfair-chippy-london-2?osq=Restaurants" name="The Mayfair Chippy" rel="" target="">The Mayfair Chippy</a>,
 <a class="css-166la90" href="/biz/dishoom-london?osq=Restaurants" name="Dishoom" rel="" target="">Dishoom</a>,
 <a class="css-166la90" href="/biz/flat-iron-london-2?osq=Restaurants" name="Flat Iron" rel="" target="">Flat Iron</a>,
 <a class="css-166la90" href="/biz/ffionas-restaurant-london?osq=Restaurants" name="Ffiona’s Restaurant" rel="" target="">Ffiona’s Restaurant</a>,
 <a class="css-166la90" href="/biz/restaurant-gordon-ramsay-london-3?osq=Restaurants" name="Restaurant Gordon Ramsay" rel="" target="">Restaurant Gordon Ramsay</a>,
 <a class="css-166la90" href="/biz/the-fat-bear-london?osq=Restaurants" name="The Fat Bear" rel="" target="">The Fat Bear</a>,
 <a class="css-166la90" href="/biz/the-breakfast-club-london-2?osq=Restaurants" name="The Breakfast Club" rel="" target="">The Breakfast Club</a>,
 <a class="css-166la90" href="/biz/

In [72]:
# if you check the data type of the item, you'll notice it's NOT a string, but rather a specialized scraper object
type(titles[0])

bs4.element.Tag

In [14]:
# if you want to grab the content inside the tag you can use the text element
titles[0].text

'The Mayfair Chippy'

In [15]:
# loop and get what you need
titles = [title.text for title in titles]

In [16]:
# there are some additional values that we don't need
titles

['The Mayfair Chippy',
 'Dishoom',
 'Flat Iron',
 'Ffiona’s Restaurant',
 'Restaurant Gordon Ramsay',
 'The Fat Bear',
 'The Breakfast Club',
 'Padella',
 'Dishoom',
 'The Golden Chippy',
 '2',
 '3',
 '4',
 '5',
 '6',
 '7',
 '8',
 '9',
 '']

In [17]:
# so select them out
titles = [title for title in titles if len(title) > 1]
# and it looks good
titles

['The Mayfair Chippy',
 'Dishoom',
 'Flat Iron',
 'Ffiona’s Restaurant',
 'Restaurant Gordon Ramsay',
 'The Fat Bear',
 'The Breakfast Club',
 'Padella',
 'Dishoom',
 'The Golden Chippy']

Perfect!!  

Now let's follow a similar process to get the number of ratings for each item.

#### Step 1 Solution:

In [18]:
num_reviews = scraper.find_all('span', {'class': 'reviewCount__09f24__EUXPN'})

In [21]:
# grab the text attribute
num_reviews = [review.text for review in num_reviews]
# and it looks like we're good
num_reviews

['283', '1841', '380', '268', '204', '122', '494', '207', '547', '107']

#### Step 2 Solution:

Using the selection criteria that we had before we also had the price range included, so we'll grab that as well.

In [22]:
price_ranges = scraper.find_all('span', {'class': 'priceRange__09f24__2O6le'})

price_ranges = [ranges.text for ranges in price_ranges]

In [23]:
# and we're good
price_ranges

['££', '££', '££', '££', '££££', '££', '££', '££', '££', '££']

##### Step 3 Solution:  Turning Our Data into a Dataframe

Using a step similar to what we used in the previous lab, let's turn our results into a dataframe

In [24]:
df_dict = {
    'Name': titles,
    'NumReviews': num_reviews,
    'PriceRange': price_ranges
}

df = pd.DataFrame(df_dict)

In [25]:
# beautiful :)
df

Unnamed: 0,Name,NumReviews,PriceRange
0,The Mayfair Chippy,283,££
1,Dishoom,1841,££
2,Flat Iron,380,££
3,Ffiona’s Restaurant,268,££
4,Restaurant Gordon Ramsay,204,££££
5,The Fat Bear,122,££
6,The Breakfast Club,494,££
7,Padella,207,££
8,Dishoom,547,££
9,The Golden Chippy,107,££
