In [1]:
# Run this cell to get all imported packages

import pprint
import pandas as pd
import requests
import re
import pandas as pd
from bs4 import BeautifulSoup
from enum import Enum

# Homework 3
### June 18, 2021

**Main Topics**
- Dictionaries
- Arrays
- Functions
- Loops
- Conditionals

These four concepts represent a great deal of day-to-day programming. The good news is that there's no review for today. Everything is in the main problem, which is pretty interesting.

# Today's Problem
## Where to Eat?

The Michelin Guide is famous for being extremely exclusive with its star-awarding system. Out of roughly 660,755 restaurants in the United States, only 178 have stars. The stars system is summarized in the table below.

| Stars | Amount in United States | Michelin Guide Explanation |
| :-- | :-- | :-- |
| 3 (&ast;&ast;&ast;) | 13 | Exceptional cuisine, worth a special journey! |
| 2 (&ast;&ast;) | 32 | Excellent cooking, worth a detour! |
| 1 (&ast;) | 133 | High quality cooking, worth a stop! |

Even though the guide has only 178 starred restaurants, nobody is ever going to go to all 178 of them. We want to build a tool that will help us filter through all of the starred restaurants in America.

By the end of the assignment, we will have defined a function `michelin_filter` which takes in a price limit and the number of stars you want in your restaurant, and gives you a list of restaurant names that meet those criteria.

Here's an example to demonstrate:
``` python
restaurants_dict = {
    ... # this dictionary (<restaurant name>: <list of properties>) is provided for you
}

def michelin_filter(price_limit, number_of_stars):
    restaurant_names = []
    # Do something in here to look for these restaurants in the dictionary.
    return restaurant_names
```

<a id='top'></a>
## What you get
This may seem very hard, but the good news is that I've done 99% of the hard work for you. If you've ever been curious as to how things get scraped from the internet, all the work I had to do to scrape the website information is below, so you can check it out there.

Here's a list of things you get.
1. A dictionary of all the Michelin-starred restaurants (First two previewed below)
``` python
{
    'Acadia': ['https://guide.michelin.com/us/en/illinois/chicago/restaurant/acadia',
                2,
                '**',
                '1639 S. Wabash Ave., Chicago, 60616, United States',
                'One of the true gems of South Loop, Acadia is the impassioned '
                'restaurant of the talented Ryan McCaskey...',
                'range(185, 186)',
                'Contemporary,AmericanContemporary',
                'MICHELIN Guide United States',
                'United States',
                'Illinois',
                'Acadia'],
    'Acquerello': ['https://guide.michelin.com/us/en/california/san-francisco/restaurant/acquerello',
                    2,
                    '**',
                    '1722 Sacramento St., San Francisco, 94109, United States',
                    'With its air of old-world sophistication, this is the kind of '
                    'establishment where one dresses for dinner, and celebrants of '
                    'a certain age are happy to splurge...',
                    'range(175, 276)',
                    'Italian,Contemporary',
                    'MICHELIN Guide California',
                    'California',
                    'San Francisco Restaurants',
                    'Acquerello']
}
```
2. As you may be able to see, this dictionary is formatted with its key as the restaurant name, and the value being an array of information. The information in the array is as follows:

<a id='properties_table'></a>
    
| Index | What it is | Example |
| :-- | :-- | :-- |
| 0 | url | https://guide.michelin.com/us/en/illinois/chicago/restaurant/acadia |
| 1 | number of stars | 2 |
| 2 | symbol for number of stars using asterisks | "&ast;&ast;" |
| 3 | address | "1722 Sacramento St., San Francisco, 94109, United States" |
| 4 | review | "One of the true gems of South Loop, Acadia..." | 
| 5 | price range | range(175, 275) |
| 6 | cuisine | "Italian,Contemporary" |
| 7 | part of which guide | "MICHELIN Guide California" |
| 8 | locale | "California" |
| 9 | location grouping | "San Francisco Restaurants" |
| 10 | restaurant name | "Acquerello" |

# [Jump to Question](#start_work)

In [2]:
# Give me a list of restaurant URLs

michelin_main_url = "https://guide.michelin.com"

def all_links():
    michelin_url = michelin_main_url + "/us/en/restaurants/3-stars-michelin/2-stars-michelin/1-star-michelin"
    num_pages = 9
    all_urls = []

    for i in range(num_pages):
        page_number_string = "/page/" + str(i + 1)
        page_url = michelin_url + page_number_string
        all_urls += get_restaurant_links(page_url)
    
    return all_urls

def get_restaurant_links(url):
    all_restaurants_page = requests.get(url)
    all_restaurants_soup = BeautifulSoup(all_restaurants_page.content, "html.parser")
    class_name = "col-md-6 col-lg-6 col-xl-3"
    all_restaurants = all_restaurants_soup.find_all("div", {"class": class_name})
    
    restaurant_urls = []
    
    for restaurant in all_restaurants:
        try:
            restaurant_block = restaurant.find("h3", {"class": "card__menu-content--title last pl-text pl-big"})
            restaurant_url = michelin_main_url + restaurant_block.find("a", href = True).get("href")
            restaurant_urls.append(restaurant_url)
        except AttributeError:
            restaurant_block = restaurant.find("h3", {"class": "card__menu-content--title pl-text pl-big"})
            restaurant_url = michelin_main_url + restaurant_block.find("a", href = True).get("href")
            restaurant_urls.append(restaurant_url)
    return restaurant_urls

In [3]:
class Stars(Enum):
    one = "m"
    two = "n"
    three = "o"
    
class Restaurant:
    def __init__(self, url):
        self.url = url
        self.get_details()
        
    def convert_stars(self, star):
        if star.name == "one":
            return 1, "*"
        elif star.name == "two":
            return 2, "**"
        else:
            return 3, "***"
        
    def get_details(self):
        page = requests.get(self.url)
        soup = BeautifulSoup(page.content, "html.parser")
        
        breadcrumb_items = soup.find_all(class_="breadcrumb-item")
        star_denotation = soup.find(class_="fa-michelin").text
        self.stars_count = self.convert_stars(Stars(star_denotation))[0]
        self.star_denotation = self.convert_stars(Stars(star_denotation))[1]
        
        self.address = soup.find(class_="restaurant-details__heading--list").find("li").text
        self.review = soup.find(class_="js-show-description-text").get_text(separator=" ").replace('\n', '').strip()
        self.cost = ""
        self.cuisine = ""

        for i in range(len((breadcrumb_items))):
            item = breadcrumb_items[i]
            item_text = item.text
            if i == 0:
                self.guide_title = item_text
            elif i == 1:
                self.locale = item_text
            elif i == 2:
                self.location = item_text
            else:
                self.name = item.get_text(separator=" ").replace('\n', '').strip()

        subheading_text = soup.find("li", {"class": "restaurant-details__heading-price"}).text
        subheading = re.sub('[\s+]', '', subheading_text)

        price = re.findall('.*(?=\•)', subheading)
        category = re.findall('\•(.*)', subheading)

        if price:
            cost_soup = price[0]
            cost_soup = re.findall('.*(?=USD)', cost_soup)
            if cost_soup:
                self.cost = cost_soup[0]
            
        if category:
            self.cuisine = category[0]

In [4]:
def get_restaurant_info():
    urls = all_links()
    all_restaurants = []
    for link in urls:
        restaurant = Restaurant(link)
        all_restaurants.append(restaurant)
    return all_restaurants

In [5]:
restaurants_array = get_restaurant_info()

In [6]:
def convert_restaurant_array_to_df(restaurants_array):
    column_names = list(restaurants_array[0].__dict__.keys())
    df_dict = {}
    
    keys = list(restaurants_array[0].__dict__.keys())
    for restaurant in restaurants_array:
        values = list(restaurant.__dict__.values())
        df_dict[restaurant.name] = values
    
    restaurant_df = pd.DataFrame(df_dict).T
    restaurant_df.columns = keys
    
    return restaurant_df.drop("name", axis = 1)

def convert_string_to_range(string_range):
    lower_bound = 0
    upper_bound = 0

    # If it's a range
    if "-" in string_range:
        lower_bound_group = re.findall(".*(?=-)", string_range)
        upper_bound_group = re.findall("\-(.*)", string_range)

        # If there's an item in lower_bound_group
        if lower_bound_group:
            try:
                lower_bound = int(lower_bound_group[0])
            except ValueError:
                pass
        if upper_bound_group:
            try:
                upper_bound = int(upper_bound_group[0]) + 1
            except ValueError:
                pass
    else:
        lower_bound = int(string_range)
        upper_bound = int(string_range) + 1
    
    return range(lower_bound, upper_bound)

def convert_restaurant_array_to_dict(restaurants_array):
    column_names = list(restaurants_array[0].__dict__.keys())
    restaurants_dict = {}
    
    keys = list(restaurants_array[0].__dict__.keys())
    for restaurant in restaurants_array:
        values = list(restaurant.__dict__.values())
        string_range = values[5]
        values[5] = convert_string_to_range(string_range)
        restaurants_dict[restaurant.name] = values
    
    return restaurants_dict

In [7]:
restaurants_df = convert_restaurant_array_to_df(restaurants_array)
restaurants_df = restaurants_df.sort_values(by = "stars_count", ascending = False)
restaurants_dict = convert_restaurant_array_to_dict(restaurants_array)
pprint.pprint(restaurants_dict)

{'Acadia': ['https://guide.michelin.com/us/en/illinois/chicago/restaurant/acadia',
            2,
            '**',
            '1639 S. Wabash Ave., Chicago, 60616, United States',
            'One of the true gems of South Loop, Acadia is the impassioned '
            'restaurant of the talented Ryan McCaskey. Pulling from his '
            'Vietnamese heritage as well as his travels in Maine, Chef '
            'McCaskey’s cooking is ambitious, precise and deliciously '
            'technical. Perhaps even more importantly, his kitchen’s '
            'commitment to that vision is palpable in every bite. Set between '
            'glassy apartment buildings and a small patch of grass, Acadia is '
            'a distinct stand-alone destination. The interior is decidedly '
            'elegant, with soaring ceilings and a soothing blend of warm '
            'neutrals, cool grays, as well as chocolate and sage peppered '
            'throughout the lofty space. Service is gracious an

                         "handblown water vases. It's all evidence of Chef "
                         'Daniel Humm’s masterful precision that extends '
                         'through the cuisine as it artfully unfolds before '
                         'each guest. The kitchen has a signature way with '
                         'delicate presentations; and the menu beams with '
                         'pristine ingredients that are handled with '
                         'impressive creativity. Courses are so well conceived '
                         'that they often seem as continuations of the same '
                         'dish; specialties tend to reflect a flair for the '
                         'dramatic that suit this luxurious room. The pasty '
                         'team further enhances your dining experience by '
                         'making sure that every meal ends on a high note. '
                         'Cocktails are a must and the beverage program has '


            'atop an array of cucumber and melon. Sonoma duck breast composed '
            "with stone fruit purée and the season's greenery is lean and "
            'velvety. Desserts, such as the caramel pot de crème with a double '
            'hit of almond crumble and ice cream, are simple perfection.',
            range(56, 99),
            'Contemporary,JapaneseContemporary',
            'MICHELIN Guide California',
            'California',
            'San Anselmo Restaurants',
            'Madcap'],
 'Madera': ['https://guide.michelin.com/us/en/california/menlo-park/restaurant/madera',
            1,
            '*',
            '2825 Sand Hill Rd., Menlo Park, 94025, United States',
            'This is a swanky spot for fine dining in the Rosewood Sand Hill '
            'hotel. The grand open kitchen, roaring fireplace and large '
            'outdoor patio with gorgeous views of the Santa Cruz mountains '
            'draw an affluent crowd of local techies. While lunch

                'thanks to its smoky outer layer giving way to melting '
                'richness. Finally, all the luxurious standards are perfectly '
                'executed here, including Santa Barbara uni, Hokkaido scallops '
                'and first-rate fatty tuna.',
                range(50, 121),
                'Japanese,Sushi',
                'MICHELIN Guide California',
                'California',
                'Encino Restaurants',
                'Shin Sushi'],
 'Shunji': ['https://guide.michelin.com/us/en/california/us-los-angeles/restaurant/shunji',
            1,
            '*',
            '12244 W. Pico Blvd., Los Angeles, 90064, United States',
            'Behind this notable counter is a chef whose experience runs deep. '
            'Japanese-born sushi maestro—Shunji Nakao—previously flashed his '
            "knife at Nobu Matsuhisa’s eponymous 1980's spot. He is also known "
            'for his work at Asanebo, which he opened along with his brothe

<a id='start_work'></a>
# Start Here
[Back to Prompt](#top)

Below is a preview of the first two items in `restaurants_dict`. As you can see, the key is the restaurant name, and the value is a list of properties defined [here](#properties_table). Walk through this problem step by step to end up with a working `michelin_filter` function.

Recall that our `michelin_filter` function will take in a price limit and the number of stars that you would like, and it should return a list of restaurant names that fit the criteria. 

Example below:
``` python
def michelin_filter(price_limit, number_of_stars):
    """Some Code Here"""
    return restaurants

# I'm filtering for 3-star restaurants with a price limit of $200
restaurants = michelin_filter(400, 3)
print(restaurants)
>>> ["Alinea", "Benu", "Per Se", "The French Laundry"]
```

In [8]:
# Run this cell to print the first two items in restaurants_dict
pprint.pprint(list(restaurants_dict.items())[:2])

[('Shin Sushi',
  ['https://guide.michelin.com/us/en/california/encino/restaurant/shin-sushi',
   1,
   '*',
   '16573 Ventura Blvd., Encino, 91436, United States',
   'Set in a nondescript shopping center, this highly pedigreed settler has '
   'managed to keep a low profile. But discerning locals know a great master '
   'when they see one, and Taketoshi Azumi is indeed the real deal. The chef '
   'has worked for two decades at top spots on both coasts and comes from a '
   'family of sushi connoisseurs. In fact, Shin is named for the Tokyo '
   'restaurant run by his late father, whose former sign now hangs behind the '
   'counter. Despite its pedigree, the vibe is affable and laid-back. Tables '
   'fill with as many diners ordering lunch combos as the omakase, and the '
   'friendly chef engages each customer as he slices their fish to order. His '
   'approach to shari is singular and highly personal, with a dense texture '
   'and mild flavor from sake lees vinegar. It makes a

Remember, you have access to the `restaurants_dict` dictionary, which you can operate on like this:

1. To get a certain properties list from the dictionary, key into it like this:
``` python
restaurants_dict["Eleven Madison Park"]
```
Which gets you the properties list of Eleven Madison.

***
2. To loop through the keys in the dictionary, just write a for loop, naming the counting variable `key`.
``` python
# the key is really just the restaurant name, so we can assign 
# it to a new variable for no other reason than for clarity
for key in restaurants_dict:
        restaurant_name = key
        restaurant_properties_list = restaurants_dict[key] 
```
Now, we can operate on the restaurants properties list.

***
3. To get the ith element of a list, index into the list, a lot like how you would key into a dictionary.
``` python
# in this example, I'm getting the price range of the restaurant, which is the 5th index in the properties list
price_range = restaurant_properties_list[5]
print(price_range)
>>> range(115, 300)
```
4. **New Concept Alert**: To add an element to a list, use Python's `.append` function like this:
``` python
fruits_list = ["apple", "banana", "orange"] # assign a new variable to a list of fruits
fruits_list.append("durian") # This is where you use the .append function
print(fruits_list)
>>> ["apple", "banana", "orange", "durian"] # as you can see, "durian" has been added to fruits_list
```


### Steps for this problem
These steps may seem intimidating, but the actual solution is only 10 lines of code.
1. We need something to store our list of restaurant names that fit our criteria. We will eventually return this list of restaurant names. 
    - Define a new variable named `restaurant_names` and set it to an empty list.
2. We will now look through the `restaurants_dict` dictionary, entry by entry, and see if there are any matches for our price limit and number of stars. After finding matches, we will append them to our array. 
    - Loop through restaurants_dict.
    - Inside the loop, assign a new variable `restaurant_name` (note, this is a singular restaurant name as opposed to your growing list of restaurant_names) to `key`, since the keys of the restaurants_dict are just names of the restaurants
    - You want to get the restaurant properties from this restaurants dictionary, so you can check if price range and stars match your inputs.
        - Assign a variable named `restaurant_properties` to the value of the restaurant dictionary queried at `key`. 
        - Recall that you can get values from dictionaries with the following syntax: `dictionary_name[key]`
    - Now it's time to get price range and number of stars from the restaurant. 
        - Assign a variable named `price_range` to the fifth index of `restaurant_properties`. 
        - Assign a variable named `restaurant_number_of_stars` to the first index of `restaurant_properties`.
    - It's time to finally check if:
        - The restaurant's number of stars matches the function's inputted number of stars
        - Our price limit is greater than or equal to the lower-bound of the price range
            - Why? Say a restaurant is ranged 115-400 dollars, and our user inputs 300 for their price limit. You only care that 300 is greater than 115
    - After checking these things, if there's a match, we want to `.append` the restaurant name to our list of `restaurant_names`
    - Finally, return `restaurant_names`

In [54]:
def michelin_filter(price_limit, number_of_stars):
    restaurant_names = []
    for ... in ...:
        restaurant_name = ...
        restaurant_properties = ...
        price_range = ...
        restaurant_number_of_stars = ...
        if ... and ...:
            ...
    return ...

"""DO NOT TOUCH BELOW"""
try:
    assert michelin_filter(50, 2) == ['Jônt']
    assert len(michelin_filter(500, 2)) == 32
    assert michelin_filter(200, 3) == ['Quince', 'Eleven Madison Park', 'Le Bernardin']
    print("Nice! Your function is working great.")
except:
    print("Looks like your michelin filter function isn't working as intended.")
    print(michelin_filter(100, 2))

SyntaxError: can't assign to Ellipsis (<ipython-input-54-739205d8ab4c>, line 3)

In [48]:
# Test your function on any price limit, and any number of stars from 1-3
michelin_filter(100, 2)

['Commis',
 'Campton Place',
 'Jean-Georges',
 'Gabriel Kreuther',
 'Aquavit',
 'Jônt']

## I want more
It's nice to have a function that gets me the restaurant names, but I want all of the information available to me in the dictionary. The good thing is that I have access to the dictionary, and my function `michelin_filter` returns a list of restaurant names, which is the key value for `restaurants_dict`. 

Write a function `more_information_michelin_filter` that returns a subset of `restaurants_dict` composed of all of the restaurants that would be returned by `michelin_filter`.
- Your arguments will be the same as your arguments in `michelin_filter`.
- This solution only takes 4 lines, and includes a loop to add elements to a new dictionary. Call this new dictionary `new_dict`. 

In [52]:
def more_information_michelin_filter(price_limit, number_of_stars):
    # Replace this comment with your 4-line solution
    
    """DO NOT TOUCH BELOW"""
    to_be = "is" if len(new_dict) == 1 else "are"
    print(f"There {to_be} {len(restaurants)} {number_of_stars}-star restaurant with selections under a budget of {price_limit}.")
    print("Specifics listed below")
    return new_dict

try:
    assert len(more_information_michelin_filter(100, 2)) == michelin_filter(100, 2)
    print("Good job! You've finished today's assignment.")
except:
    print("Your function isn't working as intended")

Your function isn't working as intended
