# Important note!

Before you turn this problem in, make sure everything runs as expected. First, **restart the kernel** (in the menubar, select Kernel$\rightarrow$Restart) and then **run all cells** (in the menubar, select Cell$\rightarrow$Run All).

Make sure you fill in any place that says `YOUR CODE HERE` or "YOUR ANSWER HERE", as well as your name and collaborators below:

In [None]:
YOUR_ID = "" # Please enter your GT login, e.g., "rvuduc3" or "gtg911x"
COLLABORATORS = [] # list of strings of your collaborators' IDs

In [None]:
import re

RE_CHECK_ID = re.compile (r'''[a-zA-Z]+\d+|[gG][tT][gG]\d+[a-zA-Z]''')
assert RE_CHECK_ID.match (YOUR_ID) is not None

collab_check = [RE_CHECK_ID.match (i) is not None for i in COLLABORATORS]
assert all (collab_check)

del collab_check
del RE_CHECK_ID
del re

**Jupyter / IPython version check.** The following code cell verifies that you are using the correct version of Jupyter/IPython.

In [None]:
import IPython
assert IPython.version_info[0] >= 3, "Your version of IPython is too old, please update it."

# Part 3: Using the Yelp! API (12 points)

Let's use the Yelp! API to find information about restaurants in various cities. The goal is to find 5 highly-rated restaurants in Atlanta with most reviews (largest numbers of reviews) on Yelp!.

## Setup:

1. Go to http://www.yelp.com/developers and create an account. You can use your existing Yelp! account or create a new account by providing your name, email address, and zip code.
2. Go to http://www.yelp.com/developers/manage_api_keys to generate your app key/secret and a token, by providing a website URL (can be anything, for example a dummy URL or the course page http://cse6040.gatech.edu) and describing the purpose to use the APIs -- in this case, to do your for homework! Write down your **“Consumer Key”**, **“Consumer Secret”**, **“Token”**, and **“Token Secret”** and enter them in the appropriate code cell below.
3. Go to http://www.yelp.com/developers/documentation, learn how to build the URLs in order to use Yelp Search API and Business API.
4. You will also learn how to install a package in `Python`. We will use the tool “pip”, which should already be installed with Anaconda. If not, follow this link: https://pip.pypa.io/en/latest/installing.html, to install “pip”.
5. For any package you want to install in python, you can type **“pip install {package_name}”** at the command prompt or `!pip install {package_name}` in a code cell within Jupyter.

**Exercise 1** (1 point). Per item (2) above, enter your credentials in the following code cell.

In [None]:
CONSUMER_KEY = "" # Consumer key
CONSUMER_SECRET = "" # Consumer secret
TOKEN = "" # Token
TOKEN_SECRET = "" # Token secret

# YOUR CODE HERE
raise NotImplementedError()

We will need the packages below. If you get an error when trying to run the following code cell, try running the installation commands shown below.

In [None]:
import oauth2
import json
import urllib

In [None]:
# Uncomment these if you get an errors above
#!pip install oauth2
#!pip install json
#!pip install urllib

Here is a helper function for making a search request via the Yelp! API.

In [None]:
import re
import oauth2
import requests

YELP_API_SEARCH_URL_BASE = 'http://api.yelp.com/v2/search'

def yelp_search (url_params,
                 key=CONSUMER_KEY,
                 secret=CONSUMER_SECRET,
                 token=TOKEN,
                 token_secret=TOKEN_SECRET,
                 url_base=YELP_API_SEARCH_URL_BASE):
    """What does this code do? (see exercise below)"""
    
    url = url_base + '?' + urllib.parse.urlencode (url_params)
    
    oauth_request = oauth2.Request ('GET', url, {})
    oauth_request.update(
        {
            'oauth_nonce': oauth2.generate_nonce(),
            'oauth_timestamp': oauth2.generate_timestamp(),
            'oauth_token': token,
            'oauth_consumer_key': key
        }
    )
    oauth2_consumer = oauth2.Consumer (key, secret)
    oauth2_token = oauth2.Token (token, token_secret)
    oauth_request.sign_request (oauth2.SignatureMethod_HMAC_SHA1 (),
                                oauth2_consumer, oauth2_token)
    signed_url = oauth_request.to_url ()
    
    response = requests.get (signed_url)
    assert response is not None
    assert re.search ('application/json', response.headers['Content-Type']) is not None
    return response.json ()

**Exercise 2** (2 points). Read the code above and, in your own words, explain what it does.

> Feel free to read the documentation for the various bits and pieces. You may also find the code cells below, which use the function, helpful. Lastly, your explanation does not have to be too detailed; a short paragraph's worth of text should suffice.

YOUR ANSWER HERE

We can now write some simple code to carry out a Yelp! search.

In [None]:
params = {'term': 'restaurants',
          'location': 'Atlanta, GA',
          'limit': 20,
          'sort': 2}
result = yelp_search (params)

print (result)

Let's write this output to a file.

In [None]:
with open ('first20.json', 'w') as f_out:
    json.dump (result, f_out, sort_keys=True, indent=2)

**Exercise 3** (3 points). Write some code to search for the next 20 results. Save these results in a file called, **`next20.json`**.

> Hint: You may need to read the [Yelp! Search API's documentation](https://www.yelp.com/developers/documentation/v2/overview) to set up the correct query parameters.

In [None]:
# YOUR CODE HERE
raise NotImplementedError()

In [None]:
with open ('next20.json', 'r') as next20_fp:
    next20 = json.load (next20_fp)
    
assert len (next20['businesses']) == 20

# Additional instructor test code will go here.
# Feel free to write your own test code here as
# you debug.

**Exercise 4** (3 points). For each of the 40 highest rated restaurants you collected, get the number of reviews each has received. Create a text file named **`40restaurants.csv`** to store the results. In particular, write in this file the restaurant names and the numbers of reviews, one line for each restaurant, higher ratings first, comma-delimited. 

For example:
```
Aviva by Kameel,138
Purnima,43
......
```

In [None]:
# YOUR CODE HERE
raise NotImplementedError()

In [None]:
with open ('40restaurants.csv', 'r') as rest40_fp:
    rest40 = [l.strip () for l in rest40_fp.readlines ()]
    
assert len (rest40) == 40
assert all ([len (r.split (',')) == 2 for r in rest40])

rest40_names = [r.split (',')[0] for r in rest40]
rest40_numrevs = [r.split (',')[1] for r in rest40]
assert all ([k.isdigit () for k in rest40_numrevs])

# Additional instructor test code will go here.
# Feel free to write your own test code here as
# you debug.

**Exercise 5** (3 points). From the 40 restaurants you collected, get the 5 restaurants with most reviews.
Create a text file named **`40restaurants_top_review_count.csv`**, and write in this file the 5 restaurant
names with most reviews (in descending order of their numbers of reviews) as well as the number of reviews, one line for each restaurant, comma-delimited. 

For example:
```
Antico Pizza,1622
Fox Bros. Bar-B-Q,1168
......
```

In [None]:
# YOUR CODE HERE
raise NotImplementedError()

In [None]:
with open ('40restaurants_top_review_count.csv', 'r') as top_revd_fp:
    top_revd = [l.strip () for l in top_revd_fp.readlines ()]
    
assert len (top_revd) == 5
assert all ([len (r.split (',')) == 2 for r in top_revd])

top_revd_names = [r.split (',')[0] for r in top_revd]
top_revd_numrevs = [r.split (',')[1] for r in top_revd]
assert all ([k.isdigit () for k in top_revd_numrevs])

# Additional instructor test code will go here.
# Feel free to write your own test code here as
# you debug.