# Phase 3 - Supervised Learning

## Problem 7

### `python ok -q 07 -u` quiz

Q: What does a `predictor` function returned by `find_predictor` do?
Choose the number of the correct choice:

0. takes in a restaurant and returns the predicted location of
   that restaurant
1. takes in a restaurant and returns the predicted rating for that
   restaurant
2. returns the `r_squared` value

**Ans**: 1

Q: What does the list `xs` represent?
Choose the number of the correct choice:

0. the extracted feature value for each restaurant in restaurants
1. the restaurants reviewed by user
2. the names of restaurants in restaurants
3. the restaurants in restaurants

**Ans**: 0

Q: What does the list `ys` represent?
Choose the number of the correct choice:

0. the average rating for the restaurants in restaurants
1. user's ratings for the restaurants in restaurants
2. the names for the restaurants in restaurants
3. the names for the restaurants reviewed by user

**Ans**: 1

### Implementation of Problem 7

We start by calculating the components of linear regression: sums of squares, regression coefficients (`a` and `b`) and $R^2$.

In [None]:
Sxx = sum([(xi - mean(xs))**2 for xi in xs])
Syy = sum([(yi - mean(ys))**2 for yi in ys])
Sxy = sum([(xi - mean(xs)) * (yi - mean(ys)) for xi, yi in zip(xs, ys)])

b = Sxy / Sxx
a = mean(ys) - (b * mean(xs))
r_squared = (Sxy**2) / (Sxx * Syy)

The tricky part is returning the `predictor` function. The `predictor` function is akin to a line $y = a + bx$. A few things to take note:

1. The `predictor` function takes in a `restaurant`
    * But if we see `xs`, each `x` corresponds to the result of calling `feature_fn(restaurant)`
2. The `ys` is the rating given by the user. However, we want to create a rating prediction.

In [None]:
def predictor(restaurant):
    return b * feature_fn(restaurant) + a

And thus, we return both the `predictor` function and the `r_squared`

In [None]:
return predictor, r_squared

## Problem 8

### `python ok -q 08 -u` quiz

Q: In `best_predictor`, what does the variable `reviewed` represent?
Choose the number of the correct choice:

0. a list of ratings for restaurants reviewed by the user
1. a list of restaurants reviewed by the user
2. a list of all possible restaurants

**Ans**: 1

Q: Given a `user`, a list of `restaurants`, and a feature function, what does `find_predictor` from Problem 7 return?
Choose the number of the correct choice:

0. a restaurant
1. a `predictor` function and its `r_squared` value
2. a `predictor` function
3. an `r_squared` value

**Ans**: 1

Q: After computing a list of `[predictor, r_squared]` pairs,
which predictor should we select?
Choose the number of the correct choice:

0. an arbitrary `predictor`
1. the `predictor` with the lowest `r_squared` value
2. the `predictor` with the highest `r_squared` value
3. the first `predictor` in the list

**Ans**: 2

### Implementation of Problem 8

Notice in the problem description:

"It computes a predictor fuction **for each feature function**..."

With multiple `feature_fn` in `feature_fns`, we want to create a `predictor` function for each `feature_fn`.

In [None]:
predictors_and_rsquared = [find_predictor(user, reviewed, feature_fn) for feature_fn in feature_fns]

Now recall that in the previous problem, the `find_predictor` function returns 2 things: a `predictor` and its `r_squared` value. A function that returns 2 things returns the return values in form of tuple.

In [11]:
def lol():
    return 1, 2
a = lol()
type(lol)

function

In [13]:
a[0] # We can access the elements in the tuple via indexing!

1

In [14]:
a[1]

2

The `predictors_and_rsquared` will contain multiple tuples containing predictor and `r_squared`.

In [None]:
predictor_and_rsquared = [(pred_fn, 0.85), (pred_fn, 0.99), ...] # This is just a depiction, not a real data

Since we want to return the predictor that has the greatest `r_squared`, use the `max` function with `key` of choosing the element of index `[1]` for each tuple. 
* Keep in mind that we will have to select index `[0]` once we obtain the result of `max` since we want the predictor, not the `r_squared` value.

In [None]:
return max(predictors_and_rsquared, key = lambda x: x[1])[0]

## Problem 9

### `python ok -q 09 -u` quiz

Q: `rate_all` returns a dictionary. What are the keys of this dictionary?
Choose the number of the correct choice:

0. restaurants
1. restaurant ratings
2. restaurant names

**Ans**: 2

Q: What are the values of the returned dictionary?
Choose the number of the correct choice:

0. lists - list of all restaurant ratings
1. numbers - a mix of user ratings and predicted ratings
2. numbers - mean restaurant ratings
3. numbers - predicted ratings only
4. numbers - user ratings only

**Ans**: 1

Q: In `rate_all`, what does the variable `reviewed` represent?
Choose the number of the correct choice:

0. a list of all possible restaurants
1. a list of restaurants reviewed by the user
2. a list of ratings for restaurants reviewed by the user

**Ans**: 1

### Implementation of Problem 9

The return result is a dictionary, so we'll have to create an empty dictionary first.

In [None]:
result = {}

Then iterate through the restaurants. For each restaurant:

1. If the restaurant is present in `reviewed`, then use the `user_rating` function as the rating
2. Otherwise, use the `predictor` function.

In [None]:
for r in restaurants:
    if r in reviewed:
        result[restaurant_name(r)] = user_rating(user, restaurant_name(r))
    else:
        result[restaurant_name(r)] = predictor(r)

return result

## Problem 10

### `python ok -q 10 -u` quiz

Q: Given a `restaurant`, what does `restaurant_categories` in `abstractions.py` return?
Choose the number of the correct choice:

0. a single string (category)
1. a list of numbers (ratings)
2. a list of strings (categories)
3. a single number (rating)

**Ans**: 2

Q: When does a restaurant match a search query?
Choose the number of the correct choice:

0. if the query string is one of the restaurant's categories
1. if the query string is mentioned in the restaurant's reviews
2. if the query string is equal to the restaurant's categories
3. if the query string is a substring of the restaurant's name

**Ans**: 0

Q: What type of object does `search` return?
Choose the number of the correct choice:

0. a dictionary that maps restaurant categories (strings) to restaurants
1. a list of restaurant names (strings)
2. a dictionary that maps restaurant names (strings) to restaurants
3. a list of restaurants

**Ans**: 3

Simpy iterate through each restaurant in `restaurants` and include it if the string `query` is found within the restaurant categories.

In [None]:
return [r for r in restaurants if query in restaurant_categories(r)]