<img src="http://imgur.com/1ZcRyrc.png" style="float: left; margin: 20px; height: 55px">

## Python Review With Movie Data

_Author: Kiefer Katovich and Dave Yerrington (San Francisco)

---

In this lab, you'll be using the [IMDb](http://www.imdb.com/) `movies` list below as your data set. 

This lab is designed to help you practice iteration and functions in particular. The normal questions are more gentle, and the challenge questions are suitable for advanced/expert Python students or those with programming experience. 

All of the questions require writing functions and using iteration to solve. You should print out a test of each function you write.


### 1) Load the provided list of `movies` dictionaries.

In [1]:
# List of movies dictionaries:

movies = [
{
"name": "Usual Suspects", 
"imdb": 7.0,
"category": "Thriller"
},
{
"name": "Hitman",
"imdb": 6.3,
"category": "Action"
},
{
"name": "Dark Knight",
"imdb": 9.0,
"category": "Adventure"
},
{
"name": "The Help",
"imdb": 8.0,
"category": "Drama"
},
{
"name": "The Choice",
"imdb": 6.2,
"category": "Romance"
},
{
"name": "Colonia",
"imdb": 7.4,
"category": "Romance"
},
{
"name": "Love",
"imdb": 6.0,
"category": "Romance"
},
{
"name": "Bride Wars",
"imdb": 5.4,
"category": "Romance"
},
{
"name": "AlphaJet",
"imdb": 3.2,
"category": "War"
},
{
"name": "Ringing Crime",
"imdb": 4.0,
"category": "Crime"
},
{
"name": "Joking muck",
"imdb": 7.2,
"category": "Comedy"
},
{
"name": "What is the name",
"imdb": 9.2,
"category": "Suspense"
},
{
"name": "Detective",
"imdb": 7.0,
"category": "Suspense"
},
{
"name": "Exam",
"imdb": 4.2,
"category": "Thriller"
},
{
"name": "We Two",
"imdb": 7.2,
"category": "Romance"
}
]

---

### 2) Filtering data by IMDb score.

#### 2.1)

Write a function that:

1) Accepts a single movie dictionary from the `movies` list as an argument.
2) Returns `True` if the IMDb score is greater than 5.5.

#### 2.2 [Challenge])

Write a function that:

1) Accepts the `movies` list and a specified category.
2) Returns `True` if the average score of the category is higher than the average score of all movies.

In [2]:
def movie_selector(movie, score_limit=5.5):
    """
    Accepts movies list dictionary and returns True if IMDb score is greater than 5.5
    
    Paramters
    ---------
    movie : dict
        Dict holding information about movie in format {name, imdb score, category}
        
    score_limit : float
        Score_limit to assess imdb score against.
    
    Returns
    -------
    boolean
        If imdb score > score_limit return True, else return False.
    """
    output = movie['imdb'] > score_limit 
    return output

In [3]:
print('Example Use')
print('Running through all movies...\n')
for movie in movies:
    name = movie['name']
    print(name, movie_selector(movie))

Example Use
Running through all movies...

Usual Suspects True
Hitman True
Dark Knight True
The Help True
The Choice True
Colonia True
Love True
Bride Wars False
AlphaJet False
Ringing Crime False
Joking muck True
What is the name True
Detective True
Exam False
We Two True


In [4]:
#Challenge
def average_by_category(movies,category):
    """
    Accepts list of movie dicts and a category. Returns average imdb score of category.
    
    Parameters:
    -----------
    movies : list
        List of dicts holding information about movie in format {name, imdb score, category}
    category : str
        Movies category to be averaged over
        
    Examples:
    ---------
    
    >>> average_by_category(movies, 'suspense')
    8.1
    
    >>> average_by_category(movies, 'comedy')
    7.2
    
    Returns
    -------
    score : float
        Average score of movies matching category
        
    """
    if category != None:
        category = category.lower()
    ratings = []
    for movie in movies:
        if movie['category'].lower() == category or category == None:
            ratings.append(movie['imdb']);
    score = sum(ratings) / len(ratings)
    return score

def average_all_movies(movies):
    """
    Accepts list of movie dicts. Returns average imdb score of all movies.
    
    Parameters:
    -----------
    movies : list
        List of dicts holding information about movie in format {name, imdb score, category}
    
    Returns
    -------
    score : float
        Average score of all movies
        
    """
    score = average_by_category(movies,category=None)
    return score

def better_than_average(movie, category):
    """
    Accepts list of movie dicts and a category. Returns True if average imdb rating is greater than the average
    imdb rating for all movies.
    
    Parameters:
    -----------
    movies : list
        List of dicts holding information about movie in format {name, imdb score, category}
        
    category : str
        Movies category to be reviewed
        
    Examples:
    ---------
    
    >>> better_than_average(movies, 'suspense')
    True
    
    >>> better_than_average(movies, 'comedy')
    True
    
    >>> better_than_average(movies, 'thriller')
    False
    
    Returns
    -------
    boolean
        If average imdb score for category > average imdb score for all category return True, else return False.
        
    """
    return average_by_category(movies, category) > average_all_movies(movies)

In [5]:
print('Example Use')
better_than_average(movies, 'suspense')

Example Use


True

---

### 3) Creating subsets by numeric condition.

#### 3.1)

Write a function that:

1) Accepts the list of movies and a specified IMDb score.
2) Returns the sublist of movies that have scores greater than the one specified.

#### 3.2 [Expert])

Write a function that:

1) Accepts the `movies` list as an argument.
2) Returns the `movies` list sorted first by category and then by movie according to category average score and individual IMDb score, respectively.

In [6]:
#3.1
def movie_sublist(movies, imdb_score):
    """
    Returns a list of movies that have an imdb score greater than the imdb_score supplied.

    Parameters:
    -----------
    movies : list
        List of dicts holding information about movie in format {name, imdb score, category}
    
    imdb_score : float
        Target imdb_score to be compared against.
    
    Returns:
    -------
    pick_list : list
        Sublist of movies where imdb rating > imdb_score
    """
    pick_list = []
    for movie in movies:
        if movie['imdb'] > imdb_score:
            pick_list.append(movie)
    return pick_list

In [7]:
#Example use
movie_sublist(movies, 7.0)

[{'name': 'Dark Knight', 'imdb': 9.0, 'category': 'Adventure'},
 {'name': 'The Help', 'imdb': 8.0, 'category': 'Drama'},
 {'name': 'Colonia', 'imdb': 7.4, 'category': 'Romance'},
 {'name': 'Joking muck', 'imdb': 7.2, 'category': 'Comedy'},
 {'name': 'What is the name', 'imdb': 9.2, 'category': 'Suspense'},
 {'name': 'We Two', 'imdb': 7.2, 'category': 'Romance'}]

In [8]:
#3.2 Expert
def sorted_movies(movies):
    data = []
    new_dict = {}
    output = []
    for movie in movies:
        score = average_by_category(movies, movie['category'])
        data.append([movie['name'],movie['imdb'],score])
        new_dict[movie['name']] = movie
    #print(new_dict)
    data = sorted(data, key = lambda x: (x[2], x[1]), reverse=True)

    for item in data:
        output.append(new_dict[item[0]])
    return output

In [9]:
sorted_movies(movies)

[{'name': 'Dark Knight', 'imdb': 9.0, 'category': 'Adventure'},
 {'name': 'What is the name', 'imdb': 9.2, 'category': 'Suspense'},
 {'name': 'Detective', 'imdb': 7.0, 'category': 'Suspense'},
 {'name': 'The Help', 'imdb': 8.0, 'category': 'Drama'},
 {'name': 'Joking muck', 'imdb': 7.2, 'category': 'Comedy'},
 {'name': 'Colonia', 'imdb': 7.4, 'category': 'Romance'},
 {'name': 'We Two', 'imdb': 7.2, 'category': 'Romance'},
 {'name': 'The Choice', 'imdb': 6.2, 'category': 'Romance'},
 {'name': 'Love', 'imdb': 6.0, 'category': 'Romance'},
 {'name': 'Bride Wars', 'imdb': 5.4, 'category': 'Romance'},
 {'name': 'Hitman', 'imdb': 6.3, 'category': 'Action'},
 {'name': 'Usual Suspects', 'imdb': 7.0, 'category': 'Thriller'},
 {'name': 'Exam', 'imdb': 4.2, 'category': 'Thriller'},
 {'name': 'Ringing Crime', 'imdb': 4.0, 'category': 'Crime'},
 {'name': 'AlphaJet', 'imdb': 3.2, 'category': 'War'}]

---

### 4) Creating subsets by string condition.

#### 4.1)

Write a function that:

1) Accepts the `movies` list and a category name.
2) Returns the movie names within that category (case-insensitive!).
3) If the category is not in the data, prints a message that says it does not exist and returns `None`.

Recall that, to convert a string to lowercase, you can use:

```python
mystring = 'Dumb and Dumber'
lowercase_mystring = mystring.lower()
print lowercase_mystring
'dumb and dumber'
```

#### 4.2 [Challenge])

Write a function that:

1) Accepts the `movies` list and a "search string."
2) Returns a dictionary with the keys `'category'` and `'title'` whose values are lists of categories that contain the search string and titles that contain the search string, respectively (case-insensitive!).

In [10]:
#4.1
def movie_names(movies, category):
    """
    Accepts list of movie dicts and a category. Returns list of movie names within that category.
    
    Parameters:
    -----------
    movies : list
        List of dicts holding information about movie in format {name, imdb score, category}
        
    category : str
        Movies category to be selected.
        
    Returns:
    --------
    sub_list : list
        Sub list of movies, where movie category matches category.
    """
    sub_list = [movie['name'] for movie in movies if movie['category'].lower() == category.lower()]
    if not sub_list:
        print('Category does not exist: ',category)
        return None
    return sub_list

In [11]:
#Example
movie_names(movies, 'Romance')

['The Choice', 'Colonia', 'Love', 'Bride Wars', 'We Two']

In [12]:
#Example
movie_names(movies, 'Not_here')

Category does not exist:  Not_here


In [13]:
#4.2
def movie_search(movies, search_string):
    """
    Returns a dictionary of categories and movie names, where search_string present in category or movie name.

    Parameters:
    -----------
    movies : list
        List of dicts holding information about movie in format {name, imdb score, category}
        
    search_string : str
        Target string to be found in category or movie name.
    
    Returns:
    --------
    matches : dict
        Dict of {'category' : matching, 'title' : matching}
    """
    matches = {'category' : [],
              'title' : []}
    search_string = search_string.lower()
    for movie in movies:
        title = movie['name']
        category = movie['category']
        if search_string in title.lower():
            matches['title'].append(title)
        if search_string in category.lower():
            matches['category'].append(category)
    return matches

In [14]:
#Example
movie_search(movies, 'ce')

{'category': ['Romance', 'Romance', 'Romance', 'Romance', 'Romance'],
 'title': ['The Choice']}

---

### 5) Multiple conditions.

#### 5.1)

Write a function that:

1) Accepts the `movies` list and a "search criteria" variable.
2) If the criteria variable is numeric, return a list of movie titles with a score greater than or equal to the criteria.
3) If the criteria variable is a string, return a list of movie titles that match that category (case-insensitive!). If there is no match, return an empty list and print an informative message.

#### 5.2 [Expert])

Write a function that:

1) Accepts the `movies` list and a string search criteria variable.
2) The search criteria variable can contain within it:
  - Boolean operations: `'AND'`, `'OR'`, and `'NOT'` (can have/be lowercase as well, we just capitalized for clarity).
  - Search criteria specified with the syntax `score=...`, `category=...`, and/or `title=...`, where the `...` indicates what to look for.
    - If `score` is present, it indicates scores greater than or equal to the value.
    - For `category` and `title`, the string indicates that the category or title must _contain_ the search string (case-insensitive).
3) Return the matches for the search criteria specified.

In [15]:
#5.1
def return_movie_by_category(movies, category):
    """
    For list of movie dicts, return a list of movie names where category matches input category.

    Parameters:
    -----------
    movies : list
        List of dicts holding information about movie in format {name, imdb score, category}
        
    category : str
        Search category to be matched.
        
    Examples:
    ---------
    
    >>> return_movie_by_category(movies, 'Romance')
    ['The Choice', 'Colonia', 'Love', 'Bride Wars', 'We Two']
        
    Returns:
    --------
    titles : list
        List of strings of movie names where category matches input category
    """
    titles = []
    for movie in movies:
        if movie['category'] == category:
            titles.append(movie['name'])
    return titles

def return_movie_by_rating(movies, rating):
    """
    For list of movie dicts, return a list of movie names where rating is greater than input rating.

    Parameters:
    -----------
    movies : list
        List of dicts holding information about movie in format {name, imdb score, category}
        
    rating : float
        Search rating to be compare against.
        
    Examples:
    ---------
    
    >>> return_movie_by_rating(movies, 8.9)
    ['Dark Knight', 'What is the name']
    
    >>> return_movie_by_rating(movies, 9.0)
    ['What is the name']
        
    Returns:
    --------
    titles : list
        List of strings of movie names where rating greater than input rating.
    """
    titles = []
    for movie in movies:
        if movie['imdb'] > rating:
            titles.append(movie['name'])
    return titles

def search_criteria_type(search_criteria):
    """
    Checks search criteria and determines if the criteria is a rating or a category.
    
    Examples:
    ---------
    >>> search_criteria_type(4.0)
    'rating'
    
    >>> search_criteria_type('The Dark Knight')
    'category'
    
    """
    action = ''
    try:
        float(search_criteria)
        action = 'rating'
    except:
        action = 'category'
        
    return action

def movie_search_again(movies, search_criteria):
    """
    If criteria is numeric will return movie titles with score greater than value. If criteria is a string retrun a list
    of movie titles with matching category.
    
    Parameters:
    -----------
    movies : list
        List of dicts holding information about movie in format {name, imdb score, category}
        
    search_criteria : str
        Search term - if numeric search based on rating, if string search based on category
    
    Returns:
    --------
    output : list
        If criteria is a string retrun a list of movie titles with matching category.
    """
    action = search_criteria_type(search_criteria)
    output = []
        
    if action == 'rating':
        output = return_movie_by_rating(movies, search_criteria)
    if action == 'category':
        output = return_movie_by_category(movies, search_criteria)
    
    if not output:
        print('No match')
    return output

In [16]:
#Example
movie_search_again(movies,7)

['Dark Knight',
 'The Help',
 'Colonia',
 'Joking muck',
 'What is the name',
 'We Two']

In [17]:
#Example
movie_search_again(movies, 'Thriller')

['Usual Suspects', 'Exam']

In [18]:
#Example
movie_search_again(movies, 'not_there')

No match


[]

In [19]:
#5.2
def movies_advanced_criteria(movies, search_criteria):
    pass

In [20]:
#Example

In [21]:
import doctest
doctest.testmod()

TestResults(failed=0, attempted=10)